web/views/embedding.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Mon, 05 Jul 2010 17:06:11 +0200
branchstable
changeset 5886 00a78298d30d
parent 5424 8ecbcbff9777
child 5890 141b935a38fc
permissions -rw-r--r--
cleanups
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     1
# copyright 2003-2010 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    18
"""Objects interacting together to provides the external page embeding
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    19
functionality.
5886
00a78298d30d cleanups
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    20
"""
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    21
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    22
__docformat__ = "restructuredtext en"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    23
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    24
import re
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
from urlparse import urljoin
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    26
from urllib2 import urlopen, Request, HTTPError
2808
497424219fb0 fix urlquote imports
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2798
diff changeset
    27
from urllib import quote as urlquote # XXX should use view.url_quote method
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
from logilab.mtconverter import guess_encoding
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    30
692
800592b8d39b replace deprecated cubicweb.common.selectors by its new module path (cubicweb.selectors)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 631
diff changeset
    31
from cubicweb.selectors import (one_line_rset, score_entity,
800592b8d39b replace deprecated cubicweb.common.selectors by its new module path (cubicweb.selectors)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 631
diff changeset
    32
                                match_search_state, implements)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    33
from cubicweb.interfaces import IEmbedable
800
860451b72ab7 update import
sylvain.thenault@logilab.fr
parents: 742
diff changeset
    34
from cubicweb.view import NOINDEX, NOFOLLOW
4023
eae23c40627a drop common subpackage
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3460
diff changeset
    35
from cubicweb.uilib import soup2xhtml
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    36
from cubicweb.web.controller import Controller
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    37
from cubicweb.web.action import Action
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    38
from cubicweb.web.views import basetemplates
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    39
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    40
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    41
class ExternalTemplate(basetemplates.TheMainTemplate):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    42
    """template embeding an external web pages into CubicWeb web interface
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    43
    """
3377
dd9d292b6a6d use __regid__ instead of id on appobject classes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2808
diff changeset
    44
    __regid__ = 'external'
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
    45
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    46
    def call(self, body):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    47
        # XXX fallback to HTML 4 mode when embeding ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    48
        self.set_request_content_type()
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    49
        self._cw.search_state = ('normal',)
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    50
        self.template_header(self.content_type, None, self._cw._('external page'),
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    51
                             [NOINDEX, NOFOLLOW])
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    52
        self.content_header()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    53
        self.w(body)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    54
        self.content_footer()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    55
        self.template_footer()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    56
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    57
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    58
class EmbedController(Controller):
3377
dd9d292b6a6d use __regid__ instead of id on appobject classes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2808
diff changeset
    59
    __regid__ = 'embed'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    60
    template = 'external'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    61
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    62
    def publish(self, rset=None):
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    63
        req = self._cw
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    64
        if 'custom_css' in req.form:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    65
            req.add_css(req.form['custom_css'])
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    66
        embedded_url = req.form['url']
4083
3b285889b8e9 3.6 api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4023
diff changeset
    67
        allowed = self._cw.vreg.config['embed-allowed']
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
        _ = req._
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
        if allowed is None or not allowed.match(embedded_url):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    70
            body = '<h2>%s</h2><h3>%s</h3>' % (
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
                _('error while embedding page'),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
                _('embedding this url is forbidden'))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
        else:
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    74
            prefix = req.build_url(self.__regid__, url='')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
            authorization = req.get_header('Authorization')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
            if authorization:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
                headers = {'Authorization' : authorization}
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
                headers = {}
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
            try:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
                body = embed_external_page(embedded_url, prefix,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
                                           headers, req.form.get('custom_css'))
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    83
                body = soup2xhtml(body, self._cw.encoding)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
            except HTTPError, err:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
                body = '<h2>%s</h2><h3>%s</h3>' % (
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
                    _('error while embedding page'), err)
1092
b8fbb95dc0eb process_rql now done in the controller
sylvain.thenault@logilab.fr
parents: 800
diff changeset
    87
        self.process_rql(req.form.get('rql'))
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    88
        return self._cw.vreg['views'].main_template(req, self.template,
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
    89
                                                rset=self.cw_rset, body=body)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    91
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    92
def entity_has_embedable_url(entity):
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    93
    """return 1 if the entity provides an allowed embedable url"""
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    94
    url = entity.embeded_url()
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    95
    if not url or not url.strip():
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    96
        return 0
4083
3b285889b8e9 3.6 api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4023
diff changeset
    97
    allowed = entity._cw.vreg.config['embed-allowed']
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    98
    if allowed is None or not allowed.match(url):
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    99
        return 0
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   100
    return 1
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   101
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   102
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   103
class EmbedAction(Action):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   104
    """display an 'embed' link on entity implementing `embeded_url` method
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   105
    if the returned url match embeding configuration
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   106
    """
3377
dd9d292b6a6d use __regid__ instead of id on appobject classes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2808
diff changeset
   107
    __regid__ = 'embed'
742
99115e029dca replaced most of __selectors__ assignments with __select__
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 692
diff changeset
   108
    __select__ = (one_line_rset() & match_search_state('normal')
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   109
                  & implements(IEmbedable)
742
99115e029dca replaced most of __selectors__ assignments with __select__
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 692
diff changeset
   110
                  & score_entity(entity_has_embedable_url))
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   111
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   112
    title = _('embed')
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   113
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   114
    def url(self, row=0):
3451
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
   115
        entity = self.cw_rset.get_entity(row, 0)
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
   116
        url = urljoin(self._cw.base_url(), entity.embeded_url())
6b46d73823f5 [api] work in progress, use __regid__, cw_*, etc.
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3377
diff changeset
   117
        if self._cw.form.has_key('rql'):
3460
e4843535db25 [api] some more _cw / __regid__, automatic tests now pass again
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3451
diff changeset
   118
            return self._cw.build_url('embed', url=url, rql=self._cw.form['rql'])
e4843535db25 [api] some more _cw / __regid__, automatic tests now pass again
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3451
diff changeset
   119
        return self._cw.build_url('embed', url=url)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   120
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   121
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   122
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   123
# functions doing necessary substitutions to embed an external html page ######
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   124
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   125
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   126
BODY_RGX = re.compile('<body.*?>(.*?)</body>', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   127
HREF_RGX = re.compile('<a\s+href="([^"]*)"', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   128
SRC_RGX = re.compile('<img\s+src="([^"]*)"', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   129
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   130
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   131
class replace_href:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   132
    def __init__(self, prefix, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   133
        self.prefix = prefix
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   134
        self.custom_css = custom_css
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   135
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   136
    def __call__(self, match):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   137
        original_url = match.group(1)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   138
        url = self.prefix + urlquote(original_url, safe='')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   139
        if self.custom_css is not None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   140
            if '?' in url:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   141
                url = '%s&amp;custom_css=%s' % (url, self.custom_css)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   142
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   143
                url = '%s?custom_css=%s' % (url, self.custom_css)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   144
        return '<a href="%s"' % url
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   145
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   146
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   147
class absolutize_links:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   148
    def __init__(self, embedded_url, tag, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   149
        self.embedded_url = embedded_url
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   150
        self.tag = tag
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
        self.custom_css = custom_css
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   152
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   153
    def __call__(self, match):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   154
        original_url = match.group(1)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   155
        if '://' in original_url:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   156
            return match.group(0) # leave it unchanged
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
        return '%s="%s"' % (self.tag, urljoin(self.embedded_url, original_url))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   160
def prefix_links(body, prefix, embedded_url, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   161
    filters = ((HREF_RGX, absolutize_links(embedded_url, '<a href', custom_css)),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   162
               (SRC_RGX, absolutize_links(embedded_url, '<img src')),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   163
               (HREF_RGX, replace_href(prefix, custom_css)))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
    for rgx, repl in filters:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
        body = rgx.sub(repl, body)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   166
    return body
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   167
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   168
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   169
def embed_external_page(url, prefix, headers=None, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   170
    req = Request(url, headers=(headers or {}))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   171
    content = urlopen(req).read()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   172
    page_source = unicode(content, guess_encoding(content), 'replace')
1132
96752791c2b6 pylint cleanup
sylvain.thenault@logilab.fr
parents: 1092
diff changeset
   173
    page_source = page_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   174
    match = BODY_RGX.search(page_source)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   175
    if match is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   176
        return page_source
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   177
    return prefix_links(match.group(1), prefix, url, custom_css)