web/views/embedding.py
author Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
Thu, 14 May 2009 11:38:40 +0200
branchtls-sprint
changeset 1802 d628defebc17
parent 1132 96752791c2b6
child 1977 606923dff11b
permissions -rw-r--r--
delete-trailing-whitespace + some copyright update
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     1
"""Objects interacting together to provides the external page embeding
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     2
functionality.
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     3
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     4
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     5
:organization: Logilab
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
     6
:copyright: 2001-2009 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     7
:contact: http://www.logilab.fr/ -- mailto:contact@logilab.fr
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     8
"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
     9
__docformat__ = "restructuredtext en"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    10
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    11
import re
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    12
from urlparse import urljoin
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    13
from urllib2 import urlopen, Request, HTTPError
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    14
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    15
from logilab.mtconverter import guess_encoding
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    16
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    17
from cubicweb import urlquote # XXX should use view.url_quote method
692
800592b8d39b replace deprecated cubicweb.common.selectors by its new module path (cubicweb.selectors)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 631
diff changeset
    18
from cubicweb.selectors import (one_line_rset, score_entity,
800592b8d39b replace deprecated cubicweb.common.selectors by its new module path (cubicweb.selectors)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 631
diff changeset
    19
                                match_search_state, implements)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    20
from cubicweb.interfaces import IEmbedable
800
860451b72ab7 update import
sylvain.thenault@logilab.fr
parents: 742
diff changeset
    21
from cubicweb.view import NOINDEX, NOFOLLOW
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    22
from cubicweb.common.uilib import soup2xhtml
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    23
from cubicweb.web.controller import Controller
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    24
from cubicweb.web.action import Action
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
from cubicweb.web.views import basetemplates
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    26
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    27
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
class ExternalTemplate(basetemplates.TheMainTemplate):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
    """template embeding an external web pages into CubicWeb web interface
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    30
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    31
    id = 'external'
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
    32
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    33
    def call(self, body):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    34
        # XXX fallback to HTML 4 mode when embeding ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    35
        self.set_request_content_type()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    36
        self.req.search_state = ('normal',)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    37
        self.template_header(self.content_type, None, self.req._('external page'),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    38
                             [NOINDEX, NOFOLLOW])
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    39
        self.content_header()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    40
        self.w(body)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    41
        self.content_footer()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    42
        self.template_footer()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    43
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    44
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    45
class EmbedController(Controller):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    46
    id = 'embed'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    47
    template = 'external'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    48
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    49
    def publish(self, rset=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    50
        req = self.req
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    51
        if 'custom_css' in req.form:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    52
            req.add_css(req.form['custom_css'])
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    53
        embedded_url = req.form['url']
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    54
        allowed = self.config['embed-allowed']
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    55
        _ = req._
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    56
        if allowed is None or not allowed.match(embedded_url):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    57
            body = '<h2>%s</h2><h3>%s</h3>' % (
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    58
                _('error while embedding page'),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    59
                _('embedding this url is forbidden'))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    60
        else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    61
            prefix = req.build_url(self.id, url='')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    62
            authorization = req.get_header('Authorization')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    63
            if authorization:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    64
                headers = {'Authorization' : authorization}
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    65
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    66
                headers = {}
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    67
            try:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
                body = embed_external_page(embedded_url, prefix,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
                                           headers, req.form.get('custom_css'))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    70
                body = soup2xhtml(body, self.req.encoding)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
            except HTTPError, err:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
                body = '<h2>%s</h2><h3>%s</h3>' % (
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
                    _('error while embedding page'), err)
1092
b8fbb95dc0eb process_rql now done in the controller
sylvain.thenault@logilab.fr
parents: 800
diff changeset
    74
        self.process_rql(req.form.get('rql'))
b8fbb95dc0eb process_rql now done in the controller
sylvain.thenault@logilab.fr
parents: 800
diff changeset
    75
        return self.vreg.main_template(req, self.template, rset=self.rset, body=body)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    78
def entity_has_embedable_url(entity):
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    79
    """return 1 if the entity provides an allowed embedable url"""
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    80
    url = entity.embeded_url()
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    81
    if not url or not url.strip():
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    82
        return 0
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    83
    allowed = entity.config['embed-allowed']
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    84
    if allowed is None or not allowed.match(url):
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    85
        return 0
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    86
    return 1
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    87
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    88
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
class EmbedAction(Action):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
    """display an 'embed' link on entity implementing `embeded_url` method
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    91
    if the returned url match embeding configuration
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    92
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    93
    id = 'embed'
742
99115e029dca replaced most of __selectors__ assignments with __select__
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 692
diff changeset
    94
    __select__ = (one_line_rset() & match_search_state('normal')
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
    95
                  & implements(IEmbedable)
742
99115e029dca replaced most of __selectors__ assignments with __select__
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 692
diff changeset
    96
                  & score_entity(entity_has_embedable_url))
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
    97
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    98
    title = _('embed')
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
    99
    controller = 'embed'
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   100
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   101
    def url(self, row=0):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   102
        entity = self.rset.get_entity(row, 0)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   103
        url = urljoin(self.req.base_url(), entity.embeded_url())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   104
        if self.req.form.has_key('rql'):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   105
            return self.build_url(url=url, rql=self.req.form['rql'])
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   106
        return self.build_url(url=url)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   107
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   108
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   109
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   110
# functions doing necessary substitutions to embed an external html page ######
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   111
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   112
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   113
BODY_RGX = re.compile('<body.*?>(.*?)</body>', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   114
HREF_RGX = re.compile('<a\s+href="([^"]*)"', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   115
SRC_RGX = re.compile('<img\s+src="([^"]*)"', re.I | re.S | re.U)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   116
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   117
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   118
class replace_href:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   119
    def __init__(self, prefix, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   120
        self.prefix = prefix
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   121
        self.custom_css = custom_css
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   122
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   123
    def __call__(self, match):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   124
        original_url = match.group(1)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   125
        url = self.prefix + urlquote(original_url, safe='')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   126
        if self.custom_css is not None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   127
            if '?' in url:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   128
                url = '%s&amp;custom_css=%s' % (url, self.custom_css)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   129
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   130
                url = '%s?custom_css=%s' % (url, self.custom_css)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   131
        return '<a href="%s"' % url
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   132
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   133
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   134
class absolutize_links:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   135
    def __init__(self, embedded_url, tag, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   136
        self.embedded_url = embedded_url
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   137
        self.tag = tag
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   138
        self.custom_css = custom_css
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1132
diff changeset
   139
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   140
    def __call__(self, match):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   141
        original_url = match.group(1)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   142
        if '://' in original_url:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   143
            return match.group(0) # leave it unchanged
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   144
        return '%s="%s"' % (self.tag, urljoin(self.embedded_url, original_url))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   145
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   146
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   147
def prefix_links(body, prefix, embedded_url, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   148
    filters = ((HREF_RGX, absolutize_links(embedded_url, '<a href', custom_css)),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   149
               (SRC_RGX, absolutize_links(embedded_url, '<img src')),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   150
               (HREF_RGX, replace_href(prefix, custom_css)))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
    for rgx, repl in filters:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   152
        body = rgx.sub(repl, body)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   153
    return body
631
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   154
99f5852f8604 major selector refactoring (mostly to avoid looking for select parameters on the target class), start accept / interface unification)
sylvain.thenault@logilab.fr
parents: 431
diff changeset
   155
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   156
def embed_external_page(url, prefix, headers=None, custom_css=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
    req = Request(url, headers=(headers or {}))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
    content = urlopen(req).read()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
    page_source = unicode(content, guess_encoding(content), 'replace')
1132
96752791c2b6 pylint cleanup
sylvain.thenault@logilab.fr
parents: 1092
diff changeset
   160
    page_source = page_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   161
    match = BODY_RGX.search(page_source)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   162
    if match is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   163
        return page_source
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
    return prefix_links(match.group(1), prefix, url, custom_css)