entities/sources.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Wed, 09 Nov 2011 14:34:03 +0100
changeset 8068 72210779ff6d
parent 7995 9a9f35ef418c
child 8483 4ba11607d84a
permissions -rw-r--r--
[data import log] log on the source so we properly have source name information
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     1
# copyright 2003-2011 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    18
"""data source related entities"""
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    19
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    20
__docformat__ = "restructuredtext en"
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    21
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    22
import re
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    23
from socket import gethostname
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    24
import logging
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    25
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    26
from logilab.common.textutils import text_to_dict
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    27
from logilab.common.configuration import OptionError
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    28
from logilab.mtconverter import xml_escape
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    29
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    30
from cubicweb import ValidationError
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    31
from cubicweb.entities import AnyEntity, fetch_config
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    32
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
class _CWSourceCfgMixIn(object):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    34
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    35
    def dictconfig(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
        return self.config and text_to_dict(self.config) or {}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    37
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    38
    def update_config(self, skip_unknown=False, **config):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    39
        from cubicweb.server import SOURCE_TYPES
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
        from cubicweb.server.serverconfig import (SourceConfiguration,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
                                                  generate_source_config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
        cfg = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
        cfg.update(config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
        options = SOURCE_TYPES[self.type].options
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
        sconfig = SourceConfiguration(self._cw.vreg.config, options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
        for opt, val in cfg.iteritems():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    47
            try:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
                sconfig.set_option(opt, val)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
            except OptionError:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    50
                if skip_unknown:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
                    continue
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    52
                raise
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    53
        cfgstr = unicode(generate_source_config(sconfig), self._cw.encoding)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
        self.set_attributes(config=cfgstr)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
class CWSource(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
    __regid__ = 'CWSource'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
    59
    fetch_attrs, cw_fetch_order = fetch_config(['name', 'type'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    61
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
    def host_config(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
        dictconfig = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
        host = gethostname()
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
        for hostcfg in self.host_configs:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
            if hostcfg.match(host):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    67
                self.info('matching host config %s for source %s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
                          hostcfg.match_host, self.name)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
                dictconfig.update(hostcfg.dictconfig)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
        return dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
    def host_configs(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
        return self.reverse_cw_host_config_of
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    75
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
    def init_mapping(self, mapping):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
        for key, options in mapping:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    78
            if isinstance(key, tuple): # relation definition
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
                assert len(key) == 3
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
                restrictions = ['X relation_type RT, RT name %(rt)s']
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    81
                kwargs = {'rt': key[1]}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                if key[0] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
                    restrictions.append('X from_entity FT, FT name %(ft)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    84
                    kwargs['ft'] = key[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    85
                if key[2] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    86
                    restrictions.append('X to_entity TT, TT name %(tt)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    87
                    kwargs['tt'] = key[2]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
                rql = 'Any X WHERE %s' % ','.join(restrictions)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
                schemarset = self._cw.execute(rql, kwargs)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
            elif key[0].isupper(): # entity type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    91
                schemarset = self._cw.execute('CWEType X WHERE X name %(et)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    92
                                              {'et': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
            else: # relation type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    94
                schemarset = self._cw.execute('CWRType X WHERE X name %(rt)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
                                              {'rt': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    96
            for schemaentity in schemarset.entities():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    97
                self._cw.create_entity('CWSourceSchemaConfig',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    98
                                       cw_for_source=self,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    99
                                       cw_schema=schemaentity,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   100
                                       options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   101
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   102
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   103
    def repo_source(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   104
        """repository only property, not available from the web side (eg
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   105
        self._cw is expected to be a server session)
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   106
        """
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   107
        return self._cw.repo.sources_by_eid[self.eid]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   108
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   109
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   110
class CWSourceHostConfig(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   111
    __regid__ = 'CWSourceHostConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   112
    fetch_attrs, cw_fetch_order = fetch_config(['match_host', 'config'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   113
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   114
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   115
    def cwsource(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   116
        return self.cw_host_config_of[0]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   117
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   118
    def match(self, hostname):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   119
        return re.match(self.match_host, hostname)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   120
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   121
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   122
class CWSourceSchemaConfig(AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   123
    __regid__ = 'CWSourceSchemaConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   124
    fetch_attrs, cw_fetch_order = fetch_config(['cw_for_source', 'cw_schema', 'options'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   125
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   126
    def dc_title(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   127
        return self._cw._(self.__regid__) + ' #%s' % self.eid
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   128
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   129
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   130
    def schema(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   131
        return self.cw_schema[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   133
    @property
6957
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   134
    def cwsource(self):
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   135
        return self.cw_for_source[0]
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   136
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   137
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   138
class CWDataImport(AnyEntity):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   139
    __regid__ = 'CWDataImport'
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   140
    repo_source = _logs = None # please pylint
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   141
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   142
    def init(self):
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   143
        self._logs = []
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   144
        self.repo_source = self.cwsource.repo_source
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   145
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   146
    def dc_title(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   147
        return '%s [%s]' % (self.printable_value('start_timestamp'),
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   148
                            self.printable_value('status'))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   149
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   150
    @property
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   151
    def cwsource(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   152
        return self.cw_import_of[0]
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   153
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   154
    def record_debug(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   155
        self._log(logging.DEBUG, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   156
        self.repo_source.debug(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   157
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   158
    def record_info(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   159
        self._log(logging.INFO, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   160
        self.repo_source.info(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   161
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   162
    def record_warning(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   163
        self._log(logging.WARNING, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   164
        self.repo_source.warning(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   165
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   166
    def record_error(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   167
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   168
        self._log(logging.ERROR, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   169
        self.repo_source.error(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   170
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   171
    def record_fatal(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   172
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   173
        self._log(logging.FATAL, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   174
        self.repo_source.fatal(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   175
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   176
    def _log(self, severity, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   177
        encodedmsg =  u'%s\t%s\t%s\t%s<br/>' % (severity, path or u'',
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   178
                                                line or u'', xml_escape(msg))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   179
        self._logs.append(encodedmsg)
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   180
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   181
    def write_log(self, session, **kwargs):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   182
        if 'status' not in kwargs:
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   183
            kwargs['status'] = getattr(self, '_status', u'success')
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   184
        self.set_attributes(log=u'<br/>'.join(self._logs), **kwargs)
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   185
        self._logs = []