entities/sources.py
author Julien Cristau <julien.cristau@logilab.fr>
Wed, 09 Jul 2014 15:59:34 +0200
changeset 9907 696b81eba218
parent 8930 6a02be304486
child 10662 10942ed172de
permissions -rw-r--r--
[dataimport] Update RQLObjectStore to Connection API Take a connection instead of a session. Don't play games with set_cnxset.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     1
# copyright 2003-2011 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    18
"""data source related entities"""
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    19
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    20
__docformat__ = "restructuredtext en"
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    21
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    22
import re
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    23
from socket import gethostname
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    24
import logging
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    25
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    26
from logilab.common.textutils import text_to_dict
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    27
from logilab.common.configuration import OptionError
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    28
from logilab.mtconverter import xml_escape
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    29
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    30
from cubicweb.entities import AnyEntity, fetch_config
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    31
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    32
class _CWSourceCfgMixIn(object):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    34
    def dictconfig(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    35
        return self.config and text_to_dict(self.config) or {}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    37
    def update_config(self, skip_unknown=False, **config):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    38
        from cubicweb.server import SOURCE_TYPES
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    39
        from cubicweb.server.serverconfig import (SourceConfiguration,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
                                                  generate_source_config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
        cfg = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
        cfg.update(config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
        options = SOURCE_TYPES[self.type].options
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
        sconfig = SourceConfiguration(self._cw.vreg.config, options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
        for opt, val in cfg.iteritems():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
            try:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    47
                sconfig.set_option(opt, val)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
            except OptionError:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
                if skip_unknown:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    50
                    continue
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
                raise
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    52
        cfgstr = unicode(generate_source_config(sconfig), self._cw.encoding)
8483
4ba11607d84a [entity api] unify set_attributes / set_relations into a cw_set method. Closes #2423719
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8068
diff changeset
    53
        self.cw_set(config=cfgstr)
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
class CWSource(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
    __regid__ = 'CWSource'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
    58
    fetch_attrs, cw_fetch_order = fetch_config(['name', 'type'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    61
    def host_config(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
        dictconfig = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
        host = gethostname()
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
        for hostcfg in self.host_configs:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
            if hostcfg.match(host):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
                self.info('matching host config %s for source %s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    67
                          hostcfg.match_host, self.name)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
                dictconfig.update(hostcfg.dictconfig)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
        return dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
    def host_configs(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
        return self.reverse_cw_host_config_of
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    75
    def init_mapping(self, mapping):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
        for key, options in mapping:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
            if isinstance(key, tuple): # relation definition
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    78
                assert len(key) == 3
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
                restrictions = ['X relation_type RT, RT name %(rt)s']
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
                kwargs = {'rt': key[1]}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    81
                if key[0] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                    restrictions.append('X from_entity FT, FT name %(ft)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
                    kwargs['ft'] = key[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    84
                if key[2] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    85
                    restrictions.append('X to_entity TT, TT name %(tt)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    86
                    kwargs['tt'] = key[2]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    87
                rql = 'Any X WHERE %s' % ','.join(restrictions)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
                schemarset = self._cw.execute(rql, kwargs)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
            elif key[0].isupper(): # entity type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
                schemarset = self._cw.execute('CWEType X WHERE X name %(et)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    91
                                              {'et': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    92
            else: # relation type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
                schemarset = self._cw.execute('CWRType X WHERE X name %(rt)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    94
                                              {'rt': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
            for schemaentity in schemarset.entities():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    96
                self._cw.create_entity('CWSourceSchemaConfig',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    97
                                       cw_for_source=self,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    98
                                       cw_schema=schemaentity,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    99
                                       options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   100
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   101
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   102
    def repo_source(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   103
        """repository only property, not available from the web side (eg
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   104
        self._cw is expected to be a server session)
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   105
        """
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   106
        return self._cw.repo.sources_by_eid[self.eid]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   107
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   108
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   109
class CWSourceHostConfig(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   110
    __regid__ = 'CWSourceHostConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   111
    fetch_attrs, cw_fetch_order = fetch_config(['match_host', 'config'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   112
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   113
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   114
    def cwsource(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   115
        return self.cw_host_config_of[0]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   116
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   117
    def match(self, hostname):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   118
        return re.match(self.match_host, hostname)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   119
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   120
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   121
class CWSourceSchemaConfig(AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   122
    __regid__ = 'CWSourceSchemaConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   123
    fetch_attrs, cw_fetch_order = fetch_config(['cw_for_source', 'cw_schema', 'options'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   124
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   125
    def dc_title(self):
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8483
diff changeset
   126
        return self._cw._(self.cw_etype) + ' #%s' % self.eid
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   127
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   128
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   129
    def schema(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   130
        return self.cw_schema[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   131
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
    @property
6957
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   133
    def cwsource(self):
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   134
        return self.cw_for_source[0]
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   135
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   136
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   137
class CWDataImport(AnyEntity):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   138
    __regid__ = 'CWDataImport'
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   139
    repo_source = _logs = None # please pylint
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   140
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   141
    def init(self):
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   142
        self._logs = []
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   143
        self.repo_source = self.cwsource.repo_source
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   144
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   145
    def dc_title(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   146
        return '%s [%s]' % (self.printable_value('start_timestamp'),
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   147
                            self.printable_value('status'))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   148
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   149
    @property
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   150
    def cwsource(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   151
        return self.cw_import_of[0]
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   152
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   153
    def record_debug(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   154
        self._log(logging.DEBUG, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   155
        self.repo_source.debug(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   156
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   157
    def record_info(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   158
        self._log(logging.INFO, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   159
        self.repo_source.info(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   160
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   161
    def record_warning(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   162
        self._log(logging.WARNING, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   163
        self.repo_source.warning(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   164
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   165
    def record_error(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   166
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   167
        self._log(logging.ERROR, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   168
        self.repo_source.error(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   169
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   170
    def record_fatal(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   171
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   172
        self._log(logging.FATAL, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   173
        self.repo_source.fatal(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   174
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   175
    def _log(self, severity, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   176
        encodedmsg =  u'%s\t%s\t%s\t%s<br/>' % (severity, path or u'',
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   177
                                                line or u'', xml_escape(msg))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   178
        self._logs.append(encodedmsg)
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   179
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   180
    def write_log(self, session, **kwargs):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   181
        if 'status' not in kwargs:
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   182
            kwargs['status'] = getattr(self, '_status', u'success')
8483
4ba11607d84a [entity api] unify set_attributes / set_relations into a cw_set method. Closes #2423719
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8068
diff changeset
   183
        self.cw_set(log=u'<br/>'.join(self._logs), **kwargs)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   184
        self._logs = []