entities/sources.py
author Julien Cristau <julien.cristau@logilab.fr>
Mon, 28 Apr 2014 11:20:26 +0200
changeset 9708 b36bc18f6ef7
parent 8930 6a02be304486
child 10662 10942ed172de
permissions -rw-r--r--
[migration] move 'entities' table changes from 3.19.0 to bootstrap script The 'mtime' and 'source' columns need to go away before we attempt to do anything else with the repo, otherwise any addition of an entity is going to explode.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     1
# copyright 2003-2011 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    18
"""data source related entities"""
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    19
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    20
__docformat__ = "restructuredtext en"
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    21
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    22
import re
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    23
from socket import gethostname
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    24
import logging
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    25
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    26
from logilab.common.textutils import text_to_dict
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    27
from logilab.common.configuration import OptionError
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
    28
from logilab.mtconverter import xml_escape
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    29
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    30
from cubicweb.entities import AnyEntity, fetch_config
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    31
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    32
class _CWSourceCfgMixIn(object):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    34
    def dictconfig(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    35
        return self.config and text_to_dict(self.config) or {}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    37
    def update_config(self, skip_unknown=False, **config):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    38
        from cubicweb.server import SOURCE_TYPES
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    39
        from cubicweb.server.serverconfig import (SourceConfiguration,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
                                                  generate_source_config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
        cfg = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
        cfg.update(config)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
        options = SOURCE_TYPES[self.type].options
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
        sconfig = SourceConfiguration(self._cw.vreg.config, options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
        for opt, val in cfg.iteritems():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
            try:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    47
                sconfig.set_option(opt, val)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
            except OptionError:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
                if skip_unknown:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    50
                    continue
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
                raise
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    52
        cfgstr = unicode(generate_source_config(sconfig), self._cw.encoding)
8483
4ba11607d84a [entity api] unify set_attributes / set_relations into a cw_set method. Closes #2423719
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8068
diff changeset
    53
        self.cw_set(config=cfgstr)
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
class CWSource(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
    __regid__ = 'CWSource'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
    58
    fetch_attrs, cw_fetch_order = fetch_config(['name', 'type'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    61
    def host_config(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
        dictconfig = self.dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
        host = gethostname()
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
        for hostcfg in self.host_configs:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
            if hostcfg.match(host):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
                self.info('matching host config %s for source %s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    67
                          hostcfg.match_host, self.name)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
                dictconfig.update(hostcfg.dictconfig)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
        return dictconfig
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
    def host_configs(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
        return self.reverse_cw_host_config_of
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    75
    def init_mapping(self, mapping):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
        for key, options in mapping:
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
            if isinstance(key, tuple): # relation definition
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    78
                assert len(key) == 3
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
                restrictions = ['X relation_type RT, RT name %(rt)s']
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
                kwargs = {'rt': key[1]}
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    81
                if key[0] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                    restrictions.append('X from_entity FT, FT name %(ft)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
                    kwargs['ft'] = key[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    84
                if key[2] != '*':
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    85
                    restrictions.append('X to_entity TT, TT name %(tt)s')
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    86
                    kwargs['tt'] = key[2]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    87
                rql = 'Any X WHERE %s' % ','.join(restrictions)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
                schemarset = self._cw.execute(rql, kwargs)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
            elif key[0].isupper(): # entity type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
                schemarset = self._cw.execute('CWEType X WHERE X name %(et)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    91
                                              {'et': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    92
            else: # relation type
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
                schemarset = self._cw.execute('CWRType X WHERE X name %(rt)s',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    94
                                              {'rt': key})
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
            for schemaentity in schemarset.entities():
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    96
                self._cw.create_entity('CWSourceSchemaConfig',
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    97
                                       cw_for_source=self,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    98
                                       cw_schema=schemaentity,
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    99
                                       options=options)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   100
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   101
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   102
    def repo_source(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   103
        """repository only property, not available from the web side (eg
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   104
        self._cw is expected to be a server session)
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   105
        """
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   106
        return self._cw.repo.sources_by_eid[self.eid]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   107
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   108
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   109
class CWSourceHostConfig(_CWSourceCfgMixIn, AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   110
    __regid__ = 'CWSourceHostConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   111
    fetch_attrs, cw_fetch_order = fetch_config(['match_host', 'config'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   112
6945
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   113
    @property
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   114
    def cwsource(self):
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   115
        return self.cw_host_config_of[0]
28bf94d062a9 [sources] refactor source creation and options handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6944
diff changeset
   116
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   117
    def match(self, hostname):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   118
        return re.match(self.match_host, hostname)
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   119
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   120
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   121
class CWSourceSchemaConfig(AnyEntity):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   122
    __regid__ = 'CWSourceSchemaConfig'
7827
9bbf83f68bcc [entity] upgrade fetch_[unrelated_]order to benefit from changes introduced in 3.14 (closes #1942758)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6957
diff changeset
   123
    fetch_attrs, cw_fetch_order = fetch_config(['cw_for_source', 'cw_schema', 'options'])
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   124
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   125
    def dc_title(self):
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8483
diff changeset
   126
        return self._cw._(self.cw_etype) + ' #%s' % self.eid
6944
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   127
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   128
    @property
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   129
    def schema(self):
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   130
        return self.cw_schema[0]
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   131
0cf10429ad39 [sources] rewrite the way pyrorql mapping are stored in the database so it can be reused for other sources (eg datafeed+cwxml)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
    @property
6957
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   133
    def cwsource(self):
ffda12be2e9f [repository] #1460066: backport datafeed cube as cubicweb source
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6945
diff changeset
   134
        return self.cw_for_source[0]
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   135
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   136
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   137
class CWDataImport(AnyEntity):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   138
    __regid__ = 'CWDataImport'
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   139
    repo_source = _logs = None # please pylint
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   140
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   141
    def init(self):
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   142
        self._logs = []
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   143
        self.repo_source = self.cwsource.repo_source
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   144
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   145
    def dc_title(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   146
        return '%s [%s]' % (self.printable_value('start_timestamp'),
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   147
                            self.printable_value('status'))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   148
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   149
    @property
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   150
    def cwsource(self):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   151
        return self.cw_import_of[0]
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   152
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   153
    def record_debug(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   154
        self._log(logging.DEBUG, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   155
        self.repo_source.debug(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   156
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   157
    def record_info(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   158
        self._log(logging.INFO, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   159
        self.repo_source.info(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   160
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   161
    def record_warning(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   162
        self._log(logging.WARNING, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   163
        self.repo_source.warning(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   164
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   165
    def record_error(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   166
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   167
        self._log(logging.ERROR, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   168
        self.repo_source.error(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   169
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   170
    def record_fatal(self, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   171
        self._status = u'failed'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   172
        self._log(logging.FATAL, msg, path, line)
8068
72210779ff6d [data import log] log on the source so we properly have source name information
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   173
        self.repo_source.fatal(msg)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   174
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   175
    def _log(self, severity, msg, path=None, line=None):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   176
        encodedmsg =  u'%s\t%s\t%s\t%s<br/>' % (severity, path or u'',
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   177
                                                line or u'', xml_escape(msg))
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   178
        self._logs.append(encodedmsg)
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   179
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   180
    def write_log(self, session, **kwargs):
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   181
        if 'status' not in kwargs:
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   182
            kwargs['status'] = getattr(self, '_status', u'success')
8483
4ba11607d84a [entity api] unify set_attributes / set_relations into a cw_set method. Closes #2423719
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8068
diff changeset
   183
        self.cw_set(log=u'<br/>'.join(self._logs), **kwargs)
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7827
diff changeset
   184
        self._logs = []