sobjects/parsers.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Tue, 11 Oct 2011 13:50:32 +0200
branchstable
changeset 7932 2ad26cc3b5c6
parent 7916 54e240c1b419
child 7957 5da7ede69063
permissions -rw-r--r--
[datafeed parser] only update an entity attributes when needed. Closes #1989142 This may avoid error with some entities which have immutable attributes and don't check something actually changed. Also, avoid modifying the cached item representation, this may cause pb when an item is seen several times during an import.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     1
# copyright 2010-2011 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    18
"""datafeed parser for xml generated by cubicweb
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    19
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    20
Example of mapping for CWEntityXMLParser::
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    21
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    22
  {u'CWUser': {                                        # EntityType
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    23
      (u'in_group', u'subject', u'link'): [            # (rtype, role, action)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    24
          (u'CWGroup', {u'linkattr': u'name'})],       #   -> rules = [(EntityType, options), ...]
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    25
      (u'tags', u'object', u'link-or-create'): [       # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    26
          (u'Tag', {u'linkattr': u'name'})],           #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    27
      (u'use_email', u'subject', u'copy'): [           # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    28
          (u'EmailAddress', {})]                       #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    29
      }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    30
   }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    31
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    32
"""
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
6963
5774d4ba4306 [datafeed] introduce a host mapping so dev instance may transparently redirect request to another host than the actual's one
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6960
diff changeset
    34
import os.path as osp
7879
9aae456abab5 [pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7727
diff changeset
    35
from datetime import datetime, timedelta, time
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
    36
from urllib import urlencode
7532
76914c7ba1b0 [python 2.5] compat fix
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7531
diff changeset
    37
from cgi import parse_qs # in urlparse with python >= 2.6
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    38
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    39
from logilab.common.date import todate, totime
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
from logilab.common.textutils import splitstrip, text_to_dict
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    41
from logilab.common.decorators import classproperty
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
from yams.constraints import BASE_CONVERTERS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
from yams.schema import role_name as rn
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    46
from cubicweb import ValidationError, RegistryException, typed_eid
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    47
from cubicweb.view import Component
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
from cubicweb.server.sources import datafeed
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    49
from cubicweb.server.hook import match_rtype
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    50
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    51
# XXX see cubicweb.cwvreg.YAMS_TO_PY
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    52
# XXX see cubicweb.web.views.xmlrss.SERIALIZERS
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    53
DEFAULT_CONVERTERS = BASE_CONVERTERS.copy()
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
DEFAULT_CONVERTERS['String'] = unicode
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
DEFAULT_CONVERTERS['Password'] = lambda x: x.encode('utf8')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
def convert_date(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
    return todate(datetime.strptime(ustr, '%Y-%m-%d'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
DEFAULT_CONVERTERS['Date'] = convert_date
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
def convert_datetime(ustr):
7002
29f085f6177b [sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7001
diff changeset
    60
    if '.' in ustr: # assume %Y-%m-%d %H:%M:%S.mmmmmm
29f085f6177b [sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7001
diff changeset
    61
        ustr = ustr.split('.',1)[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
    return datetime.strptime(ustr, '%Y-%m-%d %H:%M:%S')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
DEFAULT_CONVERTERS['Datetime'] = convert_datetime
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
def convert_time(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
    return totime(datetime.strptime(ustr, '%H:%M:%S'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
DEFAULT_CONVERTERS['Time'] = convert_time
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    67
def convert_interval(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
    return time(seconds=int(ustr))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
DEFAULT_CONVERTERS['Interval'] = convert_interval
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
def extract_typed_attrs(eschema, stringdict, converters=DEFAULT_CONVERTERS):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
    typeddict = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
    for rschema in eschema.subject_relations():
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
        if rschema.final and rschema in stringdict:
7932
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
    75
            if rschema in ('eid', 'cwuri', 'cwtype', 'cwsource'):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
            attrtype = eschema.destination(rschema)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    78
            value = stringdict[rschema]
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    79
            if value is not None:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    80
                value = converters[attrtype](value)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    81
            typeddict[rschema.type] = value
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
    return typeddict
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    84
def rtype_role_rql(rtype, role):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    85
    if role == 'object':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    86
        return 'Y %s X WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    87
    else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
        return 'X %s Y WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    91
class CWEntityXMLParser(datafeed.DataFeedXMLParser):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    92
    """datafeed parser for the 'xml' entity view
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    94
    Most of the logic is delegated to the following components:
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    96
    * an "item builder" component, turning an etree xml node into a specific
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    97
      python dictionnary representing an entity
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    98
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    99
    * "action" components, selected given an entity, a relation and its role in
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   100
      the relation, and responsible to link the entity to given related items
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   101
      (eg dictionnary)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   102
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   103
    So the parser is only doing the gluing service and the connection to the
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   104
    source.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   105
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   106
    __regid__ = 'cw.entityxml'
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   107
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   108
    def __init__(self, *args, **kwargs):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   109
        super(CWEntityXMLParser, self).__init__(*args, **kwargs)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   110
        self._parsed_urls = {}
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   111
        self._processed_entities = set()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   112
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   113
    def select_linker(self, action, rtype, role, entity=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   114
        try:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   115
            return self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   116
                'cw.entityxml.action.%s' % action, self._cw, entity=entity,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   117
                rtype=rtype, role=role, parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   118
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   119
            raise RegistryException('Unknown action %s' % action)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   120
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   121
    def list_actions(self):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   122
        reg = self._cw.vreg['components']
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   123
        return sorted(clss[0].action for rid, clss in reg.iteritems()
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   124
                      if rid.startswith('cw.entityxml.action.'))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   125
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   126
    # mapping handling #########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   127
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   128
    def add_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   129
        """added CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   130
        _ = self._cw._
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   131
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
            rtype = schemacfg.schema.rtype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   133
        except AttributeError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   134
            msg = _("entity and relation types can't be mapped, only attributes "
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   135
                    "or relations")
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   136
            raise ValidationError(schemacfg.eid, {rn('cw_for_schema', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   137
        if schemacfg.options:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   138
            options = text_to_dict(schemacfg.options)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   139
        else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   140
            options = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   141
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   142
            role = options.pop('role')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   143
            if role not in ('subject', 'object'):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   144
                raise KeyError
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   145
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   146
            msg = _('"role=subject" or "role=object" must be specified in options')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   147
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   148
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   149
            action = options.pop('action')
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   150
            linker = self.select_linker(action, rtype, role)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   151
            linker.check_options(options, schemacfg.eid)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   152
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   153
            msg = _('"action" must be specified in options; allowed values are '
7879
9aae456abab5 [pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7727
diff changeset
   154
                    '%s') % ', '.join(self.list_actions())
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   155
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   156
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   157
            msg = _('allowed values for "action" are %s') % ', '.join(self.list_actions())
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   158
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   159
        if not checkonly:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   160
            if role == 'subject':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   161
                etype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   162
                ttype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   163
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   164
                etype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   165
                ttype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   166
            etyperules = self.source.mapping.setdefault(etype, {})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   167
            etyperules.setdefault((rtype, role, action), []).append(
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   168
                (ttype, options) )
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   169
            self.source.mapping_idx[schemacfg.eid] = (
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   170
                etype, rtype, role, action, ttype)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   171
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   172
    def del_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   173
        """deleted CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   174
        etype, rtype, role, action, ttype = self.source.mapping_idx[schemacfg.eid]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   175
        rules = self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   176
        rules = [x for x in rules if not x[0] == ttype]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   177
        if not rules:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   178
            del self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   179
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   180
    # import handling ##########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   181
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   182
    def process(self, url, raise_on_error=False, partialcommit=True):
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   183
        """IDataFeedParser main entry point"""
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   184
        if url.startswith('http'): # XXX similar loose test as in parse of sources.datafeed
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   185
            url = self.complete_url(url)
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   186
        super(CWEntityXMLParser, self).process(url, raise_on_error, partialcommit)
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   187
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   188
    def parse_etree(self, parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   189
        for node in list(parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   190
            builder = self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   191
                'cw.entityxml.item-builder', self._cw, node=node,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   192
                parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   193
            yield builder.build_item()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   194
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   195
    def process_item(self, item, rels):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   196
        """
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   197
        item and rels are what's returned by the item builder `build_item` method:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   198
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   199
        * `item` is an {attribute: value} dictionary
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   200
        * `rels` is for relations and structured as
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   201
           {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   202
        """
7932
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   203
        entity = self.extid2entity(str(item['cwuri']),  item['cwtype'],
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   204
                                   cwsource=item['cwsource'], item=item)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   205
        if entity is None:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   206
            return None
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   207
        if entity.eid in self._processed_entities:
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   208
            return entity
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   209
        self._processed_entities.add(entity.eid)
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   210
        if not (self.created_during_pull(entity) or self.updated_during_pull(entity)):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   211
            self.notify_updated(entity)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   212
            attrs = extract_typed_attrs(entity.e_schema, item)
7932
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   213
            # check modification date and compare attribute values to only
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   214
            # update what's actually needed
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   215
            entity.complete(tuple(attrs))
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   216
            mdate = attrs.get('modification_date')
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   217
            if not mdate or mdate > entity.modification_date:
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   218
                attrs = dict( (k, v) for k, v in attrs.iteritems()
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   219
                              if v != getattr(entity, k))
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   220
                if attrs:
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
   221
                    entity.set_attributes(**attrs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   222
        self.process_relations(entity, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   223
        return entity
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   224
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   225
    def process_relations(self, entity, rels):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   226
        etype = entity.__regid__
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   227
        for (rtype, role, action), rules in self.source.mapping.get(etype, {}).iteritems():
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   228
            try:
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   229
                related_items = rels[role][rtype]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   230
            except KeyError:
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   231
                self.source.error('relation %s-%s not found in xml export of %s',
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   232
                                  rtype, role, etype)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   233
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   234
            try:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   235
                linker = self.select_linker(action, rtype, role, entity)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   236
            except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   237
                self.source.error('no linker for action %s', action)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   238
            else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   239
                linker.link_items(related_items, rules)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   240
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   241
    def before_entity_copy(self, entity, sourceparams):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   242
        """IDataFeedParser callback"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   243
        attrs = extract_typed_attrs(entity.e_schema, sourceparams['item'])
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   244
        entity.cw_edited.update(attrs)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   245
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   246
    def complete_url(self, url, etype=None, known_relations=None):
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   247
        """append to the url's query string information about relation that should
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   248
        be included in the resulting xml, according to source mapping.
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   249
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   250
        If etype is not specified, try to guess it using the last path part of
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   251
        the url, i.e. the format used by default in cubicweb to map all entities
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   252
        of a given type as in 'http://mysite.org/EntityType'.
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   253
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   254
        If `known_relations` is given, it should be a dictionary of already
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   255
        known relations, so they don't get queried again.
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   256
        """
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   257
        try:
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   258
            url, qs = url.split('?', 1)
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   259
        except ValueError:
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   260
            qs = ''
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   261
        params = parse_qs(qs)
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   262
        if not 'vid' in params:
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   263
            params['vid'] = ['xml']
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   264
        if etype is None:
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   265
            try:
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   266
                etype = url.rsplit('/', 1)[1]
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   267
            except ValueError:
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   268
                return url + '?' + self._cw.build_url_params(**params)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   269
            try:
7701
e5ba5de1d485 [datafeed] case insensitive etype. Closes #1877014
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7700
diff changeset
   270
                etype = self._cw.vreg.case_insensitive_etypes[etype.lower()]
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   271
            except KeyError:
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   272
                return url + '?' + self._cw.build_url_params(**params)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   273
        relations = params.setdefault('relation', [])
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   274
        for rtype, role, _ in self.source.mapping.get(etype, ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   275
            if known_relations and rtype in known_relations.get('role', ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   276
                continue
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   277
            reldef = '%s-%s' % (rtype, role)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   278
            if not reldef in relations:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   279
                relations.append(reldef)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   280
        return url + '?' + self._cw.build_url_params(**params)
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   281
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   282
    def complete_item(self, item, rels):
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   283
        try:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   284
            return self._parsed_urls[item['cwuri']]
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   285
        except KeyError:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   286
            itemurl = self.complete_url(item['cwuri'], item['cwtype'], rels)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   287
            item_rels = list(self.parse(itemurl))
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   288
            assert len(item_rels) == 1, 'url %s expected to bring back one '\
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   289
                   'and only one entity, got %s' % (itemurl, len(item_rels))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   290
            self._parsed_urls[item['cwuri']] = item_rels[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   291
            if rels:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   292
                # XXX (do it better) merge relations
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   293
                new_rels = item_rels[0][1]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   294
                new_rels.get('subject', {}).update(rels.get('subject', {}))
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   295
                new_rels.get('object', {}).update(rels.get('object', {}))
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   296
            return item_rels[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   297
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   298
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   299
class CWEntityXMLItemBuilder(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   300
    __regid__ = 'cw.entityxml.item-builder'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   301
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   302
    def __init__(self, _cw, parser, node, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   303
        super(CWEntityXMLItemBuilder, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   304
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   305
        self.node = node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   306
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   307
    def build_item(self):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   308
        """parse a XML document node and return two dictionaries defining (part
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   309
        of) an entity:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   310
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   311
        - {attribute: value}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   312
        - {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   313
        """
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   314
        node = self.node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   315
        item = dict(node.attrib.items())
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   316
        item['cwtype'] = unicode(node.tag)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   317
        item.setdefault('cwsource', None)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   318
        try:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   319
            item['eid'] = typed_eid(item['eid'])
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   320
        except KeyError:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   321
            # cw < 3.11 compat mode XXX
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   322
            item['eid'] = typed_eid(node.find('eid').text)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   323
            item['cwuri'] = node.find('cwuri').text
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   324
        rels = {}
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   325
        for child in node:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   326
            role = child.get('role')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   327
            if role:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   328
                # relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   329
                related = rels.setdefault(role, {}).setdefault(child.tag, [])
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   330
                related += self.parser.parse_etree(child)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   331
            elif child.text:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   332
                # attribute
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   333
                item[child.tag] = unicode(child.text)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   334
            else:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   335
                # None attribute (empty tag)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   336
                item[child.tag] = None
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   337
        return item, rels
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   338
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   339
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   340
class CWEntityXMLActionCopy(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   341
    """implementation of cubicweb entity xml parser's'copy' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   342
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   343
    Takes no option.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   344
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   345
    __regid__ = 'cw.entityxml.action.copy'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   346
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   347
    def __init__(self, _cw, parser, rtype, role, entity=None, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   348
        super(CWEntityXMLActionCopy, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   349
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   350
        self.rtype = rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   351
        self.role = role
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   352
        self.entity = entity
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   353
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   354
    @classproperty
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   355
    def action(cls):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   356
        return cls.__regid__.rsplit('.', 1)[-1]
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   357
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   358
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   359
        self._check_no_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   360
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   361
    def _check_no_options(self, options, eid, msg=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   362
        if options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   363
            if msg is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   364
                msg = self._cw._("'%s' action doesn't take any options") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   365
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   366
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   367
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   368
        assert not any(x[1] for x in rules), "'copy' action takes no option"
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   369
        ttypes = frozenset([x[0] for x in rules])
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   370
        eids = [] # local eids
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   371
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   372
            if item['cwtype'] in ttypes:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   373
                item, rels = self.parser.complete_item(item, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   374
                other_entity = self.parser.process_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   375
                if other_entity is not None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   376
                    eids.append(other_entity.eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   377
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   378
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   379
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   380
            self._clear_relation(ttypes)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   381
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   382
    def _clear_relation(self, ttypes):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   383
        if not self.parser.created_during_pull(self.entity):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   384
            if len(ttypes) > 1:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   385
                typerestr = ', Y is IN(%s)' % ','.join(ttypes)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   386
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   387
                typerestr = ', Y is %s' % ','.join(ttypes)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   388
            self._cw.execute('DELETE ' + rtype_role_rql(self.rtype, self.role) + typerestr,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   389
                             {'x': self.entity.eid})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   390
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   391
    def _set_relation(self, eids):
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   392
        assert eids
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   393
        rtype = self.rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   394
        rqlbase = rtype_role_rql(rtype, self.role)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   395
        eidstr = ','.join(str(eid) for eid in eids)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   396
        self._cw.execute('DELETE %s, NOT Y eid IN (%s)' % (rqlbase, eidstr),
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   397
                         {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   398
        if self.role == 'object':
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   399
            rql = 'SET %s, Y eid IN (%s), NOT Y %s X' % (rqlbase, eidstr, rtype)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   400
        else:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   401
            rql = 'SET %s, Y eid IN (%s), NOT X %s Y' % (rqlbase, eidstr, rtype)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   402
        self._cw.execute(rql, {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   403
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   404
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   405
class CWEntityXMLActionLink(CWEntityXMLActionCopy):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   406
    """implementation of cubicweb entity xml parser's'link' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   407
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   408
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   409
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   410
    __regid__ = 'cw.entityxml.action.link'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   411
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   412
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   413
        if not 'linkattr' in options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   414
            msg = self._cw._("'%s' action requires 'linkattr' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   415
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   416
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   417
    create_when_not_found = False
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   418
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   419
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   420
        for ttype, options in rules:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   421
            searchattrs = splitstrip(options.get('linkattr', ''))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   422
            self._related_link(ttype, others, searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   423
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   424
    def _related_link(self, ttype, others, searchattrs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   425
        def issubset(x,y):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   426
            return all(z in y for z in x)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   427
        eids = [] # local eids
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   428
        source = self.parser.source
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   429
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   430
            if item['cwtype'] != ttype:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   431
                continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   432
            if not issubset(searchattrs, item):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   433
                item, rels = self.parser.complete_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   434
                if not issubset(searchattrs, item):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   435
                    source.error('missing attribute, got %s expected keys %s',
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   436
                                 item, searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   437
                    continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   438
            # XXX str() needed with python < 2.6
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   439
            kwargs = dict((str(attr), item[attr]) for attr in searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   440
            targets = self._find_entities(item, kwargs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   441
            if len(targets) == 1:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   442
                entity = targets[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   443
            elif not targets and self.create_when_not_found:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   444
                entity = self._cw.create_entity(item['cwtype'], **kwargs)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   445
            else:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   446
                if len(targets) > 1:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   447
                    source.error('ambiguous link: found %s entity %s with attributes %s',
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   448
                                 len(targets), item['cwtype'], kwargs)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   449
                else:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   450
                    source.error('can not find %s entity with attributes %s',
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   451
                                 item['cwtype'], kwargs)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   452
                continue
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   453
            eids.append(entity.eid)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   454
            self.parser.process_relations(entity, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   455
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   456
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   457
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   458
            self._clear_relation((ttype,))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   459
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   460
    def _find_entities(self, item, kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   461
        return tuple(self._cw.find_entities(item['cwtype'], **kwargs))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   462
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   463
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   464
class CWEntityXMLActionLinkInState(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   465
    """custom implementation of cubicweb entity xml parser's'link' action for
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   466
    in_state relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   467
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   468
    __select__ = match_rtype('in_state')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   469
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   470
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   471
        super(CWEntityXMLActionLinkInState, self).check_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   472
        if not 'name' in options['linkattr']:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   473
            msg = self._cw._("'%s' action for in_state relation should at least have 'linkattr=name' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   474
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   475
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   476
    def _find_entities(self, item, kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   477
        assert 'name' in item # XXX else, complete_item
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   478
        state_name = item['name']
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   479
        wf = self.entity.cw_adapt_to('IWorkflowable').current_workflow
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   480
        state = wf.state_by_name(state_name)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   481
        if state is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   482
            return ()
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   483
        return (state,)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   484
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   485
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   486
class CWEntityXMLActionLinkOrCreate(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   487
    """implementation of cubicweb entity xml parser's'link-or-create' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   488
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   489
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   490
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   491
    __regid__ = 'cw.entityxml.action.link-or-create'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   492
    create_when_not_found = True
6963
5774d4ba4306 [datafeed] introduce a host mapping so dev instance may transparently redirect request to another host than the actual's one
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6960
diff changeset
   493
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   494
6963
5774d4ba4306 [datafeed] introduce a host mapping so dev instance may transparently redirect request to another host than the actual's one
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6960
diff changeset
   495
def registration_callback(vreg):
6970
a6ccbfbacf3d [parser host mapping] take care to cases where apphome is None (c-c i18ncube command for instance)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6963
diff changeset
   496
    vreg.register_all(globals().values(), __name__)
7727
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   497
    global URL_MAPPING
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   498
    URL_MAPPING = {}
6970
a6ccbfbacf3d [parser host mapping] take care to cases where apphome is None (c-c i18ncube command for instance)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6963
diff changeset
   499
    if vreg.config.apphome:
7727
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   500
        url_mapping_file = osp.join(vreg.config.apphome, 'urlmapping.py')
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   501
        if osp.exists(url_mapping_file):
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   502
            URL_MAPPING = eval(file(url_mapping_file).read())
70ea754d3e04 [server] host mapping is better named url mapping (closes #1892461)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7701
diff changeset
   503
            vreg.info('using url mapping %s from %s', URL_MAPPING, url_mapping_file)