sobjects/cwxmlparser.py
author Julien Cristau <julien.cristau@logilab.fr>
Wed, 09 Dec 2015 18:21:55 +0100
changeset 10996 dc572d116731
parent 10758 a34edc1057c0
child 11034 75d752e6daf7
permissions -rw-r--r--
Backout "[entity,storage] remove hackish code from the bfss path The merge of the repo-side and web-side entity caches doesn't actually work, because entities have a _cw attribute which is sometimes a connection and sometimes a web request, and whose nature changes not-really-predictably whenever ResultSet.get_entity() is called and brings it out of the cache. This reverts commit 59a79300f213.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8188
1867e252e487 [repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
     1
# copyright 2010-2012 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    18
"""datafeed parser for xml generated by cubicweb
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    19
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    20
Example of mapping for CWEntityXMLParser::
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    21
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    22
  {u'CWUser': {                                        # EntityType
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    23
      (u'in_group', u'subject', u'link'): [            # (rtype, role, action)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    24
          (u'CWGroup', {u'linkattr': u'name'})],       #   -> rules = [(EntityType, options), ...]
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    25
      (u'tags', u'object', u'link-or-create'): [       # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    26
          (u'Tag', {u'linkattr': u'name'})],           #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    27
      (u'use_email', u'subject', u'copy'): [           # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    28
          (u'EmailAddress', {})]                       #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    29
      }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    30
   }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    31
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    32
"""
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
    34
from datetime import datetime, time
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
    35
import urllib
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
    37
from six import text_type
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
    38
from six.moves.urllib.parse import urlparse, urlunparse, parse_qs, urlencode
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
    39
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
from logilab.common.date import todate, totime
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
from logilab.common.textutils import splitstrip, text_to_dict
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    42
from logilab.common.decorators import classproperty
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
from yams.constraints import BASE_CONVERTERS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
from yams.schema import role_name as rn
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
    47
from cubicweb import ValidationError, RegistryException
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    48
from cubicweb.view import Component
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
from cubicweb.server.sources import datafeed
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    50
from cubicweb.server.hook import match_rtype
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    52
# XXX see cubicweb.cwvreg.YAMS_TO_PY
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    53
# XXX see cubicweb.web.views.xmlrss.SERIALIZERS
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
DEFAULT_CONVERTERS = BASE_CONVERTERS.copy()
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
    55
DEFAULT_CONVERTERS['String'] = text_type
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
DEFAULT_CONVERTERS['Password'] = lambda x: x.encode('utf8')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
def convert_date(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
    return todate(datetime.strptime(ustr, '%Y-%m-%d'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
DEFAULT_CONVERTERS['Date'] = convert_date
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
def convert_datetime(ustr):
7002
29f085f6177b [sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7001
diff changeset
    61
    if '.' in ustr: # assume %Y-%m-%d %H:%M:%S.mmmmmm
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
    62
        ustr = ustr.split('.', 1)[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
    return datetime.strptime(ustr, '%Y-%m-%d %H:%M:%S')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
DEFAULT_CONVERTERS['Datetime'] = convert_datetime
7957
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    65
# XXX handle timezone, though this will be enough as TZDatetime are
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    66
# serialized without time zone by default (UTC time). See
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    67
# cw.web.views.xmlrss.SERIALIZERS.
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    68
DEFAULT_CONVERTERS['TZDatetime'] = convert_datetime
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
def convert_time(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
    return totime(datetime.strptime(ustr, '%H:%M:%S'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
DEFAULT_CONVERTERS['Time'] = convert_time
7957
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    72
DEFAULT_CONVERTERS['TZTime'] = convert_time
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
def convert_interval(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
    return time(seconds=int(ustr))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    75
DEFAULT_CONVERTERS['Interval'] = convert_interval
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
def extract_typed_attrs(eschema, stringdict, converters=DEFAULT_CONVERTERS):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    78
    typeddict = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
    for rschema in eschema.subject_relations():
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
        if rschema.final and rschema in stringdict:
7932
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
    81
            if rschema in ('eid', 'cwuri', 'cwtype', 'cwsource'):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
            attrtype = eschema.destination(rschema)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    84
            value = stringdict[rschema]
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    85
            if value is not None:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    86
                value = converters[attrtype](value)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    87
            typeddict[rschema.type] = value
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
    return typeddict
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
def rtype_role_rql(rtype, role):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    91
    if role == 'object':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    92
        return 'Y %s X WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
    else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    94
        return 'X %s Y WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    96
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    97
class CWEntityXMLParser(datafeed.DataFeedXMLParser):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    98
    """datafeed parser for the 'xml' entity view
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    99
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   100
    Most of the logic is delegated to the following components:
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   101
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   102
    * an "item builder" component, turning an etree xml node into a specific
8239
c6cdd060212e backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8189
diff changeset
   103
      python dictionary representing an entity
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   104
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   105
    * "action" components, selected given an entity, a relation and its role in
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   106
      the relation, and responsible to link the entity to given related items
8239
c6cdd060212e backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8189
diff changeset
   107
      (eg dictionary)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   108
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   109
    So the parser is only doing the gluing service and the connection to the
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   110
    source.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   111
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   112
    __regid__ = 'cw.entityxml'
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   113
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   114
    def __init__(self, *args, **kwargs):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   115
        super(CWEntityXMLParser, self).__init__(*args, **kwargs)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   116
        self._parsed_urls = {}
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   117
        self._processed_entities = set()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   118
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   119
    def select_linker(self, action, rtype, role, entity=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   120
        try:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   121
            return self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   122
                'cw.entityxml.action.%s' % action, self._cw, entity=entity,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   123
                rtype=rtype, role=role, parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   124
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   125
            raise RegistryException('Unknown action %s' % action)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   126
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   127
    def list_actions(self):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   128
        reg = self._cw.vreg['components']
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   129
        return sorted(clss[0].action for rid, clss in reg.items()
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   130
                      if rid.startswith('cw.entityxml.action.'))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   131
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
    # mapping handling #########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   133
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   134
    def add_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   135
        """added CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   136
        _ = self._cw._
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   137
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   138
            rtype = schemacfg.schema.rtype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   139
        except AttributeError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   140
            msg = _("entity and relation types can't be mapped, only attributes "
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   141
                    "or relations")
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   142
            raise ValidationError(schemacfg.eid, {rn('cw_for_schema', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   143
        if schemacfg.options:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   144
            options = text_to_dict(schemacfg.options)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   145
        else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   146
            options = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   147
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   148
            role = options.pop('role')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   149
            if role not in ('subject', 'object'):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   150
                raise KeyError
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   151
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   152
            msg = _('"role=subject" or "role=object" must be specified in options')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   153
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   154
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   155
            action = options.pop('action')
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   156
            linker = self.select_linker(action, rtype, role)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   157
            linker.check_options(options, schemacfg.eid)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   158
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   159
            msg = _('"action" must be specified in options; allowed values are '
7879
9aae456abab5 [pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7727
diff changeset
   160
                    '%s') % ', '.join(self.list_actions())
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   161
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   162
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   163
            msg = _('allowed values for "action" are %s') % ', '.join(self.list_actions())
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   164
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   165
        if not checkonly:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   166
            if role == 'subject':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   167
                etype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   168
                ttype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   169
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   170
                etype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   171
                ttype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   172
            etyperules = self.source.mapping.setdefault(etype, {})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   173
            etyperules.setdefault((rtype, role, action), []).append(
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
   174
                (ttype, options))
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   175
            self.source.mapping_idx[schemacfg.eid] = (
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   176
                etype, rtype, role, action, ttype)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   177
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   178
    def del_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   179
        """deleted CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   180
        etype, rtype, role, action, ttype = self.source.mapping_idx[schemacfg.eid]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   181
        rules = self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   182
        rules = [x for x in rules if not x[0] == ttype]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   183
        if not rules:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   184
            del self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   185
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   186
    # import handling ##########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   187
8409
79534887943e [datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8299
diff changeset
   188
    def process(self, url, raise_on_error=False):
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   189
        """IDataFeedParser main entry point"""
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   190
        if url.startswith('http'): # XXX similar loose test as in parse of sources.datafeed
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   191
            url = self.complete_url(url)
8409
79534887943e [datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8299
diff changeset
   192
        super(CWEntityXMLParser, self).process(url, raise_on_error)
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   193
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   194
    def parse_etree(self, parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   195
        for node in list(parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   196
            builder = self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   197
                'cw.entityxml.item-builder', self._cw, node=node,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   198
                parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   199
            yield builder.build_item()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   200
10089
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   201
    def process_item(self, item, rels, raise_on_error=False):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   202
        """
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   203
        item and rels are what's returned by the item builder `build_item` method:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   204
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   205
        * `item` is an {attribute: value} dictionary
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   206
        * `rels` is for relations and structured as
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   207
           {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   208
        """
10758
a34edc1057c0 [datafeed] use explicit encode/decode instead of casting to str
Julien Cristau <julien.cristau@logilab.fr>
parents: 10756
diff changeset
   209
        entity = self.extid2entity(item['cwuri'].encode('ascii'), item['cwtype'],
10089
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   210
                                   cwsource=item['cwsource'], item=item,
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   211
                                   raise_on_error=raise_on_error)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   212
        if entity is None:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   213
            return None
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   214
        if entity.eid in self._processed_entities:
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   215
            return entity
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   216
        self._processed_entities.add(entity.eid)
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   217
        if not (self.created_during_pull(entity) or self.updated_during_pull(entity)):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   218
            attrs = extract_typed_attrs(entity.e_schema, item)
8188
1867e252e487 [repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   219
            self.update_if_necessary(entity, attrs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   220
        self.process_relations(entity, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   221
        return entity
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   222
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   223
    def process_relations(self, entity, rels):
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8748
diff changeset
   224
        etype = entity.cw_etype
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   225
        for (rtype, role, action), rules in self.source.mapping.get(etype, {}).items():
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   226
            try:
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   227
                related_items = rels[role][rtype]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   228
            except KeyError:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   229
                self.import_log.record_error('relation %s-%s not found in xml export of %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   230
                                             % (rtype, role, etype))
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   231
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   232
            try:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   233
                linker = self.select_linker(action, rtype, role, entity)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   234
            except RegistryException:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   235
                self.import_log.record_error('no linker for action %s' % action)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   236
            else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   237
                linker.link_items(related_items, rules)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   238
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   239
    def before_entity_copy(self, entity, sourceparams):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   240
        """IDataFeedParser callback"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   241
        attrs = extract_typed_attrs(entity.e_schema, sourceparams['item'])
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   242
        entity.cw_edited.update(attrs)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   243
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   244
    def normalize_url(self, url):
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   245
        """overridden to add vid=xml if vid is not set in the qs"""
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   246
        url = super(CWEntityXMLParser, self).normalize_url(url)
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   247
        purl = urlparse(url)
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   248
        if purl.scheme in ('http', 'https'):
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   249
            params = parse_qs(purl.query)
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   250
            if 'vid' not in params:
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   251
                params['vid'] = ['xml']
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   252
                purl = list(purl)
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   253
                purl[4] = urlencode(params, doseq=True)
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   254
                return urlunparse(purl)
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   255
        return url
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   256
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   257
    def complete_url(self, url, etype=None, known_relations=None):
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   258
        """append to the url's query string information about relation that should
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   259
        be included in the resulting xml, according to source mapping.
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   260
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   261
        If etype is not specified, try to guess it using the last path part of
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   262
        the url, i.e. the format used by default in cubicweb to map all entities
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   263
        of a given type as in 'http://mysite.org/EntityType'.
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   264
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   265
        If `known_relations` is given, it should be a dictionary of already
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   266
        known relations, so they don't get queried again.
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   267
        """
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   268
        purl = urlparse(url)
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   269
        params = parse_qs(purl.query)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   270
        if etype is None:
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   271
            etype = purl.path.split('/')[-1]
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   272
        try:
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   273
            etype = self._cw.vreg.case_insensitive_etypes[etype.lower()]
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   274
        except KeyError:
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   275
            return url
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   276
        relations = params['relation'] = set(params.get('relation', ()))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   277
        for rtype, role, _ in self.source.mapping.get(etype, ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   278
            if known_relations and rtype in known_relations.get('role', ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   279
                continue
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   280
            relations.add('%s-%s' % (rtype, role))
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   281
        purl = list(purl)
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   282
        purl[4] = urlencode(params, doseq=True)
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   283
        return urlunparse(purl)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   284
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   285
    def complete_item(self, item, rels):
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   286
        try:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   287
            return self._parsed_urls[item['cwuri']]
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   288
        except KeyError:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   289
            itemurl = self.complete_url(item['cwuri'], item['cwtype'], rels)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   290
            item_rels = list(self.parse(itemurl))
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   291
            assert len(item_rels) == 1, 'url %s expected to bring back one '\
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   292
                   'and only one entity, got %s' % (itemurl, len(item_rels))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   293
            self._parsed_urls[item['cwuri']] = item_rels[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   294
            if rels:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   295
                # XXX (do it better) merge relations
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   296
                new_rels = item_rels[0][1]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   297
                new_rels.get('subject', {}).update(rels.get('subject', {}))
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   298
                new_rels.get('object', {}).update(rels.get('object', {}))
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   299
            return item_rels[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   300
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   301
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   302
class CWEntityXMLItemBuilder(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   303
    __regid__ = 'cw.entityxml.item-builder'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   304
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   305
    def __init__(self, _cw, parser, node, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   306
        super(CWEntityXMLItemBuilder, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   307
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   308
        self.node = node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   309
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   310
    def build_item(self):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   311
        """parse a XML document node and return two dictionaries defining (part
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   312
        of) an entity:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   313
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   314
        - {attribute: value}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   315
        - {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   316
        """
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   317
        node = self.node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   318
        item = dict(node.attrib.items())
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
   319
        item['cwtype'] = text_type(node.tag)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   320
        item.setdefault('cwsource', None)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   321
        try:
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
   322
            item['eid'] = int(item['eid'])
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   323
        except KeyError:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   324
            # cw < 3.11 compat mode XXX
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
   325
            item['eid'] = int(node.find('eid').text)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   326
            item['cwuri'] = node.find('cwuri').text
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   327
        rels = {}
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   328
        for child in node:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   329
            role = child.get('role')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   330
            if role:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   331
                # relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   332
                related = rels.setdefault(role, {}).setdefault(child.tag, [])
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   333
                related += self.parser.parse_etree(child)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   334
            elif child.text:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   335
                # attribute
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
   336
                item[child.tag] = text_type(child.text)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   337
            else:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   338
                # None attribute (empty tag)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   339
                item[child.tag] = None
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   340
        return item, rels
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   341
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   342
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   343
class CWEntityXMLActionCopy(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   344
    """implementation of cubicweb entity xml parser's'copy' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   345
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   346
    Takes no option.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   347
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   348
    __regid__ = 'cw.entityxml.action.copy'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   349
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   350
    def __init__(self, _cw, parser, rtype, role, entity=None, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   351
        super(CWEntityXMLActionCopy, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   352
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   353
        self.rtype = rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   354
        self.role = role
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   355
        self.entity = entity
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   356
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   357
    @classproperty
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   358
    def action(cls):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   359
        return cls.__regid__.rsplit('.', 1)[-1]
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   360
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   361
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   362
        self._check_no_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   363
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   364
    def _check_no_options(self, options, eid, msg=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   365
        if options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   366
            if msg is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   367
                msg = self._cw._("'%s' action doesn't take any options") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   368
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   369
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   370
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   371
        assert not any(x[1] for x in rules), "'copy' action takes no option"
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   372
        ttypes = frozenset([x[0] for x in rules])
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   373
        eids = [] # local eids
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   374
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   375
            if item['cwtype'] in ttypes:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   376
                item, rels = self.parser.complete_item(item, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   377
                other_entity = self.parser.process_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   378
                if other_entity is not None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   379
                    eids.append(other_entity.eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   380
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   381
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   382
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   383
            self._clear_relation(ttypes)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   384
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   385
    def _clear_relation(self, ttypes):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   386
        if not self.parser.created_during_pull(self.entity):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   387
            if len(ttypes) > 1:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   388
                typerestr = ', Y is IN(%s)' % ','.join(ttypes)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   389
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   390
                typerestr = ', Y is %s' % ','.join(ttypes)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   391
            self._cw.execute('DELETE ' + rtype_role_rql(self.rtype, self.role) + typerestr,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   392
                             {'x': self.entity.eid})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   393
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   394
    def _set_relation(self, eids):
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   395
        assert eids
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   396
        rtype = self.rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   397
        rqlbase = rtype_role_rql(rtype, self.role)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   398
        eidstr = ','.join(str(eid) for eid in eids)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   399
        self._cw.execute('DELETE %s, NOT Y eid IN (%s)' % (rqlbase, eidstr),
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   400
                         {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   401
        if self.role == 'object':
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   402
            rql = 'SET %s, Y eid IN (%s), NOT Y %s X' % (rqlbase, eidstr, rtype)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   403
        else:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   404
            rql = 'SET %s, Y eid IN (%s), NOT X %s Y' % (rqlbase, eidstr, rtype)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   405
        self._cw.execute(rql, {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   406
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   407
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   408
class CWEntityXMLActionLink(CWEntityXMLActionCopy):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   409
    """implementation of cubicweb entity xml parser's'link' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   410
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   411
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   412
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   413
    __regid__ = 'cw.entityxml.action.link'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   414
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   415
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   416
        if not 'linkattr' in options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   417
            msg = self._cw._("'%s' action requires 'linkattr' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   418
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   419
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   420
    create_when_not_found = False
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   421
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   422
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   423
        for ttype, options in rules:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   424
            searchattrs = splitstrip(options.get('linkattr', ''))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   425
            self._related_link(ttype, others, searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   426
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   427
    def _related_link(self, ttype, others, searchattrs):
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
   428
        def issubset(x, y):
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   429
            return all(z in y for z in x)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   430
        eids = [] # local eids
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   431
        log = self.parser.import_log
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   432
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   433
            if item['cwtype'] != ttype:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   434
                continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   435
            if not issubset(searchattrs, item):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   436
                item, rels = self.parser.complete_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   437
                if not issubset(searchattrs, item):
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   438
                    log.record_error('missing attribute, got %s expected keys %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   439
                                     % (item, searchattrs))
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   440
                    continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   441
            # XXX str() needed with python < 2.6
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   442
            kwargs = dict((str(attr), item[attr]) for attr in searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   443
            targets = self._find_entities(item, kwargs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   444
            if len(targets) == 1:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   445
                entity = targets[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   446
            elif not targets and self.create_when_not_found:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   447
                entity = self._cw.create_entity(item['cwtype'], **kwargs)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   448
            else:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   449
                if len(targets) > 1:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   450
                    log.record_error('ambiguous link: found %s entity %s with attributes %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   451
                                     % (len(targets), item['cwtype'], kwargs))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   452
                else:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   453
                    log.record_error('can not find %s entity with attributes %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   454
                                     % (item['cwtype'], kwargs))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   455
                continue
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   456
            eids.append(entity.eid)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   457
            self.parser.process_relations(entity, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   458
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   459
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   460
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   461
            self._clear_relation((ttype,))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   462
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   463
    def _find_entities(self, item, kwargs):
9750
566f8fce5168 [dataimport, migration] silence find_entities / find_one_entity warning
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9474
diff changeset
   464
        return tuple(self._cw.find(item['cwtype'], **kwargs).entities())
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   465
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   466
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   467
class CWEntityXMLActionLinkInState(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   468
    """custom implementation of cubicweb entity xml parser's'link' action for
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   469
    in_state relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   470
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   471
    __select__ = match_rtype('in_state')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   472
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   473
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   474
        super(CWEntityXMLActionLinkInState, self).check_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   475
        if not 'name' in options['linkattr']:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   476
            msg = self._cw._("'%s' action for in_state relation should at least have 'linkattr=name' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   477
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   478
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   479
    def _find_entities(self, item, kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   480
        assert 'name' in item # XXX else, complete_item
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   481
        state_name = item['name']
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   482
        wf = self.entity.cw_adapt_to('IWorkflowable').current_workflow
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   483
        state = wf.state_by_name(state_name)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   484
        if state is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   485
            return ()
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   486
        return (state,)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   487
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   488
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   489
class CWEntityXMLActionLinkOrCreate(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   490
    """implementation of cubicweb entity xml parser's'link-or-create' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   491
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   492
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   493
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   494
    __regid__ = 'cw.entityxml.action.link-or-create'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   495
    create_when_not_found = True