sobjects/cwxmlparser.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Thu, 24 Sep 2015 11:26:11 +0200
changeset 10921 977def81780a
parent 10758 a34edc1057c0
child 11034 75d752e6daf7
permissions -rw-r--r--
[migration] don't handle data deletion anymore on schema changes In most cases when we want to drop some entity/relation type, we don't care whether hooks are called on their deletion. There is even low chances that some hooks still exists, based on an old version of the schema. Last but not least, this is horribly inefficient. So this should be clearly documented and handled by application's programmer if desired. This patch removes unnecessary deletion (because table or column will be later dropped) and reimplements the case of partial deletion (only one relation definition among several, hence the database structure isn't modified) using sql. Only one test regarding deletion of inlined relation def is added as other cases seem to be covered by existing tests. Closes #7023315
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8188
1867e252e487 [repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
     1
# copyright 2010-2012 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
# This file is part of CubicWeb.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
# any later version.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
# details.
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
#
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    18
"""datafeed parser for xml generated by cubicweb
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    19
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    20
Example of mapping for CWEntityXMLParser::
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    21
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    22
  {u'CWUser': {                                        # EntityType
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    23
      (u'in_group', u'subject', u'link'): [            # (rtype, role, action)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    24
          (u'CWGroup', {u'linkattr': u'name'})],       #   -> rules = [(EntityType, options), ...]
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    25
      (u'tags', u'object', u'link-or-create'): [       # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    26
          (u'Tag', {u'linkattr': u'name'})],           #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    27
      (u'use_email', u'subject', u'copy'): [           # (...)
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    28
          (u'EmailAddress', {})]                       #   -> ...
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    29
      }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    30
   }
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    31
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    32
"""
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
    34
from datetime import datetime, time
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
    35
import urllib
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
    37
from six import text_type
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
    38
from six.moves.urllib.parse import urlparse, urlunparse, parse_qs, urlencode
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
    39
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
from logilab.common.date import todate, totime
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
from logilab.common.textutils import splitstrip, text_to_dict
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    42
from logilab.common.decorators import classproperty
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
from yams.constraints import BASE_CONVERTERS
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
from yams.schema import role_name as rn
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
    47
from cubicweb import ValidationError, RegistryException
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    48
from cubicweb.view import Component
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
from cubicweb.server.sources import datafeed
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    50
from cubicweb.server.hook import match_rtype
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    52
# XXX see cubicweb.cwvreg.YAMS_TO_PY
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
    53
# XXX see cubicweb.web.views.xmlrss.SERIALIZERS
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
DEFAULT_CONVERTERS = BASE_CONVERTERS.copy()
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
    55
DEFAULT_CONVERTERS['String'] = text_type
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
DEFAULT_CONVERTERS['Password'] = lambda x: x.encode('utf8')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
def convert_date(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
    return todate(datetime.strptime(ustr, '%Y-%m-%d'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
DEFAULT_CONVERTERS['Date'] = convert_date
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
def convert_datetime(ustr):
7002
29f085f6177b [sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7001
diff changeset
    61
    if '.' in ustr: # assume %Y-%m-%d %H:%M:%S.mmmmmm
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
    62
        ustr = ustr.split('.', 1)[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
    return datetime.strptime(ustr, '%Y-%m-%d %H:%M:%S')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
DEFAULT_CONVERTERS['Datetime'] = convert_datetime
7957
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    65
# XXX handle timezone, though this will be enough as TZDatetime are
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    66
# serialized without time zone by default (UTC time). See
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    67
# cw.web.views.xmlrss.SERIALIZERS.
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    68
DEFAULT_CONVERTERS['TZDatetime'] = convert_datetime
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
def convert_time(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
    return totime(datetime.strptime(ustr, '%H:%M:%S'))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
DEFAULT_CONVERTERS['Time'] = convert_time
7957
5da7ede69063 [test] fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7932
diff changeset
    72
DEFAULT_CONVERTERS['TZTime'] = convert_time
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
def convert_interval(ustr):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    74
    return time(seconds=int(ustr))
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    75
DEFAULT_CONVERTERS['Interval'] = convert_interval
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    76
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    77
def extract_typed_attrs(eschema, stringdict, converters=DEFAULT_CONVERTERS):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    78
    typeddict = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
    for rschema in eschema.subject_relations():
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
        if rschema.final and rschema in stringdict:
7932
2ad26cc3b5c6 [datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7916
diff changeset
    81
            if rschema in ('eid', 'cwuri', 'cwtype', 'cwsource'):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
            attrtype = eschema.destination(rschema)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    84
            value = stringdict[rschema]
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    85
            if value is not None:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    86
                value = converters[attrtype](value)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
    87
            typeddict[rschema.type] = value
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
    return typeddict
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    89
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    90
def rtype_role_rql(rtype, role):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    91
    if role == 'object':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    92
        return 'Y %s X WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    93
    else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    94
        return 'X %s Y WHERE X eid %%(x)s' % rtype
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    95
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    96
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    97
class CWEntityXMLParser(datafeed.DataFeedXMLParser):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
    98
    """datafeed parser for the 'xml' entity view
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    99
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   100
    Most of the logic is delegated to the following components:
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   101
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   102
    * an "item builder" component, turning an etree xml node into a specific
8239
c6cdd060212e backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8189
diff changeset
   103
      python dictionary representing an entity
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   104
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   105
    * "action" components, selected given an entity, a relation and its role in
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   106
      the relation, and responsible to link the entity to given related items
8239
c6cdd060212e backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8189
diff changeset
   107
      (eg dictionary)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   108
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   109
    So the parser is only doing the gluing service and the connection to the
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   110
    source.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   111
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   112
    __regid__ = 'cw.entityxml'
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   113
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   114
    def __init__(self, *args, **kwargs):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   115
        super(CWEntityXMLParser, self).__init__(*args, **kwargs)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   116
        self._parsed_urls = {}
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   117
        self._processed_entities = set()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   118
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   119
    def select_linker(self, action, rtype, role, entity=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   120
        try:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   121
            return self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   122
                'cw.entityxml.action.%s' % action, self._cw, entity=entity,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   123
                rtype=rtype, role=role, parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   124
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   125
            raise RegistryException('Unknown action %s' % action)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   126
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   127
    def list_actions(self):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   128
        reg = self._cw.vreg['components']
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   129
        return sorted(clss[0].action for rid, clss in reg.items()
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   130
                      if rid.startswith('cw.entityxml.action.'))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   131
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   132
    # mapping handling #########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   133
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   134
    def add_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   135
        """added CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   136
        _ = self._cw._
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   137
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   138
            rtype = schemacfg.schema.rtype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   139
        except AttributeError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   140
            msg = _("entity and relation types can't be mapped, only attributes "
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   141
                    "or relations")
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   142
            raise ValidationError(schemacfg.eid, {rn('cw_for_schema', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   143
        if schemacfg.options:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   144
            options = text_to_dict(schemacfg.options)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   145
        else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   146
            options = {}
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   147
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   148
            role = options.pop('role')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   149
            if role not in ('subject', 'object'):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   150
                raise KeyError
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   151
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   152
            msg = _('"role=subject" or "role=object" must be specified in options')
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   153
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   154
        try:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   155
            action = options.pop('action')
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   156
            linker = self.select_linker(action, rtype, role)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   157
            linker.check_options(options, schemacfg.eid)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   158
        except KeyError:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   159
            msg = _('"action" must be specified in options; allowed values are '
7879
9aae456abab5 [pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7727
diff changeset
   160
                    '%s') % ', '.join(self.list_actions())
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   161
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   162
        except RegistryException:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   163
            msg = _('allowed values for "action" are %s') % ', '.join(self.list_actions())
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   164
            raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   165
        if not checkonly:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   166
            if role == 'subject':
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   167
                etype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   168
                ttype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   169
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   170
                etype = schemacfg.schema.otype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   171
                ttype = schemacfg.schema.stype.name
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   172
            etyperules = self.source.mapping.setdefault(etype, {})
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   173
            etyperules.setdefault((rtype, role, action), []).append(
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
   174
                (ttype, options))
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   175
            self.source.mapping_idx[schemacfg.eid] = (
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   176
                etype, rtype, role, action, ttype)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   177
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   178
    def del_schema_config(self, schemacfg, checkonly=False):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   179
        """deleted CWSourceSchemaConfig, modify mapping accordingly"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   180
        etype, rtype, role, action, ttype = self.source.mapping_idx[schemacfg.eid]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   181
        rules = self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   182
        rules = [x for x in rules if not x[0] == ttype]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   183
        if not rules:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   184
            del self.source.mapping[etype][(rtype, role, action)]
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   185
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   186
    # import handling ##########################################################
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   187
8409
79534887943e [datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8299
diff changeset
   188
    def process(self, url, raise_on_error=False):
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   189
        """IDataFeedParser main entry point"""
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   190
        if url.startswith('http'): # XXX similar loose test as in parse of sources.datafeed
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   191
            url = self.complete_url(url)
8409
79534887943e [datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8299
diff changeset
   192
        super(CWEntityXMLParser, self).process(url, raise_on_error)
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   193
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   194
    def parse_etree(self, parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   195
        for node in list(parent):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   196
            builder = self._cw.vreg['components'].select(
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   197
                'cw.entityxml.item-builder', self._cw, node=node,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   198
                parser=self)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   199
            yield builder.build_item()
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   200
10089
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   201
    def process_item(self, item, rels, raise_on_error=False):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   202
        """
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   203
        item and rels are what's returned by the item builder `build_item` method:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   204
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   205
        * `item` is an {attribute: value} dictionary
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   206
        * `rels` is for relations and structured as
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   207
           {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   208
        """
10758
a34edc1057c0 [datafeed] use explicit encode/decode instead of casting to str
Julien Cristau <julien.cristau@logilab.fr>
parents: 10756
diff changeset
   209
        entity = self.extid2entity(item['cwuri'].encode('ascii'), item['cwtype'],
10089
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   210
                                   cwsource=item['cwsource'], item=item,
6346f53c85f1 [datafeed] Add a raise_on_error parameter to DataFeedSource.extid2entity
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 9750
diff changeset
   211
                                   raise_on_error=raise_on_error)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   212
        if entity is None:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   213
            return None
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   214
        if entity.eid in self._processed_entities:
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   215
            return entity
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   216
        self._processed_entities.add(entity.eid)
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   217
        if not (self.created_during_pull(entity) or self.updated_during_pull(entity)):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   218
            attrs = extract_typed_attrs(entity.e_schema, item)
8188
1867e252e487 [repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7995
diff changeset
   219
            self.update_if_necessary(entity, attrs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   220
        self.process_relations(entity, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   221
        return entity
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   222
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   223
    def process_relations(self, entity, rels):
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8748
diff changeset
   224
        etype = entity.cw_etype
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   225
        for (rtype, role, action), rules in self.source.mapping.get(etype, {}).items():
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   226
            try:
7354
f627ab500fda [parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7351
diff changeset
   227
                related_items = rels[role][rtype]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   228
            except KeyError:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   229
                self.import_log.record_error('relation %s-%s not found in xml export of %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   230
                                             % (rtype, role, etype))
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   231
                continue
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   232
            try:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   233
                linker = self.select_linker(action, rtype, role, entity)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   234
            except RegistryException:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   235
                self.import_log.record_error('no linker for action %s' % action)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   236
            else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   237
                linker.link_items(related_items, rules)
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   238
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   239
    def before_entity_copy(self, entity, sourceparams):
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   240
        """IDataFeedParser callback"""
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   241
        attrs = extract_typed_attrs(entity.e_schema, sourceparams['item'])
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   242
        entity.cw_edited.update(attrs)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   243
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   244
    def normalize_url(self, url):
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   245
        """overridden to add vid=xml if vid is not set in the qs"""
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   246
        url = super(CWEntityXMLParser, self).normalize_url(url)
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   247
        purl = urlparse(url)
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   248
        if purl.scheme in ('http', 'https'):
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   249
            params = parse_qs(purl.query)
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   250
            if 'vid' not in params:
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   251
                params['vid'] = ['xml']
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   252
                purl = list(purl)
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   253
                purl[4] = urlencode(params, doseq=True)
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   254
                return urlunparse(purl)
8189
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   255
        return url
2ee0ef069fa7 [datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8188
diff changeset
   256
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   257
    def complete_url(self, url, etype=None, known_relations=None):
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   258
        """append to the url's query string information about relation that should
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   259
        be included in the resulting xml, according to source mapping.
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   260
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   261
        If etype is not specified, try to guess it using the last path part of
7689
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   262
        the url, i.e. the format used by default in cubicweb to map all entities
a86fd3ec322e [datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7553
diff changeset
   263
        of a given type as in 'http://mysite.org/EntityType'.
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   264
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   265
        If `known_relations` is given, it should be a dictionary of already
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   266
        known relations, so they don't get queried again.
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   267
        """
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   268
        purl = urlparse(url)
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   269
        params = parse_qs(purl.query)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   270
        if etype is None:
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   271
            etype = purl.path.split('/')[-1]
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   272
        try:
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   273
            etype = self._cw.vreg.case_insensitive_etypes[etype.lower()]
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   274
        except KeyError:
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   275
            return url
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   276
        relations = params['relation'] = set(params.get('relation', ()))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   277
        for rtype, role, _ in self.source.mapping.get(etype, ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   278
            if known_relations and rtype in known_relations.get('role', ()):
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   279
                continue
10515
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   280
            relations.add('%s-%s' % (rtype, role))
70ed2067fdb5 [cwxml parser] refactor url handling code to use urlparse and urlencode (related to #5456849)
David Douard <david.douard@logilab.fr>
parents: 10089
diff changeset
   281
        purl = list(purl)
10756
3eb527ce5f0f [py3k] get urllib foo from six.moves
Julien Cristau <julien.cristau@logilab.fr>
parents: 10687
diff changeset
   282
        purl[4] = urlencode(params, doseq=True)
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10515
diff changeset
   283
        return urlunparse(purl)
7531
e891ca479586 [datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7481
diff changeset
   284
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   285
    def complete_item(self, item, rels):
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   286
        try:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   287
            return self._parsed_urls[item['cwuri']]
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   288
        except KeyError:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   289
            itemurl = self.complete_url(item['cwuri'], item['cwtype'], rels)
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   290
            item_rels = list(self.parse(itemurl))
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   291
            assert len(item_rels) == 1, 'url %s expected to bring back one '\
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   292
                   'and only one entity, got %s' % (itemurl, len(item_rels))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   293
            self._parsed_urls[item['cwuri']] = item_rels[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   294
            if rels:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   295
                # XXX (do it better) merge relations
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   296
                new_rels = item_rels[0][1]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   297
                new_rels.get('subject', {}).update(rels.get('subject', {}))
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   298
                new_rels.get('object', {}).update(rels.get('object', {}))
7474
7dc405ad7bf3 [datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7470
diff changeset
   299
            return item_rels[0]
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   300
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   301
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   302
class CWEntityXMLItemBuilder(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   303
    __regid__ = 'cw.entityxml.item-builder'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   304
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   305
    def __init__(self, _cw, parser, node, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   306
        super(CWEntityXMLItemBuilder, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   307
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   308
        self.node = node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   309
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   310
    def build_item(self):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   311
        """parse a XML document node and return two dictionaries defining (part
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   312
        of) an entity:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   313
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   314
        - {attribute: value}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   315
        - {role: {relation: [(related item, related rels)...]}
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   316
        """
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   317
        node = self.node
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   318
        item = dict(node.attrib.items())
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
   319
        item['cwtype'] = text_type(node.tag)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   320
        item.setdefault('cwsource', None)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   321
        try:
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
   322
            item['eid'] = int(item['eid'])
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   323
        except KeyError:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   324
            # cw < 3.11 compat mode XXX
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8409
diff changeset
   325
            item['eid'] = int(node.find('eid').text)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   326
            item['cwuri'] = node.find('cwuri').text
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   327
        rels = {}
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   328
        for child in node:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   329
            role = child.get('role')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   330
            if role:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   331
                # relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   332
                related = rels.setdefault(role, {}).setdefault(child.tag, [])
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   333
                related += self.parser.parse_etree(child)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   334
            elif child.text:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   335
                # attribute
10687
d394bfcd8c25 [py3k] unicode → six.text_type
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10662
diff changeset
   336
                item[child.tag] = text_type(child.text)
7700
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   337
            else:
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   338
                # None attribute (empty tag)
0010dde5352a [datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7689
diff changeset
   339
                item[child.tag] = None
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   340
        return item, rels
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   341
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   342
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   343
class CWEntityXMLActionCopy(Component):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   344
    """implementation of cubicweb entity xml parser's'copy' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   345
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   346
    Takes no option.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   347
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   348
    __regid__ = 'cw.entityxml.action.copy'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   349
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   350
    def __init__(self, _cw, parser, rtype, role, entity=None, **kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   351
        super(CWEntityXMLActionCopy, self).__init__(_cw, **kwargs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   352
        self.parser = parser
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   353
        self.rtype = rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   354
        self.role = role
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   355
        self.entity = entity
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   356
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   357
    @classproperty
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   358
    def action(cls):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   359
        return cls.__regid__.rsplit('.', 1)[-1]
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   360
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   361
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   362
        self._check_no_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   363
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   364
    def _check_no_options(self, options, eid, msg=None):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   365
        if options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   366
            if msg is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   367
                msg = self._cw._("'%s' action doesn't take any options") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   368
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   369
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   370
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   371
        assert not any(x[1] for x in rules), "'copy' action takes no option"
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   372
        ttypes = frozenset([x[0] for x in rules])
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   373
        eids = [] # local eids
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   374
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   375
            if item['cwtype'] in ttypes:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   376
                item, rels = self.parser.complete_item(item, rels)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   377
                other_entity = self.parser.process_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   378
                if other_entity is not None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   379
                    eids.append(other_entity.eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   380
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   381
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   382
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   383
            self._clear_relation(ttypes)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   384
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   385
    def _clear_relation(self, ttypes):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   386
        if not self.parser.created_during_pull(self.entity):
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   387
            if len(ttypes) > 1:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   388
                typerestr = ', Y is IN(%s)' % ','.join(ttypes)
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   389
            else:
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   390
                typerestr = ', Y is %s' % ','.join(ttypes)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   391
            self._cw.execute('DELETE ' + rtype_role_rql(self.rtype, self.role) + typerestr,
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   392
                             {'x': self.entity.eid})
6960
822f2530570d [datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
   393
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   394
    def _set_relation(self, eids):
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   395
        assert eids
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   396
        rtype = self.rtype
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   397
        rqlbase = rtype_role_rql(rtype, self.role)
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   398
        eidstr = ','.join(str(eid) for eid in eids)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   399
        self._cw.execute('DELETE %s, NOT Y eid IN (%s)' % (rqlbase, eidstr),
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   400
                         {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   401
        if self.role == 'object':
7399
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   402
            rql = 'SET %s, Y eid IN (%s), NOT Y %s X' % (rqlbase, eidstr, rtype)
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   403
        else:
972ed1843bd8 [multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7378
diff changeset
   404
            rql = 'SET %s, Y eid IN (%s), NOT X %s Y' % (rqlbase, eidstr, rtype)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   405
        self._cw.execute(rql, {'x': self.entity.eid})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   406
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   407
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   408
class CWEntityXMLActionLink(CWEntityXMLActionCopy):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   409
    """implementation of cubicweb entity xml parser's'link' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   410
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   411
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   412
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   413
    __regid__ = 'cw.entityxml.action.link'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   414
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   415
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   416
        if not 'linkattr' in options:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   417
            msg = self._cw._("'%s' action requires 'linkattr' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   418
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   419
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   420
    create_when_not_found = False
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   421
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   422
    def link_items(self, others, rules):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   423
        for ttype, options in rules:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   424
            searchattrs = splitstrip(options.get('linkattr', ''))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   425
            self._related_link(ttype, others, searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   426
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   427
    def _related_link(self, ttype, others, searchattrs):
9474
dd96b877879d [cwxml parser] minor pylint fixes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8900
diff changeset
   428
        def issubset(x, y):
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   429
            return all(z in y for z in x)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   430
        eids = [] # local eids
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   431
        log = self.parser.import_log
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   432
        for item, rels in others:
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   433
            if item['cwtype'] != ttype:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   434
                continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   435
            if not issubset(searchattrs, item):
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   436
                item, rels = self.parser.complete_item(item, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   437
                if not issubset(searchattrs, item):
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   438
                    log.record_error('missing attribute, got %s expected keys %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   439
                                     % (item, searchattrs))
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   440
                    continue
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   441
            # XXX str() needed with python < 2.6
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   442
            kwargs = dict((str(attr), item[attr]) for attr in searchattrs)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   443
            targets = self._find_entities(item, kwargs)
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   444
            if len(targets) == 1:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   445
                entity = targets[0]
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   446
            elif not targets and self.create_when_not_found:
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   447
                entity = self._cw.create_entity(item['cwtype'], **kwargs)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   448
            else:
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   449
                if len(targets) > 1:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   450
                    log.record_error('ambiguous link: found %s entity %s with attributes %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   451
                                     % (len(targets), item['cwtype'], kwargs))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   452
                else:
7995
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   453
                    log.record_error('can not find %s entity with attributes %s'
9a9f35ef418c Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents: 7957
diff changeset
   454
                                     % (item['cwtype'], kwargs))
7916
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   455
                continue
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   456
            eids.append(entity.eid)
54e240c1b419 [datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7879
diff changeset
   457
            self.parser.process_relations(entity, rels)
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   458
        if eids:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   459
            self._set_relation(eids)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   460
        else:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   461
            self._clear_relation((ttype,))
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   462
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   463
    def _find_entities(self, item, kwargs):
9750
566f8fce5168 [dataimport, migration] silence find_entities / find_one_entity warning
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9474
diff changeset
   464
        return tuple(self._cw.find(item['cwtype'], **kwargs).entities())
7553
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   465
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   466
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   467
class CWEntityXMLActionLinkInState(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   468
    """custom implementation of cubicweb entity xml parser's'link' action for
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   469
    in_state relation
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   470
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   471
    __select__ = match_rtype('in_state')
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   472
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   473
    def check_options(self, options, eid):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   474
        super(CWEntityXMLActionLinkInState, self).check_options(options, eid)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   475
        if not 'name' in options['linkattr']:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   476
            msg = self._cw._("'%s' action for in_state relation should at least have 'linkattr=name' option") % self.action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   477
            raise ValidationError(eid, {rn('options', 'subject'): msg})
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   478
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   479
    def _find_entities(self, item, kwargs):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   480
        assert 'name' in item # XXX else, complete_item
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   481
        state_name = item['name']
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   482
        wf = self.entity.cw_adapt_to('IWorkflowable').current_workflow
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   483
        state = wf.state_by_name(state_name)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   484
        if state is None:
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   485
            return ()
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   486
        return (state,)
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   487
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   488
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   489
class CWEntityXMLActionLinkOrCreate(CWEntityXMLActionLink):
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   490
    """implementation of cubicweb entity xml parser's'link-or-create' action
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   491
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   492
    requires a 'linkattr' option to control search of the linked entity.
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   493
    """
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   494
    __regid__ = 'cw.entityxml.action.link-or-create'
935423529f45 [datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7534
diff changeset
   495
    create_when_not_found = True