author | Vincent Michel <vincent.michel@logilab.fr> |
Mon, 07 Apr 2014 08:44:56 +0000 | |
branch | stable |
changeset 9696 | 4de6e73f580d |
parent 8900 | 010a59e12d89 |
child 9474 | dd96b877879d |
permissions | -rw-r--r-- |
8188
1867e252e487
[repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7995
diff
changeset
|
1 |
# copyright 2010-2012 LOGILAB S.A. (Paris, FRANCE), all rights reserved. |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
2 |
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
3 |
# |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
4 |
# This file is part of CubicWeb. |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
5 |
# |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
6 |
# CubicWeb is free software: you can redistribute it and/or modify it under the |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
7 |
# terms of the GNU Lesser General Public License as published by the Free |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
8 |
# Software Foundation, either version 2.1 of the License, or (at your option) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
9 |
# any later version. |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
10 |
# |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
11 |
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
12 |
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
13 |
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
14 |
# details. |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
15 |
# |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
16 |
# You should have received a copy of the GNU Lesser General Public License along |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
17 |
# with CubicWeb. If not, see <http://www.gnu.org/licenses/>. |
7354
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
18 |
"""datafeed parser for xml generated by cubicweb |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
19 |
|
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
20 |
Example of mapping for CWEntityXMLParser:: |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
21 |
|
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
22 |
{u'CWUser': { # EntityType |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
23 |
(u'in_group', u'subject', u'link'): [ # (rtype, role, action) |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
24 |
(u'CWGroup', {u'linkattr': u'name'})], # -> rules = [(EntityType, options), ...] |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
25 |
(u'tags', u'object', u'link-or-create'): [ # (...) |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
26 |
(u'Tag', {u'linkattr': u'name'})], # -> ... |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
27 |
(u'use_email', u'subject', u'copy'): [ # (...) |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
28 |
(u'EmailAddress', {})] # -> ... |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
29 |
} |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
30 |
} |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
31 |
|
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
32 |
""" |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
33 |
|
7879
9aae456abab5
[pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7727
diff
changeset
|
34 |
from datetime import datetime, timedelta, time |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
35 |
from urllib import urlencode |
7532
76914c7ba1b0
[python 2.5] compat fix
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7531
diff
changeset
|
36 |
from cgi import parse_qs # in urlparse with python >= 2.6 |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
37 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
38 |
from logilab.common.date import todate, totime |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
39 |
from logilab.common.textutils import splitstrip, text_to_dict |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
40 |
from logilab.common.decorators import classproperty |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
41 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
42 |
from yams.constraints import BASE_CONVERTERS |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
43 |
from yams.schema import role_name as rn |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
44 |
|
8748
f5027f8d2478
drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
8409
diff
changeset
|
45 |
from cubicweb import ValidationError, RegistryException |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
46 |
from cubicweb.view import Component |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
47 |
from cubicweb.server.sources import datafeed |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
48 |
from cubicweb.server.hook import match_rtype |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
49 |
|
7354
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
50 |
# XXX see cubicweb.cwvreg.YAMS_TO_PY |
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
51 |
# XXX see cubicweb.web.views.xmlrss.SERIALIZERS |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
52 |
DEFAULT_CONVERTERS = BASE_CONVERTERS.copy() |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
53 |
DEFAULT_CONVERTERS['String'] = unicode |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
54 |
DEFAULT_CONVERTERS['Password'] = lambda x: x.encode('utf8') |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
55 |
def convert_date(ustr): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
56 |
return todate(datetime.strptime(ustr, '%Y-%m-%d')) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
57 |
DEFAULT_CONVERTERS['Date'] = convert_date |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
58 |
def convert_datetime(ustr): |
7002
29f085f6177b
[sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7001
diff
changeset
|
59 |
if '.' in ustr: # assume %Y-%m-%d %H:%M:%S.mmmmmm |
29f085f6177b
[sobjects/parsers] compatibility with xml view of cw <= 3.10
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7001
diff
changeset
|
60 |
ustr = ustr.split('.',1)[0] |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
61 |
return datetime.strptime(ustr, '%Y-%m-%d %H:%M:%S') |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
62 |
DEFAULT_CONVERTERS['Datetime'] = convert_datetime |
7957
5da7ede69063
[test]Â fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7932
diff
changeset
|
63 |
# XXX handle timezone, though this will be enough as TZDatetime are |
5da7ede69063
[test]Â fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7932
diff
changeset
|
64 |
# serialized without time zone by default (UTC time). See |
5da7ede69063
[test]Â fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7932
diff
changeset
|
65 |
# cw.web.views.xmlrss.SERIALIZERS. |
5da7ede69063
[test]Â fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7932
diff
changeset
|
66 |
DEFAULT_CONVERTERS['TZDatetime'] = convert_datetime |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
67 |
def convert_time(ustr): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
68 |
return totime(datetime.strptime(ustr, '%H:%M:%S')) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
69 |
DEFAULT_CONVERTERS['Time'] = convert_time |
7957
5da7ede69063
[test]Â fix some failures recently introduced
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7932
diff
changeset
|
70 |
DEFAULT_CONVERTERS['TZTime'] = convert_time |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
71 |
def convert_interval(ustr): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
72 |
return time(seconds=int(ustr)) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
73 |
DEFAULT_CONVERTERS['Interval'] = convert_interval |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
74 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
75 |
def extract_typed_attrs(eschema, stringdict, converters=DEFAULT_CONVERTERS): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
76 |
typeddict = {} |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
77 |
for rschema in eschema.subject_relations(): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
78 |
if rschema.final and rschema in stringdict: |
7932
2ad26cc3b5c6
[datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7916
diff
changeset
|
79 |
if rschema in ('eid', 'cwuri', 'cwtype', 'cwsource'): |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
80 |
continue |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
81 |
attrtype = eschema.destination(rschema) |
7700
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
82 |
value = stringdict[rschema] |
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
83 |
if value is not None: |
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
84 |
value = converters[attrtype](value) |
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
85 |
typeddict[rschema.type] = value |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
86 |
return typeddict |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
87 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
88 |
def rtype_role_rql(rtype, role): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
89 |
if role == 'object': |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
90 |
return 'Y %s X WHERE X eid %%(x)s' % rtype |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
91 |
else: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
92 |
return 'X %s Y WHERE X eid %%(x)s' % rtype |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
93 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
94 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
95 |
class CWEntityXMLParser(datafeed.DataFeedXMLParser): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
96 |
"""datafeed parser for the 'xml' entity view |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
97 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
98 |
Most of the logic is delegated to the following components: |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
99 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
100 |
* an "item builder" component, turning an etree xml node into a specific |
8239
c6cdd060212e
backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8189
diff
changeset
|
101 |
python dictionary representing an entity |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
102 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
103 |
* "action" components, selected given an entity, a relation and its role in |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
104 |
the relation, and responsible to link the entity to given related items |
8239
c6cdd060212e
backport stable
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8189
diff
changeset
|
105 |
(eg dictionary) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
106 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
107 |
So the parser is only doing the gluing service and the connection to the |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
108 |
source. |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
109 |
""" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
110 |
__regid__ = 'cw.entityxml' |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
111 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
112 |
def __init__(self, *args, **kwargs): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
113 |
super(CWEntityXMLParser, self).__init__(*args, **kwargs) |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
114 |
self._parsed_urls = {} |
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
115 |
self._processed_entities = set() |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
116 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
117 |
def select_linker(self, action, rtype, role, entity=None): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
118 |
try: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
119 |
return self._cw.vreg['components'].select( |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
120 |
'cw.entityxml.action.%s' % action, self._cw, entity=entity, |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
121 |
rtype=rtype, role=role, parser=self) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
122 |
except RegistryException: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
123 |
raise RegistryException('Unknown action %s' % action) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
124 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
125 |
def list_actions(self): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
126 |
reg = self._cw.vreg['components'] |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
127 |
return sorted(clss[0].action for rid, clss in reg.iteritems() |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
128 |
if rid.startswith('cw.entityxml.action.')) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
129 |
|
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
130 |
# mapping handling ######################################################### |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
131 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
132 |
def add_schema_config(self, schemacfg, checkonly=False): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
133 |
"""added CWSourceSchemaConfig, modify mapping accordingly""" |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
134 |
_ = self._cw._ |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
135 |
try: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
136 |
rtype = schemacfg.schema.rtype.name |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
137 |
except AttributeError: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
138 |
msg = _("entity and relation types can't be mapped, only attributes " |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
139 |
"or relations") |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
140 |
raise ValidationError(schemacfg.eid, {rn('cw_for_schema', 'subject'): msg}) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
141 |
if schemacfg.options: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
142 |
options = text_to_dict(schemacfg.options) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
143 |
else: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
144 |
options = {} |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
145 |
try: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
146 |
role = options.pop('role') |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
147 |
if role not in ('subject', 'object'): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
148 |
raise KeyError |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
149 |
except KeyError: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
150 |
msg = _('"role=subject" or "role=object" must be specified in options') |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
151 |
raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg}) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
152 |
try: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
153 |
action = options.pop('action') |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
154 |
linker = self.select_linker(action, rtype, role) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
155 |
linker.check_options(options, schemacfg.eid) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
156 |
except KeyError: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
157 |
msg = _('"action" must be specified in options; allowed values are ' |
7879
9aae456abab5
[pylint] fix pylint detected errors and tweak it so that pylint -E will be much less verbose next time (+ update some copyrights on the way)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7727
diff
changeset
|
158 |
'%s') % ', '.join(self.list_actions()) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
159 |
raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg}) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
160 |
except RegistryException: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
161 |
msg = _('allowed values for "action" are %s') % ', '.join(self.list_actions()) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
162 |
raise ValidationError(schemacfg.eid, {rn('options', 'subject'): msg}) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
163 |
if not checkonly: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
164 |
if role == 'subject': |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
165 |
etype = schemacfg.schema.stype.name |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
166 |
ttype = schemacfg.schema.otype.name |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
167 |
else: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
168 |
etype = schemacfg.schema.otype.name |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
169 |
ttype = schemacfg.schema.stype.name |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
170 |
etyperules = self.source.mapping.setdefault(etype, {}) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
171 |
etyperules.setdefault((rtype, role, action), []).append( |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
172 |
(ttype, options) ) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
173 |
self.source.mapping_idx[schemacfg.eid] = ( |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
174 |
etype, rtype, role, action, ttype) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
175 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
176 |
def del_schema_config(self, schemacfg, checkonly=False): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
177 |
"""deleted CWSourceSchemaConfig, modify mapping accordingly""" |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
178 |
etype, rtype, role, action, ttype = self.source.mapping_idx[schemacfg.eid] |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
179 |
rules = self.source.mapping[etype][(rtype, role, action)] |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
180 |
rules = [x for x in rules if not x[0] == ttype] |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
181 |
if not rules: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
182 |
del self.source.mapping[etype][(rtype, role, action)] |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
183 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
184 |
# import handling ########################################################## |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
185 |
|
8409
79534887943e
[datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8299
diff
changeset
|
186 |
def process(self, url, raise_on_error=False): |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
187 |
"""IDataFeedParser main entry point""" |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
188 |
if url.startswith('http'): # XXX similar loose test as in parse of sources.datafeed |
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
189 |
url = self.complete_url(url) |
8409
79534887943e
[datafeed] fix/finish cleanup started by auc in 8393:c25b96ae4f8a: parser.process prototytpe is (url, raise_on_error=False). Drop partialcommit argument which were never specified
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8299
diff
changeset
|
190 |
super(CWEntityXMLParser, self).process(url, raise_on_error) |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
191 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
192 |
def parse_etree(self, parent): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
193 |
for node in list(parent): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
194 |
builder = self._cw.vreg['components'].select( |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
195 |
'cw.entityxml.item-builder', self._cw, node=node, |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
196 |
parser=self) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
197 |
yield builder.build_item() |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
198 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
199 |
def process_item(self, item, rels): |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
200 |
""" |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
201 |
item and rels are what's returned by the item builder `build_item` method: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
202 |
|
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
203 |
* `item` is an {attribute: value} dictionary |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
204 |
* `rels` is for relations and structured as |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
205 |
{role: {relation: [(related item, related rels)...]} |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
206 |
""" |
7932
2ad26cc3b5c6
[datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7916
diff
changeset
|
207 |
entity = self.extid2entity(str(item['cwuri']), item['cwtype'], |
2ad26cc3b5c6
[datafeed parser] only update an entity attributes when needed. Closes #1989142
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7916
diff
changeset
|
208 |
cwsource=item['cwsource'], item=item) |
7399
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
209 |
if entity is None: |
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
210 |
return None |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
211 |
if entity.eid in self._processed_entities: |
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
212 |
return entity |
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
213 |
self._processed_entities.add(entity.eid) |
7354
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
214 |
if not (self.created_during_pull(entity) or self.updated_during_pull(entity)): |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
215 |
attrs = extract_typed_attrs(entity.e_schema, item) |
8188
1867e252e487
[repository] ldap-feed source. Closes #2086984
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7995
diff
changeset
|
216 |
self.update_if_necessary(entity, attrs) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
217 |
self.process_relations(entity, rels) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
218 |
return entity |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
219 |
|
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
220 |
def process_relations(self, entity, rels): |
8900
010a59e12d89
use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents:
8748
diff
changeset
|
221 |
etype = entity.cw_etype |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
222 |
for (rtype, role, action), rules in self.source.mapping.get(etype, {}).iteritems(): |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
223 |
try: |
7354
f627ab500fda
[parsers] various refactorings
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
7351
diff
changeset
|
224 |
related_items = rels[role][rtype] |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
225 |
except KeyError: |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
226 |
self.import_log.record_error('relation %s-%s not found in xml export of %s' |
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
227 |
% (rtype, role, etype)) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
228 |
continue |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
229 |
try: |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
230 |
linker = self.select_linker(action, rtype, role, entity) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
231 |
except RegistryException: |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
232 |
self.import_log.record_error('no linker for action %s' % action) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
233 |
else: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
234 |
linker.link_items(related_items, rules) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
235 |
|
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
236 |
def before_entity_copy(self, entity, sourceparams): |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
237 |
"""IDataFeedParser callback""" |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
238 |
attrs = extract_typed_attrs(entity.e_schema, sourceparams['item']) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
239 |
entity.cw_edited.update(attrs) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
240 |
|
8189
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
241 |
|
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
242 |
def normalize_url(self, url): |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
243 |
"""overriden to add vid=xml""" |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
244 |
url = super(CWEntityXMLParser, self).normalize_url(url) |
8299
e5d636081571
[datafeed xml parser] fix typo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8239
diff
changeset
|
245 |
if url.startswith('http'): |
8189
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
246 |
try: |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
247 |
url, qs = url.split('?', 1) |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
248 |
except ValueError: |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
249 |
params = {} |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
250 |
else: |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
251 |
params = parse_qs(qs) |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
252 |
if not 'vid' in params: |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
253 |
params['vid'] = ['xml'] |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
254 |
return url + '?' + self._cw.build_url_params(**params) |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
255 |
return url |
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
256 |
|
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
257 |
def complete_url(self, url, etype=None, known_relations=None): |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
258 |
"""append to the url's query string information about relation that should |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
259 |
be included in the resulting xml, according to source mapping. |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
260 |
|
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
261 |
If etype is not specified, try to guess it using the last path part of |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
262 |
the url, i.e. the format used by default in cubicweb to map all entities |
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
263 |
of a given type as in 'http://mysite.org/EntityType'. |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
264 |
|
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
265 |
If `known_relations` is given, it should be a dictionary of already |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
266 |
known relations, so they don't get queried again. |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
267 |
""" |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
268 |
try: |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
269 |
url, qs = url.split('?', 1) |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
270 |
except ValueError: |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
271 |
qs = '' |
8189
2ee0ef069fa7
[datafeed] when checking for deletion using cwentity xml parser, use vid=xml to limit resources consumption
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
8188
diff
changeset
|
272 |
# XXX vid will be added by later call to normalize_url (in parent class) |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
273 |
params = parse_qs(qs) |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
274 |
if etype is None: |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
275 |
try: |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
276 |
etype = url.rsplit('/', 1)[1] |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
277 |
except ValueError: |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
278 |
return url + '?' + self._cw.build_url_params(**params) |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
279 |
try: |
7701
e5ba5de1d485
[datafeed] case insensitive etype. Closes #1877014
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7700
diff
changeset
|
280 |
etype = self._cw.vreg.case_insensitive_etypes[etype.lower()] |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
281 |
except KeyError: |
7689
a86fd3ec322e
[datafeed] closes #1873616 (user's url corruption when etype is not recognized
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7553
diff
changeset
|
282 |
return url + '?' + self._cw.build_url_params(**params) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
283 |
relations = params.setdefault('relation', []) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
284 |
for rtype, role, _ in self.source.mapping.get(etype, ()): |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
285 |
if known_relations and rtype in known_relations.get('role', ()): |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
286 |
continue |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
287 |
reldef = '%s-%s' % (rtype, role) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
288 |
if not reldef in relations: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
289 |
relations.append(reldef) |
7531
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
290 |
return url + '?' + self._cw.build_url_params(**params) |
e891ca479586
[datafeed, cwxml] parser now try to complete source's urls so one can add url such as <myinstance>/project. Closes #1759908
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7481
diff
changeset
|
291 |
|
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
292 |
def complete_item(self, item, rels): |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
293 |
try: |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
294 |
return self._parsed_urls[item['cwuri']] |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
295 |
except KeyError: |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
296 |
itemurl = self.complete_url(item['cwuri'], item['cwtype'], rels) |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
297 |
item_rels = list(self.parse(itemurl)) |
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
298 |
assert len(item_rels) == 1, 'url %s expected to bring back one '\ |
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
299 |
'and only one entity, got %s' % (itemurl, len(item_rels)) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
300 |
self._parsed_urls[item['cwuri']] = item_rels[0] |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
301 |
if rels: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
302 |
# XXX (do it better) merge relations |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
303 |
new_rels = item_rels[0][1] |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
304 |
new_rels.get('subject', {}).update(rels.get('subject', {})) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
305 |
new_rels.get('object', {}).update(rels.get('object', {})) |
7474
7dc405ad7bf3
[datafeed cwxml parser] cache processed urls/entities to avoid unnecessary http requests and processing
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7470
diff
changeset
|
306 |
return item_rels[0] |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
307 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
308 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
309 |
class CWEntityXMLItemBuilder(Component): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
310 |
__regid__ = 'cw.entityxml.item-builder' |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
311 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
312 |
def __init__(self, _cw, parser, node, **kwargs): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
313 |
super(CWEntityXMLItemBuilder, self).__init__(_cw, **kwargs) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
314 |
self.parser = parser |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
315 |
self.node = node |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
316 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
317 |
def build_item(self): |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
318 |
"""parse a XML document node and return two dictionaries defining (part |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
319 |
of) an entity: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
320 |
|
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
321 |
- {attribute: value} |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
322 |
- {role: {relation: [(related item, related rels)...]} |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
323 |
""" |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
324 |
node = self.node |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
325 |
item = dict(node.attrib.items()) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
326 |
item['cwtype'] = unicode(node.tag) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
327 |
item.setdefault('cwsource', None) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
328 |
try: |
8748
f5027f8d2478
drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
8409
diff
changeset
|
329 |
item['eid'] = int(item['eid']) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
330 |
except KeyError: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
331 |
# cw < 3.11 compat mode XXX |
8748
f5027f8d2478
drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents:
8409
diff
changeset
|
332 |
item['eid'] = int(node.find('eid').text) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
333 |
item['cwuri'] = node.find('cwuri').text |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
334 |
rels = {} |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
335 |
for child in node: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
336 |
role = child.get('role') |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
337 |
if role: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
338 |
# relation |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
339 |
related = rels.setdefault(role, {}).setdefault(child.tag, []) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
340 |
related += self.parser.parse_etree(child) |
7700
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
341 |
elif child.text: |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
342 |
# attribute |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
343 |
item[child.tag] = unicode(child.text) |
7700
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
344 |
else: |
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
345 |
# None attribute (empty tag) |
0010dde5352a
[datafeed] closes #1875322: empty tag means None value
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7689
diff
changeset
|
346 |
item[child.tag] = None |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
347 |
return item, rels |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
348 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
349 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
350 |
class CWEntityXMLActionCopy(Component): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
351 |
"""implementation of cubicweb entity xml parser's'copy' action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
352 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
353 |
Takes no option. |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
354 |
""" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
355 |
__regid__ = 'cw.entityxml.action.copy' |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
356 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
357 |
def __init__(self, _cw, parser, rtype, role, entity=None, **kwargs): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
358 |
super(CWEntityXMLActionCopy, self).__init__(_cw, **kwargs) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
359 |
self.parser = parser |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
360 |
self.rtype = rtype |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
361 |
self.role = role |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
362 |
self.entity = entity |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
363 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
364 |
@classproperty |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
365 |
def action(cls): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
366 |
return cls.__regid__.rsplit('.', 1)[-1] |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
367 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
368 |
def check_options(self, options, eid): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
369 |
self._check_no_options(options, eid) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
370 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
371 |
def _check_no_options(self, options, eid, msg=None): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
372 |
if options: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
373 |
if msg is None: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
374 |
msg = self._cw._("'%s' action doesn't take any options") % self.action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
375 |
raise ValidationError(eid, {rn('options', 'subject'): msg}) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
376 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
377 |
def link_items(self, others, rules): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
378 |
assert not any(x[1] for x in rules), "'copy' action takes no option" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
379 |
ttypes = frozenset([x[0] for x in rules]) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
380 |
eids = [] # local eids |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
381 |
for item, rels in others: |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
382 |
if item['cwtype'] in ttypes: |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
383 |
item, rels = self.parser.complete_item(item, rels) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
384 |
other_entity = self.parser.process_item(item, rels) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
385 |
if other_entity is not None: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
386 |
eids.append(other_entity.eid) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
387 |
if eids: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
388 |
self._set_relation(eids) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
389 |
else: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
390 |
self._clear_relation(ttypes) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
391 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
392 |
def _clear_relation(self, ttypes): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
393 |
if not self.parser.created_during_pull(self.entity): |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
394 |
if len(ttypes) > 1: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
395 |
typerestr = ', Y is IN(%s)' % ','.join(ttypes) |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
396 |
else: |
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
397 |
typerestr = ', Y is %s' % ','.join(ttypes) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
398 |
self._cw.execute('DELETE ' + rtype_role_rql(self.rtype, self.role) + typerestr, |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
399 |
{'x': self.entity.eid}) |
6960
822f2530570d
[datafeed] add parser to import cubicweb xml
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff
changeset
|
400 |
|
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
401 |
def _set_relation(self, eids): |
7399
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
402 |
assert eids |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
403 |
rtype = self.rtype |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
404 |
rqlbase = rtype_role_rql(rtype, self.role) |
7399
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
405 |
eidstr = ','.join(str(eid) for eid in eids) |
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
406 |
self._cw.execute('DELETE %s, NOT Y eid IN (%s)' % (rqlbase, eidstr), |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
407 |
{'x': self.entity.eid}) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
408 |
if self.role == 'object': |
7399
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
409 |
rql = 'SET %s, Y eid IN (%s), NOT Y %s X' % (rqlbase, eidstr, rtype) |
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
410 |
else: |
972ed1843bd8
[multi-sources] support for moving an entity from an external source (closes #343818)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7378
diff
changeset
|
411 |
rql = 'SET %s, Y eid IN (%s), NOT X %s Y' % (rqlbase, eidstr, rtype) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
412 |
self._cw.execute(rql, {'x': self.entity.eid}) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
413 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
414 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
415 |
class CWEntityXMLActionLink(CWEntityXMLActionCopy): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
416 |
"""implementation of cubicweb entity xml parser's'link' action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
417 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
418 |
requires a 'linkattr' option to control search of the linked entity. |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
419 |
""" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
420 |
__regid__ = 'cw.entityxml.action.link' |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
421 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
422 |
def check_options(self, options, eid): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
423 |
if not 'linkattr' in options: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
424 |
msg = self._cw._("'%s' action requires 'linkattr' option") % self.action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
425 |
raise ValidationError(eid, {rn('options', 'subject'): msg}) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
426 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
427 |
create_when_not_found = False |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
428 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
429 |
def link_items(self, others, rules): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
430 |
for ttype, options in rules: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
431 |
searchattrs = splitstrip(options.get('linkattr', '')) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
432 |
self._related_link(ttype, others, searchattrs) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
433 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
434 |
def _related_link(self, ttype, others, searchattrs): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
435 |
def issubset(x,y): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
436 |
return all(z in y for z in x) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
437 |
eids = [] # local eids |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
438 |
log = self.parser.import_log |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
439 |
for item, rels in others: |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
440 |
if item['cwtype'] != ttype: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
441 |
continue |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
442 |
if not issubset(searchattrs, item): |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
443 |
item, rels = self.parser.complete_item(item, rels) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
444 |
if not issubset(searchattrs, item): |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
445 |
log.record_error('missing attribute, got %s expected keys %s' |
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
446 |
% (item, searchattrs)) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
447 |
continue |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
448 |
# XXX str() needed with python < 2.6 |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
449 |
kwargs = dict((str(attr), item[attr]) for attr in searchattrs) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
450 |
targets = self._find_entities(item, kwargs) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
451 |
if len(targets) == 1: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
452 |
entity = targets[0] |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
453 |
elif not targets and self.create_when_not_found: |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
454 |
entity = self._cw.create_entity(item['cwtype'], **kwargs) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
455 |
else: |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
456 |
if len(targets) > 1: |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
457 |
log.record_error('ambiguous link: found %s entity %s with attributes %s' |
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
458 |
% (len(targets), item['cwtype'], kwargs)) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
459 |
else: |
7995
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
460 |
log.record_error('can not find %s entity with attributes %s' |
9a9f35ef418c
Record a log of datafeed source imports (closes #2026097)
Julien Cristau <julien.cristau@logilab.fr>
parents:
7957
diff
changeset
|
461 |
% (item['cwtype'], kwargs)) |
7916
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
462 |
continue |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
463 |
eids.append(entity.eid) |
54e240c1b419
[datafeed parser] properly recurse on nodes to follow all mapped relations (closes #1988432)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7879
diff
changeset
|
464 |
self.parser.process_relations(entity, rels) |
7553
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
465 |
if eids: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
466 |
self._set_relation(eids) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
467 |
else: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
468 |
self._clear_relation((ttype,)) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
469 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
470 |
def _find_entities(self, item, kwargs): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
471 |
return tuple(self._cw.find_entities(item['cwtype'], **kwargs)) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
472 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
473 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
474 |
class CWEntityXMLActionLinkInState(CWEntityXMLActionLink): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
475 |
"""custom implementation of cubicweb entity xml parser's'link' action for |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
476 |
in_state relation |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
477 |
""" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
478 |
__select__ = match_rtype('in_state') |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
479 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
480 |
def check_options(self, options, eid): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
481 |
super(CWEntityXMLActionLinkInState, self).check_options(options, eid) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
482 |
if not 'name' in options['linkattr']: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
483 |
msg = self._cw._("'%s' action for in_state relation should at least have 'linkattr=name' option") % self.action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
484 |
raise ValidationError(eid, {rn('options', 'subject'): msg}) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
485 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
486 |
def _find_entities(self, item, kwargs): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
487 |
assert 'name' in item # XXX else, complete_item |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
488 |
state_name = item['name'] |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
489 |
wf = self.entity.cw_adapt_to('IWorkflowable').current_workflow |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
490 |
state = wf.state_by_name(state_name) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
491 |
if state is None: |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
492 |
return () |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
493 |
return (state,) |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
494 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
495 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
496 |
class CWEntityXMLActionLinkOrCreate(CWEntityXMLActionLink): |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
497 |
"""implementation of cubicweb entity xml parser's'link-or-create' action |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
498 |
|
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
499 |
requires a 'linkattr' option to control search of the linked entity. |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
500 |
""" |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
501 |
__regid__ = 'cw.entityxml.action.link-or-create' |
935423529f45
[datafeed cw parser] refactor: split logic that was in the parser
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
7534
diff
changeset
|
502 |
create_when_not_found = True |