sobjects/textparsers.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Thu, 08 Apr 2010 13:38:36 +0200
changeset 5195 e8235b2789fc
parent 4252 6c4f109c2b03
child 5421 8167de96c523
permissions -rw-r--r--
le patch move-pdf-export-to-cubes a été importé
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     1
"""hooks triggered on email entities creation:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     2
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     3
* look for state change instruction (XXX security)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     4
* set email content as a comment on an entity when comments are supported and
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     5
  linking information are found
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     6
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     7
:organization: Logilab
4212
ab6573088b4a update copyright: welcome 2010
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 3860
diff changeset
     8
:copyright: 2003-2010 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
     9
:contact: http://www.logilab.fr/ -- mailto:contact@logilab.fr
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    10
"""
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    11
__docformat__ = "restructuredtext en"
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    12
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    13
import re
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    14
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    15
from cubicweb import UnknownEid, typed_eid
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    16
from cubicweb.view import Component
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    17
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    18
        # XXX use user session if gpg signature validated
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    19
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    20
class TextAnalyzer(Component):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    21
    """analyze and extract information from plain text by calling registered
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    22
    text parsers
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    23
    """
4004
c52619c738a5 api renaming update
Sandrine Ribeau <sandrine.ribeau@logilab.fr>
parents: 3860
diff changeset
    24
    __regid__ = 'textanalyzer'
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    25
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    26
    def parse(self, caller, text):
4056
f4634710e20c api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4004
diff changeset
    27
        for parsercls in self._cw.vreg['components'].get('textparser', ()):
f4634710e20c api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4004
diff changeset
    28
            parsercls(self._cw).parse(caller, text)
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    29
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    30
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    31
class TextParser(Component):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    32
    """base class for text parser, responsible to extract some information
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
    from plain text. When something is done, it usually call the
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    34
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    35
      .fire_event(something, {event args})
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    37
    method on the caller.
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    38
    """
4004
c52619c738a5 api renaming update
Sandrine Ribeau <sandrine.ribeau@logilab.fr>
parents: 3860
diff changeset
    39
    __regid__ = 'textparser'
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
    __abstract__ = True
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
    def parse(self, caller, text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
        raise NotImplementedError
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
class ChangeStateTextParser(TextParser):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    47
    """search some text for change state instruction in the form
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
         :<transition name>: #?<eid>
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    50
    """
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
    instr_rgx = re.compile(':(\w+):\s*#?(\d+)', re.U)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    52
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    53
    def parse(self, caller, text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
        for trname, eid in self.instr_rgx.findall(text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
            try:
4056
f4634710e20c api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4004
diff changeset
    56
                entity = self._cw.entity_from_eid(typed_eid(eid))
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
            except UnknownEid:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
                self.error("can't get entity with eid %s", eid)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
                continue
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
            if not hasattr(entity, 'in_state'):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    61
                self.error('bad change state instruction for eid %s', eid)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
                continue
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
            tr = entity.current_workflow and entity.current_workflow.transition_by_name(trname)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
            if tr and tr.may_be_fired(entity.eid):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
                try:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
                    trinfo = entity.fire_transition(tr)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    67
                    caller.fire_event('state-changed', {'trinfo': trinfo,
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
                                                        'entity': entity})
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
                except:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
                    self.exception('while changing state of %s', entity)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
            else:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
                self.error("can't pass transition %s on entity %s",
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
                           trname, entity)