cubicweb/sobjects/textparsers.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Wed, 09 Nov 2016 11:42:33 +0100
branch3.24
changeset 11811 f09efeead7f9
parent 11767 432f87a63057
child 12355 c703dc95c82e
permissions -rw-r--r--
Fix broken flake8 configuration and flake8 errors which were hidden by this breakage. flake8 --filename options doesn't work as expected: * it's expected to be a shell pattern, using stdlib's fnmatch.fnmatch function internally. This funciton thinks that 'cubicweb/x.py' doesn't match 'cubicweb/x.py' (there must be a reason but that's not the point), hence no file was actually checked ; * as this is a list of pattern, each encountered file is checked against each pattern, leading to run time explosion. So maintain list of files to check in a separated file and give this list to flake8 using unix's xarg command.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6911
diff changeset
     1
# copyright 2003-2011 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
6911
75849076fd6c cleanup, docstring update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5556
diff changeset
    18
"""Some parsers to detect action to do from text
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    19
6911
75849076fd6c cleanup, docstring update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5556
diff changeset
    20
Currently only a parser to look for state change instruction is provided.
75849076fd6c cleanup, docstring update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5556
diff changeset
    21
Take care to security when you're using it, think about the user that
75849076fd6c cleanup, docstring update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5556
diff changeset
    22
will provide the text to analyze...
75849076fd6c cleanup, docstring update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5556
diff changeset
    23
"""
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    24
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    25
import re
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    26
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    27
from cubicweb import UnknownEid
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    28
from cubicweb.view import Component
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    29
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    30
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    31
class TextAnalyzer(Component):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    32
    """analyze and extract information from plain text by calling registered
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    33
    text parsers
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    34
    """
4004
c52619c738a5 api renaming update
Sandrine Ribeau <sandrine.ribeau@logilab.fr>
parents: 3860
diff changeset
    35
    __regid__ = 'textanalyzer'
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    36
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    37
    def parse(self, caller, text):
4056
f4634710e20c api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4004
diff changeset
    38
        for parsercls in self._cw.vreg['components'].get('textparser', ()):
f4634710e20c api update
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4004
diff changeset
    39
            parsercls(self._cw).parse(caller, text)
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    40
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    41
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    42
class TextParser(Component):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    43
    """base class for text parser, responsible to extract some information
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    44
    from plain text. When something is done, it usually call the
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    45
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    46
      .fire_event(something, {event args})
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    47
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    48
    method on the caller.
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    49
    """
4004
c52619c738a5 api renaming update
Sandrine Ribeau <sandrine.ribeau@logilab.fr>
parents: 3860
diff changeset
    50
    __regid__ = 'textparser'
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    51
    __abstract__ = True
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    52
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    53
    def parse(self, caller, text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    54
        raise NotImplementedError
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    55
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    56
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    57
class ChangeStateTextParser(TextParser):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    58
    """search some text for change state instruction in the form
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    59
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    60
         :<transition name>: #?<eid>
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    61
    """
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    62
    instr_rgx = re.compile(':(\w+):\s*#?(\d+)', re.U)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    63
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    64
    def parse(self, caller, text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    65
        for trname, eid in self.instr_rgx.findall(text):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    66
            try:
8748
f5027f8d2478 drop typed_eid() in favour of int() (closes #2742462)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    67
                entity = self._cw.entity_from_eid(int(eid))
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    68
            except UnknownEid:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    69
                self.error("can't get entity with eid %s", eid)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    70
                continue
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    71
            if not hasattr(entity, 'in_state'):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    72
                self.error('bad change state instruction for eid %s', eid)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    73
                continue
5556
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    74
            iworkflowable = entity.cw_adapt_to('IWorkflowable')
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    75
            if iworkflowable.current_workflow:
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    76
                tr = iworkflowable.current_workflow.transition_by_name(trname)
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    77
            else:
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    78
                tr = None
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    79
            if tr and tr.may_be_fired(entity.eid):
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    80
                try:
5556
9ab2b4c74baf [entity] introduce a new 'adapters' registry
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
    81
                    trinfo = iworkflowable.fire_transition(tr)
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    82
                    caller.fire_event('state-changed', {'trinfo': trinfo,
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    83
                                                        'entity': entity})
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6911
diff changeset
    84
                except Exception:
3860
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    85
                    self.exception('while changing state of %s', entity)
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    86
            else:
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    87
                self.error("can't pass transition %s on entity %s",
2e7d399ee075 add textparser object, designed to trigger some actions from textual content such as email or checkin-message (unused in the library itself yet, see email cube)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents:
diff changeset
    88
                           trname, entity)