doc/book/en/devrepo/repo/hooks.rst
branchstable
changeset 5394 105011657405
parent 5318 81cd2540c7d2
child 6147 95c604ec89bf
child 6152 6824f8b61098
equal deleted inserted replaced
5393:875bdc0fe8ce 5394:105011657405
       
     1 .. -*- coding: utf-8 -*-
       
     2 
       
     3 .. _hooks:
       
     4 
       
     5 Hooks and Operations
       
     6 ====================
       
     7 
       
     8 Generalities
       
     9 ------------
       
    10 
       
    11 Paraphrasing the `emacs`_ documentation, let us say that hooks are an
       
    12 important mechanism for customizing an application. A hook is
       
    13 basically a list of functions to be called on some well-defined
       
    14 occasion (this is called `running the hook`).
       
    15 
       
    16 .. _`emacs`: http://www.gnu.org/software/emacs/manual/html_node/emacs/Hooks.html
       
    17 
       
    18 In CubicWeb, hooks are subclasses of the Hook class in
       
    19 `server/hook.py`, implementing their own `call` method, and selected
       
    20 over a set of pre-defined `events` (and possibly more conditions,
       
    21 hooks being selectable AppObjects like views and components).
       
    22 
       
    23 There are two families of events: data events and server events. In a
       
    24 typical application, most of the Hooks are defined over data
       
    25 events.
       
    26 
       
    27 The purpose of data hooks is to complement the data model as defined
       
    28 in the schema.py, which is static by nature, with dynamic or value
       
    29 driven behaviours. It is functionally equivalent to a `database
       
    30 trigger`_, except that database triggers definition languages are not
       
    31 standardized, hence not portable (for instance, PL/SQL works with
       
    32 Oracle and PostgreSQL but not SqlServer nor Sqlite).
       
    33 
       
    34 .. _`database trigger`: http://en.wikipedia.org/wiki/Database_trigger
       
    35 
       
    36 Data hooks can serve the following purposes:
       
    37 
       
    38 * enforcing constraints that the static schema cannot express
       
    39   (spanning several entities/relations, exotic value ranges and
       
    40   cardinalities, etc.)
       
    41 
       
    42 * implement computed attributes
       
    43 
       
    44 Operations are Hook-like objects that may be created by Hooks and
       
    45 scheduled to happen just before (or after) the `commit` event. Hooks
       
    46 being fired immediately on data operations, it is sometime necessary
       
    47 to delay the actual work down to a time where all other Hooks have
       
    48 run, for instance a validation check which needs that all relations be
       
    49 already set on an entity. Also while the order of execution of Hooks
       
    50 is data dependant (and thus hard to predict), it is possible to force
       
    51 an order on Operations.
       
    52 
       
    53 Operations also may be used to process various side effects associated
       
    54 with a transaction such as filesystem udpates, mail notifications,
       
    55 etc.
       
    56 
       
    57 Operations are subclasses of the Operation class in `server/hook.py`,
       
    58 implementing `precommit_event` and other standard methods (wholly
       
    59 described in :ref:`operations_api`).
       
    60 
       
    61 Events
       
    62 ------
       
    63 
       
    64 Hooks are mostly defined and used to handle `dataflow`_ operations. It
       
    65 means as data gets in (entities added, updated, relations set or
       
    66 unset), specific events are issued and the Hooks matching these events
       
    67 are called.
       
    68 
       
    69 .. _`dataflow`: http://en.wikipedia.org/wiki/Dataflow
       
    70 
       
    71 Below comes a list of the dataflow events related to entities operations:
       
    72 
       
    73 * before_add_entity
       
    74 
       
    75 * before_update_entity
       
    76 
       
    77 * before_delete_entity
       
    78 
       
    79 * after_add_entity
       
    80 
       
    81 * after_update_entity
       
    82 
       
    83 * after_delete_entity
       
    84 
       
    85 These define ENTTIES HOOKS. RELATIONS HOOKS are defined
       
    86 over the following events:
       
    87 
       
    88 * after_add_relation
       
    89 
       
    90 * after_delete_relation
       
    91 
       
    92 * before_add_relation
       
    93 
       
    94 * before_delete_relation
       
    95 
       
    96 This is an occasion to remind us that relations support the add/delete
       
    97 operation, but no update.
       
    98 
       
    99 Non data events also exist. These are called SYSTEM HOOKS.
       
   100 
       
   101 * server_startup
       
   102 
       
   103 * server_shutdown
       
   104 
       
   105 * server_maintenance
       
   106 
       
   107 * server_backup
       
   108 
       
   109 * server_restore
       
   110 
       
   111 * session_open
       
   112 
       
   113 * session_close
       
   114 
       
   115 
       
   116 Using dataflow Hooks
       
   117 --------------------
       
   118 
       
   119 Dataflow hooks either automate data operations or maintain the
       
   120 consistency of the data model. In the later case, we must use a
       
   121 specific exception named ValidationError
       
   122 
       
   123 Validation Errors
       
   124 ~~~~~~~~~~~~~~~~~
       
   125 
       
   126 When a condition is not met in a Hook/Operation, it must raise a
       
   127 `ValidationError`. Raising anything but a (subclass of)
       
   128 ValidationError is a programming error. Raising a ValidationError
       
   129 entails aborting the current transaction.
       
   130 
       
   131 The ValidationError exception is used to convey enough information up
       
   132 to the user interface. Hence its constructor is different from the
       
   133 default Exception constructor. It accepts, positionally:
       
   134 
       
   135 * an entity eid,
       
   136 
       
   137 * a dict whose keys represent attribute (or relation) names and values
       
   138   an end-user facing message (hence properly translated) relating the
       
   139   problem.
       
   140 
       
   141 An entity hook
       
   142 ~~~~~~~~~~~~~~
       
   143 
       
   144 We will use a very simple example to show hooks usage. Let us start
       
   145 with the following schema.
       
   146 
       
   147 .. sourcecode:: python
       
   148 
       
   149    class Person(EntityType):
       
   150        age = Int(required=True)
       
   151 
       
   152 We would like to add a range constraint over a person's age. Let's
       
   153 write an hook. It shall be placed into mycube/hooks.py. If this file
       
   154 were to grow too much, we can easily have a mycube/hooks/... package
       
   155 containing hooks in various modules.
       
   156 
       
   157 .. sourcecode:: python
       
   158 
       
   159    from cubicweb import ValidationError
       
   160    from cubicweb.selectors import implements
       
   161    from cubicweb.server.hook import Hook
       
   162 
       
   163    class PersonAgeRange(Hook):
       
   164         __regid__ = 'person_age_range'
       
   165         events = ('before_add_entity', 'before_update_entity')
       
   166         __select__ = Hook.__select__ & implements('Person')
       
   167 
       
   168         def __call__(self):
       
   169             if 0 >= self.entity.age <= 120:
       
   170                return
       
   171             msg = self._cw._('age must be between 0 and 120')
       
   172             raise ValidationError(self.entity.eid, {'age': msg})
       
   173 
       
   174 Hooks being AppObjects like views, they have a __regid__ and a
       
   175 __select__ class attribute. The base __select__ is augmented with an
       
   176 `implements` selector matching the desired entity type. The `events`
       
   177 tuple is used by the Hook.__select__ base selector to dispatch the
       
   178 hook on the right events. In an entity hook, it is possible to
       
   179 dispatch on any entity event (e.g. 'before_add_entity',
       
   180 'before_update_entity') at once if needed.
       
   181 
       
   182 Like all appobjects, hooks have the `self._cw` attribute which
       
   183 represents the current session. In entity hooks, a `self.entity`
       
   184 attribute is also present.
       
   185 
       
   186 
       
   187 A relation hook
       
   188 ~~~~~~~~~~~~~~~
       
   189 
       
   190 Let us add another entity type with a relation to person (in
       
   191 mycube/schema.py).
       
   192 
       
   193 .. sourcecode:: python
       
   194 
       
   195    class Company(EntityType):
       
   196         name = String(required=True)
       
   197         boss = SubjectRelation('Person', cardinality='1*')
       
   198 
       
   199 We would like to constrain the company's bosses to have a minimum
       
   200 (legal) age. Let's write an hook for this, which will be fired when
       
   201 the `boss` relation is established.
       
   202 
       
   203 .. sourcecode:: python
       
   204 
       
   205    class CompanyBossLegalAge(Hook):
       
   206         __regid__ = 'company_boss_legal_age'
       
   207         events = ('before_add_relation',)
       
   208         __select__ = Hook.__select__ & match_rtype('boss')
       
   209 
       
   210         def __call__(self):
       
   211             boss = self._cw.entity_from_eid(self.eidto)
       
   212             if boss.age < 18:
       
   213                 msg = self._cw._('the minimum age for a boss is 18')
       
   214                 raise ValidationError(self.eidfrom, {'boss': msg})
       
   215 
       
   216 We use the `match_rtype` selector to select the proper relation type.
       
   217 
       
   218 The essential difference with respect to an entity hook is that there
       
   219 is no self.entity, but `self.eidfrom` and `self.eidto` hook attributes
       
   220 which represent the subject and object eid of the relation.
       
   221 
       
   222 
       
   223 Using Operations
       
   224 ----------------
       
   225 
       
   226 Let's augment our example with a new `subsidiary_of` relation on Company.
       
   227 
       
   228 .. sourcecode:: python
       
   229 
       
   230    class Company(EntityType):
       
   231         name = String(required=True)
       
   232         boss = SubjectRelation('Person', cardinality='1*')
       
   233         subsidiary_of = SubjectRelation('Company', cardinality='*?')
       
   234 
       
   235 Base example
       
   236 ~~~~~~~~~~~~
       
   237 
       
   238 We would like to check that there is no cycle by the `subsidiary_of`
       
   239 relation. This is best achieved in an Operation since all relations
       
   240 are likely to be set at commit time.
       
   241 
       
   242 .. sourcecode:: python
       
   243 
       
   244     def check_cycle(self, session, eid, rtype, role='subject'):
       
   245         parents = set([eid])
       
   246         parent = session.entity_from_eid(eid)
       
   247         while parent.related(rtype, role):
       
   248             parent = parent.related(rtype, role)[0]
       
   249             if parent.eid in parents:
       
   250                 msg = session._('detected %s cycle' % rtype)
       
   251                 raise ValidationError(eid, {rtype: msg})
       
   252             parents.add(parent.eid)
       
   253 
       
   254     class CheckSubsidiaryCycleOp(Operation):
       
   255 
       
   256         def precommit_event(self):
       
   257             check_cycle(self.session, self.eidto, 'subsidiary_of')
       
   258 
       
   259 
       
   260     class CheckSubsidiaryCycleHook(Hook):
       
   261         __regid__ = 'check_no_subsidiary_cycle'
       
   262         events = ('after_add_relation',)
       
   263         __select__ = Hook.__select__ & match_rtype('subsidiary_of')
       
   264 
       
   265         def __call__(self):
       
   266             CheckSubsidiaryCycleOp(self._cw, eidto=self.eidto)
       
   267 
       
   268 The operation is instantiated in the Hook.__call__ method.
       
   269 
       
   270 An operation always takes a session object as first argument
       
   271 (accessible as `.session` from the operation instance), and optionally
       
   272 all keyword arguments needed by the operation. These keyword arguments
       
   273 will be accessible as attributes from the operation instance.
       
   274 
       
   275 Like in Hooks, ValidationError can be raised in Operations. Other
       
   276 exceptions are programming errors.
       
   277 
       
   278 Notice how our hook will instantiate an operation each time the Hook
       
   279 is called, i.e. each time the `subsidiary_of` relation is set.
       
   280 
       
   281 Using set_operation
       
   282 ~~~~~~~~~~~~~~~~~~~
       
   283 
       
   284 There is an alternative method to schedule an Operation from a Hook,
       
   285 using the `set_operation` function.
       
   286 
       
   287 .. sourcecode:: python
       
   288 
       
   289    from cubicweb.server.hook import set_operation
       
   290 
       
   291    class CheckSubsidiaryCycleHook(Hook):
       
   292        __regid__ = 'check_no_subsidiary_cycle'
       
   293        events = ('after_add_relation',)
       
   294        __select__ = Hook.__select__ & match_rtype('subsidiary_of')
       
   295 
       
   296        def __call__(self):
       
   297            set_operation(self._cw, 'subsidiary_cycle_detection', self.eidto,
       
   298                          CheckSubsidiaryCycleOp, rtype=self.rtype)
       
   299 
       
   300    class CheckSubsidiaryCycleOp(Operation):
       
   301 
       
   302        def precommit_event(self):
       
   303            for eid in self._cw.transaction_data['subsidiary_cycle_detection']:
       
   304                check_cycle(self.session, eid, self.rtype)
       
   305 
       
   306 Here, we call set_operation with a session object, a specially forged
       
   307 key, a value that is the actual payload of an individual operation (in
       
   308 our case, the object of the subsidiary_of relation) , the class of the
       
   309 Operation, and more optional parameters to give to the operation (here
       
   310 the rtype which do not vary accross operations).
       
   311 
       
   312 The body of the operation must then iterate over the values that have
       
   313 been mapped in the transaction_data dictionary to the forged key.
       
   314 
       
   315 This mechanism is especially useful on two occasions (not shown in our
       
   316 example):
       
   317 
       
   318 * massive data import (reduced memory consumption within a large
       
   319   transaction)
       
   320 
       
   321 * when several hooks need to instantiate the same operation (e.g. an
       
   322   entity and a relation hook).
       
   323 
       
   324 .. note::
       
   325 
       
   326   A more realistic example can be found in the advanced tutorial
       
   327   chapter :ref:`adv_tuto_security_propagation`.
       
   328 
       
   329 .. _operations_api:
       
   330 
       
   331 Operation: a small API overview
       
   332 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       
   333 
       
   334 .. autoclass:: cubicweb.server.hook.Operation
       
   335 .. autoclass:: cubicweb.server.hook.LateOperation
       
   336 .. autofunction:: cubicweb.server.hook.set_operation
       
   337 
       
   338 Hooks writing rules
       
   339 -------------------
       
   340 
       
   341 Remainder
       
   342 ~~~~~~~~~
       
   343 
       
   344 Never, ever use the `entity.foo = 42` notation to update an entity. It
       
   345 will not work.
       
   346 
       
   347 How to choose between a before and an after event ?
       
   348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       
   349 
       
   350 Before hooks give you access to the old attribute (or relation)
       
   351 values. By definition the database is not yet updated in a before
       
   352 hook.
       
   353 
       
   354 To access old and new values in an before_update_entity hook, one can
       
   355 use the `server.hook.entity_oldnewvalue` function which returns a
       
   356 tuple of the old and new values. This function takes an entity and an
       
   357 attribute name as parameters.
       
   358 
       
   359 In a 'before_add|update_entity' hook the self.entity contains the new
       
   360 values. One is allowed to further modify them before database
       
   361 operations, using the dictionary notation.
       
   362 
       
   363 .. sourcecode:: python
       
   364 
       
   365    self.entity['age'] = 42
       
   366 
       
   367 This is because using self.entity.set_attributes(age=42) will
       
   368 immediately update the database (which does not make sense in a
       
   369 pre-database hook), and will trigger any existing
       
   370 before_add|update_entity hook, thus leading to infinite hook loops or
       
   371 such awkward situations.
       
   372 
       
   373 Beyond these specific cases, updating an entity attribute or relation
       
   374 must *always* be done using `set_attributes` and `set_relations`
       
   375 methods.
       
   376 
       
   377 (Of course, ValidationError will always abort the current transaction,
       
   378 whetever the event).
       
   379 
       
   380 Peculiarities of inlined relations
       
   381 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       
   382 
       
   383 Some relations are defined in the schema as `inlined` (see
       
   384 :ref:`RelationType` for details). In this case, they are inserted in
       
   385 the database at the same time as entity attributes.
       
   386 
       
   387 Hence in the case of before_add_relation, such relations already exist
       
   388 in the database.
       
   389 
       
   390 Edited attributes
       
   391 ~~~~~~~~~~~~~~~~~
       
   392 
       
   393 On udpates, it is possible to ask the `entity.edited_attributes`
       
   394 variable whether one attribute has been updated.
       
   395 
       
   396 .. sourcecode:: python
       
   397 
       
   398   if 'age' not in entity.edited_attribute:
       
   399       return
       
   400 
       
   401 Deleted in transaction
       
   402 ~~~~~~~~~~~~~~~~~~~~~~
       
   403 
       
   404 The session object has a deleted_in_transaction method, which can help
       
   405 writing deletion Hooks.
       
   406 
       
   407 .. sourcecode:: python
       
   408 
       
   409    if self._cw.deleted_in_transaction(self.eidto):
       
   410       return
       
   411 
       
   412 Given this predicate, we can avoid scheduling an operation.
       
   413 
       
   414 Disabling hooks
       
   415 ~~~~~~~~~~~~~~~
       
   416 
       
   417 It is sometimes convenient to disable some hooks. For instance to
       
   418 avoid infinite Hook loops. One uses the `hooks_control` context
       
   419 manager.
       
   420 
       
   421 This can be controlled more finely through the `category` Hook class
       
   422 attribute, which is a string.
       
   423 
       
   424 .. sourcecode:: python
       
   425 
       
   426    with hooks_control(self.session, self.session.HOOKS_ALLOW_ALL, <category>):
       
   427        # ... do stuff
       
   428 
       
   429 .. autoclass:: cubicweb.server.session.hooks_control
       
   430 
       
   431 The existing categories are: ``email``, ``syncsession``,
       
   432 ``syncschema``, ``bookmark``, ``security``, ``worfklow``,
       
   433 ``metadata``, ``notification``, ``integrity``, ``activeintegrity``.
       
   434 
       
   435 Nothing precludes one to invent new categories and use the
       
   436 hooks_control context manager to filter them (in or out).