[CWEP002] document computed relations and attributes
authorAnthony Truchet <anthony.truchet@logilab.fr>
Mon, 10 Feb 2014 11:03:43 +0100
changeset 9972 e210f0e082b0
parent 9971 5e44dd9dde6b
child 9973 bbe05c74eb92
[CWEP002] document computed relations and attributes Related to #3546717.
doc/book/en/devrepo/datamodel/definition.rst
--- a/doc/book/en/devrepo/datamodel/definition.rst	Thu Aug 28 17:41:34 2014 +0200
+++ b/doc/book/en/devrepo/datamodel/definition.rst	Mon Feb 10 11:03:43 2014 +0100
@@ -1,4 +1,4 @@
- .. -*- coding: utf-8 -*-
+.. -*- coding: utf-8 -*-
 
 .. _datamodel_definition:
 
@@ -503,6 +503,210 @@
 
 .. _yams_example:
 
+
+Derived attributes and relation
+-------------------------------
+
+.. note:: **TODO** Check organisation of the whole chapter of the documentation
+
+Cubicweb offers the possibility to *query* data using so called
+*computed* relations and attributes. Those are *seen* by RQL requests
+as normal attributes and relations but are actually derived from other
+attributes and relations. In a first section we'll informally review
+two typical use cases. Then we see how to use computed attributes and
+relation in you schema. Last we will consider various significant
+aspects of their implementation and the impact on their usage.
+
+Motivating use cases
+~~~~~~~~~~~~~~~~~~~~
+
+Computed (or reified) relations
+```````````````````````````````
+
+It often arises that one must represent a ternary relation, or a
+family of relations. For example, in the context of an exhibition
+catalog you might want to link all *contributors* to the *work* they
+contributed to, but this contribution can be as *illustrator*,
+*author*, *interpret*, ...
+
+The classical way to describe this kind of information within an
+entity-relationship schema is to *reify* the relation, that is turn
+the relation into a entity. In our example the schema will have a
+*Contribution* entity type used to represent the family of the
+contribution relations.
+
+
+.. sourcecode:: python
+
+    class ArtWork(EntityType):
+        name = String()
+        ...
+
+    class Person(EntityType):
+        name = String()
+        ...
+
+    class Contribution(EntityType):
+        contributor = SubjectRelation('Person', cardinality='1*', inlined=True)
+        manifestation = SubjectRelation('ArtWork')
+        role = SubjectRelation('Role')
+
+    class Role(EntityType):
+        name = String()
+
+But then, in order to query the illustrator(s) ``I`` of a work ``W``,
+one has to write::
+
+    Any I, W WHERE C is Contribution, C contributor I, C manifestation W,
+                   C role R, R name 'illustrator'
+
+whereas we would like to be able to simply write::
+
+    Any I, W WHERE I illustrator_of W
+
+This is precisely what the computed relations allow.
+
+
+Computed (or synthesised) attribute
+```````````````````````````````````
+
+Assuming a trivial schema for describing employees in companies, one
+can be interested in the total of salaries payed by the companies for
+all its employees. One has to write::
+
+    Any C, SUM(SA) GROUPBY S WHERE E works_for C, E salary SA
+
+whereas it would be most convenient to simply write::
+
+    Any C, TS WHERE C total_salary TS
+
+And this is again what computed attributes provide.
+
+
+Using computed attributes and relations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Computed (or reified) relations
+```````````````````````````````
+
+In the above case we would define the *computed relation*
+``illustrator_of`` in the schema by:
+
+.. sourcecode:: python
+
+    class illustrator_of(ComputedRelationType):
+        rule  = ('C is Contribution, C contributor S, C manifestation O,'
+                 'C role R, R name "illustrator"')
+
+You will note that:
+
+* the ``S`` and ``O`` RQL variable implicitly identify the subject and
+  object of the defined computed relation, akin to what happens in
+  RRQLExpression
+* the possible subject and object entity types are inferred from the rule;
+* computed relation definitions always have empty *add* and *delete* permissions
+* 'read' permissions can be defined, permissions from the relations used in the
+  rewrite rule **are not considered** ;
+* nothing else may be defined on the `ComputedRelation` subclass beside
+  permissions and rule (e.g. no cardinality, composite, etc.,).
+  `BadSchemaDefinition` is raised on attempt to specify other attributes;
+* computed relations can not be used in 'SET' and 'DELETE' rql queries
+  (`BadQuery` exception raised).
+
+
+NB: The fact that the *add* and *delete* permissions are *empty* even
+for managers is expected to make the automatic UI not attempt to edit
+them.
+
+.. note:: **TODO** Clarify read permissions
+
+          Are the permissions from the relations used in the rewrite
+          rule **never considered** or only when read permission are
+          explicitly defined ?
+
+Computed (or synthesised) attribute
+```````````````````````````````````
+
+In the above case we would define the *computed attribute*
+``total_salary`` on the ``Company`` entity type in the schema by::
+
+.. sourcecode:: python
+
+    class Company(EntityType):
+        name = String()
+        total_salary = Int(formula=('Any SUM(SA) GROUPBY E WHERE P works_for X, E salary SA'))
+
+* the ``X`` RQL variable implicitly identify the entity holding the
+  computed attribute, akin to what happens in ERQLExpression;
+
+* the type inferred from the formula is checked against the type declared;
+* the computed attributes always have empty *update* permissions
+* `BadSchemaDefinition` is raised on attempt to set 'update' permissions;
+* 'read' permissions can be defined, permissions regarding the formula
+  **are not considered**;
+* other attribute's property (inlined, ...) can be defined as for normal attributes;
+* Similarly to computed relation, computed attribute can't be used in 'SET' and
+  'DELETE' rql queries (`BadQuery` exception raised).
+
+.. note:: **TODO** Precise the error raised if the type checking fails
+
+
+API and implementation
+~~~~~~~~~~~~~~~~~~~~~~
+
+Representation in the data back-end
+```````````````````````````````````
+
+Computed relations have no direct representation at the SQL table
+level.  Instead, each time a query is issued the query is rewritten to
+replace the computed relation by its equivalent definition and the
+resulting rewritten query is performed in the usual way.
+
+On the contrary, computed attribute are represented as a column in the
+table for their host entity type, just like normal attributes. Their
+value is kept up-to-date with respect to their defintion by a system
+of hooks (also called triggers in most RDBMS) which recomputes them
+when the relations and attributes they depends on are modified.
+
+Yams API
+````````
+
+When accessing the schema through the *yams API* (not when defining a
+schema in a ``schema.py`` file) the computed attributes and relations
+are represented as follows:
+
+relations
+    The ``yams.RelationSchema`` class has a new ``rule`` attribute
+    holding the rule as a string. If this attribute is set all other
+    must not be set.
+attributes
+    An new property ``formula`` is added on class
+    ``yams.RelationDefinitionSchema`` alomng with a new keyword
+    argument ``formula`` on the initializer.
+
+Migration
+`````````
+
+The migrations are to be handled as summarized in the array below.
+
++------------+---------------------------------------------------+---------------------------------------+
+|            | Computed rtype                                    | Computed attribute                    |
++============+===================================================+=======================================+
+| add        | * add_relation_type                               | * add_attribute                       |
+|            | * add_relation_definition should trigger an error | * add_relation_definition             |
++------------+---------------------------------------------------+---------------------------------------+
+| modify     | * sync_schema_prop_perms:                         | * sync_schema_prop_perms:             |
+|            |   checks the rule is                              |                                       |
+| (rule or   |   synchronized with the database                  |   - empty the cache,                  |
+| formula)   |                                                   |   - check formula,                    |
+|            |                                                   |   - make sure all the values get      |
+|            |                                                   |     updated                           |
++------------+---------------------------------------------------+---------------------------------------+
+| del        | * drop_relation_type                              | * drop_attribute                      |
+|            | * drop_relation_definition should trigger an error| * drop_relation_definition            |
++------------+---------------------------------------------------+---------------------------------------+
+
+
 Defining your schema using yams
 -------------------------------
 
@@ -670,7 +874,7 @@
   RelationType declaration which offers some advantages in the context
   of reusable cubes.
 
-  
+
 
 
 Handling schema changes