# HG changeset patch # User Anthony Truchet # Date 1392026623 -3600 # Node ID e210f0e082b078fb25984673e08f1d4c909f603f # Parent 5e44dd9dde6b6fc0af41945da046b8e9441f419e [CWEP002] document computed relations and attributes Related to #3546717. diff -r 5e44dd9dde6b -r e210f0e082b0 doc/book/en/devrepo/datamodel/definition.rst --- a/doc/book/en/devrepo/datamodel/definition.rst Thu Aug 28 17:41:34 2014 +0200 +++ b/doc/book/en/devrepo/datamodel/definition.rst Mon Feb 10 11:03:43 2014 +0100 @@ -1,4 +1,4 @@ - .. -*- coding: utf-8 -*- +.. -*- coding: utf-8 -*- .. _datamodel_definition: @@ -503,6 +503,210 @@ .. _yams_example: + +Derived attributes and relation +------------------------------- + +.. note:: **TODO** Check organisation of the whole chapter of the documentation + +Cubicweb offers the possibility to *query* data using so called +*computed* relations and attributes. Those are *seen* by RQL requests +as normal attributes and relations but are actually derived from other +attributes and relations. In a first section we'll informally review +two typical use cases. Then we see how to use computed attributes and +relation in you schema. Last we will consider various significant +aspects of their implementation and the impact on their usage. + +Motivating use cases +~~~~~~~~~~~~~~~~~~~~ + +Computed (or reified) relations +``````````````````````````````` + +It often arises that one must represent a ternary relation, or a +family of relations. For example, in the context of an exhibition +catalog you might want to link all *contributors* to the *work* they +contributed to, but this contribution can be as *illustrator*, +*author*, *interpret*, ... + +The classical way to describe this kind of information within an +entity-relationship schema is to *reify* the relation, that is turn +the relation into a entity. In our example the schema will have a +*Contribution* entity type used to represent the family of the +contribution relations. + + +.. sourcecode:: python + + class ArtWork(EntityType): + name = String() + ... + + class Person(EntityType): + name = String() + ... + + class Contribution(EntityType): + contributor = SubjectRelation('Person', cardinality='1*', inlined=True) + manifestation = SubjectRelation('ArtWork') + role = SubjectRelation('Role') + + class Role(EntityType): + name = String() + +But then, in order to query the illustrator(s) ``I`` of a work ``W``, +one has to write:: + + Any I, W WHERE C is Contribution, C contributor I, C manifestation W, + C role R, R name 'illustrator' + +whereas we would like to be able to simply write:: + + Any I, W WHERE I illustrator_of W + +This is precisely what the computed relations allow. + + +Computed (or synthesised) attribute +``````````````````````````````````` + +Assuming a trivial schema for describing employees in companies, one +can be interested in the total of salaries payed by the companies for +all its employees. One has to write:: + + Any C, SUM(SA) GROUPBY S WHERE E works_for C, E salary SA + +whereas it would be most convenient to simply write:: + + Any C, TS WHERE C total_salary TS + +And this is again what computed attributes provide. + + +Using computed attributes and relations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Computed (or reified) relations +``````````````````````````````` + +In the above case we would define the *computed relation* +``illustrator_of`` in the schema by: + +.. sourcecode:: python + + class illustrator_of(ComputedRelationType): + rule = ('C is Contribution, C contributor S, C manifestation O,' + 'C role R, R name "illustrator"') + +You will note that: + +* the ``S`` and ``O`` RQL variable implicitly identify the subject and + object of the defined computed relation, akin to what happens in + RRQLExpression +* the possible subject and object entity types are inferred from the rule; +* computed relation definitions always have empty *add* and *delete* permissions +* 'read' permissions can be defined, permissions from the relations used in the + rewrite rule **are not considered** ; +* nothing else may be defined on the `ComputedRelation` subclass beside + permissions and rule (e.g. no cardinality, composite, etc.,). + `BadSchemaDefinition` is raised on attempt to specify other attributes; +* computed relations can not be used in 'SET' and 'DELETE' rql queries + (`BadQuery` exception raised). + + +NB: The fact that the *add* and *delete* permissions are *empty* even +for managers is expected to make the automatic UI not attempt to edit +them. + +.. note:: **TODO** Clarify read permissions + + Are the permissions from the relations used in the rewrite + rule **never considered** or only when read permission are + explicitly defined ? + +Computed (or synthesised) attribute +``````````````````````````````````` + +In the above case we would define the *computed attribute* +``total_salary`` on the ``Company`` entity type in the schema by:: + +.. sourcecode:: python + + class Company(EntityType): + name = String() + total_salary = Int(formula=('Any SUM(SA) GROUPBY E WHERE P works_for X, E salary SA')) + +* the ``X`` RQL variable implicitly identify the entity holding the + computed attribute, akin to what happens in ERQLExpression; + +* the type inferred from the formula is checked against the type declared; +* the computed attributes always have empty *update* permissions +* `BadSchemaDefinition` is raised on attempt to set 'update' permissions; +* 'read' permissions can be defined, permissions regarding the formula + **are not considered**; +* other attribute's property (inlined, ...) can be defined as for normal attributes; +* Similarly to computed relation, computed attribute can't be used in 'SET' and + 'DELETE' rql queries (`BadQuery` exception raised). + +.. note:: **TODO** Precise the error raised if the type checking fails + + +API and implementation +~~~~~~~~~~~~~~~~~~~~~~ + +Representation in the data back-end +``````````````````````````````````` + +Computed relations have no direct representation at the SQL table +level. Instead, each time a query is issued the query is rewritten to +replace the computed relation by its equivalent definition and the +resulting rewritten query is performed in the usual way. + +On the contrary, computed attribute are represented as a column in the +table for their host entity type, just like normal attributes. Their +value is kept up-to-date with respect to their defintion by a system +of hooks (also called triggers in most RDBMS) which recomputes them +when the relations and attributes they depends on are modified. + +Yams API +```````` + +When accessing the schema through the *yams API* (not when defining a +schema in a ``schema.py`` file) the computed attributes and relations +are represented as follows: + +relations + The ``yams.RelationSchema`` class has a new ``rule`` attribute + holding the rule as a string. If this attribute is set all other + must not be set. +attributes + An new property ``formula`` is added on class + ``yams.RelationDefinitionSchema`` alomng with a new keyword + argument ``formula`` on the initializer. + +Migration +````````` + +The migrations are to be handled as summarized in the array below. + ++------------+---------------------------------------------------+---------------------------------------+ +| | Computed rtype | Computed attribute | ++============+===================================================+=======================================+ +| add | * add_relation_type | * add_attribute | +| | * add_relation_definition should trigger an error | * add_relation_definition | ++------------+---------------------------------------------------+---------------------------------------+ +| modify | * sync_schema_prop_perms: | * sync_schema_prop_perms: | +| | checks the rule is | | +| (rule or | synchronized with the database | - empty the cache, | +| formula) | | - check formula, | +| | | - make sure all the values get | +| | | updated | ++------------+---------------------------------------------------+---------------------------------------+ +| del | * drop_relation_type | * drop_attribute | +| | * drop_relation_definition should trigger an error| * drop_relation_definition | ++------------+---------------------------------------------------+---------------------------------------+ + + Defining your schema using yams ------------------------------- @@ -670,7 +874,7 @@ RelationType declaration which offers some advantages in the context of reusable cubes. - + Handling schema changes