# HG changeset patch # User Nicolas Chauvat # Date 1227215242 -3600 # Node ID 7a56ca431d65ea241b7d7da26d12feb72ea27c7f # Parent fae5651e75934ab41d77d8da6aa577f75bd892a7 [doc] missing file diff -r fae5651e7593 -r 7a56ca431d65 doc/book/en/02-01-concepts.en.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/book/en/02-01-concepts.en.txt Thu Nov 20 22:07:22 2008 +0100 @@ -0,0 +1,547 @@ +.. -*- coding: utf-8 -*- + +Concepts +-------- + +Global architecture +~~~~~~~~~~~~~~~~~~~ +.. image:: images/archi_globale.png + +.. note:: + For real, the client and server sides are integrated in the same + process and interact directly, without the needs for distants + calls using Pyro. It is important to note down that those two + sides, client/server, are disjointed and it is possible to execute + a couple of calls in distincts processes to balance the load of + your web site on one or more machines. + +.. _TermsVocabulary: + +Terms and vocabulary +~~~~~~~~~~~~~~~~~~~~~ + +*schema* + The schema defines the data model of an application based on entities + and relations, modeled with a comprehensive language made of Python + classes based on `yams`_ library. This is the core piece + of an application. It is initially defined in the file system and is + stored in the database at the time an instance is created. `CubicWeb` + provides a certain number of system entities included automatically as + it is necessarry for the core of `CubicWeb` and a library of + cubes that can be explicitely included if necessary. + + +*entity type* + An entity is a set of attributes; the essential attribute of + an entity is its key, named eid + +*relation type* + Entities are linked to each others by relations. In `CubicWeb` + relations are binary: by convention we name the first item of + a relation the `subject` and the second the `object`. + +*final entity type* + Final types corresponds to the basic types such as string of characters, + integers... Those types have a main property which is that they can + only be used as `object` of a relation. The attributes of an entity + (non final) are entities (finals). + +*final relation type* + A relation is said final if its `object` is a final type. This is equivalent + to an entity attribute. + +*relation definition* + a relation definition is a 3-uple (subject entity type, relation type, object entity type), + with an associated set of property such as cardinality, constraints... + +*repository* + This is the RQL server side of `CubicWeb`. Be carefull not to get + confused with a Mercurial repository or a debian repository. + +*source* + A data source is a container of data (SGBD, LDAP directory, `Google + App Engine`'s datastore ...) integrated in the + `CubicWeb` repository. This repository has at least one source, `system` which + contains the schema of the application, plain-text index and others + vital informations for the system. + +*configuration* + It is possible to create differents configurations for an instance: + + - ``repository`` : repository only, accessible for clients using Pyro + - ``twisted`` : web interface only, access the repository using Pyro + - ``all-in-one`` : web interface and repository in a single process. + The repository could be or not accessible using Pyro. + +*cube* + A cube is a model grouping one or multiple data types and/or views + to provide a specific functionnality or a complete `CubicWeb` application + potentially using other cubes. The available subes are located in the file + system at `/path/to/forest/cubicweb/cubes`. + Larger applications can be built faster by importing cubes, + adding entities and relationships and overriding the + views that need to display or edit informations not provided by + cubes. + +*instance* + An instance is a specific installation of a cube. All the required + configuration files necessarry for the well being of your web application + are grouped in an instance. This will refer to the cube(s) your application + is based on. + By example logilab.org and our intranet are two instances of a single + cube jpl, developped internally. + The instances are defined in the directory `~/etc/cubicweb.d`. + +*application* + The term application is sometime used to talk about an instance + and sometimes to talk of a cube depending on the context. + So we would like to avoid using this term and try to use *cube* and + *instance* instead. + +*result set* + This object contains the results of an RQL query sent to the source + and information on the query. + +*Pyro* + `Python Remote Object`_, distributed objects system similar to Java's RMI + (Remote Method Invocation), which can be used for the dialog between the web + side of the framework and the RQL repository. + +*query language* + A full-blown query language named RQL is used to formulate requests + to the database or any sources such as LDAP or `Google App Engine`'s + datastore. + +*views* + A view is applied to a `result set` to present it as HTML, XML, + JSON, CSV, etc. Views are implemented as Python classes. There is no + templating language. + +*generated user interface* + A user interface is generated on-the-fly from the schema definition: + entities can be created, displayed, updated and deleted. As display + views are not very fancy, it is usually necessary to develop your + own. Any generated view can be overridden by defining a new one with + the same identifier. + + +.. _`Python Remote Object`: http://pyro.sourceforge.net/ +.. _`yams`: http://www.logilab.org/project/yams/ + + +`CubicWeb` engine +~~~~~~~~~~~~~~~~~ + +The engine in `CubicWeb` is a set of classes managing a set of objects loaded +dynamically at the startup of `CubicWeb` (*appobjects*). Those dynamics objects, based on the schema +or the library, are building the final application. The differents dymanic components are +by example: + +* client and server side + + - entities definition, containing the logic which enables application data manipulation + +* client side + + - *views*, or more specifically + + - boxes + - header and footer + - forms + - page templates + + - *actions* + - *controllers* + +* server side + + - notification hooks + - notification views + +The components of the engine are: + +* a frontal web (only twisted is available so far), transparent for dynamic objects +* an object that encapsulates the configuration +* a `registry` (`cubicweb.cwvreg`) containing the dynamic objects loaded automatically + +Every *appobject* may access to the instance configuration using its *config* attribute +and to the registry using its *vreg* attribute. + +Details of the recording process +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +At startup, the `registry` or registers base, inspects a number of directories +looking for compatible classes definition. After a recording process, the objects +are assigned to registers so that they can be selected dynamically while the +application is running. + +The base class of those objects is `AppRsetObject` (module `cubicweb.common.appobject`). + +XXX registers example +XXX actual details of the recording process! + +Runtime objects selection +~~~~~~~~~~~~~~~~~~~~~~~~~ + +XXX tell why it's a cw foundation! + +Application objects are stored in the registry using a two level hierarchy : + + object's `__registry__` : object's `id` : [list of app objects] + +The following rules are applied to select an object given a register and an id and an input context: +* each object has a selector + - its selector may be derivated from a set of basic (or not :) + selectors using `chainall` or `chainfirst` combinators +* a selector return a score >= 0 +* a score of 0 means the objects can't be applied to the input context +* the object with the greatest score is selected. If multiple objects have an + identical score, one of them is selected randomly (this is usually a bug) + +The object's selector is the `__selector__` class method on the object's class. + +The score is used to choose the most pertinent objects where there are more than +one selectable object. For instance, if you're selecting the primary +(eg `id = 'primary'`) view (eg `__registry__ = 'view'`) for a result set containing +a `Card` entity, 2 objects will probably be selectable: + +* the default primary view (`accepts = 'Any'`) +* the specific `Card` primary view (`accepts = 'Card'`) + +This is because primary views are using the `accept_selector` which is considering the +`accepts` class attribute of the object's class. Other primary views specific to other +entity types won't be selectable in this case. And among selectable objects, the +accept selector will return a higher score the the second view since it's more +specific, so it will be selected as expected. + +Usually, you won't define it directly but by defining the `__selectors__` tuple +on the class, with :: + + __selectors__ = (sel1, sel2) + +which is equivalent to :: + + __selector__ = chainall(sel1, sel2) + +The former is prefered since it's shorter and it's ease overriding in +subclasses (you have access to sub-selectors instead of the wrapping function). + +:chainall(selectors...): if one selector return 0, return 0, else return the sum of scores + +:chainfirst(selectors...): return the score of the first selector which has a non zero score + +XXX describe standard selector (link to generated api doc!) + +Example +```````` + +Le but final : quand on est sur un Blog, on veut que le lien rss de celui-ci pointe +vers les entrées de ce blog, non vers l'entité blog elle-même. + +L'idée générale pour résoudre ça : on définit une méthode sur les classes d'entité +qui renvoie l'url du flux rss pour l'entité en question. Avec une implémentation +par défaut sur AnyEntity et une implémentation particulière sur Blog qui fera ce +qu'on veut. + +La limitation : on est embêté dans le cas ou par ex. on a un result set qui contient +plusieurs entités Blog (ou autre chose), car on ne sait pas sur quelle entité appeler +la méthode sus-citée. Dans ce cas, on va conserver le comportement actuel (eg appel +à limited_rql) + +Donc : on veut deux cas ici, l'un pour un rset qui contient une et une seule entité, +l'autre pour un rset qui contient plusieurs entité. + +Donc... On a déja dans web/views/boxes.py la classe RSSIconBox qui fonctionne. Son +sélecteur :: + + class RSSIconBox(ExtResourcesBoxTemplate): + """just display the RSS icon on uniform result set""" + __selectors__ = ExtResourcesBoxTemplate.__selectors__ + (nfentity_selector,) + + +indique qu'il prend en compte : + +* les conditions d'apparition de la boite (faut remonter dans les classes parentes + pour voir le détail) +* nfentity_selector, qui filtre sur des rset contenant une liste d'entité non finale + +ça correspond donc à notre 2eme cas. Reste à fournir un composant plus spécifique +pour le 1er cas :: + + class EntityRSSIconBox(RSSIconBox): + """just display the RSS icon on uniform result set for a single entity""" + __selectors__ = RSSIconBox.__selectors__ + (onelinerset_selector,) + + +Ici, on ajoute onelinerset_selector, qui filtre sur des rset de taille 1. Il faut +savoir que quand on chaine des selecteurs, le score final est la somme des scores +renvoyés par chaque sélecteur (sauf si l'un renvoie zéro, auquel cas l'objet est +non sélectionnable). Donc ici, sur un rset avec plusieurs entités, onelinerset_selector +rendra la classe EntityRSSIconBox non sélectionnable, et on obtiendra bien la +classe RSSIconBox. Pour un rset avec une entité, la classe EntityRSSIconBox aura un +score supérieur à RSSIconBox et c'est donc bien elle qui sera sélectionnée. + +Voili voilou, il reste donc pour finir tout ça : + +* à définir le contenu de la méthode call de EntityRSSIconBox +* fournir l'implémentation par défaut de la méthode renvoyant l'url du flux rss sur + AnyEntity +* surcharger cette methode dans blog.Blog + + +When to use selectors? +``````````````````````` + +Il faut utiliser les sélecteurs pour faire des choses différentes en +fonction de ce qu'on a en entrée. Dès qu'on a un "if" qui teste la +nature de `self.rset` dans un objet, il faut très sérieusement se +poser la question s'il ne vaut pas mieux avoir deux objets différent +avec des sélecteurs approprié. + +If this is so fundamental, why don't I see them more often? +``````````````````````````````````````````````````````````` + +Because you're usually using base classes which are hiding the plumbing +of __registry__ (almost always), id (often when using "standard" object), +register and selector. + +API Python/RQL +~~~~~~~~~~~~~~ + +Inspired from the standard db-api, with a Connection object having the methods +cursor, rollback and commit essentially. The most important method is +the `execute` method of a cursor : + +`execute(rqlstring, args=None, eid_key=None, build_descr=True)` + +:rqlstring: the RQL query to execute (unicode) +:args: if the query contains substitutions, a dictionnary containing the values to use +:eid_key: + an implementation detail of the RQL queries cache implies that if a substitution + is used to introduce an eid *susceptible to raise the ambiguities in the query + type resolution*, then we have to specify the correponding key in the dictionnary + through this argument + + +The `Connection` object owns the methods `commit` and `rollback`. You *should +never need to use them* during the development of the web interface based on +the `CubicWeb` framework as it determines the end of the transaction depending +on the query execution success. + +.. note:: + While executing updates queries (SET, INSERT, DELETE), if a query generates + an error related to security, a rollback is automatically done on the current + transaction. + + +The `Request` class (`cubicweb.web`) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A request instance is created when an HTPP request is sent to the web server. +It contains informations such as forms parameters, user authenticated, etc. + +**Globally, a request represents a user query, either through HTTP or not +(we also talk about RQL queries on the server side by example)** + +An instance of `Request` has the following attributes: + +* `user`, instance of `cubicweb.common.utils.User` corresponding to the authenticated + user +* `form`, dictionnary containing the values of a web form +* `encoding`, characters encoding to use in the response + +But also: + +:Session data handling: + * `session_data()`, returns a dictinnary containing all the session data + * `get_session_data(key, default=None)`, returns a value associated to the given + key or the value `default` if the key is not defined + * `set_session_data(key, value)`, assign a value to a key + * `del_session_data(key)`, suppress the value associated to a key + + +:Cookies handling: + * `get_cookie()`, returns a dictionnary containing the value of the header + HTTP 'Cookie' + * `set_cookie(cookie, key, maxage=300)`, adds a header HTTP `Set-Cookie`, + with a minimal 5 minutes length of duration by default (`maxage` = None + returns a *session* cookie which will expire when the user closes the browser + window + * `remove_cookie(cookie, key)`, forces a value to expire + +:URL handling: + * `url()`, returns the full URL of the HTTP request + * `base_url()`, returns the root URL of the web application + * `relative_path()`, returns the relative path of the request + +:And more...: + * `set_content_type(content_type, filename=None)`, adds the header HTTP + 'Content-Type' + * `get_header(header)`, returns the value associated to an arbitrary header + of the HTTP request + * `set_header(header, value)`, adds an arbitrary header in the response + * `cursor()` returns a RQL cursor on the session + * `execute(*args, **kwargs)`, shortcut to ``.cursor().execute()`` + * `property_value(key)`, properties management (`EProperty`) + * dictionnary `data` to store data to share informations between components + *while a request is executed* + +Please note down that this class is abstract and that a concrete implementation +will be provided by the *frontend* web used (in particular *twisted* as of +today). For the views or others that are executed on the server side, +most of the interface of `Request` is defined in the session associated +to the client. + +The `AppObject` class +~~~~~~~~~~~~~~~~~~~~~ + +In general: + +* we do not inherit directly from this class but from a more specific + class such as `AnyEntity`, `EntityView`, `AnyRsetView`, + `Action`... + +* to be recordable, a subclass has to define its own register (attribute + `__registry__`) and its identifier (attribute `id`). Usually we do not have + to take care of the register, only the identifier `id`. + +We can find a certain number of attributes and methods defined in this class +and so common to all the application objects: + +At the recording, the following attributes are dynamically added to +the *subclasses*: + +* `vreg`, the `vregistry` of the application +* `schema`, the application schema +* `config`, the application configuration + +We also find on instances, the following attributes: + +* `req`, `Request` instance +* `rset`, the *result set* associated to the object if necessarry +* `cursor`, rql cursor on the session + + +:URL handling: + * `build_url(method=None, **kwargs)`, returns an absolute URL based on + the given arguments. The *controller* supposed to handle the response + can be specified through the special parameter `method` (the connection + is theoretically done automatically :). + + * `datadir_url()`, returns the directory of the application data + (contains static files such as images, css, js...) + + * `base_url()`, shortcut to `req.base_url()` + + * `url_quote(value)`, version *unicode safe* of the function `urllib.quote` + +:Data manipulation: + + * `etype_rset(etype, size=1)`, shortcut to `vreg.etype_rset()` + + * `eid_rset(eid, rql=None, descr=True)`, returns a *result set* object for + the given eid + * `entity(row, col=0)`, returns the entity corresponding to the data position + in the *result set* associated to the object + + * `complete_entity(row, col=0, skip_bytes=True)`, is equivalent to `entity` but + also call the method `complete()` on the entity before returning it + +:Data formatting: + * `format_date(date, date_format=None, time=False)` + * `format_time(time)` + +:And more...: + + * `external_resource(rid, default=_MARKER)`, access to a value defined in the + configuration file `external_resource` + + * `tal_render(template, variables)`, + + +.. note:: + When we inherit from `AppObject` (even not directly), you *always* have to use + **super()** to get the methods and attributes of the superclasses, and not + use the class identifier. + By example, instead of writting: :: + + class Truc(PrimaryView): + def f(self, arg1): + PrimaryView.f(self, arg1) + + You'd better write: :: + + class Truc(PrimaryView): + def f(self, arg1): + super(Truc, self).f(arg1) + + +Standard structure for a cube +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A complex cube is structured as follows: + +:: + + mycube/ + | + |-- schema.py + | + |-- entities/ + | + |-- sobjects/ + | + |-- views/ + | + |-- test/ + | + |-- i18n/ + | + |-- data/ + | + |-- migration/ + | |- postcreate.py + | \- depends.map + | + |-- debian/ + | + \-- __pkginfo__.py + +We can use simple Python module instead of packages, by example: + +:: + + mycube/ + | + |-- entities.py + |-- hooks.py + \-- views.py + + +where : + +* ``schema`` contains the schema definition (server side only) +* ``entities`` contains the entities definition (server side and web interface) +* ``sobjects`` contains hooks and/or views notifications (server side only) +* ``views`` contains the different components of the web interface (web interface only) +* ``test`` contains tests specifics to the application (not installed) +* ``i18n`` contains the messages catalog for supported languages (server side and + web interface) +* ``data`` contains arbitrary data files served statically + (images, css, javascripts files)... (web interface only) +* ``migration`` contains the initialization file for new instances + (``postcreate.py``) and in general a file containing the `CubicWeb` dependancies + of the cube depending on its version (``depends.map``) +* ``debian`` contains all the files that manages the debian packaging + (you would find there the classical structure with ``control``, ``rules``, + ``changelog``... (not installed) +* the file ``__pkginfo__.py`` provides meta-data on the cube, especially the + distribution name and the current version (server side and web interface) or + also the sub-cubes used by this cube + +The only required files are: + +* the file ``__pkginfo__.py`` +* the schema definition + XXX false, we may want to have cubes which are only adding a service, no persistent data (eg embeding for instance) +