[doc] missing file
authorNicolas Chauvat <nicolas.chauvat@logilab.fr>
Thu, 20 Nov 2008 22:07:22 +0100
changeset 119 7a56ca431d65
parent 118 fae5651e7593
child 120 ec4055760654
[doc] missing file
doc/book/en/02-01-concepts.en.txt
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/book/en/02-01-concepts.en.txt	Thu Nov 20 22:07:22 2008 +0100
@@ -0,0 +1,547 @@
+.. -*- coding: utf-8 -*-
+
+Concepts
+--------
+
+Global architecture
+~~~~~~~~~~~~~~~~~~~
+.. image:: images/archi_globale.png
+
+.. note::
+  For real, the client and server sides are integrated in the same
+  process and interact directly, without the needs for distants
+  calls using Pyro. It is important to note down that those two
+  sides, client/server, are disjointed and it is possible to execute
+  a couple of calls in distincts processes to balance the load of
+  your web site on one or more machines.
+
+.. _TermsVocabulary:
+
+Terms and vocabulary
+~~~~~~~~~~~~~~~~~~~~~
+
+*schema*
+  The schema defines the data model of an application based on entities
+  and relations, modeled with a comprehensive language made of Python
+  classes based on `yams`_ library. This is the core piece
+  of an application. It is initially defined in the file system and is
+  stored in the database at the time an instance is created. `CubicWeb`
+  provides a certain number of system entities included automatically as
+  it is necessarry for the core of `CubicWeb` and a library of
+  cubes that can be explicitely included if necessary.
+
+
+*entity type*
+  An entity is a set of attributes; the essential attribute of
+  an entity is its key, named eid
+
+*relation type*
+  Entities are linked to each others by relations. In `CubicWeb`
+  relations are binary: by convention we name the first item of
+  a relation the `subject` and the second the `object`.
+
+*final entity type*
+  Final types corresponds to the basic types such as string of characters,
+  integers... Those types have a main property which is that they can
+  only be used as `object` of a relation. The attributes of an entity
+  (non final) are entities (finals).
+
+*final relation type*
+  A relation is said final if its `object` is a final type. This is equivalent
+  to an entity attribute.
+
+*relation definition*
+  a relation definition is a 3-uple (subject entity type, relation type, object entity type),
+  with an associated set of property such as cardinality, constraints...
+  
+*repository*
+  This is the RQL server side of `CubicWeb`. Be carefull not to get
+  confused with a Mercurial repository or a debian repository.
+
+*source*
+  A data source is a container of data (SGBD, LDAP directory, `Google
+  App Engine`'s datastore ...) integrated in the
+  `CubicWeb` repository. This repository has at least one source, `system` which 
+  contains the schema of the application, plain-text index and others
+  vital informations for the system.
+
+*configuration*
+  It is possible to create differents configurations for an instance:
+
+  - ``repository`` : repository only, accessible for clients using Pyro
+  - ``twisted`` : web interface only, access the repository using Pyro
+  - ``all-in-one`` : web interface and repository in a single process. 
+     The repository could be or not accessible using Pyro.
+
+*cube*
+  A cube is a model grouping one or multiple data types and/or views
+  to provide a specific functionnality or a complete `CubicWeb` application
+  potentially using other cubes. The available subes are located in the file
+  system at `/path/to/forest/cubicweb/cubes`.
+  Larger applications can be built faster by importing cubes,
+  adding entities and relationships and overriding the
+  views that need to display or edit informations not provided by
+  cubes.
+
+*instance*
+  An instance is a specific installation of a cube. All the required 
+  configuration files necessarry for the well being of your web application
+  are grouped in an instance. This will refer to the cube(s) your application
+  is based on.
+  By example logilab.org and our intranet are two instances of a single
+  cube jpl, developped internally.
+  The instances are defined in the directory `~/etc/cubicweb.d`.
+
+*application*
+  The term application is sometime used to talk about an instance
+  and sometimes to talk of a cube depending on the context. 
+  So we would like to avoid using this term and try to use *cube* and
+  *instance* instead.
+
+*result set*
+  This object contains the results of an RQL query sent to the source
+  and information on the query.
+
+*Pyro*
+  `Python Remote Object`_, distributed objects system similar to Java's RMI
+  (Remote Method Invocation), which can be used for the dialog between the web
+  side of the framework and the RQL repository.
+
+*query language*
+  A full-blown query language named RQL is used to formulate requests
+  to the database or any sources such as LDAP or `Google App Engine`'s 
+  datastore.
+
+*views*
+  A view is applied to a `result set` to present it as HTML, XML,
+  JSON, CSV, etc. Views are implemented as Python classes. There is no
+  templating language.
+
+*generated user interface*
+  A user interface is generated on-the-fly from the schema definition:
+  entities can be created, displayed, updated and deleted. As display
+  views are not very fancy, it is usually necessary to develop your
+  own. Any generated view can be overridden by defining a new one with
+  the same identifier.
+
+  
+.. _`Python Remote Object`: http://pyro.sourceforge.net/
+.. _`yams`: http://www.logilab.org/project/yams/
+
+
+`CubicWeb` engine
+~~~~~~~~~~~~~~~~~
+
+The engine in `CubicWeb` is a set of classes managing a set of objects loaded
+dynamically at the startup of `CubicWeb` (*appobjects*). Those dynamics objects, based on the schema
+or the library, are building the final application. The differents dymanic components are
+by example:
+
+* client and server side
+
+  - entities definition, containing the logic which enables application data manipulation
+
+* client side
+
+  - *views*, or more specifically
+
+    - boxes
+    - header and footer
+    - forms
+    - page templates
+
+  - *actions*
+  - *controllers*
+
+* server side
+
+  - notification hooks
+  - notification views
+
+The components of the engine are:
+
+* a frontal web (only twisted is available so far), transparent for dynamic objects
+* an object that encapsulates the configuration
+* a `registry` (`cubicweb.cwvreg`) containing the dynamic objects loaded automatically
+
+Every *appobject* may access to the instance configuration using its *config* attribute
+and to the registry using its *vreg* attribute.
+
+Details of the recording process
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+At startup, the `registry` or registers base, inspects a number of directories
+looking for compatible classes definition. After a recording process, the objects
+are assigned to registers so that they can be selected dynamically while the
+application is running.
+
+The base class of those objects is `AppRsetObject` (module `cubicweb.common.appobject`).
+
+XXX registers example
+XXX actual details of the recording process!
+
+Runtime objects selection
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+XXX tell why it's a cw foundation!
+
+Application objects are stored in the registry using a two level hierarchy :
+
+  object's `__registry__` : object's `id` : [list of app objects]
+
+The following rules are applied to select an object given a register and an id and an input context:
+* each object has a selector
+  - its selector may be derivated from a set of basic (or not :)
+    selectors using `chainall` or `chainfirst` combinators
+* a selector return a score >= 0
+* a score of 0 means the objects can't be applied to the input context
+* the object with the greatest score is selected. If multiple objects have an
+  identical score, one of them is selected randomly (this is usually a bug)
+
+The object's selector is the `__selector__` class method on the object's class.
+
+The score is used to choose the most pertinent objects where there are more than
+one selectable object. For instance, if you're selecting the primary
+(eg `id = 'primary'`) view (eg `__registry__ = 'view'`) for a result set containing
+a `Card` entity, 2 objects will probably be selectable:
+
+* the default primary view (`accepts = 'Any'`)
+* the specific `Card` primary view (`accepts = 'Card'`)
+
+This is because primary views are using the `accept_selector` which is considering the
+`accepts` class attribute of the object's class. Other primary views specific to other
+entity types won't be selectable in this case. And among selectable objects, the
+accept selector will return a higher score the the second view since it's more
+specific, so it will be selected as expected.
+
+Usually, you won't define it directly but by defining the `__selectors__` tuple
+on the class, with ::
+
+  __selectors__ = (sel1, sel2)
+
+which is equivalent to ::
+
+  __selector__ = chainall(sel1, sel2)
+
+The former is prefered since it's shorter and it's ease overriding in
+subclasses (you have access to sub-selectors instead of the wrapping function).
+
+:chainall(selectors...): if one selector return 0, return 0, else return the sum of scores
+
+:chainfirst(selectors...): return the score of the first selector which has a non zero score
+
+XXX describe standard selector (link to generated api doc!)
+
+Example
+````````
+
+Le but final : quand on est sur un Blog, on veut que le lien rss de celui-ci pointe
+vers les entrées de ce blog, non vers l'entité blog elle-même.
+
+L'idée générale pour résoudre ça : on définit une méthode sur les classes d'entité
+qui renvoie l'url du flux rss pour l'entité en question. Avec une implémentation
+par défaut sur AnyEntity et une implémentation particulière sur Blog qui fera ce
+qu'on veut.
+
+La limitation : on est embêté dans le cas ou par ex. on a un result set qui contient
+plusieurs entités Blog (ou autre chose), car on ne sait pas sur quelle entité appeler
+la méthode sus-citée. Dans ce cas, on va conserver le comportement actuel (eg appel
+à limited_rql)
+
+Donc : on veut deux cas ici, l'un pour un rset qui contient une et une seule entité,
+l'autre pour un rset qui contient plusieurs entité.
+
+Donc... On a déja dans web/views/boxes.py la classe RSSIconBox qui fonctionne. Son
+sélecteur ::
+
+  class RSSIconBox(ExtResourcesBoxTemplate):
+    """just display the RSS icon on uniform result set"""
+    __selectors__ = ExtResourcesBoxTemplate.__selectors__ + (nfentity_selector,)
+
+
+indique qu'il prend en compte :
+
+* les conditions d'apparition de la boite (faut remonter dans les classes parentes
+  pour voir le détail)
+* nfentity_selector, qui filtre sur des rset contenant une liste d'entité non finale
+
+ça correspond donc à notre 2eme cas. Reste à fournir un composant plus spécifique
+pour le 1er cas ::
+
+  class EntityRSSIconBox(RSSIconBox):
+    """just display the RSS icon on uniform result set for a single entity"""
+    __selectors__ = RSSIconBox.__selectors__ + (onelinerset_selector,)
+
+
+Ici, on ajoute onelinerset_selector, qui filtre sur des rset de taille 1. Il faut
+savoir que quand on chaine des selecteurs, le score final est la somme des scores
+renvoyés par chaque sélecteur (sauf si l'un renvoie zéro, auquel cas l'objet est
+non sélectionnable). Donc ici, sur un rset avec plusieurs entités, onelinerset_selector
+rendra la classe EntityRSSIconBox non sélectionnable, et on obtiendra bien la
+classe RSSIconBox. Pour un rset avec une entité, la classe EntityRSSIconBox aura un
+score supérieur à RSSIconBox et c'est donc bien elle qui sera sélectionnée.
+
+Voili voilou, il reste donc pour finir tout ça :
+
+* à définir le contenu de la méthode call de EntityRSSIconBox
+* fournir l'implémentation par défaut de la méthode renvoyant l'url du flux rss sur
+  AnyEntity
+* surcharger cette methode dans blog.Blog
+
+
+When to use selectors?
+```````````````````````
+
+Il faut utiliser les sélecteurs pour faire des choses différentes en
+fonction de ce qu'on a en entrée. Dès qu'on a un "if" qui teste la
+nature de `self.rset` dans un objet, il faut très sérieusement se
+poser la question s'il ne vaut pas mieux avoir deux objets différent
+avec des sélecteurs approprié.
+
+If this is so fundamental, why don't I see them more often?
+```````````````````````````````````````````````````````````
+
+Because you're usually using base classes which are hiding the plumbing
+of __registry__ (almost always), id (often when using "standard" object),
+register and selector.
+
+API Python/RQL
+~~~~~~~~~~~~~~
+
+Inspired from the standard db-api, with a Connection object having the methods
+cursor, rollback and commit essentially. The most important method is
+the `execute` method of a cursor :
+
+`execute(rqlstring, args=None, eid_key=None, build_descr=True)`
+
+:rqlstring: the RQL query to execute (unicode)
+:args: if the query contains substitutions, a dictionnary containing the values to use
+:eid_key: 
+   an implementation detail of the RQL queries cache implies that if a substitution
+   is used to introduce an eid *susceptible to raise the ambiguities in the query
+   type resolution*, then we have to specify the correponding key in the dictionnary
+   through this argument
+
+
+The `Connection` object owns the methods `commit` and `rollback`. You *should
+never need to use them* during the development of the web interface based on
+the `CubicWeb` framework as it determines the end of the transaction depending 
+on the query execution success.
+
+.. note::
+  While executing updates queries (SET, INSERT, DELETE), if a query generates
+  an error related to security, a rollback is automatically done on the current
+  transaction.
+  
+
+The `Request` class (`cubicweb.web`)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A request instance is created when an HTPP request is sent to the web server.
+It contains informations such as forms parameters, user authenticated, etc.
+
+**Globally, a request represents a user query, either through HTTP or not
+(we also talk about RQL queries on the server side by example)**
+
+An instance of `Request` has the following attributes:
+
+* `user`, instance of `cubicweb.common.utils.User` corresponding to the authenticated
+  user
+* `form`, dictionnary containing the values of a web form
+* `encoding`, characters encoding to use in the response
+
+But also:
+
+:Session data handling:
+  * `session_data()`, returns a dictinnary containing all the session data
+  * `get_session_data(key, default=None)`, returns a value associated to the given
+    key or the value `default` if the key is not defined
+  * `set_session_data(key, value)`, assign a value to a key
+  * `del_session_data(key)`,  suppress the value associated to a key
+    
+
+:Cookies handling:
+  * `get_cookie()`, returns a dictionnary containing the value of the header
+    HTTP 'Cookie'
+  * `set_cookie(cookie, key, maxage=300)`, adds a header HTTP `Set-Cookie`,
+    with a minimal 5 minutes length of duration by default (`maxage` = None
+    returns a *session* cookie which will expire when the user closes the browser
+    window
+  * `remove_cookie(cookie, key)`, forces a value to expire
+
+:URL handling:
+  * `url()`, returns the full URL of the HTTP request
+  * `base_url()`, returns the root URL of the web application
+  * `relative_path()`, returns the relative path of the request
+
+:And more...:
+  * `set_content_type(content_type, filename=None)`, adds the header HTTP
+    'Content-Type'
+  * `get_header(header)`, returns the value associated to an arbitrary header
+    of the HTTP request
+  * `set_header(header, value)`, adds an arbitrary header in the response
+  * `cursor()` returns a RQL cursor on the session
+  * `execute(*args, **kwargs)`, shortcut to ``.cursor().execute()``
+  * `property_value(key)`, properties management (`EProperty`)
+  * dictionnary `data` to store data to share informations between components
+    *while a request is executed*
+
+Please note down that this class is abstract and that a concrete implementation
+will be provided by the *frontend* web used (in particular *twisted* as of
+today). For the views or others that are executed on the server side,
+most of the interface of `Request` is defined in the session associated
+to the client.
+
+The `AppObject` class
+~~~~~~~~~~~~~~~~~~~~~
+
+In general:
+
+* we do not inherit directly from this class but from a more specific
+  class such as `AnyEntity`, `EntityView`, `AnyRsetView`,
+  `Action`...
+
+* to be recordable, a subclass has to define its own register (attribute
+  `__registry__`) and its identifier (attribute `id`). Usually we do not have
+  to take care of the register, only the identifier `id`.
+
+We can find a certain number of attributes and methods defined in this class 
+and so common to all the application objects:
+
+At the recording, the following attributes are dynamically added to
+the *subclasses*:
+
+* `vreg`, the `vregistry` of the application
+* `schema`, the application schema
+* `config`, the application configuration
+
+We also find on instances, the following attributes:
+
+* `req`, `Request` instance
+* `rset`, the *result set* associated to the object if necessarry
+* `cursor`, rql cursor on the session
+
+
+:URL handling:
+  * `build_url(method=None, **kwargs)`, returns an absolute URL based on
+    the given arguments. The *controller* supposed to handle the response
+    can be specified through the special parameter `method` (the connection
+    is theoretically done automatically :).
+
+  * `datadir_url()`, returns the directory of the application data
+    (contains static files such as images, css, js...)
+
+  * `base_url()`, shortcut to `req.base_url()`
+
+  * `url_quote(value)`, version *unicode safe* of the function `urllib.quote`
+
+:Data manipulation:
+
+  * `etype_rset(etype, size=1)`, shortcut to `vreg.etype_rset()`
+
+  * `eid_rset(eid, rql=None, descr=True)`, returns a *result set* object for
+    the given eid
+  * `entity(row, col=0)`, returns the entity corresponding to the data position
+    in the *result set* associated to the object
+
+  * `complete_entity(row, col=0, skip_bytes=True)`, is equivalent to `entity` but
+    also call the method `complete()` on the entity before returning it
+
+:Data formatting:
+  * `format_date(date, date_format=None, time=False)`
+  * `format_time(time)`
+
+:And more...:
+
+  * `external_resource(rid, default=_MARKER)`, access to a value defined in the
+    configuration file `external_resource`
+    
+  * `tal_render(template, variables)`, 
+
+
+.. note::
+  When we inherit from `AppObject` (even not directly), you *always* have to use
+  **super()** to get the methods and attributes of the superclasses, and not
+  use the class identifier.
+  By example, instead of writting: ::
+
+      class Truc(PrimaryView):
+          def f(self, arg1):
+              PrimaryView.f(self, arg1)
+
+  You'd better write: ::
+  
+      class Truc(PrimaryView):
+          def f(self, arg1):
+              super(Truc, self).f(arg1)
+
+
+Standard structure for a cube
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A complex cube is structured as follows:
+
+::
+  
+  mycube/
+  |
+  |-- schema.py
+  |
+  |-- entities/
+  |
+  |-- sobjects/
+  |
+  |-- views/
+  |
+  |-- test/
+  |
+  |-- i18n/
+  |
+  |-- data/
+  |
+  |-- migration/
+  | |- postcreate.py
+  | \- depends.map
+  |
+  |-- debian/
+  |
+  \-- __pkginfo__.py
+    
+We can use simple Python module instead of packages, by example: 
+
+::
+  
+  mycube/
+  |
+  |-- entities.py
+  |-- hooks.py
+  \-- views.py
+    
+
+where :
+
+* ``schema`` contains the schema definition (server side only)
+* ``entities`` contains the entities definition (server side and web interface)
+* ``sobjects`` contains hooks and/or views notifications (server side only)
+* ``views`` contains the different components of the web interface (web interface only)
+* ``test`` contains tests specifics to the application (not installed)
+* ``i18n`` contains the messages catalog for supported languages (server side and 
+  web interface) 
+* ``data`` contains arbitrary data files served statically
+  (images, css, javascripts files)... (web interface only)
+* ``migration`` contains the initialization file for new instances
+  (``postcreate.py``) and in general a file containing the `CubicWeb` dependancies 
+  of the cube depending on its version (``depends.map``)
+* ``debian`` contains all the files that manages the debian packaging
+  (you would find there the classical structure with ``control``, ``rules``, 
+  ``changelog``... (not installed)
+* the file ``__pkginfo__.py`` provides meta-data on the cube, especially the 
+  distribution name and the current version (server side and web interface) or
+  also the sub-cubes used by this cube
+
+The only required files are:
+
+* the file ``__pkginfo__.py``
+* the schema definition
+  XXX false, we may want to have cubes which are only adding a service, no persistent data (eg embeding for instance)
+