doc/book/en/A03a-concepts.en.txt
author Nicolas Chauvat <nicolas.chauvat@logilab.fr>
Sat, 22 Nov 2008 23:59:42 +0100
changeset 127 ae611743f5c6
parent 123 doc/book/en/02-01-concepts.en.txt@c5dd68070dea
child 229 767ff7f5d5a7
permissions -rw-r--r--
[doc] divided book in parts

.. -*- coding: utf-8 -*-

Concepts
--------

Global architecture
~~~~~~~~~~~~~~~~~~~
.. image:: images/archi_globale.png

.. note::
  For real, the client and server sides are integrated in the same
  process and interact directly, without the needs for distants
  calls using Pyro. It is important to note down that those two
  sides, client/server, are disjointed and it is possible to execute
  a couple of calls in distincts processes to balance the load of
  your web site on one or more machines.

.. _TermsVocabulary:

Terms and vocabulary
~~~~~~~~~~~~~~~~~~~~~

*schema*
  The schema defines the data model of an application based on entities
  and relations, modeled with a comprehensive language made of Python
  classes based on `yams`_ library. This is the core piece
  of an application. It is initially defined in the file system and is
  stored in the database at the time an instance is created. `CubicWeb`
  provides a certain number of system entities included automatically as
  it is necessarry for the core of `CubicWeb` and a library of
  cubes that can be explicitely included if necessary.


*entity type*
  An entity is a set of attributes; the essential attribute of
  an entity is its key, named eid

*relation type*
  Entities are linked to each others by relations. In `CubicWeb`
  relations are binary: by convention we name the first item of
  a relation the `subject` and the second the `object`.

*final entity type*
  Final types corresponds to the basic types such as string of characters,
  integers... Those types have a main property which is that they can
  only be used as `object` of a relation. The attributes of an entity
  (non final) are entities (finals).

*final relation type*
  A relation is said final if its `object` is a final type. This is equivalent
  to an entity attribute.

*relation definition*
  a relation definition is a 3-uple (subject entity type, relation type, object entity type),
  with an associated set of property such as cardinality, constraints...
  
*repository*
  This is the RQL server side of `CubicWeb`. Be carefull not to get
  confused with a Mercurial repository or a debian repository.

*source*
  A data source is a container of data (SGBD, LDAP directory, `Google
  App Engine`'s datastore ...) integrated in the
  `CubicWeb` repository. This repository has at least one source, `system` which 
  contains the schema of the application, plain-text index and others
  vital informations for the system.

*configuration*
  It is possible to create differents configurations for an instance:

  - ``repository`` : repository only, accessible for clients using Pyro
  - ``twisted`` : web interface only, access the repository using Pyro
  - ``all-in-one`` : web interface and repository in a single process. 
     The repository could be or not accessible using Pyro.

*cube*
  A cube is a model grouping one or multiple data types and/or views
  to provide a specific functionnality or a complete `CubicWeb` application
  potentially using other cubes. The available subes are located in the file
  system at `/path/to/forest/cubicweb/cubes`.
  Larger applications can be built faster by importing cubes,
  adding entities and relationships and overriding the
  views that need to display or edit informations not provided by
  cubes.

*instance*
  An instance is a specific installation of a cube. All the required 
  configuration files necessarry for the well being of your web application
  are grouped in an instance. This will refer to the cube(s) your application
  is based on.
  By example logilab.org and our intranet are two instances of a single
  cube jpl, developped internally.
  The instances are defined in the directory `~/etc/cubicweb.d`.

*application*
  The term application is sometime used to talk about an instance
  and sometimes to talk of a cube depending on the context. 
  So we would like to avoid using this term and try to use *cube* and
  *instance* instead.

*result set*
  This object contains the results of an RQL query sent to the source
  and information on the query.

*Pyro*
  `Python Remote Object`_, distributed objects system similar to Java's RMI
  (Remote Method Invocation), which can be used for the dialog between the web
  side of the framework and the RQL repository.

*query language*
  A full-blown query language named RQL is used to formulate requests
  to the database or any sources such as LDAP or `Google App Engine`'s 
  datastore.

*views*
  A view is applied to a `result set` to present it as HTML, XML,
  JSON, CSV, etc. Views are implemented as Python classes. There is no
  templating language.

*generated user interface*
  A user interface is generated on-the-fly from the schema definition:
  entities can be created, displayed, updated and deleted. As display
  views are not very fancy, it is usually necessary to develop your
  own. Any generated view can be overridden by defining a new one with
  the same identifier.

*rql*
 XXX
 
.. _`Python Remote Object`: http://pyro.sourceforge.net/
.. _`yams`: http://www.logilab.org/project/yams/


`CubicWeb` engine
~~~~~~~~~~~~~~~~~

The engine in `CubicWeb` is a set of classes managing a set of objects loaded
dynamically at the startup of `CubicWeb` (*appobjects*). Those dynamics objects, based on the schema
or the library, are building the final application. The differents dymanic components are
by example:

* client and server side

  - entities definition, containing the logic which enables application data manipulation

* client side

  - *views*, or more specifically

    - boxes
    - header and footer
    - forms
    - page templates

  - *actions*
  - *controllers*

* server side

  - notification hooks
  - notification views

The components of the engine are:

* a frontal web (only twisted is available so far), transparent for dynamic objects
* an object that encapsulates the configuration
* a `registry` (`cubicweb.cwvreg`) containing the dynamic objects loaded automatically

Every *appobject* may access to the instance configuration using its *config* attribute
and to the registry using its *vreg* attribute.

API Python/RQL
~~~~~~~~~~~~~~

Inspired from the standard db-api, with a Connection object having the methods
cursor, rollback and commit essentially. The most important method is
the `execute` method of a cursor :

`execute(rqlstring, args=None, eid_key=None, build_descr=True)`

:rqlstring: the RQL query to execute (unicode)
:args: if the query contains substitutions, a dictionnary containing the values to use
:eid_key: 
   an implementation detail of the RQL queries cache implies that if a substitution
   is used to introduce an eid *susceptible to raise the ambiguities in the query
   type resolution*, then we have to specify the correponding key in the dictionnary
   through this argument


The `Connection` object owns the methods `commit` and `rollback`. You *should
never need to use them* during the development of the web interface based on
the `CubicWeb` framework as it determines the end of the transaction depending 
on the query execution success.

.. note::
  While executing updates queries (SET, INSERT, DELETE), if a query generates
  an error related to security, a rollback is automatically done on the current
  transaction.
  

The `Request` class (`cubicweb.web`)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A request instance is created when an HTPP request is sent to the web server.
It contains informations such as forms parameters, user authenticated, etc.

**Globally, a request represents a user query, either through HTTP or not
(we also talk about RQL queries on the server side by example)**

An instance of `Request` has the following attributes:

* `user`, instance of `cubicweb.common.utils.User` corresponding to the authenticated
  user
* `form`, dictionnary containing the values of a web form
* `encoding`, characters encoding to use in the response

But also:

:Session data handling:
  * `session_data()`, returns a dictinnary containing all the session data
  * `get_session_data(key, default=None)`, returns a value associated to the given
    key or the value `default` if the key is not defined
  * `set_session_data(key, value)`, assign a value to a key
  * `del_session_data(key)`,  suppress the value associated to a key
    

:Cookies handling:
  * `get_cookie()`, returns a dictionnary containing the value of the header
    HTTP 'Cookie'
  * `set_cookie(cookie, key, maxage=300)`, adds a header HTTP `Set-Cookie`,
    with a minimal 5 minutes length of duration by default (`maxage` = None
    returns a *session* cookie which will expire when the user closes the browser
    window
  * `remove_cookie(cookie, key)`, forces a value to expire

:URL handling:
  * `url()`, returns the full URL of the HTTP request
  * `base_url()`, returns the root URL of the web application
  * `relative_path()`, returns the relative path of the request

:And more...:
  * `set_content_type(content_type, filename=None)`, adds the header HTTP
    'Content-Type'
  * `get_header(header)`, returns the value associated to an arbitrary header
    of the HTTP request
  * `set_header(header, value)`, adds an arbitrary header in the response
  * `cursor()` returns a RQL cursor on the session
  * `execute(*args, **kwargs)`, shortcut to ``.cursor().execute()``
  * `property_value(key)`, properties management (`EProperty`)
  * dictionnary `data` to store data to share informations between components
    *while a request is executed*

Please note down that this class is abstract and that a concrete implementation
will be provided by the *frontend* web used (in particular *twisted* as of
today). For the views or others that are executed on the server side,
most of the interface of `Request` is defined in the session associated
to the client.

The `AppObject` class
~~~~~~~~~~~~~~~~~~~~~

In general:

* we do not inherit directly from this class but from a more specific
  class such as `AnyEntity`, `EntityView`, `AnyRsetView`,
  `Action`...

* to be recordable, a subclass has to define its own register (attribute
  `__registry__`) and its identifier (attribute `id`). Usually we do not have
  to take care of the register, only the identifier `id`.

We can find a certain number of attributes and methods defined in this class 
and so common to all the application objects:

At the recording, the following attributes are dynamically added to
the *subclasses*:

* `vreg`, the `vregistry` of the application
* `schema`, the application schema
* `config`, the application configuration

We also find on instances, the following attributes:

* `req`, `Request` instance
* `rset`, the *result set* associated to the object if necessarry
* `cursor`, rql cursor on the session


:URL handling:
  * `build_url(method=None, **kwargs)`, returns an absolute URL based on
    the given arguments. The *controller* supposed to handle the response
    can be specified through the special parameter `method` (the connection
    is theoretically done automatically :).

  * `datadir_url()`, returns the directory of the application data
    (contains static files such as images, css, js...)

  * `base_url()`, shortcut to `req.base_url()`

  * `url_quote(value)`, version *unicode safe* of the function `urllib.quote`

:Data manipulation:

  * `etype_rset(etype, size=1)`, shortcut to `vreg.etype_rset()`

  * `eid_rset(eid, rql=None, descr=True)`, returns a *result set* object for
    the given eid
  * `entity(row, col=0)`, returns the entity corresponding to the data position
    in the *result set* associated to the object

  * `complete_entity(row, col=0, skip_bytes=True)`, is equivalent to `entity` but
    also call the method `complete()` on the entity before returning it

:Data formatting:
  * `format_date(date, date_format=None, time=False)`
  * `format_time(time)`

:And more...:

  * `external_resource(rid, default=_MARKER)`, access to a value defined in the
    configuration file `external_resource`
    
  * `tal_render(template, variables)`, 


.. note::
  When we inherit from `AppObject` (even not directly), you *always* have to use
  **super()** to get the methods and attributes of the superclasses, and not
  use the class identifier.
  By example, instead of writting: ::

      class Truc(PrimaryView):
          def f(self, arg1):
              PrimaryView.f(self, arg1)

  You'd better write: ::
  
      class Truc(PrimaryView):
          def f(self, arg1):
              super(Truc, self).f(arg1)


Standard structure for a cube
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A complex cube is structured as follows:

::
  
  mycube/
  |
  |-- schema.py
  |
  |-- entities/
  |
  |-- sobjects/
  |
  |-- views/
  |
  |-- test/
  |
  |-- i18n/
  |
  |-- data/
  |
  |-- migration/
  | |- postcreate.py
  | \- depends.map
  |
  |-- debian/
  |
  \-- __pkginfo__.py
    
We can use simple Python module instead of packages, by example: 

::
  
  mycube/
  |
  |-- entities.py
  |-- hooks.py
  \-- views.py
    

where :

* ``schema`` contains the schema definition (server side only)
* ``entities`` contains the entities definition (server side and web interface)
* ``sobjects`` contains hooks and/or views notifications (server side only)
* ``views`` contains the different components of the web interface (web interface only)
* ``test`` contains tests specifics to the application (not installed)
* ``i18n`` contains the messages catalog for supported languages (server side and 
  web interface) 
* ``data`` contains arbitrary data files served statically
  (images, css, javascripts files)... (web interface only)
* ``migration`` contains the initialization file for new instances
  (``postcreate.py``) and in general a file containing the `CubicWeb` dependancies 
  of the cube depending on its version (``depends.map``)
* ``debian`` contains all the files that manages the debian packaging
  (you would find there the classical structure with ``control``, ``rules``, 
  ``changelog``... (not installed)
* the file ``__pkginfo__.py`` provides meta-data on the cube, especially the 
  distribution name and the current version (server side and web interface) or
  also the sub-cubes used by this cube

The only required files are:

* the file ``__pkginfo__.py``
* the schema definition
  XXX false, we may want to have cubes which are only adding a service, no persistent data (eg embeding for instance)