doc/book/en/C050-rql.en.txt
changeset 1808 aa09e20dd8c0
parent 1693 49075f57cf2c
parent 1807 6d541c610165
child 1810 e95e876be17c
equal deleted inserted replaced
1693:49075f57cf2c 1808:aa09e20dd8c0
     1 .. -*- coding: utf-8 -*-
       
     2 
       
     3 .. _RQL:
       
     4 
       
     5 ======================================
       
     6 RQL language (Relation Query Language)
       
     7 ======================================
       
     8 
       
     9 Introduction
       
    10 ============
       
    11 
       
    12 Goals of RQL
       
    13 ------------
       
    14 
       
    15 The goal is to have a language emphasizing the way of browsing
       
    16 relations. As such, attributes will be regarded as cases of
       
    17 special relations (in terms of implementation, the language
       
    18 user should see virtually no difference between an attribute and a
       
    19 relation).
       
    20 
       
    21 RQL is inspired by SQL but is the highest level. A knowledge of the 
       
    22 `CubicWeb` schema defining the application is necessary.
       
    23 
       
    24 Comparison with existing languages
       
    25 ----------------------------------
       
    26 
       
    27 SQL
       
    28 ```
       
    29 RQL builds on the features of SQL but is at a higher level
       
    30 (the current implementation of RQL generates SQL). For that it is limited
       
    31 to the way of browsing relations and introduces variables. 
       
    32 The user does not need to know the model underlying SQL, but the `CubicWeb` 
       
    33 schema defining the application.
       
    34 
       
    35 Versa
       
    36 `````
       
    37 We should look in more detail, but here are already some ideas for
       
    38 the moment ... Versa_ is the language most similar to what we wanted
       
    39 to do, but the model underlying data being RDF, there is some
       
    40 number of things such as namespaces or handling of the RDF types which 
       
    41 does not interest us. On the functionality level, Versa_ is very comprehensive
       
    42 including through many functions of conversion and basic types manipulation,
       
    43 which may need to be guided at one time or another. 
       
    44 Finally, the syntax is a little esoteric.
       
    45 
       
    46 Sparql
       
    47 ``````
       
    48 
       
    49 The query language most similar to RQL is SPARQL_, defined by the W3C to serve
       
    50 for the semantic web. 
       
    51 
       
    52 
       
    53 The different types of queries
       
    54 ------------------------------
       
    55 
       
    56 Search (`Any`)
       
    57    Extract entities and attributes of entities.
       
    58 
       
    59 Insert entities (`INSERT`)
       
    60    Insert new entities or relations in the database.
       
    61    It can also directly create relationships for the newly created entities.
       
    62 
       
    63 Update entities, create relations (`SET`)
       
    64    Update existing entities in the database,
       
    65    or create relations between existing entities.
       
    66 
       
    67 Delete entities or relationship (`DELETE`)
       
    68    Remove entities or relations existing in the database.
       
    69 
       
    70 Search Query
       
    71 ------------
       
    72 
       
    73    [ `DISTINCT`] <entity type> V1 (V2) \ *
       
    74    [ `GROUPBY` V1 (V2) \*] [ `ORDERBY` <orderterms>]
       
    75    [ `WHERE` <restriction>]
       
    76    [ `LIMIT` <value>] [ `OFFSET` <value>]
       
    77 
       
    78 :entity type:
       
    79    Type of selected variables.
       
    80    The special type `Any` is equivalent to not specify a type.
       
    81 :restriction:
       
    82    list of conditions to test successively 
       
    83      `V1 relation V2 | <static value>`
       
    84 :orderterms:
       
    85    Definition of the selection order: variable or column number followed by
       
    86    sorting method ( `ASC`, `DESC`), ASC is the default.
       
    87 :note for grouped queries:
       
    88    For grouped queries (e.g., a clause `GROUPBY`), all
       
    89    selected variables must be aggregated or grouped.
       
    90 
       
    91 
       
    92 
       
    93 - *Search for the object of identifier 53*
       
    94   ::
       
    95 
       
    96         Any WHERE X
       
    97         X eid 53
       
    98 
       
    99 - *Search material such as comics, owned by syt and available*
       
   100   ::
       
   101 
       
   102         WHERE X Document
       
   103         X occurence_of F, F class C, C name 'Comics'
       
   104         X owned_by U, U login 'syt'
       
   105         X available true
       
   106 
       
   107 - *Looking for people working for eurocopter interested in training*
       
   108   ::
       
   109 
       
   110         Any P WHERE
       
   111         P is Person, P work_for S, S name 'Eurocopter'
       
   112         P interested_by T, T name 'training'
       
   113 
       
   114 - *Search note less than 10 days old written by jphc or ocy*
       
   115   ::
       
   116 
       
   117         Any N WHERE
       
   118         N is Note, N written_on D, D day> (today -10),
       
   119         N written_by P, P name 'jphc' or P name 'ocy'
       
   120 
       
   121 - *Looking for people interested in training or living in Paris*
       
   122   ::
       
   123 
       
   124         Any P WHERE
       
   125         P is Person, (P interested_by T, T name 'training') OR
       
   126         (P city 'Paris')
       
   127 
       
   128 - *The name and surname of all people*
       
   129   ::
       
   130 
       
   131         Any N, P WHERE
       
   132         X is Person, X name N, X first_name P
       
   133 
       
   134   Note that the selection of several entities generally force
       
   135   the use of "Any" because the type specification applies otherwise
       
   136   to all the selected variables. We could write here
       
   137   ::
       
   138 
       
   139         String N, P WHERE
       
   140         X is Person, X name N, X first_name P
       
   141 
       
   142 
       
   143   Note: You can not specify several types with * ... where X is FirstType or X is SecondType*.
       
   144   To specify several types explicitely, you have to do
       
   145 
       
   146   ::
       
   147 
       
   148         Any X where X is in (FirstType, SecondType)
       
   149 
       
   150 
       
   151 Insertion query
       
   152 ---------------
       
   153 
       
   154     `INSERT` <entity type> V1 (, <entity type> V2) \ * `:` <assignments>
       
   155     [ `WHERE` <restriction>]
       
   156 
       
   157 :assignments:
       
   158    list of relations to assign in the form `V1 relationship V2 | <static value>`
       
   159 
       
   160 The restriction can define variables used in assignments.
       
   161 
       
   162 Caution, if a restriction is specified, the insertion is done for 
       
   163 *each line result returned by the restriction*.
       
   164 
       
   165 - *Insert a new person named 'foo'*
       
   166   ::
       
   167 
       
   168         INSERT Person X: X name 'foo'
       
   169 
       
   170 - *Insert a new person named 'foo', another called 'nice' and a 'friend' relation
       
   171   between them*
       
   172   ::
       
   173 
       
   174         INSERT Person X, Person Y: X name 'foo', Y name 'nice', X friend Y
       
   175 
       
   176 - *Insert a new person named 'foo' and a 'friend' relation with an existing 
       
   177   person called 'nice'*
       
   178   ::
       
   179 
       
   180         INSERT Person X: X name 'foo', X friend  Y WHERE name 'nice'
       
   181 
       
   182 Update and relation creation queries
       
   183 ------------------------------------
       
   184     `SET` <assignements>
       
   185     [ `WHERE` <restriction>]
       
   186 
       
   187 Caution, if a restriction is specified, the update is done *for
       
   188 each result line returned by the restriction*.
       
   189 
       
   190 - *Renaming of the person named 'foo' to 'bar' with the first name changed*
       
   191   ::
       
   192 
       
   193         SET X name 'bar', X first_name 'original' WHERE X is Person, X name 'foo'
       
   194 
       
   195 - *Insert a relation of type 'know' between objects linked by 
       
   196   the relation of type 'friend'*
       
   197   ::
       
   198 
       
   199         SET X know Y  WHERE X friend Y
       
   200 
       
   201 
       
   202 Deletion query
       
   203 --------------
       
   204     `DELETE` (<entity type> V) | (V1 relation v2 ),...
       
   205     [ `WHERE` <restriction>]
       
   206 
       
   207 Caution, if a restriction is specified, the deletion is made *for
       
   208 each line result returned by the restriction*.
       
   209 
       
   210 - *Deletion of the person named 'foo'*
       
   211   ::
       
   212 
       
   213         DELETE Person X WHERE X name 'foo'
       
   214 
       
   215 - *Removal of all relations of type 'friend' from the person named 'foo'*
       
   216   ::
       
   217 
       
   218         DELETE X friend Y WHERE X is Person, X name 'foo'
       
   219 
       
   220 
       
   221 (yet) Undocumented types of queries
       
   222 -----------------------------------
       
   223 
       
   224 **Limit / offset**
       
   225 ::
       
   226     
       
   227     Any P ORDERBY N LIMIT 5 OFFSET 10 WHERE P is Person, P firstname N
       
   228 
       
   229 **Function calls**
       
   230 ::
       
   231     
       
   232     Any UPPER(N) WHERE P firstname N
       
   233 
       
   234 **Exists**
       
   235 ::
       
   236     
       
   237     Any X ORDERBY PN,N
       
   238     WHERE X num N, X version_of P, P name PN, 
       
   239           EXISTS(X in_state S, S name IN ("dev", "ready"))
       
   240           OR EXISTS(T tags X, T name "priority")
       
   241 
       
   242 **Left outer join**
       
   243 ::
       
   244 
       
   245     Any T,P,V WHERE T is Ticket, T concerns P, T done_in V?
       
   246     
       
   247     
       
   248 **Having**
       
   249 ::
       
   250     
       
   251     Any X GROUPBY X WHERE X knows Y HAVING COUNT(Y) > 10
       
   252 
       
   253 **Simple union**
       
   254 ::
       
   255 
       
   256     (Any X WHERE X is Person) UNION (Any X WHERE X is Company)
       
   257     
       
   258 **Complex union**
       
   259 ::
       
   260 
       
   261      DISTINCT Any W, REF
       
   262         WITH W, REF BEING 
       
   263             (
       
   264 	      (Any W, REF WHERE W is Workcase, W ref REF, 
       
   265                                  W concerned_by D, D name "Logilab")
       
   266                UNION 
       
   267               (Any W, REF WHERE W is Workcase, W ref REF, '
       
   268                                 W split_into WP, WP name "WP1")
       
   269             )
       
   270 
       
   271 
       
   272 Language definition
       
   273 ===================
       
   274 
       
   275 Reserved keywords
       
   276 -----------------
       
   277 The keywords are not case sensitive.
       
   278 
       
   279 ::
       
   280 
       
   281      DISTINCT, INSERT, SET, DELETE,
       
   282      WHERE, AND, OR, NOT, EXISTS,
       
   283      IN, LIKE, UNION, WITH, BEING,
       
   284      TRUE, FALSE, NULL, TODAY, NOW,
       
   285      LIMIT, OFFSET,
       
   286      HAVING, GROUPBY, ORDERBY, ASC, DESC
       
   287 
       
   288 
       
   289 Variables and Typing
       
   290 --------------------
       
   291 
       
   292 With RQL, we do not distinguish between entities and attributes. The
       
   293 value of an attribute is considered an entity of a particular type (see
       
   294 below), linked to one (real) entity by a relation called the name of
       
   295 the attribute.
       
   296 
       
   297 Entities and values to browse and/or select are represented in
       
   298 the query by *variables* that must be written in capital letters.
       
   299 
       
   300 There is a special type **Any**, referring to a non specific type.
       
   301 
       
   302 We can restrict the possible types for a variable using the
       
   303 special relation **is**.
       
   304 The possible type(s) for each variable is derived from the schema
       
   305 according to the constraints expressed above and thanks to the relations between
       
   306 each variable.
       
   307 
       
   308 Built-in types
       
   309 ``````````````
       
   310 
       
   311 The base types supported are string (between double or single quotes),
       
   312 integers or floats (the separator is '.'), dates and
       
   313 boolean. We expect to receive a schema in which types String,
       
   314 Int, Float, Date and Boolean are defined.
       
   315 
       
   316 * `String` (literal: between double or single quotes).
       
   317 * `Int`, `Float` (separator being'.').
       
   318 * `Date`, `Datetime`, `Time` (literal: string YYYY/MM/DD [hh:mm] or keywords
       
   319   `TODAY` and `NOW`).
       
   320 * `Boolean` (keywords `TRUE` and `FALSE`).
       
   321 * `Keyword` NULL.
       
   322 
       
   323 
       
   324 Operators
       
   325 ---------
       
   326 
       
   327 Logical Operators
       
   328 `````````````````
       
   329 ::
       
   330 
       
   331      AND, OR, NOT, ','
       
   332 
       
   333 ',' is equivalent to 'AND' but with the smallest among the priority
       
   334 of logical operators (see :ref:`PriorityOperators`).
       
   335 
       
   336 Mathematical Operators
       
   337 ``````````````````````
       
   338 ::
       
   339 
       
   340      +, -, *, /
       
   341 
       
   342 Comparison operators
       
   343 ````````````````````
       
   344 ::
       
   345 
       
   346      =, <, <=, >=, >, ~=, IN, LIKE
       
   347 
       
   348 * The operator `=` is the default operator.
       
   349 
       
   350 * The operator `LIKE` equivalent to `~=` can be used with the
       
   351   special character `%` in a string to indicate that the chain 
       
   352   must start or finish by a prefix/suffix:
       
   353   ::
       
   354 
       
   355      Any X WHERE X name ~= 'Th%'
       
   356      Any X WHERE X name LIKE '%lt'
       
   357 
       
   358 * The operator `IN` provides a list of possible values:
       
   359   ::
       
   360   
       
   361     Any X WHERE X name IN ( 'chauvat', 'fayolle', 'di mascio', 'thenault')
       
   362 
       
   363 
       
   364 XXX nico: "A trick <> 'bar'" wouldn't it be more convenient than 
       
   365 "NOT A trick 'bar'" ?
       
   366 
       
   367 .. _PriorityOperators:
       
   368 
       
   369 Operator priority
       
   370 `````````````````
       
   371 
       
   372 1. '*', '/'
       
   373 
       
   374 2. '+', '-'
       
   375 
       
   376 3. 'not'
       
   377 
       
   378 4 'and'
       
   379 
       
   380 5 'or'
       
   381 
       
   382 6 ','
       
   383 
       
   384 
       
   385 Advanced Features
       
   386 -----------------
       
   387 
       
   388 Aggregate Functions
       
   389 ```````````````````
       
   390 ::
       
   391 
       
   392      COUNT, MIN, MAX, AVG, SUM
       
   393 
       
   394 Functions on string
       
   395 ```````````````````
       
   396 ::
       
   397 
       
   398      UPPER, LOWER
       
   399 
       
   400 Optional relations
       
   401 ``````````````````
       
   402 
       
   403 * They allow you to select entities related or not to another.
       
   404 
       
   405 * You must use the `?` behind the variable to specify that the relation
       
   406   toward it is optional:
       
   407 
       
   408    - Anomalies of a project attached or not to a version ::
       
   409 
       
   410        Any X, V WHERE X concerns P, P eid 42, X corrected_in V?
       
   411 
       
   412    - All cards and the project they document if necessary ::
       
   413 
       
   414        Any C, P WHERE C is Card, P? documented_by C
       
   415 
       
   416 
       
   417 
       
   418 BNF grammar
       
   419 -----------
       
   420 
       
   421 The terminal elements are in capital letters, non-terminal in lowercase.
       
   422 The value of the terminal elements (between quotes) is a Python regular
       
   423 expression.
       
   424 ::
       
   425 
       
   426      statement:: = (select | delete | insert | update) ';'
       
   427 
       
   428 
       
   429      # select specific rules
       
   430      select      ::= 'DISTINCT'? E_TYPE selected_terms restriction? group? sort?
       
   431 
       
   432      selected_terms ::= expression ( ',' expression)*
       
   433 
       
   434      group       ::= 'GROUPBY' VARIABLE ( ',' VARIABLE)*
       
   435 
       
   436      sort        ::= 'ORDERBY' sort_term ( ',' sort_term)*
       
   437 
       
   438      sort_term   ::=  VARIABLE sort_method =?
       
   439 
       
   440      sort_method ::= 'ASC' | 'DESC'
       
   441 
       
   442 
       
   443      # delete specific rules
       
   444      delete ::= 'DELETE' (variables_declaration | relations_declaration) restriction?
       
   445 
       
   446 
       
   447      # insert specific rules
       
   448      insert ::= 'INSERT' variables_declaration ( ':' relations_declaration)? restriction?
       
   449 
       
   450 
       
   451      # update specific rules
       
   452      update ::= 'SET' relations_declaration restriction
       
   453 
       
   454 
       
   455      # common rules
       
   456      variables_declaration ::= E_TYPE VARIABLE (',' E_TYPE VARIABLE)*
       
   457 
       
   458      relations_declaration ::= simple_relation (',' simple_relation)*
       
   459 
       
   460      simple_relation ::= VARIABLE R_TYPE expression
       
   461 
       
   462      restriction ::= 'WHERE' relations
       
   463 
       
   464      relations   ::= relation (LOGIC_OP relation)*
       
   465                    | '(' relations')'
       
   466 
       
   467      relation    ::= 'NOT'? VARIABLE R_TYPE COMP_OP? expression
       
   468                    | 'NOT'? R_TYPE VARIABLE 'IN' '(' expression (',' expression)* ')'
       
   469                    
       
   470      expression  ::= var_or_func_or_const (MATH_OP var_or_func_or_const) *
       
   471                    | '(' expression ')'
       
   472 
       
   473      var_or_func_or_const ::= VARIABLE | function | constant
       
   474 
       
   475      function    ::= FUNCTION '(' expression ( ',' expression) * ')'
       
   476 
       
   477      constant    ::= KEYWORD | STRING | FLOAT | INT
       
   478 
       
   479      # tokens
       
   480      LOGIC_OP ::= ',' | 'OR' | 'AND'
       
   481      MATH_OP  ::= '+' | '-' | '/' | '*'
       
   482      COMP_OP  ::= '>' | '>=' | '=' | '<=' | '<' | '~=' | 'LIKE'
       
   483 
       
   484      FUNCTION ::= 'MIN' | 'MAX' | 'SUM' | 'AVG' | 'COUNT' | 'UPPER' | 'LOWER'
       
   485 
       
   486      VARIABLE ::= '[A-Z][A-Z0-9]*'
       
   487      E_TYPE   ::= '[A-Z]\w*'
       
   488      R_TYPE   ::= '[a-z_]+'
       
   489 
       
   490      KEYWORD  ::= 'TRUE' | 'FALSE' | 'NULL' | 'TODAY' | 'NOW'
       
   491      STRING   ::= "'([^'\]|\\.)*'" |'"([^\"]|\\.)*\"'
       
   492      FLOAT    ::= '\d+\.\d*'
       
   493      INT      ::= '\d+'
       
   494 
       
   495 
       
   496 Remarks
       
   497 -------
       
   498 
       
   499 Sorting and groups
       
   500 ``````````````````
       
   501 
       
   502 - For grouped queries (e.g. with a GROUPBY clause), all
       
   503   selected variables should be grouped.
       
   504 
       
   505 - To group and/or sort by attributes, we can do: "X,L user U, U
       
   506   login L GROUPBY L, X ORDERBY L"
       
   507 
       
   508 - If the sorting method (SORT_METHOD) is not specified, then the sorting is
       
   509   ascendant.
       
   510 
       
   511 Negation
       
   512 ````````
       
   513 
       
   514 * A query such as `Document X WHERE NOT X owned_by U` means "the
       
   515   documents have no relation `owned_by`".
       
   516 * But the query `Document X WHERE NOT X owned_by U, U login "syt"`
       
   517   means "the documents have no relation `owned_by` with the user
       
   518   syt". They may have a relation "owned_by" with another user.
       
   519 
       
   520 Identity
       
   521 ````````
       
   522 
       
   523 You can use the special relation `identity` in a query to 
       
   524 add an identity constraint between two variables. This is equivalent
       
   525 to ``is`` in python::
       
   526 
       
   527    Any A WHERE A comments B, A identity B
       
   528 
       
   529 return all objects that comment themselves. The relation
       
   530 `identity` is especially useful when defining the rules for securities
       
   531 with `RQLExpressions`.
       
   532 
       
   533 Implementation
       
   534 ==============
       
   535 
       
   536 Internal representation (syntactic tree)
       
   537 ----------------------------------------
       
   538 
       
   539 The tree research does not contain the selected variables 
       
   540 (e.g. there is only what follows "WHERE").
       
   541 
       
   542 The insertion tree does not contain the variables inserted or relations
       
   543 defined on these variables (e.g. there is only what follows "WHERE").
       
   544 
       
   545 The removal tree does not contain the deleted variables and relations
       
   546 (e.g. there is only what follows the "WHERE").
       
   547 
       
   548 The update tree does not contain the variables and relations updated
       
   549 (e.g. there is only what follows the "WHERE").
       
   550 
       
   551 ::
       
   552 
       
   553      Select         ((Relationship | And | Or)?, Group?, Sort?)
       
   554      Insert         (Relations | And | Or)?
       
   555      Delete         (Relationship | And | Or)?
       
   556      Update         (Relations | And | Or)?
       
   557 
       
   558      And            ((Relationship | And | Or), (Relationship | And | Or))
       
   559      Or             ((Relationship | And | Or), (Relationship | And | Or))
       
   560 
       
   561      Relationship   ((VariableRef, Comparison))
       
   562 
       
   563      Comparison     ((Function | MathExpression | Keyword | Constant | VariableRef) +)
       
   564 
       
   565      Function       (())
       
   566      MathExpression ((MathExpression | Keyword | Constant | VariableRef), (MathExpression | Keyword | Constant | VariableRef))
       
   567 
       
   568      Group          (VariableRef +)
       
   569      Sort           (SortTerm +)
       
   570      SortTerm       (VariableRef +)
       
   571 
       
   572      VariableRef    ()
       
   573      Variable       ()
       
   574      Keyword        ()
       
   575      Constant       ()
       
   576 
       
   577 
       
   578 Remarks
       
   579 -------
       
   580 
       
   581 - The current implementation does not support linking two relations of type
       
   582   'is' with a OR. I do not think that the negation is  supported on this type 
       
   583   of relation (XXX FIXME to be confirmed).
       
   584 
       
   585 - Relations defining the variables must be left to those using them. 
       
   586   For example::
       
   587 
       
   588      Point P where P abs X, P ord Y, P value X+Y
       
   589 
       
   590   is valid, but::
       
   591 
       
   592      Point P where P abs X, P value X+Y, P ord Y
       
   593 
       
   594   is not.
       
   595 
       
   596 RQL logs
       
   597 --------
       
   598 
       
   599 You can configure the `CubicWeb` application to keep a log
       
   600 of the queries executed against your database. To do so, 
       
   601 edit the configuration file of your application 
       
   602 ``.../etc/cubicweb.d/myapp/all-in-one.conf`` and uncomment the
       
   603 variable ``query-log-file``::
       
   604 
       
   605   # web application query log file
       
   606   query-log-file=/tmp/rql-myapp.log
       
   607 
       
   608 
       
   609 Conclusion
       
   610 ==========
       
   611 
       
   612 Limitations
       
   613 -----------
       
   614 
       
   615 It lacks at the moment:
       
   616 
       
   617 - COALESCE
       
   618 
       
   619 - restrictions on groups (HAVING)
       
   620 
       
   621 and certainly other things ...
       
   622 
       
   623 A disadvantage is that to use this language we must know the
       
   624 format used (with real relation names and entities, not those viewing
       
   625 in the user interface). On the other hand, we can not really bypass
       
   626 that, and it is the job of a user interface to hide the RQL.
       
   627 
       
   628 
       
   629 Topics
       
   630 ------
       
   631 
       
   632 It would be convenient to express the schema matching
       
   633 relations (non-recursive rules)::
       
   634 
       
   635      Document class Type <-> Document occurence_of Fiche class Type
       
   636      Sheet class Type    <-> Form collection Collection class Type
       
   637     
       
   638 Therefore 1. becomes::
       
   639 
       
   640      Document X where
       
   641      X class C, C name 'Cartoon'
       
   642      X owned_by U, U login 'syt'
       
   643      X available true
       
   644 
       
   645 I'm not sure that we should handle this at RQL level ...
       
   646 
       
   647 There should also be a special relation 'anonymous'.
       
   648 
       
   649 
       
   650 
       
   651 .. _Versa: http://uche.ogbuji.net/tech/rdf/versa/
       
   652 .. _SPARQL: http://www.w3.org/TR/rdf-sparql-query/
       
   653 
       
   654 
       
   655 [FIXME] see also RQL documentation in source rql/doc.