diff -r 6520efef1824 -r 023b123ab76b doc/book/en/C050-rql.en.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/book/en/C050-rql.en.txt Thu Apr 23 11:58:11 2009 -0700 @@ -0,0 +1,655 @@ +.. -*- coding: utf-8 -*- + +.. _RQL: + +====================================== +RQL language (Relation Query Language) +====================================== + +Introduction +============ + +Goals of RQL +------------ + +The goal is to have a language emphasizing the way of browsing +relations. As such, attributes will be regarded as cases of +special relations (in terms of implementation, the language +user should see virtually no difference between an attribute and a +relation). + +RQL is inspired by SQL but is the highest level. A knowledge of the +`CubicWeb` schema defining the application is necessary. + +Comparison with existing languages +---------------------------------- + +SQL +``` +RQL builds on the features of SQL but is at a higher level +(the current implementation of RQL generates SQL). For that it is limited +to the way of browsing relations and introduces variables. +The user does not need to know the model underlying SQL, but the `CubicWeb` +schema defining the application. + +Versa +````` +We should look in more detail, but here are already some ideas for +the moment ... Versa_ is the language most similar to what we wanted +to do, but the model underlying data being RDF, there is some +number of things such as namespaces or handling of the RDF types which +does not interest us. On the functionality level, Versa_ is very comprehensive +including through many functions of conversion and basic types manipulation, +which may need to be guided at one time or another. +Finally, the syntax is a little esoteric. + +Sparql +`````` + +The query language most similar to RQL is SPARQL_, defined by the W3C to serve +for the semantic web. + + +The different types of queries +------------------------------ + +Search (`Any`) + Extract entities and attributes of entities. + +Insert entities (`INSERT`) + Insert new entities or relations in the database. + It can also directly create relationships for the newly created entities. + +Update entities, create relations (`SET`) + Update existing entities in the database, + or create relations between existing entities. + +Delete entities or relationship (`DELETE`) + Remove entities or relations existing in the database. + +Search Query +------------ + + [ `DISTINCT`] V1 (V2) \ * + [ `GROUPBY` V1 (V2) \*] [ `ORDERBY` ] + [ `WHERE` ] + [ `LIMIT` ] [ `OFFSET` ] + +:entity type: + Type of selected variables. + The special type `Any` is equivalent to not specify a type. +:restriction: + list of conditions to test successively + `V1 relation V2 | ` +:orderterms: + Definition of the selection order: variable or column number followed by + sorting method ( `ASC`, `DESC`), ASC is the default. +:note for grouped queries: + For grouped queries (e.g., a clause `GROUPBY`), all + selected variables must be aggregated or grouped. + + + +- *Search for the object of identifier 53* + :: + + Any WHERE X + X eid 53 + +- *Search material such as comics, owned by syt and available* + :: + + WHERE X Document + X occurence_of F, F class C, C name 'Comics' + X owned_by U, U login 'syt' + X available true + +- *Looking for people working for eurocopter interested in training* + :: + + Any P WHERE + P is Person, P work_for S, S name 'Eurocopter' + P interested_by T, T name 'training' + +- *Search note less than 10 days old written by jphc or ocy* + :: + + Any N WHERE + N is Note, N written_on D, D day> (today -10), + N written_by P, P name 'jphc' or P name 'ocy' + +- *Looking for people interested in training or living in Paris* + :: + + Any P WHERE + P is Person, (P interested_by T, T name 'training') OR + (P city 'Paris') + +- *The name and surname of all people* + :: + + Any N, P WHERE + X is Person, X name N, X first_name P + + Note that the selection of several entities generally force + the use of "Any" because the type specification applies otherwise + to all the selected variables. We could write here + :: + + String N, P WHERE + X is Person, X name N, X first_name P + + + Note: You can not specify several types with * ... where X is FirstType or X is SecondType*. + To specify several types explicitely, you have to do + + :: + + Any X where X is in (FirstType, SecondType) + + +Insertion query +--------------- + + `INSERT` V1 (, V2) \ * `:` + [ `WHERE` ] + +:assignments: + list of relations to assign in the form `V1 relationship V2 | ` + +The restriction can define variables used in assignments. + +Caution, if a restriction is specified, the insertion is done for +*each line result returned by the restriction*. + +- *Insert a new person named 'foo'* + :: + + INSERT Person X: X name 'foo' + +- *Insert a new person named 'foo', another called 'nice' and a 'friend' relation + between them* + :: + + INSERT Person X, Person Y: X name 'foo', Y name 'nice', X friend Y + +- *Insert a new person named 'foo' and a 'friend' relation with an existing + person called 'nice'* + :: + + INSERT Person X: X name 'foo', X friend Y WHERE name 'nice' + +Update and relation creation queries +------------------------------------ + `SET` + [ `WHERE` ] + +Caution, if a restriction is specified, the update is done *for +each result line returned by the restriction*. + +- *Renaming of the person named 'foo' to 'bar' with the first name changed* + :: + + SET X name 'bar', X first_name 'original' WHERE X is Person, X name 'foo' + +- *Insert a relation of type 'know' between objects linked by + the relation of type 'friend'* + :: + + SET X know Y WHERE X friend Y + + +Deletion query +-------------- + `DELETE` ( V) | (V1 relation v2 ),... + [ `WHERE` ] + +Caution, if a restriction is specified, the deletion is made *for +each line result returned by the restriction*. + +- *Deletion of the person named 'foo'* + :: + + DELETE Person X WHERE X name 'foo' + +- *Removal of all relations of type 'friend' from the person named 'foo'* + :: + + DELETE X friend Y WHERE X is Person, X name 'foo' + + +(yet) Undocumented types of queries +----------------------------------- + +**Limit / offset** +:: + + Any P ORDERBY N LIMIT 5 OFFSET 10 WHERE P is Person, P firstname N + +**Function calls** +:: + + Any UPPER(N) WHERE P firstname N + +**Exists** +:: + + Any X ORDERBY PN,N + WHERE X num N, X version_of P, P name PN, + EXISTS(X in_state S, S name IN ("dev", "ready")) + OR EXISTS(T tags X, T name "priority") + +**Left outer join** +:: + + Any T,P,V WHERE T is Ticket, T concerns P, T done_in V? + + +**Having** +:: + + Any X GROUPBY X WHERE X knows Y HAVING COUNT(Y) > 10 + +**Simple union** +:: + + (Any X WHERE X is Person) UNION (Any X WHERE X is Company) + +**Complex union** +:: + + DISTINCT Any W, REF + WITH W, REF BEING + ( + (Any W, REF WHERE W is Workcase, W ref REF, + W concerned_by D, D name "Logilab") + UNION + (Any W, REF WHERE W is Workcase, W ref REF, ' + W split_into WP, WP name "WP1") + ) + + +Language definition +=================== + +Reserved keywords +----------------- +The keywords are not case sensitive. + +:: + + DISTINCT, INSERT, SET, DELETE, + WHERE, AND, OR, NOT, EXISTS, + IN, LIKE, UNION, WITH, BEING, + TRUE, FALSE, NULL, TODAY, NOW, + LIMIT, OFFSET, + HAVING, GROUPBY, ORDERBY, ASC, DESC + + +Variables and Typing +-------------------- + +With RQL, we do not distinguish between entities and attributes. The +value of an attribute is considered an entity of a particular type (see +below), linked to one (real) entity by a relation called the name of +the attribute. + +Entities and values to browse and/or select are represented in +the query by *variables* that must be written in capital letters. + +There is a special type **Any**, referring to a non specific type. + +We can restrict the possible types for a variable using the +special relation **is**. +The possible type(s) for each variable is derived from the schema +according to the constraints expressed above and thanks to the relations between +each variable. + +Built-in types +`````````````` + +The base types supported are string (between double or single quotes), +integers or floats (the separator is '.'), dates and +boolean. We expect to receive a schema in which types String, +Int, Float, Date and Boolean are defined. + +* `String` (literal: between double or single quotes). +* `Int`, `Float` (separator being'.'). +* `Date`, `Datetime`, `Time` (literal: string YYYY/MM/DD [hh:mm] or keywords + `TODAY` and `NOW`). +* `Boolean` (keywords `TRUE` and `FALSE`). +* `Keyword` NULL. + + +Operators +--------- + +Logical Operators +````````````````` +:: + + AND, OR, NOT, ',' + +',' is equivalent to 'AND' but with the smallest among the priority +of logical operators (see :ref:`PriorityOperators`). + +Mathematical Operators +`````````````````````` +:: + + +, -, *, / + +Comparison operators +```````````````````` +:: + + =, <, <=, >=, >, ~=, IN, LIKE + +* The operator `=` is the default operator. + +* The operator `LIKE` equivalent to `~=` can be used with the + special character `%` in a string to indicate that the chain + must start or finish by a prefix/suffix: + :: + + Any X WHERE X name ~= 'Th%' + Any X WHERE X name LIKE '%lt' + +* The operator `IN` provides a list of possible values: + :: + + Any X WHERE X name IN ( 'chauvat', 'fayolle', 'di mascio', 'thenault') + + +XXX nico: "A trick <> 'bar'" wouldn't it be more convenient than +"NOT A trick 'bar'" ? + +.. _PriorityOperators: + +Operator priority +````````````````` + +1. '*', '/' + +2. '+', '-' + +3. 'not' + +4 'and' + +5 'or' + +6 ',' + + +Advanced Features +----------------- + +Aggregate Functions +``````````````````` +:: + + COUNT, MIN, MAX, AVG, SUM + +Functions on string +``````````````````` +:: + + UPPER, LOWER + +Optional relations +`````````````````` + +* They allow you to select entities related or not to another. + +* You must use the `?` behind the variable to specify that the relation + toward it is optional: + + - Anomalies of a project attached or not to a version :: + + Any X, V WHERE X concerns P, P eid 42, X corrected_in V? + + - All cards and the project they document if necessary :: + + Any C, P WHERE C is Card, P? documented_by C + + + +BNF grammar +----------- + +The terminal elements are in capital letters, non-terminal in lowercase. +The value of the terminal elements (between quotes) is a Python regular +expression. +:: + + statement:: = (select | delete | insert | update) ';' + + + # select specific rules + select ::= 'DISTINCT'? E_TYPE selected_terms restriction? group? sort? + + selected_terms ::= expression ( ',' expression)* + + group ::= 'GROUPBY' VARIABLE ( ',' VARIABLE)* + + sort ::= 'ORDERBY' sort_term ( ',' sort_term)* + + sort_term ::= VARIABLE sort_method =? + + sort_method ::= 'ASC' | 'DESC' + + + # delete specific rules + delete ::= 'DELETE' (variables_declaration | relations_declaration) restriction? + + + # insert specific rules + insert ::= 'INSERT' variables_declaration ( ':' relations_declaration)? restriction? + + + # update specific rules + update ::= 'SET' relations_declaration restriction + + + # common rules + variables_declaration ::= E_TYPE VARIABLE (',' E_TYPE VARIABLE)* + + relations_declaration ::= simple_relation (',' simple_relation)* + + simple_relation ::= VARIABLE R_TYPE expression + + restriction ::= 'WHERE' relations + + relations ::= relation (LOGIC_OP relation)* + | '(' relations')' + + relation ::= 'NOT'? VARIABLE R_TYPE COMP_OP? expression + | 'NOT'? R_TYPE VARIABLE 'IN' '(' expression (',' expression)* ')' + + expression ::= var_or_func_or_const (MATH_OP var_or_func_or_const) * + | '(' expression ')' + + var_or_func_or_const ::= VARIABLE | function | constant + + function ::= FUNCTION '(' expression ( ',' expression) * ')' + + constant ::= KEYWORD | STRING | FLOAT | INT + + # tokens + LOGIC_OP ::= ',' | 'OR' | 'AND' + MATH_OP ::= '+' | '-' | '/' | '*' + COMP_OP ::= '>' | '>=' | '=' | '<=' | '<' | '~=' | 'LIKE' + + FUNCTION ::= 'MIN' | 'MAX' | 'SUM' | 'AVG' | 'COUNT' | 'UPPER' | 'LOWER' + + VARIABLE ::= '[A-Z][A-Z0-9]*' + E_TYPE ::= '[A-Z]\w*' + R_TYPE ::= '[a-z_]+' + + KEYWORD ::= 'TRUE' | 'FALSE' | 'NULL' | 'TODAY' | 'NOW' + STRING ::= "'([^'\]|\\.)*'" |'"([^\"]|\\.)*\"' + FLOAT ::= '\d+\.\d*' + INT ::= '\d+' + + +Remarks +------- + +Sorting and groups +`````````````````` + +- For grouped queries (e.g. with a GROUPBY clause), all + selected variables should be grouped. + +- To group and/or sort by attributes, we can do: "X,L user U, U + login L GROUPBY L, X ORDERBY L" + +- If the sorting method (SORT_METHOD) is not specified, then the sorting is + ascendant. + +Negation +```````` + +* A query such as `Document X WHERE NOT X owned_by U` means "the + documents have no relation `owned_by`". +* But the query `Document X WHERE NOT X owned_by U, U login "syt"` + means "the documents have no relation `owned_by` with the user + syt". They may have a relation "owned_by" with another user. + +Identity +```````` + +You can use the special relation `identity` in a query to +add an identity constraint between two variables. This is equivalent +to ``is`` in python:: + + Any A WHERE A comments B, A identity B + +return all objects that comment themselves. The relation +`identity` is especially useful when defining the rules for securities +with `RQLExpressions`. + +Implementation +============== + +Internal representation (syntactic tree) +---------------------------------------- + +The tree research does not contain the selected variables +(e.g. there is only what follows "WHERE"). + +The insertion tree does not contain the variables inserted or relations +defined on these variables (e.g. there is only what follows "WHERE"). + +The removal tree does not contain the deleted variables and relations +(e.g. there is only what follows the "WHERE"). + +The update tree does not contain the variables and relations updated +(e.g. there is only what follows the "WHERE"). + +:: + + Select ((Relationship | And | Or)?, Group?, Sort?) + Insert (Relations | And | Or)? + Delete (Relationship | And | Or)? + Update (Relations | And | Or)? + + And ((Relationship | And | Or), (Relationship | And | Or)) + Or ((Relationship | And | Or), (Relationship | And | Or)) + + Relationship ((VariableRef, Comparison)) + + Comparison ((Function | MathExpression | Keyword | Constant | VariableRef) +) + + Function (()) + MathExpression ((MathExpression | Keyword | Constant | VariableRef), (MathExpression | Keyword | Constant | VariableRef)) + + Group (VariableRef +) + Sort (SortTerm +) + SortTerm (VariableRef +) + + VariableRef () + Variable () + Keyword () + Constant () + + +Remarks +------- + +- The current implementation does not support linking two relations of type + 'is' with a OR. I do not think that the negation is supported on this type + of relation (XXX FIXME to be confirmed). + +- Relations defining the variables must be left to those using them. + For example:: + + Point P where P abs X, P ord Y, P value X+Y + + is valid, but:: + + Point P where P abs X, P value X+Y, P ord Y + + is not. + +RQL logs +-------- + +You can configure the `CubicWeb` application to keep a log +of the queries executed against your database. To do so, +edit the configuration file of your application +``.../etc/cubicweb.d/myapp/all-in-one.conf`` and uncomment the +variable ``query-log-file``:: + + # web application query log file + query-log-file=/tmp/rql-myapp.log + + +Conclusion +========== + +Limitations +----------- + +It lacks at the moment: + +- COALESCE + +- restrictions on groups (HAVING) + +and certainly other things ... + +A disadvantage is that to use this language we must know the +format used (with real relation names and entities, not those viewing +in the user interface). On the other hand, we can not really bypass +that, and it is the job of a user interface to hide the RQL. + + +Topics +------ + +It would be convenient to express the schema matching +relations (non-recursive rules):: + + Document class Type <-> Document occurence_of Fiche class Type + Sheet class Type <-> Form collection Collection class Type + +Therefore 1. becomes:: + + Document X where + X class C, C name 'Cartoon' + X owned_by U, U login 'syt' + X available true + +I'm not sure that we should handle this at RQL level ... + +There should also be a special relation 'anonymous'. + + + +.. _Versa: http://uche.ogbuji.net/tech/rdf/versa/ +.. _SPARQL: http://www.w3.org/TR/rdf-sparql-query/ + + +[FIXME] see also RQL documentation in source rql/doc.