author | Julien Cristau <julien.cristau@logilab.fr> |
Tue, 23 Jun 2015 17:04:40 +0200 | |
changeset 10495 | 5bd914ebf3ae |
parent 10491 | c67bcee93248 |
child 10847 | ce5403611cbe |
permissions | -rw-r--r-- |
8518
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
1 |
.. _fti: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
2 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
3 |
Full Text Indexing in CubicWeb |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
4 |
------------------------------ |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
5 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
6 |
When an attribute is tagged as *fulltext-indexable* in the datamodel, |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
7 |
CubicWeb will automatically trigger hooks to update the internal |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
8 |
fulltext index (i.e the ``appears`` SQL table) each time this attribute |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
9 |
is modified. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
10 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
11 |
CubicWeb also provides a ``db-rebuild-fti`` command to rebuild the whole |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
12 |
fulltext on demand: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
13 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
14 |
.. sourcecode:: bash |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
15 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
16 |
cubicweb@esope~$ cubicweb db-rebuild-fti my_tracker_instance |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
17 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
18 |
You can also rebuild the fulltext index for a given set of entity types: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
19 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
20 |
.. sourcecode:: bash |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
21 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
22 |
cubicweb@esope~$ cubicweb db-rebuild-fti my_tracker_instance Ticket Version |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
23 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
24 |
In the above example, only fulltext index of entity types ``Ticket`` and ``Version`` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
25 |
will be rebuilt. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
26 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
27 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
28 |
Standard FTI process |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
29 |
~~~~~~~~~~~~~~~~~~~~ |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
30 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
31 |
Considering an entity type ``ET``, the default *fti* process is to : |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
32 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
33 |
1. fetch all entities of type ``ET`` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
34 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
35 |
2. for each entity, adapt it to ``IFTIndexable`` (see |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
36 |
:class:`~cubicweb.entities.adapters.IFTIndexableAdapter`) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
37 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
38 |
3. call |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
39 |
:meth:`~cubicweb.entities.adapters.IFTIndexableAdapter.get_words` on |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
40 |
the adapter which is supposed to return a dictionary *weight* -> |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
41 |
*list of words* as expected by |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
42 |
:meth:`~logilab.database.fti.FTIndexerMixIn.index_object`. The |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
43 |
tokenization of each attribute value is done by |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
44 |
:meth:`~logilab.database.fti.tokenize`. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
45 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
46 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
47 |
See :class:`~cubicweb.entities.adapters.IFTIndexableAdapter` for more documentation. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
48 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
49 |
|
9514
29987849a435
[doc] Fix typo in devrepo/fti
Denis Laxalde <denis.laxalde@logilab.fr>
parents:
8518
diff
changeset
|
50 |
Yams and ``fulltext_container`` |
29987849a435
[doc] Fix typo in devrepo/fti
Denis Laxalde <denis.laxalde@logilab.fr>
parents:
8518
diff
changeset
|
51 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
8518
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
52 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
53 |
It is possible in the datamodel to indicate that fulltext-indexed |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
54 |
attributes defined for an entity type will be used to index not the |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
55 |
entity itself but a related entity. This is especially useful for |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
56 |
composite entities. Let's take a look at (a simplified version of) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
57 |
the base schema defined in CubicWeb (see :mod:`cubicweb.schemas.base`): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
58 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
59 |
.. sourcecode:: python |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
60 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
61 |
class CWUser(WorkflowableEntityType): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
62 |
login = String(required=True, unique=True, maxsize=64) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
63 |
upassword = Password(required=True) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
64 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
65 |
class EmailAddress(EntityType): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
66 |
address = String(required=True, fulltextindexed=True, |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
67 |
indexed=True, unique=True, maxsize=128) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
68 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
69 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
70 |
class use_email_relation(RelationDefinition): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
71 |
name = 'use_email' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
72 |
subject = 'CWUser' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
73 |
object = 'EmailAddress' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
74 |
cardinality = '*?' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
75 |
composite = 'subject' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
76 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
77 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
78 |
The schema above states that there is a relation between ``CWUser`` and ``EmailAddress`` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
79 |
and that the ``address`` field of ``EmailAddress`` is fulltext indexed. Therefore, |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
80 |
in your application, if you use fulltext search to look for an email address, CubicWeb |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
81 |
will return the ``EmailAddress`` itself. But the objects we'd like to index |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
82 |
are more likely to be the associated ``CWUser`` than the ``EmailAddress`` itself. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
83 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
84 |
The simplest way to achieve that is to tag the ``use_email`` relation in |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
85 |
the datamodel: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
86 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
87 |
.. sourcecode:: python |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
88 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
89 |
class use_email(RelationType): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
90 |
fulltext_container = 'subject' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
91 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
92 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
93 |
Customizing how entities are fetched during ``db-rebuild-fti`` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
94 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
95 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
96 |
``db-rebuild-fti`` will call the |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
97 |
:meth:`~cubicweb.entities.AnyEntity.cw_fti_index_rql_queries` class |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
98 |
method on your entity type. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
99 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
100 |
.. automethod:: cubicweb.entities.AnyEntity.cw_fti_index_rql_queries |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
101 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
102 |
Now, suppose you've got a _huge_ table to index, you probably don't want to |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
103 |
get all entities at once. So here's a simple customized example that will |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
104 |
process block of 10000 entities: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
105 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
106 |
.. sourcecode:: python |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
107 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
108 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
109 |
class MyEntityClass(AnyEntity): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
110 |
__regid__ = 'MyEntityClass' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
111 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
112 |
@classmethod |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
113 |
def cw_fti_index_rql_queries(cls, req): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
114 |
# get the default RQL method and insert LIMIT / OFFSET instructions |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
115 |
base_rql = super(SearchIndex, cls).cw_fti_index_rql_queries(req)[0] |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
116 |
selected, restrictions = base_rql.split(' WHERE ') |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
117 |
rql_template = '%s ORDERBY X LIMIT %%(limit)s OFFSET %%(offset)s WHERE %s' % ( |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
118 |
selected, restrictions) |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
119 |
# count how many entities you'll have to index |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
120 |
count = req.execute('Any COUNT(X) WHERE X is MyEntityClass')[0][0] |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
121 |
# iterate by blocks of 10000 entities |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
122 |
chunksize = 10000 |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
123 |
for offset in xrange(0, count, chunksize): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
124 |
print 'SENDING', rql_template % {'limit': chunksize, 'offset': offset} |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
125 |
yield rql_template % {'limit': chunksize, 'offset': offset} |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
126 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
127 |
Since you have access to ``req``, you can more or less fetch whatever you want. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
128 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
129 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
130 |
Customizing :meth:`~cubicweb.entities.adapters.IFTIndexableAdapter.get_words` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
131 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
132 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
133 |
You can also customize the FTI process by providing your own ``get_words()`` |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
134 |
implementation: |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
135 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
136 |
.. sourcecode:: python |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
137 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
138 |
from cubicweb.entities.adapters import IFTIndexableAdapter |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
139 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
140 |
class SearchIndexAdapter(IFTIndexableAdapter): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
141 |
__regid__ = 'IFTIndexable' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
142 |
__select__ = is_instance('MyEntityClass') |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
143 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
144 |
def fti_containers(self, _done=None): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
145 |
"""this should yield any entity that must be considered to |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
146 |
fulltext-index self.entity |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
147 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
148 |
CubicWeb's default implementation will look for yams' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
149 |
``fulltex_container`` property. |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
150 |
""" |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
151 |
yield self.entity |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
152 |
yield self.entity.some_related_entity |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
153 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
154 |
|
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
155 |
def get_words(self): |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
156 |
# implement any logic here |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
157 |
# see http://www.postgresql.org/docs/9.1/static/textsearch-controls.html |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
158 |
# for the actual signification of 'C' |
153a7c9cdca9
[fti] add some documentation
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff
changeset
|
159 |
return {'C': ['any', 'word', 'I', 'want']} |