author | David Douard <david.douard@logilab.fr> |
Fri, 07 Jun 2013 16:48:20 +0200 | |
changeset 9003 | dfd818290ffb |
parent 8625 | 7ee0752178e5 |
child 10457 | 1f5026e7d848 |
permissions | -rw-r--r-- |
8625
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
1 |
. -*- coding: utf-8 -*- |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
2 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
3 |
.. _dataimport: |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
4 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
5 |
Dataimport |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
6 |
========== |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
7 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
8 |
*CubicWeb* is designed to manipulate huge of amount of data, and provides helper functions to do so. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
9 |
These functions insert data within different levels of the *CubicWeb* API, |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
10 |
allowing different speed/security tradeoffs. Those keeping all the *CubicWeb* hooks |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
11 |
and security will be slower but the possible errors in insertion |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
12 |
(bad data types, integrity error, ...) will be raised. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
13 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
14 |
These dataimport function are provided in the file `dataimport.py`. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
15 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
16 |
All the stores have the following API:: |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
17 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
18 |
>>> store = ObjectStore() |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
19 |
>>> user = store.create_entity('CWUser', login=u'johndoe') |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
20 |
>>> group = store.create_entity('CWUser', name=u'unknown') |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
21 |
>>> store.relate(user.eid, 'in_group', group.eid) |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
22 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
23 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
24 |
ObjectStore |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
25 |
----------- |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
26 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
27 |
This store keeps objects in memory for *faster* validation. It may be useful |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
28 |
in development mode. However, as it will not enforce the constraints of the schema, |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
29 |
it may miss some problems. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
30 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
31 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
32 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
33 |
RQLObjectStore |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
34 |
-------------- |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
35 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
36 |
This store works with an actual RQL repository, and it may be used in production mode. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
37 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
38 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
39 |
NoHookRQLObjectStore |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
40 |
-------------------- |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
41 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
42 |
This store works similarly to the *RQLObjectStore* but bypasses some *CubicWeb* hooks to be faster. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
43 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
44 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
45 |
SQLGenObjectStore |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
46 |
----------------- |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
47 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
48 |
This store relies on *COPY FROM*/execute many sql commands to directly push data using SQL commands |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
49 |
rather than using the whole *CubicWeb* API. For now, **it only works with PostgresSQL** as it requires |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
50 |
the *COPY FROM* command. |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
51 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
52 |
The API is similar to the other stores, but **it requires a flush** after some imports to copy data |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
53 |
in the database (these flushes may be multiples through the processes, or be done only once at the |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
54 |
end if there is no memory issue):: |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
55 |
|
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
56 |
>>> store = SQLGenObjectStore(session) |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
57 |
>>> store.create_entity('Person', ...) |
7ee0752178e5
[dataimport] Add SQL Store for faster import - works ONLY with Postgres for now, as it requires "copy from" command - closes #2410822
Vincent Michel <vincent.michel@logilab.fr>
parents:
diff
changeset
|
58 |
>>> store.flush() |