server/checkintegrity.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Thu, 10 Jun 2010 16:28:06 +0200
changeset 5718 8d246203730a
parent 5693 8af6623f3d4e
child 5850 fabff2813ee4
permissions -rw-r--r--
[ITree] fix adapter/_done_init to handle tree_[attribute|relation], add maxlevel argument to base tree view since one side effect of the new code is that it find tree view for entity types it didn't find before, hence one may want to limit tree view depth when it wasn't necessary before
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     1
# copyright 2003-2010 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    18
"""Check integrity of a CubicWeb repository. Hum actually only the system database
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    19
is checked.
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    20
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    21
"""
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    22
from __future__ import with_statement
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    23
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    24
__docformat__ = "restructuredtext en"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    26
import sys
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
    27
from datetime import datetime
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
from logilab.common.shellutils import ProgressBar
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    30
2596
d02eed70937f [R repo, schema] use VIRTUAL_RTYPES const
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2476
diff changeset
    31
from cubicweb.schema import PURE_VIRTUAL_RTYPES
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    32
from cubicweb.server.sqlutils import SQL_PREFIX
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    33
from cubicweb.server.session import security_enabled
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    34
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    35
def has_eid(session, sqlcursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    36
    """return true if the eid is a valid eid"""
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
    37
    if eid in eids:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    38
        return eids[eid]
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    39
    sqlcursor.execute('SELECT type, source FROM entities WHERE eid=%s' % eid)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    40
    try:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    41
        etype, source = sqlcursor.fetchone()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    42
    except:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    43
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    44
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    45
    if source and source != 'system':
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    46
        try:
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    47
            # insert eid *and* etype to attempt checking entity has not been
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    48
            # replaced by another subsquently to a restore of an old dump
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    49
            if session.execute('Any X WHERE X is %s, X eid %%(x)s' % etype,
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    50
                               {'x': eid}):
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    51
                eids[eid] = True
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    52
                return True
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    53
        except: # TypeResolverError, Unauthorized...
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    54
            pass
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    55
        eids[eid] = False
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    56
        return False
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    57
    sqlcursor.execute('SELECT * FROM %s%s WHERE %seid=%s' % (SQL_PREFIX, etype,
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    58
                                                             SQL_PREFIX, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    59
    result = sqlcursor.fetchall()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    60
    if len(result) == 0:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    61
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    62
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    63
    elif len(result) > 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    64
        msg = '  More than one entity with eid %s exists in source !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    65
        print >> sys.stderr, msg % eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    66
        print >> sys.stderr, '  WARNING : Unable to fix this, do it yourself !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    67
    eids[eid] = True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
    return True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    70
# XXX move to yams?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
def etype_fti_containers(eschema, _done=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
    if _done is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
        _done = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    74
    _done.add(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
    containers = tuple(eschema.fulltext_containers())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
    if containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
        for rschema, target in containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
            if target == 'object':
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
                targets = rschema.objects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
                targets = rschema.subjects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
            for targeteschema in targets:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    83
                if targeteschema in _done:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
                    continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
                _done.add(targeteschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
                for container in etype_fti_containers(targeteschema, _done):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    87
                    yield container
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    88
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
        yield eschema
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
    90
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
    91
def reindex_entities(schema, session, withpb=True):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    92
    """reindex all entities in the repository"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    93
    # deactivate modification_date hook since we don't want them
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    94
    # to be updated due to the reindexation
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    95
    repo = session.repo
2248
cbf043a2134a try to create fti table if not existant on rebuild-fti
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 1977
diff changeset
    96
    cursor = session.pool['system']
4831
c5aec27c1bf7 [repo] use logilab.db instead of lgc.adbh/lgc.db/lgc.sqlgen/indexer, test new date extranction functions
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4816
diff changeset
    97
    if not repo.system_source.dbhelper.has_fti_table(cursor):
2248
cbf043a2134a try to create fti table if not existant on rebuild-fti
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 1977
diff changeset
    98
        print 'no text index table'
5693
8af6623f3d4e [pylint] fix detected name errors
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5523
diff changeset
    99
        repo.system_source.dbhelper.init_fti(cursor)
4806
4f12f59b1a13 [fti] refactor and fix full text indexation handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4691
diff changeset
   100
    repo.system_source.do_fti = True  # ensure full-text indexation is activated
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   101
    etypes = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   102
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   103
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   104
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   105
        indexable_attrs = tuple(eschema.indexable_attributes()) # generator
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   106
        if not indexable_attrs:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   107
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   108
        for container in etype_fti_containers(eschema):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   109
            etypes.add(container)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   110
    print 'Reindexing entities of type %s' % \
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   111
          ', '.join(sorted(str(e) for e in etypes))
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   112
    if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   113
        pb = ProgressBar(len(etypes) + 1)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   114
    # first monkey patch Entity.check to disable validation
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   115
    # clear fti table first
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   116
    session.system_sql('DELETE FROM %s' % session.repo.system_source.dbhelper.fti_table)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   117
    if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   118
        pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   119
    # reindex entities by generating rql queries which set all indexable
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   120
    # attribute to their current value
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   121
    source = repo.system_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   122
    for eschema in etypes:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   123
        for entity in session.execute('Any X WHERE X is %s' % eschema).entities():
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   124
            source.fti_index_entity(session, entity)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   125
        if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   126
            pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   127
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   128
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   129
def check_schema(schema, session, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   130
    """check serialized schema"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   131
    print 'Checking serialized schema'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   132
    unique_constraints = ('SizeConstraint', 'FormatConstraint',
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   133
                          'VocabularyConstraint',
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   134
                          'RQLVocabularyConstraint')
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   135
    rql = ('Any COUNT(X),RN,SN,ON,CTN GROUPBY RN,SN,ON,CTN ORDERBY 1 '
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   136
           'WHERE X is CWConstraint, R constrained_by X, '
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   137
           'R relation_type RT, RT name RN, R from_entity ST, ST name SN, '
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   138
           'R to_entity OT, OT name ON, X cstrtype CT, CT name CTN')
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   139
    for count, rn, sn, on, cstrname in session.execute(rql):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   140
        if count == 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   141
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   142
        if cstrname in unique_constraints:
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   143
            print "ERROR: got %s %r constraints on relation %s.%s.%s" % (
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   144
                count, cstrname, sn, rn, on)
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   145
            if fix:
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   146
                print 'dunno how to fix, do it yourself'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   147
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   148
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   149
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   150
def check_text_index(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
    """check all entities registered in the text index"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   152
    print 'Checking text index'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   153
    cursor = session.system_sql('SELECT uid FROM appears;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   154
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   155
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   156
        if not has_eid(session, cursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
            msg = '  Entity with eid %s exists in the text index but in no source'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
            print >> sys.stderr, msg % eid,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   160
                session.system_sql('DELETE FROM appears WHERE uid=%s;' % eid)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   161
                print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   162
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   163
                print >> sys.stderr
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   166
def check_entities(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   167
    """check all entities registered in the repo system table"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   168
    print 'Checking entities system table'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   169
    cursor = session.system_sql('SELECT eid FROM entities;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   170
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   171
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   172
        if not has_eid(session, cursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   173
            msg = '  Entity with eid %s exists in the system table but in no source'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   174
            print >> sys.stderr, msg % eid,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   175
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   176
                session.system_sql('DELETE FROM entities WHERE eid=%s;' % eid)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   177
                print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   178
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   179
                print >> sys.stderr
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   180
    print 'Checking entities tables'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   181
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   182
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   183
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   184
        table = SQL_PREFIX + eschema.type
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   185
        column = SQL_PREFIX +  'eid'
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   186
        cursor = session.system_sql('SELECT %s FROM %s;' % (column, table))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   187
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   188
            eid = row[0]
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
   189
            # eids is full since we have fetched everything from the entities table,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   190
            # no need to call has_eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   191
            if not eid in eids or not eids[eid]:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   192
                msg = '  Entity with eid %s exists in the %s table but not in the system table'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   193
                print >> sys.stderr, msg % (eid, eschema.type),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   194
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   195
                    session.system_sql('DELETE FROM %s WHERE %s=%s;' % (table, column, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   196
                    print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   197
                else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   198
                    print >> sys.stderr
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   199
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   200
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   201
def bad_related_msg(rtype, target, eid, fix):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   202
    msg = '  A relation %s with %s eid %s exists but no such entity in sources'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   203
    print >> sys.stderr, msg % (rtype, target, eid),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   204
    if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   205
        print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   206
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   207
        print >> sys.stderr
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   208
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   209
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   210
def check_relations(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   211
    """check all relations registered in the repo system table"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   212
    print 'Checking relations'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   213
    for rschema in schema.relations():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   214
        if rschema.final or rschema in PURE_VIRTUAL_RTYPES:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   215
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   216
        if rschema.inlined:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   217
            for subjtype in rschema.subjects():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   218
                table = SQL_PREFIX + str(subjtype)
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   219
                column = SQL_PREFIX +  str(rschema)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   220
                sql = 'SELECT %s FROM %s WHERE %s IS NOT NULL;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   221
                    column, table, column)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   222
                cursor = session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   223
                for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   224
                    eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   225
                    if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   226
                        bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   227
                        if fix:
3374
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   228
                            sql = 'UPDATE %s SET %s=NULL WHERE %s=%s;' % (
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   229
                                table, column, column, eid)
381
sylvain.thenault@logilab.fr
parents: 380
diff changeset
   230
                            session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   231
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   232
        cursor = session.system_sql('SELECT eid_from FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   233
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   234
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   235
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   236
                bad_related_msg(rschema, 'subject', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   237
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   238
                    sql = 'DELETE FROM %s_relation WHERE eid_from=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   239
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   240
                    session.system_sql(sql)
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   241
        cursor = session.system_sql('SELECT eid_to FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   242
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   243
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   244
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   245
                bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   246
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   247
                    sql = 'DELETE FROM %s_relation WHERE eid_to=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   248
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   249
                    session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   250
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   251
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   252
def check_metadata(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   253
    """check entities has required metadata
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   254
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   255
    FIXME: rewrite using RQL queries ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   256
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   257
    print 'Checking metadata'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   258
    cursor = session.system_sql("SELECT DISTINCT type FROM entities;")
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   259
    eidcolumn = SQL_PREFIX + 'eid'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   260
    for etype, in cursor.fetchall():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   261
        table = SQL_PREFIX + etype
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   262
        for rel, default in ( ('creation_date', datetime.now()),
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   263
                              ('modification_date', datetime.now()), ):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   264
            column = SQL_PREFIX + rel
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   265
            cursor = session.system_sql("SELECT %s FROM %s WHERE %s is NULL"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   266
                                        % (eidcolumn, table, column))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   267
            for eid, in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   268
                msg = '  %s with eid %s has no %s'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   269
                print >> sys.stderr, msg % (etype, eid, rel),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   270
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   271
                    session.system_sql("UPDATE %s SET %s=%%(v)s WHERE %s=%s ;"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   272
                                       % (table, column, eidcolumn, eid),
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   273
                                       {'v': default})
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   274
                    print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   275
                else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   276
                    print >> sys.stderr
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   277
    cursor = session.system_sql('SELECT MIN(%s) FROM %sCWUser;' % (eidcolumn,
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   278
                                                                  SQL_PREFIX))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   279
    default_user_eid = cursor.fetchone()[0]
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   280
    assert default_user_eid is not None, 'no user defined !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   281
    for rel, default in ( ('owned_by', default_user_eid), ):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   282
        cursor = session.system_sql("SELECT eid, type FROM entities "
5340
4de474016568 [db-check] don't check entities from external sources have owned_by
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5339
diff changeset
   283
                                    "WHERE source='system' AND NOT EXISTS "
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   284
                                    "(SELECT 1 FROM %s_relation WHERE eid_from=eid);"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   285
                                    % rel)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   286
        for eid, etype in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   287
            msg = '  %s with eid %s has no %s relation'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   288
            print >> sys.stderr, msg % (etype, eid, rel),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   289
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   290
                session.system_sql('INSERT INTO %s_relation VALUES (%s, %s) ;'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   291
                                   % (rel, eid, default))
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   292
                print >> sys.stderr, ' [FIXED]'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   293
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   294
                print >> sys.stderr
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   295
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   296
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   297
def check(repo, cnx, checks, reindex, fix, withpb=True):
2476
1294a6bdf3bf application -> instance where it makes sense
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2248
diff changeset
   298
    """check integrity of instance's repository,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   299
    using given user and password to locally connect to the repository
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   300
    (no running cubicweb server needed)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   301
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   302
    session = repo._get_session(cnx.sessionid, setpool=True)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   303
    # yo, launch checks
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   304
    if checks:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   305
        eids_cache = {}
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   306
        with security_enabled(session, read=False): # ensure no read security
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   307
            for check in checks:
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   308
                check_func = globals()['check_%s' % check]
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   309
                check_func(repo.schema, session, eids_cache, fix=fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   310
        if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   311
            cnx.commit()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   312
        else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   313
            print
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   314
        if not fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   315
            print 'WARNING: Diagnostic run, nothing has been corrected'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   316
    if reindex:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   317
        cnx.rollback()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   318
        session.set_pool()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   319
        reindex_entities(repo.schema, session, withpb=withpb)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   320
        cnx.commit()