server/checkintegrity.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Tue, 11 Sep 2012 12:42:26 +0200
changeset 8581 ac3cbf55d9fb
parent 8578 f6ea4ea63d57
child 8694 d901c36bcfce
permissions -rw-r--r--
[entity attr cache] mark attribute as uncacheable in the underlying function else we may miss some changes. Also rewrite a storage test currently failing because cache of the entity created by the test transaction, distinct from the entity created internally and given to hooks and all, has its attributes cache not updated. As this doesn't seems a proper usage, rewrite it as expected. Much probably closes #2423719 definitly.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8562
0d2fb4604265 [session] fix arguments default value and promote usage of security_enabled as session method. Closes #2481820
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8527
diff changeset
     1
# copyright 2003-2012 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    18
"""Integrity checking tool for instances:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    19
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    20
* integrity of a CubicWeb repository. Hum actually only the system database is
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    21
  checked.
5999
eaf8219f8b7d [migration] fix rename_entity_type to avoid to loose some relations on the way
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5954
diff changeset
    22
"""
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    23
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    24
from __future__ import with_statement
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    25
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    26
__docformat__ = "restructuredtext en"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    27
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
import sys
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
    29
from datetime import datetime
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    30
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    31
from logilab.common.shellutils import ProgressBar
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    32
7468
c202aaf71489 [server] fix NameError (missing VIRTUAL_RTYPES import) in checkintegrity.py
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 7040
diff changeset
    33
from cubicweb.schema import PURE_VIRTUAL_RTYPES, VIRTUAL_RTYPES
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    34
from cubicweb.server.sqlutils import SQL_PREFIX
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    35
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    36
def notify_fixed(fix):
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    37
    if fix:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    38
        sys.stderr.write(' [FIXED]')
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    39
    sys.stderr.write('\n')
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    40
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    41
def has_eid(session, sqlcursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    42
    """return true if the eid is a valid eid"""
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
    43
    if eid in eids:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    44
        return eids[eid]
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    45
    sqlcursor.execute('SELECT type, source FROM entities WHERE eid=%s' % eid)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    46
    try:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    47
        etype, source = sqlcursor.fetchone()
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7687
diff changeset
    48
    except Exception:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    49
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    50
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    51
    if source and source != 'system':
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    52
        try:
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    53
            # insert eid *and* etype to attempt checking entity has not been
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    54
            # replaced by another subsquently to a restore of an old dump
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    55
            if session.execute('Any X WHERE X is %s, X eid %%(x)s' % etype,
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    56
                               {'x': eid}):
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    57
                eids[eid] = True
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    58
                return True
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7687
diff changeset
    59
        except Exception: # TypeResolverError, Unauthorized...
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    60
            pass
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    61
        eids[eid] = False
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    62
        return False
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    63
    sqlcursor.execute('SELECT * FROM %s%s WHERE %seid=%s' % (SQL_PREFIX, etype,
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    64
                                                             SQL_PREFIX, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    65
    result = sqlcursor.fetchall()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    66
    if len(result) == 0:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    67
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
    elif len(result) > 1:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    70
        msg = ('  More than one entity with eid %s exists in source !\n'
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    71
               '  WARNING : Unable to fix this, do it yourself !\n')
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    72
        sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
    eids[eid] = True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    74
    return True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
# XXX move to yams?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
def etype_fti_containers(eschema, _done=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
    if _done is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
        _done = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
    _done.add(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
    containers = tuple(eschema.fulltext_containers())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
    if containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    83
        for rschema, target in containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
            if target == 'object':
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
                targets = rschema.objects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    87
                targets = rschema.subjects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    88
            for targeteschema in targets:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
                if targeteschema in _done:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
                    continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    91
                _done.add(targeteschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    92
                for container in etype_fti_containers(targeteschema, _done):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    93
                    yield container
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    94
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    95
        yield eschema
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
    96
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
    97
def reindex_entities(schema, session, withpb=True, etypes=None):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    98
    """reindex all entities in the repository"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    99
    # deactivate modification_date hook since we don't want them
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   100
    # to be updated due to the reindexation
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   101
    repo = session.repo
7398
26695dd703d8 [repository api] definitly kill usage of word 'pool' to refer to connections set used by a session
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7040
diff changeset
   102
    cursor = session.cnxset['system']
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   103
    dbhelper = session.repo.system_source.dbhelper
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   104
    if not dbhelper.has_fti_table(cursor):
2248
cbf043a2134a try to create fti table if not existant on rebuild-fti
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 1977
diff changeset
   105
        print 'no text index table'
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   106
        dbhelper.init_fti(cursor)
4806
4f12f59b1a13 [fti] refactor and fix full text indexation handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4691
diff changeset
   107
    repo.system_source.do_fti = True  # ensure full-text indexation is activated
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   108
    if etypes is None:
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   109
        print 'Reindexing entities'
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   110
        etypes = set()
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   111
        for eschema in schema.entities():
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   112
            if eschema.final:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   113
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   114
            indexable_attrs = tuple(eschema.indexable_attributes()) # generator
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   115
            if not indexable_attrs:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   116
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   117
            for container in etype_fti_containers(eschema):
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   118
                etypes.add(container)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   119
        # clear fti table first
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   120
        session.system_sql('DELETE FROM %s' % dbhelper.fti_table)
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   121
    else:
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   122
        print 'Reindexing entities of type %s' % \
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   123
              ', '.join(sorted(str(e) for e in etypes))
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   124
        # clear fti table first. Use subquery for sql compatibility
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   125
        session.system_sql("DELETE FROM %s WHERE EXISTS(SELECT 1 FROM ENTITIES "
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   126
                           "WHERE eid=%s AND type IN (%s))" % (
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   127
                               dbhelper.fti_table, dbhelper.fti_uid_attr,
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   128
                               ','.join("'%s'" % etype for etype in etypes)))
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   129
    if withpb:
6112
913979c79244 [db-fti-index] simple fix fpr progressbar-related crash when etypes is None in reindex_entities()
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 5999
diff changeset
   130
        pb = ProgressBar(len(etypes) + 1)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   131
        pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   132
    # reindex entities by generating rql queries which set all indexable
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   133
    # attribute to their current value
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   134
    source = repo.system_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   135
    for eschema in etypes:
8450
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   136
        etype_class = session.vreg['etypes'].etype_class(str(eschema))
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   137
        for fti_rql in etype_class.cw_fti_index_rql_queries(session):
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   138
            rset = session.execute(fti_rql)
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   139
            source.fti_index_entities(session, rset.entities())
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   140
            # clear entity cache to avoid high memory consumption on big tables
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   141
            session.drop_entity_cache()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   142
        if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   143
            pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   144
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   145
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   146
def check_schema(schema, session, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   147
    """check serialized schema"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   148
    print 'Checking serialized schema'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   149
    unique_constraints = ('SizeConstraint', 'FormatConstraint',
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   150
                          'VocabularyConstraint',
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
                          'RQLVocabularyConstraint')
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   152
    rql = ('Any COUNT(X),RN,SN,ON,CTN GROUPBY RN,SN,ON,CTN ORDERBY 1 '
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   153
           'WHERE X is CWConstraint, R constrained_by X, '
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   154
           'R relation_type RT, RT name RN, R from_entity ST, ST name SN, '
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   155
           'R to_entity OT, OT name ON, X cstrtype CT, CT name CTN')
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   156
    for count, rn, sn, on, cstrname in session.execute(rql):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
        if count == 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
        if cstrname in unique_constraints:
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   160
            print "ERROR: got %s %r constraints on relation %s.%s.%s" % (
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   161
                count, cstrname, sn, rn, on)
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   162
            if fix:
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   163
                print 'dunno how to fix, do it yourself'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   166
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   167
def check_text_index(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   168
    """check all entities registered in the text index"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   169
    print 'Checking text index'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   170
    msg = '  Entity with eid %s exists in the text index but in no source (autofix will remove from text index)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   171
    cursor = session.system_sql('SELECT uid FROM appears;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   172
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   173
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   174
        if not has_eid(session, cursor, eid, eids):
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   175
            sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   176
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   177
                session.system_sql('DELETE FROM appears WHERE uid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   178
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   179
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   180
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   181
def check_entities(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   182
    """check all entities registered in the repo system table"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   183
    print 'Checking entities system table'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   184
    # system table but no source
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   185
    msg = '  Entity with eid %s exists in the system table but in no source (autofix will delete the entity)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   186
    cursor = session.system_sql('SELECT eid FROM entities;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   187
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   188
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   189
        if not has_eid(session, cursor, eid, eids):
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   190
            sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   191
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   192
                session.system_sql('DELETE FROM entities WHERE eid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   193
            notify_fixed(fix)
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   194
    # source in entities, but no relation cw_source
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   195
    applcwversion = session.repo.get_versions().get('cubicweb')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   196
    if applcwversion >= (3,13,1): # entities.asource appeared in 3.13.1
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   197
        cursor = session.system_sql('SELECT e.eid FROM entities as e, cw_CWSource as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   198
                                    'WHERE s.cw_name=e.asource AND '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   199
                                    'NOT EXISTS(SELECT 1 FROM cw_source_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   200
                                    '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   201
                                    'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   202
        msg = ('  Entity with eid %s refers to source in entities table, '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   203
               'but is missing relation cw_source (autofix will create the relation)\n')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   204
        for row in cursor.fetchall():
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   205
            sys.stderr.write(msg % row[0])
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   206
        if fix:
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   207
            session.system_sql('INSERT INTO cw_source_relation (eid_from, eid_to) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   208
                               'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWSource as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   209
                               'WHERE s.cw_name=e.asource AND NOT EXISTS(SELECT 1 FROM cw_source_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   210
                               '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   211
            notify_fixed(True)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   212
    # inconsistencies for 'is'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   213
    msg = '  %s #%s is missing relation "is" (autofix will create the relation)\n'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   214
    cursor = session.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   215
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   216
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   217
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   218
    for row in cursor.fetchall():
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   219
        sys.stderr.write(msg % row)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   220
    if fix:
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   221
        session.system_sql('INSERT INTO is_relation (eid_from, eid_to) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   222
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   223
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   224
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   225
        notify_fixed(True)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   226
    # inconsistencies for 'is_instance_of'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   227
    msg = '  %s #%s is missing relation "is_instance_of" (autofix will create the relation)\n'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   228
    cursor = session.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   229
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   230
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   231
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   232
    for row in cursor.fetchall():
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   233
        sys.stderr.write(msg % row)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   234
    if fix:
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   235
        session.system_sql('INSERT INTO is_instance_of_relation (eid_from, eid_to) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   236
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   237
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   238
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   239
        notify_fixed(True)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   240
    print 'Checking entities tables'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   241
    msg = '  Entity with eid %s exists in the %s table but not in the system table (autofix will delete the entity)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   242
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   243
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   244
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   245
        table = SQL_PREFIX + eschema.type
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   246
        column = SQL_PREFIX +  'eid'
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   247
        cursor = session.system_sql('SELECT %s FROM %s;' % (column, table))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   248
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   249
            eid = row[0]
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
   250
            # eids is full since we have fetched everything from the entities table,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   251
            # no need to call has_eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   252
            if not eid in eids or not eids[eid]:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   253
                sys.stderr.write(msg % (eid, eschema.type))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   254
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   255
                    session.system_sql('DELETE FROM %s WHERE %s=%s;' % (table, column, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   256
                notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   257
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   258
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   259
def bad_related_msg(rtype, target, eid, fix):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   260
    msg = '  A relation %s with %s eid %s exists but no such entity in sources'
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   261
    sys.stderr.write(msg % (rtype, target, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   262
    notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   263
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   264
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   265
def check_relations(schema, session, eids, fix=1):
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   266
    """check that eids referenced by relations are registered in the repo system
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   267
    table
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   268
    """
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   269
    print 'Checking relations'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   270
    for rschema in schema.relations():
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   271
        if rschema.final or rschema.type in PURE_VIRTUAL_RTYPES:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   272
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   273
        if rschema.inlined:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   274
            for subjtype in rschema.subjects():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   275
                table = SQL_PREFIX + str(subjtype)
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   276
                column = SQL_PREFIX +  str(rschema)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   277
                sql = 'SELECT %s FROM %s WHERE %s IS NOT NULL;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   278
                    column, table, column)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   279
                cursor = session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   280
                for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   281
                    eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   282
                    if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   283
                        bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   284
                        if fix:
3374
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   285
                            sql = 'UPDATE %s SET %s=NULL WHERE %s=%s;' % (
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   286
                                table, column, column, eid)
381
sylvain.thenault@logilab.fr
parents: 380
diff changeset
   287
                            session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   288
            continue
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   289
        try:
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   290
            cursor = session.system_sql('SELECT eid_from FROM %s_relation;' % rschema)
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   291
        except Exception, ex:
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   292
            # usually because table doesn't exist
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   293
            print 'ERROR', ex
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   294
            continue
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   295
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   296
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   297
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   298
                bad_related_msg(rschema, 'subject', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   299
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   300
                    sql = 'DELETE FROM %s_relation WHERE eid_from=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   301
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   302
                    session.system_sql(sql)
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   303
        cursor = session.system_sql('SELECT eid_to FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   304
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   305
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   306
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   307
                bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   308
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   309
                    sql = 'DELETE FROM %s_relation WHERE eid_to=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   310
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   311
                    session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   312
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   313
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   314
def check_mandatory_relations(schema, session, eids, fix=1):
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   315
    """check entities missing some mandatory relation"""
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   316
    print 'Checking mandatory relations'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   317
    msg = '%s #%s is missing mandatory %s relation %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   318
    for rschema in schema.relations():
8527
6151b0199bc1 [c-c db-check] don't skip is/is_instance_of mandatory relation to avoid telling two opposite messages when the relation is missing. Closes #2465659
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8450
diff changeset
   319
        if rschema.final or rschema in PURE_VIRTUAL_RTYPES or rschema in ('is', 'is_instance_of'):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   320
            continue
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   321
        smandatory = set()
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   322
        omandatory = set()
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   323
        for rdef in rschema.rdefs.values():
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   324
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   325
                smandatory.add(rdef.subject)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   326
            if rdef.cardinality[1] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   327
                omandatory.add(rdef.object)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   328
        for role, etypes in (('subject', smandatory), ('object', omandatory)):
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   329
            for etype in etypes:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   330
                if role == 'subject':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   331
                    rql = 'Any X WHERE NOT X %s Y, X is %s' % (rschema, etype)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   332
                else:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   333
                    rql = 'Any X WHERE NOT Y %s X, X is %s' % (rschema, etype)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   334
                for entity in session.execute(rql).entities():
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   335
                    sys.stderr.write(msg % (entity.__regid__, entity.eid, role, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   336
                    if fix:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   337
                        #if entity.cw_describe()['source']['uri'] == 'system': XXX
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   338
                        entity.cw_delete() # XXX this is BRUTAL!
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   339
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   340
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   341
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   342
def check_mandatory_attributes(schema, session, eids, fix=1):
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   343
    """check for entities stored in the system source missing some mandatory
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   344
    attribute
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   345
    """
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   346
    print 'Checking mandatory attributes'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   347
    msg = '%s #%s is missing mandatory attribute %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   348
    for rschema in schema.relations():
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   349
        if not rschema.final or rschema in VIRTUAL_RTYPES:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   350
            continue
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   351
        for rdef in rschema.rdefs.values():
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   352
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   353
                rql = 'Any X WHERE X %s NULL, X is %s, X cw_source S, S name "system"' % (
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   354
                    rschema, rdef.subject)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   355
                for entity in session.execute(rql).entities():
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   356
                    sys.stderr.write(msg % (entity.__regid__, entity.eid, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   357
                    if fix:
7682
07b592a62be3 [c-c db-check] drop some deprecation warning
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7473
diff changeset
   358
                        entity.cw_delete()
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   359
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   360
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   361
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   362
def check_metadata(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   363
    """check entities has required metadata
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   364
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   365
    FIXME: rewrite using RQL queries ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   366
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   367
    print 'Checking metadata'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   368
    cursor = session.system_sql("SELECT DISTINCT type FROM entities;")
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   369
    eidcolumn = SQL_PREFIX + 'eid'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   370
    msg = '  %s with eid %s has no %s (autofix will set it to now)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   371
    for etype, in cursor.fetchall():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   372
        table = SQL_PREFIX + etype
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   373
        for rel, default in ( ('creation_date', datetime.now()),
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   374
                              ('modification_date', datetime.now()), ):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   375
            column = SQL_PREFIX + rel
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   376
            cursor = session.system_sql("SELECT %s FROM %s WHERE %s is NULL"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   377
                                        % (eidcolumn, table, column))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   378
            for eid, in cursor.fetchall():
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   379
                sys.stderr.write(msg % (etype, eid, rel))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   380
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   381
                    session.system_sql("UPDATE %s SET %s=%%(v)s WHERE %s=%s ;"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   382
                                       % (table, column, eidcolumn, eid),
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   383
                                       {'v': default})
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   384
                notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   385
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   386
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   387
def check(repo, cnx, checks, reindex, fix, withpb=True):
2476
1294a6bdf3bf application -> instance where it makes sense
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2248
diff changeset
   388
    """check integrity of instance's repository,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   389
    using given user and password to locally connect to the repository
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   390
    (no running cubicweb server needed)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   391
    """
7398
26695dd703d8 [repository api] definitly kill usage of word 'pool' to refer to connections set used by a session
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7040
diff changeset
   392
    session = repo._get_session(cnx.sessionid, setcnxset=True)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   393
    # yo, launch checks
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   394
    if checks:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   395
        eids_cache = {}
8562
0d2fb4604265 [session] fix arguments default value and promote usage of security_enabled as session method. Closes #2481820
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8527
diff changeset
   396
        with session.security_enabled(read=False, write=False): # ensure no read security
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   397
            for check in checks:
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   398
                check_func = globals()['check_%s' % check]
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   399
                check_func(repo.schema, session, eids_cache, fix=fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   400
        if fix:
8578
f6ea4ea63d57 [check integrity] use session consistently
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8562
diff changeset
   401
            session.commit()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   402
        else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   403
            print
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   404
        if not fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   405
            print 'WARNING: Diagnostic run, nothing has been corrected'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   406
    if reindex:
8578
f6ea4ea63d57 [check integrity] use session consistently
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8562
diff changeset
   407
        session.rollback()
7398
26695dd703d8 [repository api] definitly kill usage of word 'pool' to refer to connections set used by a session
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7040
diff changeset
   408
        session.set_cnxset()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   409
        reindex_entities(repo.schema, session, withpb=withpb)
8578
f6ea4ea63d57 [check integrity] use session consistently
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8562
diff changeset
   410
        session.commit()