server/checkintegrity.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Thu, 03 Mar 2011 16:05:34 +0100
branchstable
changeset 7035 8d2cf36bd79d
parent 6889 37668bf302f5
child 7036 63386b35ec69
permissions -rw-r--r--
[c-c db-check] factorize code by introducing notify_fixed dumb function
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     1
# copyright 2003-2010 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    18
"""Integrity checking tool for instances:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    19
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    20
* integrity of a CubicWeb repository. Hum actually only the system database is
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    21
  checked.
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    22
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    23
* consistency of multi-sources instance mapping file
5999
eaf8219f8b7d [migration] fix rename_entity_type to avoid to loose some relations on the way
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5954
diff changeset
    24
"""
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    26
from __future__ import with_statement
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    27
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
__docformat__ = "restructuredtext en"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    30
import sys
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
    31
from datetime import datetime
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    32
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    33
from logilab.common.shellutils import ProgressBar
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    34
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    35
from cubicweb.schema import META_RTYPES, VIRTUAL_RTYPES, PURE_VIRTUAL_RTYPES
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    36
from cubicweb.server.sqlutils import SQL_PREFIX
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
    37
from cubicweb.server.session import security_enabled
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    38
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    39
def notify_fixed(fix):
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    40
    if fix:
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    41
        print >> sys.stderr, ' [FIXED]'
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    42
    else:
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    43
        print >> sys.stderr
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    44
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    45
def has_eid(session, sqlcursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    46
    """return true if the eid is a valid eid"""
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
    47
    if eid in eids:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    48
        return eids[eid]
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    49
    sqlcursor.execute('SELECT type, source FROM entities WHERE eid=%s' % eid)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    50
    try:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    51
        etype, source = sqlcursor.fetchone()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    52
    except:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    53
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    54
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    55
    if source and source != 'system':
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    56
        try:
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    57
            # insert eid *and* etype to attempt checking entity has not been
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    58
            # replaced by another subsquently to a restore of an old dump
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    59
            if session.execute('Any X WHERE X is %s, X eid %%(x)s' % etype,
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    60
                               {'x': eid}):
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    61
                eids[eid] = True
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    62
                return True
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    63
        except: # TypeResolverError, Unauthorized...
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    64
            pass
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    65
        eids[eid] = False
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
    66
        return False
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    67
    sqlcursor.execute('SELECT * FROM %s%s WHERE %seid=%s' % (SQL_PREFIX, etype,
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    68
                                                             SQL_PREFIX, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
    result = sqlcursor.fetchall()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    70
    if len(result) == 0:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
    elif len(result) > 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    74
        msg = '  More than one entity with eid %s exists in source !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
        print >> sys.stderr, msg % eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
        print >> sys.stderr, '  WARNING : Unable to fix this, do it yourself !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
    eids[eid] = True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
    return True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
# XXX move to yams?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
def etype_fti_containers(eschema, _done=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
    if _done is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    83
        _done = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
    _done.add(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
    containers = tuple(eschema.fulltext_containers())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
    if containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    87
        for rschema, target in containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    88
            if target == 'object':
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
                targets = rschema.objects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    91
                targets = rschema.subjects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    92
            for targeteschema in targets:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    93
                if targeteschema in _done:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    94
                    continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    95
                _done.add(targeteschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    96
                for container in etype_fti_containers(targeteschema, _done):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    97
                    yield container
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    98
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    99
        yield eschema
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   100
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   101
def reindex_entities(schema, session, withpb=True, etypes=None):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   102
    """reindex all entities in the repository"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   103
    # deactivate modification_date hook since we don't want them
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   104
    # to be updated due to the reindexation
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   105
    repo = session.repo
2248
cbf043a2134a try to create fti table if not existant on rebuild-fti
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 1977
diff changeset
   106
    cursor = session.pool['system']
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   107
    dbhelper = session.repo.system_source.dbhelper
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   108
    if not dbhelper.has_fti_table(cursor):
2248
cbf043a2134a try to create fti table if not existant on rebuild-fti
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 1977
diff changeset
   109
        print 'no text index table'
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   110
        dbhelper.init_fti(cursor)
4806
4f12f59b1a13 [fti] refactor and fix full text indexation handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4691
diff changeset
   111
    repo.system_source.do_fti = True  # ensure full-text indexation is activated
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   112
    if etypes is None:
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   113
        print 'Reindexing entities'
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   114
        etypes = set()
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   115
        for eschema in schema.entities():
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   116
            if eschema.final:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   117
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   118
            indexable_attrs = tuple(eschema.indexable_attributes()) # generator
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   119
            if not indexable_attrs:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   120
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   121
            for container in etype_fti_containers(eschema):
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   122
                etypes.add(container)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   123
        # clear fti table first
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   124
        session.system_sql('DELETE FROM %s' % dbhelper.fti_table)
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   125
    else:
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   126
        print 'Reindexing entities of type %s' % \
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   127
              ', '.join(sorted(str(e) for e in etypes))
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   128
        # clear fti table first. Use subquery for sql compatibility
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   129
        session.system_sql("DELETE FROM %s WHERE EXISTS(SELECT 1 FROM ENTITIES "
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   130
                           "WHERE eid=%s AND type IN (%s))" % (
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   131
                               dbhelper.fti_table, dbhelper.fti_uid_attr,
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   132
                               ','.join("'%s'" % etype for etype in etypes)))
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   133
    if withpb:
6112
913979c79244 [db-fti-index] simple fix fpr progressbar-related crash when etypes is None in reindex_entities()
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 5999
diff changeset
   134
        pb = ProgressBar(len(etypes) + 1)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   135
        pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   136
    # reindex entities by generating rql queries which set all indexable
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   137
    # attribute to their current value
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   138
    source = repo.system_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   139
    for eschema in etypes:
6889
37668bf302f5 improve massive deletion performance
Alexandre Fayolle <alexandre.fayolle@logilab.fr>
parents: 6624
diff changeset
   140
        rset = session.execute('Any X WHERE X is %s' % eschema)
37668bf302f5 improve massive deletion performance
Alexandre Fayolle <alexandre.fayolle@logilab.fr>
parents: 6624
diff changeset
   141
        source.fti_index_entities(session, rset.entities())
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   142
        if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   143
            pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   144
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   145
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   146
def check_schema(schema, session, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   147
    """check serialized schema"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   148
    print 'Checking serialized schema'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   149
    unique_constraints = ('SizeConstraint', 'FormatConstraint',
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   150
                          'VocabularyConstraint',
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
                          'RQLVocabularyConstraint')
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   152
    rql = ('Any COUNT(X),RN,SN,ON,CTN GROUPBY RN,SN,ON,CTN ORDERBY 1 '
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   153
           'WHERE X is CWConstraint, R constrained_by X, '
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   154
           'R relation_type RT, RT name RN, R from_entity ST, ST name SN, '
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   155
           'R to_entity OT, OT name ON, X cstrtype CT, CT name CTN')
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   156
    for count, rn, sn, on, cstrname in session.execute(rql):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
        if count == 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
        if cstrname in unique_constraints:
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   160
            print "ERROR: got %s %r constraints on relation %s.%s.%s" % (
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   161
                count, cstrname, sn, rn, on)
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   162
            if fix:
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   163
                print 'dunno how to fix, do it yourself'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   166
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   167
def check_text_index(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   168
    """check all entities registered in the text index"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   169
    print 'Checking text index'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   170
    cursor = session.system_sql('SELECT uid FROM appears;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   171
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   172
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   173
        if not has_eid(session, cursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   174
            msg = '  Entity with eid %s exists in the text index but in no source'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   175
            print >> sys.stderr, msg % eid,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   176
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   177
                session.system_sql('DELETE FROM appears WHERE uid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   178
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   179
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   180
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   181
def check_entities(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   182
    """check all entities registered in the repo system table"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   183
    print 'Checking entities system table'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   184
    cursor = session.system_sql('SELECT eid FROM entities;')
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   185
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   186
        eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   187
        if not has_eid(session, cursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   188
            msg = '  Entity with eid %s exists in the system table but in no source'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   189
            print >> sys.stderr, msg % eid,
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   190
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   191
                session.system_sql('DELETE FROM entities WHERE eid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   192
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   193
    print 'Checking entities tables'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   194
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   195
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   196
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   197
        table = SQL_PREFIX + eschema.type
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   198
        column = SQL_PREFIX +  'eid'
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   199
        cursor = session.system_sql('SELECT %s FROM %s;' % (column, table))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   200
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   201
            eid = row[0]
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
   202
            # eids is full since we have fetched everything from the entities table,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   203
            # no need to call has_eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   204
            if not eid in eids or not eids[eid]:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   205
                msg = '  Entity with eid %s exists in the %s table but not in the system table'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   206
                print >> sys.stderr, msg % (eid, eschema.type),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   207
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   208
                    session.system_sql('DELETE FROM %s WHERE %s=%s;' % (table, column, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   209
                notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   210
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   211
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   212
def bad_related_msg(rtype, target, eid, fix):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   213
    msg = '  A relation %s with %s eid %s exists but no such entity in sources'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   214
    print >> sys.stderr, msg % (rtype, target, eid),
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   215
    notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   216
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   217
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   218
def check_relations(schema, session, eids, fix=1):
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   219
    """check that eids referenced by relations are registered in the repo system
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   220
    table
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   221
    """
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   222
    print 'Checking relations'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   223
    for rschema in schema.relations():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   224
        if rschema.final or rschema in PURE_VIRTUAL_RTYPES:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   225
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   226
        if rschema.inlined:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   227
            for subjtype in rschema.subjects():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   228
                table = SQL_PREFIX + str(subjtype)
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   229
                column = SQL_PREFIX +  str(rschema)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   230
                sql = 'SELECT %s FROM %s WHERE %s IS NOT NULL;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   231
                    column, table, column)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   232
                cursor = session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   233
                for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   234
                    eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   235
                    if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   236
                        bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   237
                        if fix:
3374
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   238
                            sql = 'UPDATE %s SET %s=NULL WHERE %s=%s;' % (
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   239
                                table, column, column, eid)
381
sylvain.thenault@logilab.fr
parents: 380
diff changeset
   240
                            session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   241
            continue
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   242
        try:
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   243
            cursor = session.system_sql('SELECT eid_from FROM %s_relation;' % rschema)
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   244
        except Exception, ex:
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   245
            # usually because table doesn't exist
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   246
            print 'ERROR', ex
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   247
            continue
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   248
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   249
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   250
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   251
                bad_related_msg(rschema, 'subject', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   252
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   253
                    sql = 'DELETE FROM %s_relation WHERE eid_from=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   254
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   255
                    session.system_sql(sql)
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   256
        cursor = session.system_sql('SELECT eid_to FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   257
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   258
            eid = row[0]
5339
b83327846450 [db-check] fix unexistent multisource entity detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5338
diff changeset
   259
            if not has_eid(session, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   260
                bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   261
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   262
                    sql = 'DELETE FROM %s_relation WHERE eid_to=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   263
                        rschema, eid)
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   264
                    session.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   265
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   266
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   267
def check_metadata(schema, session, eids, fix=1):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   268
    """check entities has required metadata
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   269
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   270
    FIXME: rewrite using RQL queries ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   271
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   272
    print 'Checking metadata'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   273
    cursor = session.system_sql("SELECT DISTINCT type FROM entities;")
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   274
    eidcolumn = SQL_PREFIX + 'eid'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   275
    for etype, in cursor.fetchall():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   276
        table = SQL_PREFIX + etype
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   277
        for rel, default in ( ('creation_date', datetime.now()),
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
   278
                              ('modification_date', datetime.now()), ):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   279
            column = SQL_PREFIX + rel
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   280
            cursor = session.system_sql("SELECT %s FROM %s WHERE %s is NULL"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   281
                                        % (eidcolumn, table, column))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   282
            for eid, in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   283
                msg = '  %s with eid %s has no %s'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   284
                print >> sys.stderr, msg % (etype, eid, rel),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   285
                if fix:
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   286
                    session.system_sql("UPDATE %s SET %s=%%(v)s WHERE %s=%s ;"
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   287
                                       % (table, column, eidcolumn, eid),
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   288
                                       {'v': default})
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   289
                notify_fixed(fix)
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   290
    cursor = session.system_sql('SELECT MIN(%s) FROM %sCWUser;' % (eidcolumn,
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   291
                                                                  SQL_PREFIX))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   292
    default_user_eid = cursor.fetchone()[0]
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   293
    assert default_user_eid is not None, 'no user defined !'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   294
    for rel, default in ( ('owned_by', default_user_eid), ):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   295
        cursor = session.system_sql("SELECT eid, type FROM entities "
5340
4de474016568 [db-check] don't check entities from external sources have owned_by
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5339
diff changeset
   296
                                    "WHERE source='system' AND NOT EXISTS "
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   297
                                    "(SELECT 1 FROM %s_relation WHERE eid_from=eid);"
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   298
                                    % rel)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   299
        for eid, etype in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   300
            msg = '  %s with eid %s has no %s relation'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   301
            print >> sys.stderr, msg % (etype, eid, rel),
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   302
            if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   303
                session.system_sql('INSERT INTO %s_relation VALUES (%s, %s) ;'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   304
                                   % (rel, eid, default))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   305
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   306
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   307
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   308
def check(repo, cnx, checks, reindex, fix, withpb=True):
2476
1294a6bdf3bf application -> instance where it makes sense
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2248
diff changeset
   309
    """check integrity of instance's repository,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   310
    using given user and password to locally connect to the repository
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   311
    (no running cubicweb server needed)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   312
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   313
    session = repo._get_session(cnx.sessionid, setpool=True)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   314
    # yo, launch checks
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   315
    if checks:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   316
        eids_cache = {}
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   317
        with security_enabled(session, read=False): # ensure no read security
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   318
            for check in checks:
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   319
                check_func = globals()['check_%s' % check]
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   320
                check_func(repo.schema, session, eids_cache, fix=fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   321
        if fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   322
            cnx.commit()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   323
        else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   324
            print
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   325
        if not fix:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   326
            print 'WARNING: Diagnostic run, nothing has been corrected'
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   327
    if reindex:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   328
        cnx.rollback()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   329
        session.set_pool()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   330
        reindex_entities(repo.schema, session, withpb=withpb)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   331
        cnx.commit()
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   332
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   333
6624
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   334
def info(msg, *args):
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   335
    if args:
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   336
        msg = msg % args
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   337
    print 'INFO: %s' % msg
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   338
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   339
def warning(msg, *args):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   340
    if args:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   341
        msg = msg % args
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   342
    print 'WARNING: %s' % msg
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   343
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   344
def error(msg, *args):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   345
    if args:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   346
        msg = msg % args
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   347
    print 'ERROR: %s' % msg
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   348
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   349
def check_mapping(schema, mapping, warning=warning, error=error):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   350
    # first check stuff found in mapping file exists in the schema
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   351
    for attr in ('support_entities', 'support_relations'):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   352
        for ertype in mapping[attr].keys():
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   353
            try:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   354
                mapping[attr][ertype] = erschema = schema[ertype]
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   355
            except KeyError:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   356
                error('reference to unknown type %s in %s', ertype, attr)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   357
                del mapping[attr][ertype]
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   358
            else:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   359
                if erschema.final or erschema in META_RTYPES:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   360
                    error('type %s should not be mapped in %s', ertype, attr)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   361
                    del mapping[attr][ertype]
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   362
    for attr in ('dont_cross_relations', 'cross_relations'):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   363
        for rtype in list(mapping[attr]):
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   364
            try:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   365
                rschema = schema.rschema(rtype)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   366
            except KeyError:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   367
                error('reference to unknown relation type %s in %s', rtype, attr)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   368
                mapping[attr].remove(rtype)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   369
            else:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   370
                if rschema.final or rschema in VIRTUAL_RTYPES:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   371
                    error('relation type %s should not be mapped in %s',
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   372
                          rtype, attr)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   373
                    mapping[attr].remove(rtype)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   374
    # check relation in dont_cross_relations aren't in support_relations
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   375
    for rschema in mapping['dont_cross_relations']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   376
        if rschema in mapping['support_relations']:
6624
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   377
            info('relation %s is in dont_cross_relations and in support_relations',
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   378
                 rschema)
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   379
    # check relation in cross_relations are in support_relations
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   380
    for rschema in mapping['cross_relations']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   381
        if rschema not in mapping['support_relations']:
6624
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   382
            info('relation %s is in cross_relations but not in support_relations',
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   383
                 rschema)
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   384
    # check for relation in both cross_relations and dont_cross_relations
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   385
    for rschema in mapping['cross_relations'] & mapping['dont_cross_relations']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   386
        error('relation %s is in both cross_relations and dont_cross_relations',
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   387
              rschema)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   388
    # now check for more handy things
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   389
    seen = set()
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   390
    for eschema in mapping['support_entities'].values():
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   391
        for rschema, ttypes, role in eschema.relation_definitions():
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   392
            if rschema in META_RTYPES:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   393
                continue
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   394
            ttypes = [ttype for ttype in ttypes if ttype in mapping['support_entities']]
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   395
            if not rschema in mapping['support_relations']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   396
                somethingprinted = False
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   397
                for ttype in ttypes:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   398
                    rdef = rschema.role_rdef(eschema, ttype, role)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   399
                    seen.add(rdef)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   400
                    if rdef.role_cardinality(role) in '1+':
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   401
                        error('relation %s with %s as %s and target type %s is '
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   402
                              'mandatory but not supported',
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   403
                              rschema, eschema, role, ttype)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   404
                        somethingprinted = True
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   405
                    elif ttype in mapping['support_entities']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   406
                        if rdef not in seen:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   407
                            warning('%s could be supported', rdef)
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   408
                        somethingprinted = True
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   409
                if rschema not in mapping['dont_cross_relations']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   410
                    if role == 'subject' and rschema.inlined:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   411
                        error('inlined relation %s of %s should be supported',
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   412
                              rschema, eschema)
6624
b30e5428048b [d-c check-mapping] small enhancements to avoid spurious warnings
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6185
diff changeset
   413
                    elif not somethingprinted and rschema not in seen and rschema not in mapping['cross_relations']:
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   414
                        print 'you may want to specify something for %s' % rschema
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   415
                        seen.add(rschema)
6132
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   416
            else:
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   417
                if not ttypes:
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   418
                    warning('relation %s with %s as %s is supported but no target '
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   419
                            'type supported', rschema, role, eschema)
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   420
                if rschema in mapping['cross_relations'] and rschema.inlined:
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   421
                    error('you should unline relation %s which is supported and '
440df442d705 [c-c check-mapping] fix dumb name error and add a warning about inlined crossed relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6127
diff changeset
   422
                          'may be crossed ', rschema)
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   423
    for rschema in mapping['support_relations'].values():
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   424
        if rschema in META_RTYPES:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   425
            continue
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   426
        for subj, obj in rschema.rdefs:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   427
            if subj in mapping['support_entities'] and obj in mapping['support_entities']:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   428
                break
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   429
        else:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   430
            error('relation %s is supported but none if its definitions '
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
   431
                  'matches supported entities', rschema)