cubicweb/server/checkintegrity.py
author Sylvain Thénault <sylvain.thenault@logilab.fr>
Wed, 14 Dec 2016 17:04:29 +0100
branch3.24
changeset 11893 b230f4a4010d
parent 11776 b49684ddd543
child 11894 ec29989fba13
permissions -rw-r--r--
Extract functions to list expected/found database indexes from check_indexes so they may be used from other contexts. Use consistent naming along the way: always use indexes, not indices.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
     1
# copyright 2003-2016 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    18
"""Integrity checking tool for instances:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    19
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    20
* integrity of a CubicWeb repository. Hum actually only the system database is
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    21
  checked.
5999
eaf8219f8b7d [migration] fix rename_entity_type to avoid to loose some relations on the way
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5954
diff changeset
    22
"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
    23
from __future__ import print_function
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
    24
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
import sys
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
    26
from datetime import datetime
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    27
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
from logilab.common.shellutils import ProgressBar
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    30
from yams.constraints import UniqueConstraint
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    31
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    32
from cubicweb.toolsutils import underline_title
10813
ab626726a70a [syncschema] don't arbitrarily remove constraints from the in-memory schema
Julien Cristau <julien.cristau@logilab.fr>
parents: 10663
diff changeset
    33
from cubicweb.schema import PURE_VIRTUAL_RTYPES, VIRTUAL_RTYPES, UNIQUE_CONSTRAINTS
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    34
from cubicweb.server.sqlutils import SQL_PREFIX
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    35
from cubicweb.server.schema2sql import iter_unique_index_names, build_index_name
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    36
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    37
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    38
def notify_fixed(fix):
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    39
    if fix:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    40
        sys.stderr.write(' [FIXED]')
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    41
    sys.stderr.write('\n')
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    42
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    43
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
    44
def has_eid(cnx, sqlcursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    45
    """return true if the eid is a valid eid"""
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
    46
    if eid in eids:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    47
        return eids[eid]
9469
032825bbacab [multi-sources-removal] Drop entities.source column
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9468
diff changeset
    48
    sqlcursor.execute('SELECT type FROM entities WHERE eid=%s' % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    49
    try:
9469
032825bbacab [multi-sources-removal] Drop entities.source column
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9468
diff changeset
    50
        etype = sqlcursor.fetchone()[0]
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7687
diff changeset
    51
    except Exception:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    52
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    53
        return False
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
    54
    if etype not in cnx.vreg.schema:
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
    55
        eids[eid] = False
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
    56
        return False
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    57
    sqlcursor.execute('SELECT * FROM %s%s WHERE %seid=%s' % (SQL_PREFIX, etype,
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    58
                                                             SQL_PREFIX, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    59
    result = sqlcursor.fetchall()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    60
    if len(result) == 0:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    61
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    62
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    63
    elif len(result) > 1:
9492
c7fc56eecd1a English typography
Dimitri Papadopoulos <dimitri.papadopoulos@cea.fr>
parents: 9171
diff changeset
    64
        msg = ('  More than one entity with eid %s exists in source!\n'
c7fc56eecd1a English typography
Dimitri Papadopoulos <dimitri.papadopoulos@cea.fr>
parents: 9171
diff changeset
    65
               '  WARNING : Unable to fix this, do it yourself!\n')
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    66
        sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    67
    eids[eid] = True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
    return True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    70
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
# XXX move to yams?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
def etype_fti_containers(eschema, _done=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
    if _done is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    74
        _done = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
    _done.add(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
    containers = tuple(eschema.fulltext_containers())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
    if containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
        for rschema, target in containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
            if target == 'object':
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
                targets = rschema.objects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
                targets = rschema.subjects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    83
            for targeteschema in targets:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
                if targeteschema in _done:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
                    continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
                _done.add(targeteschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    87
                for container in etype_fti_containers(targeteschema, _done):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    88
                    yield container
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
        yield eschema
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
    91
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    92
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    93
def reindex_entities(schema, cnx, withpb=True, etypes=None):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    94
    """reindex all entities in the repository"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    95
    # deactivate modification_date hook since we don't want them
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    96
    # to be updated due to the reindexation
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    97
    repo = cnx.repo
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    98
    dbhelper = repo.system_source.dbhelper
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
    99
    cursor = cnx.cnxset.cu
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   100
    if not dbhelper.has_fti_table(cursor):
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   101
        print('no text index table')
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   102
        dbhelper.init_fti(cursor)
4806
4f12f59b1a13 [fti] refactor and fix full text indexation handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4691
diff changeset
   103
    repo.system_source.do_fti = True  # ensure full-text indexation is activated
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   104
    if etypes is None:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   105
        print('Reindexing entities')
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   106
        etypes = set()
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   107
        for eschema in schema.entities():
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   108
            if eschema.final:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   109
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   110
            indexable_attrs = tuple(eschema.indexable_attributes()) # generator
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   111
            if not indexable_attrs:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   112
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   113
            for container in etype_fti_containers(eschema):
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   114
                etypes.add(container)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   115
        # clear fti table first
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   116
        cnx.system_sql('DELETE FROM %s' % dbhelper.fti_table)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   117
    else:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   118
        print('Reindexing entities of type %s' % \
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   119
              ', '.join(sorted(str(e) for e in etypes)))
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   120
        # clear fti table first. Use subquery for sql compatibility
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   121
        cnx.system_sql("DELETE FROM %s WHERE EXISTS(SELECT 1 FROM ENTITIES "
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   122
                       "WHERE eid=%s AND type IN (%s))" % (
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   123
                           dbhelper.fti_table, dbhelper.fti_uid_attr,
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   124
                           ','.join("'%s'" % etype for etype in etypes)))
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   125
    if withpb:
6112
913979c79244 [db-fti-index] simple fix fpr progressbar-related crash when etypes is None in reindex_entities()
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 5999
diff changeset
   126
        pb = ProgressBar(len(etypes) + 1)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   127
        pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   128
    # reindex entities by generating rql queries which set all indexable
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   129
    # attribute to their current value
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   130
    source = repo.system_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   131
    for eschema in etypes:
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   132
        etype_class = cnx.vreg['etypes'].etype_class(str(eschema))
10847
ce5403611cbe [fti] add cw_fti_index_rql_limit method
Julien Cristau <julien.cristau@logilab.fr>
parents: 10846
diff changeset
   133
        for rset in etype_class.cw_fti_index_rql_limit(cnx):
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   134
            source.fti_index_entities(cnx, rset.entities())
8450
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   135
            # clear entity cache to avoid high memory consumption on big tables
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   136
            cnx.drop_entity_cache()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   137
        if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   138
            pb.update()
9578
68049d53426c [fti] properly close the ProgressBar
Julien Cristau <julien.cristau@logilab.fr>
parents: 9577
diff changeset
   139
    if withpb:
68049d53426c [fti] properly close the ProgressBar
Julien Cristau <julien.cristau@logilab.fr>
parents: 9577
diff changeset
   140
        pb.finish()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   141
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   142
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   143
def check_schema(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   144
    """check serialized schema"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   145
    print('Checking serialized schema')
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   146
    rql = ('Any COUNT(X),RN,SN,ON,CTN GROUPBY RN,SN,ON,CTN ORDERBY 1 '
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   147
           'WHERE X is CWConstraint, R constrained_by X, '
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   148
           'R relation_type RT, RT name RN, R from_entity ST, ST name SN, '
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   149
           'R to_entity OT, OT name ON, X cstrtype CT, CT name CTN')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   150
    for count, rn, sn, on, cstrname in cnx.execute(rql):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   151
        if count == 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   152
            continue
10813
ab626726a70a [syncschema] don't arbitrarily remove constraints from the in-memory schema
Julien Cristau <julien.cristau@logilab.fr>
parents: 10663
diff changeset
   153
        if cstrname in UNIQUE_CONSTRAINTS:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   154
            print("ERROR: got %s %r constraints on relation %s.%s.%s" % (
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   155
                count, cstrname, sn, rn, on))
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   156
            if fix:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   157
                print('dunno how to fix, do it yourself')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   158
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   159
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   160
def check_text_index(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   161
    """check all entities registered in the text index"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   162
    print('Checking text index')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   163
    msg = '  Entity with eid %s exists in the text index but in no source (autofix will remove from text index)'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   164
    cursor = cnx.system_sql('SELECT uid FROM appears;')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   166
        eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   167
        if not has_eid(cnx, cursor, eid, eids):
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   168
            sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   169
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   170
                cnx.system_sql('DELETE FROM appears WHERE uid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   171
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   172
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   173
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   174
def check_entities(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   175
    """check all entities registered in the repo system table"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   176
    print('Checking entities system table')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   177
    # system table but no source
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   178
    msg = '  Entity %s with eid %s exists in the system table but in no source (autofix will delete the entity)'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   179
    cursor = cnx.system_sql('SELECT eid,type FROM entities;')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   180
    for row in cursor.fetchall():
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   181
        eid, etype = row
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   182
        if not has_eid(cnx, cursor, eid, eids):
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   183
            sys.stderr.write(msg % (etype, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   184
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   185
                cnx.system_sql('DELETE FROM entities WHERE eid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   186
            notify_fixed(fix)
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   187
    # source in entities, but no relation cw_source
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   188
    # XXX this (get_versions) requires a second connection to the db when we already have one open
11772
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   189
    cursor = cnx.system_sql('SELECT e.eid FROM entities as e, cw_CWSource as s '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   190
                            'WHERE NOT EXISTS(SELECT 1 FROM cw_source_relation as cs '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   191
                            '  WHERE cs.eid_from=e.eid) '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   192
                            'ORDER BY e.eid')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   193
    msg = ('  Entity with eid %s is missing relation cw_source (autofix will create the relation)\n')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   194
    for row in cursor.fetchall():
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   195
        sys.stderr.write(msg % row[0])
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   196
    if fix:
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   197
        cnx.system_sql('INSERT INTO cw_source_relation (eid_from, eid_to) '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   198
                       'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWSource as s '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   199
                       "WHERE s.cw_name='system' AND NOT EXISTS(SELECT 1 FROM cw_source_relation as cs "
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   200
                       '  WHERE cs.eid_from=e.eid)')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   201
        notify_fixed(True)
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   202
    # inconsistencies for 'is'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   203
    msg = '  %s #%s is missing relation "is" (autofix will create the relation)\n'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   204
    cursor = cnx.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   205
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   206
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   207
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   208
    for row in cursor.fetchall():
10625
684dbf6722ef [server] fix checkintegrity with pyodbc
Julien Cristau <julien.cristau@logilab.fr>
parents: 10365
diff changeset
   209
        sys.stderr.write(msg % tuple(row))
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   210
    if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   211
        cnx.system_sql('INSERT INTO is_relation (eid_from, eid_to) '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   212
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   213
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   214
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   215
        notify_fixed(True)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   216
    # inconsistencies for 'is_instance_of'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   217
    msg = '  %s #%s is missing relation "is_instance_of" (autofix will create the relation)\n'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   218
    cursor = cnx.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   219
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   220
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   221
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   222
    for row in cursor.fetchall():
10625
684dbf6722ef [server] fix checkintegrity with pyodbc
Julien Cristau <julien.cristau@logilab.fr>
parents: 10365
diff changeset
   223
        sys.stderr.write(msg % tuple(row))
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   224
    if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   225
        cnx.system_sql('INSERT INTO is_instance_of_relation (eid_from, eid_to) '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   226
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   227
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   228
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   229
        notify_fixed(True)
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   230
    print('Checking entities tables')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   231
    msg = '  Entity with eid %s exists in the %s table but not in the system table (autofix will delete the entity)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   232
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   233
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   234
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   235
        table = SQL_PREFIX + eschema.type
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   236
        column = SQL_PREFIX +  'eid'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   237
        cursor = cnx.system_sql('SELECT %s FROM %s;' % (column, table))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   238
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   239
            eid = row[0]
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
   240
            # eids is full since we have fetched everything from the entities table,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   241
            # no need to call has_eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   242
            if not eid in eids or not eids[eid]:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   243
                sys.stderr.write(msg % (eid, eschema.type))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   244
                if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   245
                    cnx.system_sql('DELETE FROM %s WHERE %s=%s;' % (table, column, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   246
                notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   247
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   248
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   249
def bad_related_msg(rtype, target, eid, fix):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   250
    msg = '  A relation %s with %s eid %s exists but no such entity in sources'
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   251
    sys.stderr.write(msg % (rtype, target, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   252
    notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   253
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   254
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   255
def bad_inlined_msg(rtype, parent_eid, eid, fix):
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   256
    msg = ('  An inlined relation %s from %s to %s exists but the latter '
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   257
           'entity does not exist')
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   258
    sys.stderr.write(msg % (rtype, parent_eid, eid))
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   259
    notify_fixed(fix)
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   260
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   261
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   262
def check_relations(schema, cnx, eids, fix=1):
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   263
    """check that eids referenced by relations are registered in the repo system
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   264
    table
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   265
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   266
    print('Checking relations')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   267
    for rschema in schema.relations():
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   268
        if rschema.final or rschema.type in PURE_VIRTUAL_RTYPES:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   269
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   270
        if rschema.inlined:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   271
            for subjtype in rschema.subjects():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   272
                table = SQL_PREFIX + str(subjtype)
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   273
                column = SQL_PREFIX +  str(rschema)
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   274
                sql = 'SELECT cw_eid,%s FROM %s WHERE %s IS NOT NULL;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   275
                    column, table, column)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   276
                cursor = cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   277
                for row in cursor.fetchall():
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   278
                    parent_eid, eid = row
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   279
                    if not has_eid(cnx, cursor, eid, eids):
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   280
                        bad_inlined_msg(rschema, parent_eid, eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   281
                        if fix:
3374
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   282
                            sql = 'UPDATE %s SET %s=NULL WHERE %s=%s;' % (
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   283
                                table, column, column, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   284
                            cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   285
            continue
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   286
        try:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   287
            cursor = cnx.system_sql('SELECT eid_from FROM %s_relation;' % rschema)
8695
358d8bed9626 [toward-py3k] rewrite to "except AnException as exc:" (part of #2711624)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8694
diff changeset
   288
        except Exception as ex:
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   289
            # usually because table doesn't exist
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   290
            print('ERROR', ex)
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   291
            continue
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   292
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   293
            eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   294
            if not has_eid(cnx, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   295
                bad_related_msg(rschema, 'subject', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   296
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   297
                    sql = 'DELETE FROM %s_relation WHERE eid_from=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   298
                        rschema, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   299
                    cnx.system_sql(sql)
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   300
        cursor = cnx.system_sql('SELECT eid_to FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   301
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   302
            eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   303
            if not has_eid(cnx, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   304
                bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   305
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   306
                    sql = 'DELETE FROM %s_relation WHERE eid_to=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   307
                        rschema, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   308
                    cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   309
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   310
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   311
def check_mandatory_relations(schema, cnx, eids, fix=1):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   312
    """check entities missing some mandatory relation"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   313
    print('Checking mandatory relations')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   314
    msg = '%s #%s is missing mandatory %s relation %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   315
    for rschema in schema.relations():
8527
6151b0199bc1 [c-c db-check] don't skip is/is_instance_of mandatory relation to avoid telling two opposite messages when the relation is missing. Closes #2465659
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8450
diff changeset
   316
        if rschema.final or rschema in PURE_VIRTUAL_RTYPES or rschema in ('is', 'is_instance_of'):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   317
            continue
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   318
        smandatory = set()
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   319
        omandatory = set()
10663
54b8a1f249fb [py3k] dict.itervalues → dict.values
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10651
diff changeset
   320
        for rdef in rschema.rdefs.values():
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   321
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   322
                smandatory.add(rdef.subject)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   323
            if rdef.cardinality[1] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   324
                omandatory.add(rdef.object)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   325
        for role, etypes in (('subject', smandatory), ('object', omandatory)):
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   326
            for etype in etypes:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   327
                if role == 'subject':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   328
                    rql = 'Any X WHERE NOT X %s Y, X is %s' % (rschema, etype)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   329
                else:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   330
                    rql = 'Any X WHERE NOT Y %s X, X is %s' % (rschema, etype)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   331
                for entity in cnx.execute(rql).entities():
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8696
diff changeset
   332
                    sys.stderr.write(msg % (entity.cw_etype, entity.eid, role, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   333
                    if fix:
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   334
                        entity.cw_delete() # XXX this is BRUTAL!
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   335
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   336
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   337
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   338
def check_mandatory_attributes(schema, cnx, eids, fix=1):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   339
    """check for entities stored in the system source missing some mandatory
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   340
    attribute
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   341
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   342
    print('Checking mandatory attributes')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   343
    msg = '%s #%s is missing mandatory attribute %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   344
    for rschema in schema.relations():
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   345
        if not rschema.final or rschema in VIRTUAL_RTYPES:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   346
            continue
10663
54b8a1f249fb [py3k] dict.itervalues → dict.values
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10651
diff changeset
   347
        for rdef in rschema.rdefs.values():
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   348
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   349
                rql = 'Any X WHERE X %s NULL, X is %s, X cw_source S, S name "system"' % (
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   350
                    rschema, rdef.subject)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   351
                for entity in cnx.execute(rql).entities():
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8696
diff changeset
   352
                    sys.stderr.write(msg % (entity.cw_etype, entity.eid, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   353
                    if fix:
7682
07b592a62be3 [c-c db-check] drop some deprecation warning
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7473
diff changeset
   354
                        entity.cw_delete()
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   355
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   356
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   357
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   358
def check_metadata(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   359
    """check entities has required metadata
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   360
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   361
    FIXME: rewrite using RQL queries ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   362
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   363
    print('Checking metadata')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   364
    cursor = cnx.system_sql("SELECT DISTINCT type FROM entities;")
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   365
    eidcolumn = SQL_PREFIX + 'eid'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   366
    msg = '  %s with eid %s has no %s (autofix will set it to now)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   367
    for etype, in cursor.fetchall():
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   368
        if etype not in cnx.vreg.schema:
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   369
            sys.stderr.write('entities table references unknown type %s\n' %
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   370
                             etype)
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   371
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   372
                cnx.system_sql("DELETE FROM entities WHERE type = %(type)s",
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   373
                                   {'type': etype})
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   374
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   375
        table = SQL_PREFIX + etype
11033
63d860a14a17 [schema] Use TZDatetime for creation_date and modification_date
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 10847
diff changeset
   376
        for rel, default in ( ('creation_date', datetime.utcnow()),
63d860a14a17 [schema] Use TZDatetime for creation_date and modification_date
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 10847
diff changeset
   377
                              ('modification_date', datetime.utcnow()), ):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   378
            column = SQL_PREFIX + rel
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   379
            cursor = cnx.system_sql("SELECT %s FROM %s WHERE %s is NULL"
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   380
                                        % (eidcolumn, table, column))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   381
            for eid, in cursor.fetchall():
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   382
                sys.stderr.write(msg % (etype, eid, rel))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   383
                if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   384
                    cnx.system_sql("UPDATE %s SET %s=%%(v)s WHERE %s=%s ;"
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   385
                                       % (table, column, eidcolumn, eid),
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   386
                                       {'v': default})
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   387
                notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   388
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   389
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   390
def check(repo, cnx, checks, reindex, fix, withpb=True):
2476
1294a6bdf3bf application -> instance where it makes sense
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2248
diff changeset
   391
    """check integrity of instance's repository,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   392
    using given user and password to locally connect to the repository
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   393
    (no running cubicweb server needed)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   394
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   395
    # yo, launch checks
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   396
    if checks:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   397
        eids_cache = {}
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   398
        with cnx.security_enabled(read=False, write=False): # ensure no read security
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   399
            for check in checks:
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   400
                check_func = globals()['check_%s' % check]
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   401
                check_func(repo.schema, cnx, eids_cache, fix=fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   402
        if fix:
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   403
            cnx.commit()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   404
        else:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   405
            print()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   406
        if not fix:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   407
            print('WARNING: Diagnostic run, nothing has been corrected')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   408
    if reindex:
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   409
        cnx.rollback()
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   410
        reindex_entities(repo.schema, cnx, withpb=withpb)
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   411
        cnx.commit()
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   412
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   413
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   414
SYSTEM_INDEXES = {
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   415
    # see cw/server/sources/native.py
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   416
    'transactions_tx_time_idx': ('transactions', 'tx_time'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   417
    'transactions_tx_user_idx': ('transactions', 'tx_user'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   418
    'tx_entity_actions_txa_action_idx': ('tx_entity_actions', 'txa_action'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   419
    'tx_entity_actions_txa_public_idx': ('tx_entity_actions', 'txa_public'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   420
    'tx_entity_actions_eid_idx': ('tx_entity_actions', 'txa_eid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   421
    'tx_entity_actions_etype_idx': ('tx_entity_actions', 'txa_etype'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   422
    'tx_entity_actions_tx_uuid_idx': ('tx_entity_actions', 'tx_uuid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   423
    'tx_relation_actions_txa_action_idx': ('tx_relation_actions', 'txa_action'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   424
    'tx_relation_actions_txa_public_idx': ('tx_relation_actions', 'txa_public'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   425
    'tx_relation_actions_eid_from_idx': ('tx_relation_actions', 'eid_from'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   426
    'tx_relation_actions_eid_to_idx': ('tx_relation_actions', 'eid_to'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   427
    'tx_relation_actions_tx_uuid_idx': ('tx_relation_actions', 'tx_uuid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   428
}
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   429
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   430
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   431
def expected_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   432
    """Return a dictionary describing indexes expected by the schema {index name: (table, column)}.
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   433
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   434
    This doesn't include primary key indexes.
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   435
    """
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   436
    source = cnx.repo.system_source
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   437
    dbh = source.dbhelper
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   438
    schema = cnx.repo.schema
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   439
    schema_indexes = SYSTEM_INDEXES.copy()
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   440
    if source.dbdriver == 'postgres':
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   441
        schema_indexes.update({'appears_words_idx': ('appears', 'words')})
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   442
    else:
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   443
        schema_indexes.update({'appears_uid': ('appears', 'uid'),
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   444
                               'appears_word_id': ('appears', 'word_id')})
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   445
    for rschema in schema.relations():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   446
        if rschema.rule or rschema in PURE_VIRTUAL_RTYPES:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   447
            continue  # computed relation
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   448
        if rschema.final or rschema.inlined:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   449
            for rdef in rschema.rdefs.values():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   450
                table = 'cw_{0}'.format(rdef.subject)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   451
                column = 'cw_{0}'.format(rdef.rtype)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   452
                if any(isinstance(cstr, UniqueConstraint) for cstr in rdef.constraints):
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   453
                    schema_indexes[dbh._index_name(table, column, unique=True)] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   454
                        table, [column])
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   455
                if rschema.inlined or rdef.indexed:
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   456
                    schema_indexes[dbh._index_name(table, column)] = (table, [column])
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   457
        else:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   458
            table = '{0}_relation'.format(rschema)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   459
            if source.dbdriver == 'postgres':
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   460
                # index built after the primary key constraint
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   461
                schema_indexes[build_index_name(table, ['eid_from', 'eid_to'], 'key_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   462
                    table, ['eid_from', 'eid_to'])
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   463
            schema_indexes[build_index_name(table, ['eid_from'], 'idx_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   464
                table, ['eid_from'])
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   465
            schema_indexes[build_index_name(table, ['eid_to'], 'idx_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   466
                table, ['eid_to'])
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   467
    for eschema in schema.entities():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   468
        if eschema.final:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   469
            continue
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   470
        table = 'cw_{0}'.format(eschema)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   471
        for columns, index_name in iter_unique_index_names(eschema):
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   472
            schema_indexes[index_name] = (table, columns)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   473
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   474
    return schema_indexes
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   475
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   476
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   477
def database_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   478
    """Return a set of indexes found in the database, excluding primary key indexes."""
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   479
    source = cnx.repo.system_source
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   480
    dbh = source.dbhelper
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   481
    if source.dbdriver == 'postgres':
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   482
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   483
        def index_filter(idx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   484
            return not (idx.startswith('pg_') or idx.endswith('_pkey'))
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   485
    else:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   486
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   487
        def index_filter(idx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   488
            return not idx.startswith('sqlite_')
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   489
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   490
    return set(idx for idx in dbh.list_indices(cnx.cnxset.cu)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   491
               if index_filter(idx))
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   492
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   493
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   494
def check_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   495
    """Check indexes of a system database: output missing expected indexes as well as unexpected ones.
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   496
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   497
    Return 0 if there is no differences, else 1.
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   498
    """
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   499
    schema_indexes = expected_indexes(cnx)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   500
    db_indexes = database_indexes(cnx)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   501
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   502
    missing_indexes = set(schema_indexes) - db_indexes
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   503
    if missing_indexes:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   504
        print(underline_title('Missing indexes'))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   505
        print('index expected by the schema but not found in the database:\n')
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   506
        missing = ['{0} ON {1[0]} {1[1]}'.format(idx, schema_indexes[idx])
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   507
                   for idx in missing_indexes]
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   508
        print('\n'.join(sorted(missing)))
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   509
        print()
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   510
        status = 1
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   511
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   512
    additional_indexes = db_indexes - set(schema_indexes)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   513
    if additional_indexes:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   514
        print(underline_title('Additional indexes'))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   515
        print('index in the database but not expected by the schema:\n')
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   516
        print('\n'.join(sorted(additional_indexes)))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   517
        print()
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   518
        status = 1
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   519
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   520
    if not (missing_indexes or additional_indexes):
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   521
        print('Everything is Ok')
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   522
        status = 0
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   523
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   524
    return status