cubicweb/server/checkintegrity.py
author Denis Laxalde <denis.laxalde@logilab.fr>
Thu, 21 Mar 2019 14:33:54 +0100
changeset 12530 9d88e1177c35
parent 11957 48b8fbc88209
child 12567 26744ad37953
permissions -rw-r--r--
Remove Twisted web server Twisted web server is not used anymore and has been superseded by pyramid many years ago. Furthermore, our usage is not compatible with Python 3. So we drop the "etwist" sub-package. As a consequence, "all-in-one" configuration type gets dropped as it was Twisted-specific. We resurrect it in cubicweb/pyramid/config.py by only keeping options used by the "pyramid". Similarly, we introduce a AllInOneCreateHandler in cubicweb/pyramid/pyramidctl.py that is basically the one that lived in cubicweb/etwist/twctl.py and is used to create the "all-in-one" instance. Added a TODO here about "pyramid.ini" that could be generated at the end of bootstrap() method. In cubicweb/devtools/httptest.py, CubicWebServerTC is now equivalent to CubicWebWsgiTC and the latter is dropped.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
     1
# copyright 2003-2016 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     2
# contact http://www.logilab.fr/ -- mailto:contact@logilab.fr
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     3
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     4
# This file is part of CubicWeb.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     5
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     6
# CubicWeb is free software: you can redistribute it and/or modify it under the
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     7
# terms of the GNU Lesser General Public License as published by the Free
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     8
# Software Foundation, either version 2.1 of the License, or (at your option)
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
     9
# any later version.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    10
#
5424
8ecbcbff9777 replace logilab-common by CubicWeb in disclaimer
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5421
diff changeset
    11
# CubicWeb is distributed in the hope that it will be useful, but WITHOUT
5421
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    12
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    13
# FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    14
# details.
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    15
#
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    16
# You should have received a copy of the GNU Lesser General Public License along
8167de96c523 proper licensing information (LGPL-2.1). Hope I get it right this time.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5341
diff changeset
    17
# with CubicWeb.  If not, see <http://www.gnu.org/licenses/>.
6127
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    18
"""Integrity checking tool for instances:
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    19
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    20
* integrity of a CubicWeb repository. Hum actually only the system database is
747e423093fc [ms, c-c] new command checking for consistency / potentian flaws and enhancements of mapping file of a multi-sources instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6112
diff changeset
    21
  checked.
5999
eaf8219f8b7d [migration] fix rename_entity_type to avoid to loose some relations on the way
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5954
diff changeset
    22
"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
    23
from __future__ import print_function
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
    24
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    25
import sys
1016
26387b836099 use datetime instead of mx.DateTime
sylvain.thenault@logilab.fr
parents: 713
diff changeset
    26
from datetime import datetime
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    27
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    28
from logilab.common.shellutils import ProgressBar
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    29
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    30
from yams.constraints import UniqueConstraint
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    31
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    32
from cubicweb.toolsutils import underline_title
10813
ab626726a70a [syncschema] don't arbitrarily remove constraints from the in-memory schema
Julien Cristau <julien.cristau@logilab.fr>
parents: 10663
diff changeset
    33
from cubicweb.schema import PURE_VIRTUAL_RTYPES, VIRTUAL_RTYPES, UNIQUE_CONSTRAINTS
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    34
from cubicweb.server.sqlutils import SQL_PREFIX
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    35
from cubicweb.server.schema2sql import iter_unique_index_names, build_index_name
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    36
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    37
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    38
def notify_fixed(fix):
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    39
    if fix:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    40
        sys.stderr.write(' [FIXED]')
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    41
    sys.stderr.write('\n')
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
    42
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    43
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
    44
def has_eid(cnx, sqlcursor, eid, eids):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    45
    """return true if the eid is a valid eid"""
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
    46
    if eid in eids:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    47
        return eids[eid]
9469
032825bbacab [multi-sources-removal] Drop entities.source column
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9468
diff changeset
    48
    sqlcursor.execute('SELECT type FROM entities WHERE eid=%s' % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    49
    try:
9469
032825bbacab [multi-sources-removal] Drop entities.source column
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 9468
diff changeset
    50
        etype = sqlcursor.fetchone()[0]
7815
2a164a9cf81c [exceptions] stop catching any exception in various places (closes #1942716)
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7687
diff changeset
    51
    except Exception:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    52
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    53
        return False
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
    54
    if etype not in cnx.vreg.schema:
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
    55
        eids[eid] = False
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
    56
        return False
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    57
    sqlcursor.execute('SELECT * FROM %s%s WHERE %seid=%s' % (SQL_PREFIX, etype,
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
    58
                                                             SQL_PREFIX, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    59
    result = sqlcursor.fetchall()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    60
    if len(result) == 0:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    61
        eids[eid] = False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    62
        return False
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    63
    elif len(result) > 1:
9492
c7fc56eecd1a English typography
Dimitri Papadopoulos <dimitri.papadopoulos@cea.fr>
parents: 9171
diff changeset
    64
        msg = ('  More than one entity with eid %s exists in source!\n'
c7fc56eecd1a English typography
Dimitri Papadopoulos <dimitri.papadopoulos@cea.fr>
parents: 9171
diff changeset
    65
               '  WARNING : Unable to fix this, do it yourself!\n')
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
    66
        sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    67
    eids[eid] = True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    68
    return True
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    69
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    70
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    71
# XXX move to yams?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    72
def etype_fti_containers(eschema, _done=None):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    73
    if _done is None:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    74
        _done = set()
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    75
    _done.add(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    76
    containers = tuple(eschema.fulltext_containers())
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    77
    if containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    78
        for rschema, target in containers:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    79
            if target == 'object':
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    80
                targets = rschema.objects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    81
            else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    82
                targets = rschema.subjects(eschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    83
            for targeteschema in targets:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    84
                if targeteschema in _done:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    85
                    continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    86
                _done.add(targeteschema)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    87
                for container in etype_fti_containers(targeteschema, _done):
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    88
                    yield container
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    89
    else:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    90
        yield eschema
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
    91
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
    92
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    93
def reindex_entities(schema, cnx, withpb=True, etypes=None):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    94
    """reindex all entities in the repository"""
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    95
    # deactivate modification_date hook since we don't want them
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
    96
    # to be updated due to the reindexation
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    97
    repo = cnx.repo
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
    98
    dbhelper = repo.system_source.dbhelper
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
    99
    cursor = cnx.cnxset.cu
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   100
    if not dbhelper.has_fti_table(cursor):
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   101
        print('no text index table')
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   102
        dbhelper.init_fti(cursor)
4806
4f12f59b1a13 [fti] refactor and fix full text indexation handling
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4691
diff changeset
   103
    repo.system_source.do_fti = True  # ensure full-text indexation is activated
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   104
    if etypes is None:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   105
        print('Reindexing entities')
5850
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   106
        etypes = set()
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   107
        for eschema in schema.entities():
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   108
            if eschema.final:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   109
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   110
            indexable_attrs = tuple(eschema.indexable_attributes()) # generator
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   111
            if not indexable_attrs:
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   112
                continue
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   113
            for container in etype_fti_containers(eschema):
fabff2813ee4 [migration] schema should be accessed through .repo
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5693
diff changeset
   114
                etypes.add(container)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   115
        # clear fti table first
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   116
        cnx.system_sql('DELETE FROM %s' % dbhelper.fti_table)
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   117
    else:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   118
        print('Reindexing entities of type %s' % \
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   119
              ', '.join(sorted(str(e) for e in etypes)))
5954
987086484876 [fti migration] test and fix reindexation of some specific entity types
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5850
diff changeset
   120
        # clear fti table first. Use subquery for sql compatibility
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   121
        cnx.system_sql("DELETE FROM %s WHERE EXISTS(SELECT 1 FROM ENTITIES "
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   122
                       "WHERE eid=%s AND type IN (%s))" % (
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   123
                           dbhelper.fti_table, dbhelper.fti_uid_attr,
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   124
                           ','.join("'%s'" % etype for etype in etypes)))
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   125
    if withpb:
6112
913979c79244 [db-fti-index] simple fix fpr progressbar-related crash when etypes is None in reindex_entities()
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 5999
diff changeset
   126
        pb = ProgressBar(len(etypes) + 1)
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   127
        pb.update()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   128
    # reindex entities by generating rql queries which set all indexable
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   129
    # attribute to their current value
4816
c02583cb80a9 repair stuff broken by fti handling changes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4806
diff changeset
   130
    source = repo.system_source
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   131
    for eschema in etypes:
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   132
        etype_class = cnx.vreg['etypes'].etype_class(str(eschema))
10847
ce5403611cbe [fti] add cw_fti_index_rql_limit method
Julien Cristau <julien.cristau@logilab.fr>
parents: 10846
diff changeset
   133
        for rset in etype_class.cw_fti_index_rql_limit(cnx):
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   134
            source.fti_index_entities(cnx, rset.entities())
8450
11063635c4e4 [fti] allow usage of custom RQL to fetch entities to index (closes #2410509)
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 8447
diff changeset
   135
            # clear entity cache to avoid high memory consumption on big tables
9574
2d4c4842bd04 [server] some s/session/cnx/
Julien Cristau <julien.cristau@logilab.fr>
parents: 9543
diff changeset
   136
            cnx.drop_entity_cache()
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   137
        if withpb:
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   138
            pb.update()
9578
68049d53426c [fti] properly close the ProgressBar
Julien Cristau <julien.cristau@logilab.fr>
parents: 9577
diff changeset
   139
    if withpb:
68049d53426c [fti] properly close the ProgressBar
Julien Cristau <julien.cristau@logilab.fr>
parents: 9577
diff changeset
   140
        pb.finish()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   141
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   142
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   143
_CHECKERS = {}
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   144
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   145
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   146
def _checker(func):
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   147
    """Decorator to register a function as a checker for check()."""
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   148
    fname = func.__name__
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   149
    prefix = 'check_'
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   150
    assert fname.startswith(prefix), 'cannot register %s as a checker' % func
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   151
    _CHECKERS[fname[len(prefix):]] = func
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   152
    return func
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   153
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   154
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   155
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   156
def check_schema(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   157
    """check serialized schema"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   158
    print('Checking serialized schema')
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   159
    rql = ('Any COUNT(X),RN,SN,ON,CTN GROUPBY RN,SN,ON,CTN ORDERBY 1 '
1398
5fe84a5f7035 rename internal entity types to have CW prefix instead of E
sylvain.thenault@logilab.fr
parents: 1263
diff changeset
   160
           'WHERE X is CWConstraint, R constrained_by X, '
5338
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   161
           'R relation_type RT, RT name RN, R from_entity ST, ST name SN, '
3e5a256d17ba [db-check] fix duplicated schema constraint detection
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4835
diff changeset
   162
           'R to_entity OT, OT name ON, X cstrtype CT, CT name CTN')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   163
    for count, rn, sn, on, cstrname in cnx.execute(rql):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   164
        if count == 1:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   165
            continue
10813
ab626726a70a [syncschema] don't arbitrarily remove constraints from the in-memory schema
Julien Cristau <julien.cristau@logilab.fr>
parents: 10663
diff changeset
   166
        if cstrname in UNIQUE_CONSTRAINTS:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   167
            print("ERROR: got %s %r constraints on relation %s.%s.%s" % (
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   168
                count, cstrname, sn, rn, on))
5523
4bf975c049a6 [db-check] RQLConstraint is not a 'unique' constraint
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5424
diff changeset
   169
            if fix:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   170
                print('dunno how to fix, do it yourself')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   171
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   172
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   173
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   174
def check_text_index(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   175
    """check all entities registered in the text index"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   176
    print('Checking text index')
11957
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   177
    msg = ('  Entity with eid %s exists in the text index but not in any '
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   178
           'entity type table (autofix will remove from text index)')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   179
    cursor = cnx.system_sql('SELECT uid FROM appears;')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   180
    for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   181
        eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   182
        if not has_eid(cnx, cursor, eid, eids):
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   183
            sys.stderr.write(msg % eid)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   184
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   185
                cnx.system_sql('DELETE FROM appears WHERE uid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   186
            notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   187
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   188
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   189
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   190
def check_entities(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   191
    """check all entities registered in the repo system table"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   192
    print('Checking entities system table')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   193
    # system table but no source
11957
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   194
    msg = ('  Entity %s with eid %s exists in "entities" table but not in any '
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   195
           'entity type table (autofix will delete the entity)')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   196
    cursor = cnx.system_sql('SELECT eid,type FROM entities;')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   197
    for row in cursor.fetchall():
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   198
        eid, etype = row
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   199
        if not has_eid(cnx, cursor, eid, eids):
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   200
            sys.stderr.write(msg % (etype, eid))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   201
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   202
                cnx.system_sql('DELETE FROM entities WHERE eid=%s;' % eid)
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   203
            notify_fixed(fix)
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   204
    # source in entities, but no relation cw_source
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   205
    # XXX this (get_versions) requires a second connection to the db when we already have one open
11772
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   206
    cursor = cnx.system_sql('SELECT e.eid FROM entities as e, cw_CWSource as s '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   207
                            'WHERE NOT EXISTS(SELECT 1 FROM cw_source_relation as cs '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   208
                            '  WHERE cs.eid_from=e.eid) '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   209
                            'ORDER BY e.eid')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   210
    msg = ('  Entity with eid %s is missing relation cw_source (autofix will create the relation)\n')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   211
    for row in cursor.fetchall():
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   212
        sys.stderr.write(msg % row[0])
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   213
    if fix:
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   214
        cnx.system_sql('INSERT INTO cw_source_relation (eid_from, eid_to) '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   215
                       'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWSource as s '
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   216
                       "WHERE s.cw_name='system' AND NOT EXISTS(SELECT 1 FROM cw_source_relation as cs "
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   217
                       '  WHERE cs.eid_from=e.eid)')
54fa73a2f931 [integrity check] Stop using entities.asource column in integrity checks
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11767
diff changeset
   218
        notify_fixed(True)
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   219
    # inconsistencies for 'is'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   220
    msg = '  %s #%s is missing relation "is" (autofix will create the relation)\n'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   221
    cursor = cnx.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   222
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   223
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   224
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   225
    for row in cursor.fetchall():
10625
684dbf6722ef [server] fix checkintegrity with pyodbc
Julien Cristau <julien.cristau@logilab.fr>
parents: 10365
diff changeset
   226
        sys.stderr.write(msg % tuple(row))
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   227
    if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   228
        cnx.system_sql('INSERT INTO is_relation (eid_from, eid_to) '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   229
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   230
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   231
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   232
        notify_fixed(True)
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   233
    # inconsistencies for 'is_instance_of'
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   234
    msg = '  %s #%s is missing relation "is_instance_of" (autofix will create the relation)\n'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   235
    cursor = cnx.system_sql('SELECT e.type, e.eid FROM entities as e, cw_CWEType as s '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   236
                                'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   237
                                '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid) '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   238
                                'ORDER BY e.eid')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   239
    for row in cursor.fetchall():
10625
684dbf6722ef [server] fix checkintegrity with pyodbc
Julien Cristau <julien.cristau@logilab.fr>
parents: 10365
diff changeset
   240
        sys.stderr.write(msg % tuple(row))
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   241
    if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   242
        cnx.system_sql('INSERT INTO is_instance_of_relation (eid_from, eid_to) '
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   243
                           'SELECT e.eid, s.cw_eid FROM entities as e, cw_CWEType as s '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   244
                           'WHERE s.cw_name=e.type AND NOT EXISTS(SELECT 1 FROM is_instance_of_relation as cs '
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   245
                           '  WHERE cs.eid_from=e.eid AND cs.eid_to=s.cw_eid)')
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   246
        notify_fixed(True)
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   247
    print('Checking entities tables')
11957
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   248
    msg = ('  Entity with eid %s exists in the %s table but not in "entities" '
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   249
           'table (autofix will delete the entity)')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   250
    for eschema in schema.entities():
3689
deb13e88e037 follow yams 0.25 api changes to improve performance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 3374
diff changeset
   251
        if eschema.final:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   252
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   253
        table = SQL_PREFIX + eschema.type
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   254
        column = SQL_PREFIX +  'eid'
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   255
        cursor = cnx.system_sql('SELECT %s FROM %s;' % (column, table))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   256
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   257
            eid = row[0]
5341
0de53140bd29 [db-check] cleanup
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 5340
diff changeset
   258
            # eids is full since we have fetched everything from the entities table,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   259
            # no need to call has_eid
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   260
            if not eid in eids or not eids[eid]:
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   261
                sys.stderr.write(msg % (eid, eschema.type))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   262
                if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   263
                    cnx.system_sql('DELETE FROM %s WHERE %s=%s;' % (table, column, eid))
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   264
                notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   265
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   266
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   267
def bad_related_msg(rtype, target, eid, fix):
11957
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   268
    msg = ('  A relation %(rtype)s with %(target)s eid %(eid)d exists but '
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   269
           'entity #(eid)d does not exist')
48b8fbc88209 [server] Reprase messages of db-check to avoid using "system" and "sources"
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11953
diff changeset
   270
    sys.stderr.write(msg % {'rtype': rtype, 'target': target, 'eid': eid})
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   271
    notify_fixed(fix)
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   272
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   273
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   274
def bad_inlined_msg(rtype, parent_eid, eid, fix):
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   275
    msg = ('  An inlined relation %s from %s to %s exists but the latter '
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   276
           'entity does not exist')
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   277
    sys.stderr.write(msg % (rtype, parent_eid, eid))
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   278
    notify_fixed(fix)
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   279
1802
d628defebc17 delete-trailing-whitespace + some copyright update
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents: 1398
diff changeset
   280
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   281
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   282
def check_relations(schema, cnx, eids, fix=1):
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   283
    """check that eids referenced by relations are registered in the repo system
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   284
    table
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   285
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   286
    print('Checking relations')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   287
    for rschema in schema.relations():
11953
f24b115cca74 [server] Ignore computed relations in "relations" integrity checker
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11952
diff changeset
   288
        if rschema.final or rschema.rule or rschema.type in PURE_VIRTUAL_RTYPES:
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   289
            continue
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   290
        if rschema.inlined:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   291
            for subjtype in rschema.subjects():
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   292
                table = SQL_PREFIX + str(subjtype)
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   293
                column = SQL_PREFIX +  str(rschema)
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   294
                sql = 'SELECT cw_eid,%s FROM %s WHERE %s IS NOT NULL;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   295
                    column, table, column)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   296
                cursor = cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   297
                for row in cursor.fetchall():
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   298
                    parent_eid, eid = row
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   299
                    if not has_eid(cnx, cursor, eid, eids):
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   300
                        bad_inlined_msg(rschema, parent_eid, eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   301
                        if fix:
3374
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   302
                            sql = 'UPDATE %s SET %s=NULL WHERE %s=%s;' % (
d5bd1b659ce8 [db-check] fix sql to fix bad eid referenced by inlined relation
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2596
diff changeset
   303
                                table, column, column, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   304
                            cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   305
            continue
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   306
        try:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   307
            cursor = cnx.system_sql('SELECT eid_from FROM %s_relation;' % rschema)
8695
358d8bed9626 [toward-py3k] rewrite to "except AnException as exc:" (part of #2711624)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 8694
diff changeset
   308
        except Exception as ex:
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   309
            # usually because table doesn't exist
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   310
            print('ERROR', ex)
6185
229006accd26 [c-c db-check] skip error while checking relation, useful when analyzing really broken database (after a migration failure for instance
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6132
diff changeset
   311
            continue
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   312
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   313
            eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   314
            if not has_eid(cnx, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   315
                bad_related_msg(rschema, 'subject', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   316
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   317
                    sql = 'DELETE FROM %s_relation WHERE eid_from=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   318
                        rschema, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   319
                    cnx.system_sql(sql)
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   320
        cursor = cnx.system_sql('SELECT eid_to FROM %s_relation;' % rschema)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   321
        for row in cursor.fetchall():
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   322
            eid = row[0]
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   323
            if not has_eid(cnx, cursor, eid, eids):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   324
                bad_related_msg(rschema, 'object', eid, fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   325
                if fix:
380
06e7f2932afe db-check fixes
sylvain.thenault@logilab.fr
parents: 0
diff changeset
   326
                    sql = 'DELETE FROM %s_relation WHERE eid_to=%s;' % (
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   327
                        rschema, eid)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   328
                    cnx.system_sql(sql)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   329
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   330
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   331
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   332
def check_mandatory_relations(schema, cnx, eids, fix=1):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   333
    """check entities missing some mandatory relation"""
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   334
    print('Checking mandatory relations')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   335
    msg = '%s #%s is missing mandatory %s relation %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   336
    for rschema in schema.relations():
8527
6151b0199bc1 [c-c db-check] don't skip is/is_instance_of mandatory relation to avoid telling two opposite messages when the relation is missing. Closes #2465659
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 8450
diff changeset
   337
        if rschema.final or rschema in PURE_VIRTUAL_RTYPES or rschema in ('is', 'is_instance_of'):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   338
            continue
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   339
        smandatory = set()
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   340
        omandatory = set()
10663
54b8a1f249fb [py3k] dict.itervalues → dict.values
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10651
diff changeset
   341
        for rdef in rschema.rdefs.values():
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   342
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   343
                smandatory.add(rdef.subject)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   344
            if rdef.cardinality[1] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   345
                omandatory.add(rdef.object)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   346
        for role, etypes in (('subject', smandatory), ('object', omandatory)):
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   347
            for etype in etypes:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   348
                if role == 'subject':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   349
                    rql = 'Any X WHERE NOT X %s Y, X is %s' % (rschema, etype)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   350
                else:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   351
                    rql = 'Any X WHERE NOT Y %s X, X is %s' % (rschema, etype)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   352
                for entity in cnx.execute(rql).entities():
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8696
diff changeset
   353
                    sys.stderr.write(msg % (entity.cw_etype, entity.eid, role, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   354
                    if fix:
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   355
                        entity.cw_delete() # XXX this is BRUTAL!
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   356
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   357
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   358
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   359
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   360
def check_mandatory_attributes(schema, cnx, eids, fix=1):
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   361
    """check for entities stored in the system source missing some mandatory
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   362
    attribute
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   363
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   364
    print('Checking mandatory attributes')
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   365
    msg = '%s #%s is missing mandatory attribute %s (autofix will delete the entity)'
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   366
    for rschema in schema.relations():
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   367
        if not rschema.final or rschema in VIRTUAL_RTYPES:
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   368
            continue
10663
54b8a1f249fb [py3k] dict.itervalues → dict.values
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10651
diff changeset
   369
        for rdef in rschema.rdefs.values():
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   370
            if rdef.cardinality[0] in '1+':
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   371
                rql = 'Any X WHERE X %s NULL, X is %s, X cw_source S, S name "system"' % (
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   372
                    rschema, rdef.subject)
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   373
                for entity in cnx.execute(rql).entities():
8900
010a59e12d89 use cw_etype instead of __regid__
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 8696
diff changeset
   374
                    sys.stderr.write(msg % (entity.cw_etype, entity.eid, rschema))
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   375
                    if fix:
7682
07b592a62be3 [c-c db-check] drop some deprecation warning
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7473
diff changeset
   376
                        entity.cw_delete()
7036
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   377
                    notify_fixed(fix)
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   378
63386b35ec69 [c-c db-check] new checks for entities missing a mandatory relation/attribute
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 7035
diff changeset
   379
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   380
@_checker
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   381
def check_metadata(schema, cnx, eids, fix=1):
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   382
    """check entities has required metadata
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   383
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   384
    FIXME: rewrite using RQL queries ?
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   385
    """
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   386
    print('Checking metadata')
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   387
    cursor = cnx.system_sql("SELECT DISTINCT type FROM entities;")
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   388
    eidcolumn = SQL_PREFIX + 'eid'
7965
d45c0eb39e72 [server] fix integrity checks that destroy data (closes #1972842)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7896
diff changeset
   389
    msg = '  %s with eid %s has no %s (autofix will set it to now)'
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   390
    for etype, in cursor.fetchall():
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   391
        if etype not in cnx.vreg.schema:
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   392
            sys.stderr.write('entities table references unknown type %s\n' %
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   393
                             etype)
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   394
            if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   395
                cnx.system_sql("DELETE FROM entities WHERE type = %(type)s",
9171
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   396
                                   {'type': etype})
be9596750678 Fix two crashes in db-check (closes #3024964)
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 8900
diff changeset
   397
            continue
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   398
        table = SQL_PREFIX + etype
11033
63d860a14a17 [schema] Use TZDatetime for creation_date and modification_date
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 10847
diff changeset
   399
        for rel, default in ( ('creation_date', datetime.utcnow()),
63d860a14a17 [schema] Use TZDatetime for creation_date and modification_date
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 10847
diff changeset
   400
                              ('modification_date', datetime.utcnow()), ):
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   401
            column = SQL_PREFIX + rel
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   402
            cursor = cnx.system_sql("SELECT %s FROM %s WHERE %s is NULL"
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   403
                                        % (eidcolumn, table, column))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   404
            for eid, in cursor.fetchall():
7896
4c954e1e73ef [lint] remove uses of "print >> sys.stderr" (closes #1908571)
Nicolas Chauvat <nicolas.chauvat@logilab.fr>
parents: 7815
diff changeset
   405
                sys.stderr.write(msg % (etype, eid, rel))
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   406
                if fix:
9577
c52441e4a3d7 [serverctl] use repoapi for db-check, add-source, rebuild-fti commands
Julien Cristau <julien.cristau@logilab.fr>
parents: 9574
diff changeset
   407
                    cnx.system_sql("UPDATE %s SET %s=%%(v)s WHERE %s=%s ;"
1251
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   408
                                       % (table, column, eidcolumn, eid),
af40e615dc89 introduce a 'cw_' prefix on entity table and column names so we don't conflict with sql or DBMS specific keywords
sylvain.thenault@logilab.fr
parents: 1161
diff changeset
   409
                                       {'v': default})
7035
8d2cf36bd79d [c-c db-check] factorize code by introducing notify_fixed dumb function
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 6889
diff changeset
   410
                notify_fixed(fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   411
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   412
4675
9233a8350420 [test] don't display progress bar when testing checkintegrity
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4252
diff changeset
   413
def check(repo, cnx, checks, reindex, fix, withpb=True):
2476
1294a6bdf3bf application -> instance where it makes sense
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 2248
diff changeset
   414
    """check integrity of instance's repository,
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   415
    using given user and password to locally connect to the repository
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   416
    (no running cubicweb server needed)
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   417
    """
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   418
    # yo, launch checks
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   419
    if checks:
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   420
        eids_cache = {}
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   421
        with cnx.security_enabled(read=False, write=False): # ensure no read security
4835
13b0b96d7982 [repo] enhanced security handling: deprecates unsafe_execute, in favor of explicit read/write security control using the `enabled_security` context manager. Also code executed on the repository side is now unsafe by default.
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 4834
diff changeset
   422
            for check in checks:
11952
9dbb0af82628 [server] Use global registry variable of integrity checker functions
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11894
diff changeset
   423
                check_func = _CHECKERS[check]
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   424
                check_func(repo.schema, cnx, eids_cache, fix=fix)
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   425
        if fix:
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   426
            cnx.commit()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   427
        else:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   428
            print()
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   429
        if not fix:
10589
7c23b7de2b8d [py3k] print function
Samuel Trégouët <samuel.tregouet@logilab.fr>
parents: 10365
diff changeset
   430
            print('WARNING: Diagnostic run, nothing has been corrected')
0
b97547f5f1fa Showtime !
Adrien Di Mascio <Adrien.DiMascio@logilab.fr>
parents:
diff changeset
   431
    if reindex:
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   432
        cnx.rollback()
10365
21461f80f348 [connection] remove ensure_cnx_set context manager uses
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9797
diff changeset
   433
        reindex_entities(repo.schema, cnx, withpb=withpb)
9797
4e640ab62f51 [tests/checkintegrity] use the new connection api
Aurelien Campeas <aurelien.campeas@logilab.fr>
parents: 9578
diff changeset
   434
        cnx.commit()
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   435
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   436
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   437
SYSTEM_INDEXES = {
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   438
    # see cw/server/sources/native.py
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   439
    'transactions_tx_time_idx': ('transactions', 'tx_time'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   440
    'transactions_tx_user_idx': ('transactions', 'tx_user'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   441
    'tx_entity_actions_txa_action_idx': ('tx_entity_actions', 'txa_action'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   442
    'tx_entity_actions_txa_public_idx': ('tx_entity_actions', 'txa_public'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   443
    'tx_entity_actions_eid_idx': ('tx_entity_actions', 'txa_eid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   444
    'tx_entity_actions_etype_idx': ('tx_entity_actions', 'txa_etype'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   445
    'tx_entity_actions_tx_uuid_idx': ('tx_entity_actions', 'tx_uuid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   446
    'tx_relation_actions_txa_action_idx': ('tx_relation_actions', 'txa_action'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   447
    'tx_relation_actions_txa_public_idx': ('tx_relation_actions', 'txa_public'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   448
    'tx_relation_actions_eid_from_idx': ('tx_relation_actions', 'eid_from'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   449
    'tx_relation_actions_eid_to_idx': ('tx_relation_actions', 'eid_to'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   450
    'tx_relation_actions_tx_uuid_idx': ('tx_relation_actions', 'tx_uuid'),
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   451
}
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   452
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   453
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   454
def expected_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   455
    """Return a dictionary describing indexes expected by the schema {index name: (table, column)}.
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   456
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   457
    This doesn't include primary key indexes.
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   458
    """
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   459
    source = cnx.repo.system_source
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   460
    dbh = source.dbhelper
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   461
    schema = cnx.repo.schema
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   462
    schema_indexes = SYSTEM_INDEXES.copy()
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   463
    if source.dbdriver == 'postgres':
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   464
        schema_indexes.update({'appears_words_idx': ('appears', 'words')})
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   465
    else:
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   466
        schema_indexes.update({'appears_uid': ('appears', 'uid'),
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   467
                               'appears_word_id': ('appears', 'word_id')})
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   468
    for rschema in schema.relations():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   469
        if rschema.rule or rschema in PURE_VIRTUAL_RTYPES:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   470
            continue  # computed relation
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   471
        if rschema.final or rschema.inlined:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   472
            for rdef in rschema.rdefs.values():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   473
                table = 'cw_{0}'.format(rdef.subject)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   474
                column = 'cw_{0}'.format(rdef.rtype)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   475
                if any(isinstance(cstr, UniqueConstraint) for cstr in rdef.constraints):
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   476
                    schema_indexes[dbh._index_name(table, column, unique=True)] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   477
                        table, [column])
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   478
                if rschema.inlined or rdef.indexed:
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   479
                    schema_indexes[dbh._index_name(table, column)] = (table, [column])
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   480
        else:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   481
            table = '{0}_relation'.format(rschema)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   482
            if source.dbdriver == 'postgres':
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   483
                # index built after the primary key constraint
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   484
                schema_indexes[build_index_name(table, ['eid_from', 'eid_to'], 'key_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   485
                    table, ['eid_from', 'eid_to'])
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   486
            schema_indexes[build_index_name(table, ['eid_from'], 'idx_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   487
                table, ['eid_from'])
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   488
            schema_indexes[build_index_name(table, ['eid_to'], 'idx_')] = (
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   489
                table, ['eid_to'])
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   490
    for eschema in schema.entities():
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   491
        if eschema.final:
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   492
            continue
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   493
        table = 'cw_{0}'.format(eschema)
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   494
        for columns, index_name in iter_unique_index_names(eschema):
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   495
            schema_indexes[index_name] = (table, columns)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   496
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   497
    return schema_indexes
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   498
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   499
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   500
def database_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   501
    """Return a set of indexes found in the database, excluding primary key indexes."""
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   502
    source = cnx.repo.system_source
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   503
    dbh = source.dbhelper
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   504
    if source.dbdriver == 'postgres':
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   505
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   506
        def index_filter(idx):
11894
ec29989fba13 Enhance postgres index filter with values from real life
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11893
diff changeset
   507
            return not (idx.startswith('pg_') or '_pkey' in idx or '_p_key' in idx
ec29989fba13 Enhance postgres index filter with values from real life
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11893
diff changeset
   508
                        or idx.endswith('_key'))
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   509
    else:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   510
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   511
        def index_filter(idx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   512
            return not idx.startswith('sqlite_')
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   513
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   514
    return set(idx for idx in dbh.list_indices(cnx.cnxset.cu)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   515
               if index_filter(idx))
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   516
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   517
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   518
def check_indexes(cnx):
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   519
    """Check indexes of a system database: output missing expected indexes as well as unexpected ones.
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   520
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   521
    Return 0 if there is no differences, else 1.
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   522
    """
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   523
    schema_indexes = expected_indexes(cnx)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   524
    db_indexes = database_indexes(cnx)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   525
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   526
    missing_indexes = set(schema_indexes) - db_indexes
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   527
    if missing_indexes:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   528
        print(underline_title('Missing indexes'))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   529
        print('index expected by the schema but not found in the database:\n')
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   530
        missing = ['{0} ON {1[0]} {1[1]}'.format(idx, schema_indexes[idx])
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   531
                   for idx in missing_indexes]
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   532
        print('\n'.join(sorted(missing)))
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   533
        print()
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   534
        status = 1
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   535
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   536
    additional_indexes = db_indexes - set(schema_indexes)
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   537
    if additional_indexes:
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   538
        print(underline_title('Additional indexes'))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   539
        print('index in the database but not expected by the schema:\n')
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   540
        print('\n'.join(sorted(additional_indexes)))
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   541
        print()
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   542
        status = 1
11893
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   543
b230f4a4010d Extract functions to list expected/found database indexes from check_indexes
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11776
diff changeset
   544
    if not (missing_indexes or additional_indexes):
11361
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   545
        print('Everything is Ok')
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   546
        status = 0
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   547
5a857bba1b79 [c-c] Add a command to check database index
Sylvain Thénault <sylvain.thenault@logilab.fr>
parents: 11057
diff changeset
   548
    return status