[ldapfeed] FIX: Unique Key violation when synchronizing with LDAPfeed 3.26
authorjulien tayon <julien.tayon@logilab.fr>
Thu, 16 May 2019 14:26:38 +0200
branch3.26
changeset 12618 3f125fdbcd70
parent 12617 f59439bac0a8
child 12625 ba5231e1aa45
[ldapfeed] FIX: Unique Key violation when synchronizing with LDAPfeed What was happening ================== The bug appears when ldapfeed tries to insert a user while another user exists exists with a different source. Simple use case to reproduce: - create a local user in cubicweb (source=system) - sync with ldap - ldapfeed will stop complaining user already exists. Without next patch the test MUST fail with message: cubicweb/server/sources/native.py:714: UniqueTogetherError The ldapfeed is thus stopped ignoring any further ldap entries. The proposal ============ Prior to this patch, the insertion was trying to create all CWUser with the computed login from eeimporter.extid2eid (method process of DataFeedLDAPAdapter). When a CWUser existed with a different cw_source ("system" for user created with cubicweb for instance), it created a conflict. To avoid collisions, in the extentities_generator method a bypass was added at the insertion layer. Prior to insertion the absence of the computed login is checked on a list of all existing login from a different source. If collision is detected, we skip the ldap record. By short circuiting at the CWUser entity insertion level we also avoid to treat CWGroup and EmailAddress related to this user. Hence ensuring a behaviour that will not break existing instances. (conservative approach: faced with ambiguity better do nothing than guess). An error message is added stating explicitly the conflict.
cubicweb/server/test/unittest_ldapsource.py
cubicweb/sobjects/ldapparser.py
--- a/cubicweb/server/test/unittest_ldapsource.py	Fri Mar 15 12:12:23 2019 +0100
+++ b/cubicweb/server/test/unittest_ldapsource.py	Thu May 16 14:26:38 2019 +0200
@@ -256,6 +256,33 @@
                               source.authenticate, cnx, 'syt', 'toto')
             self.assertTrue(source.authenticate(cnx, 'syt', 'syt'))
 
+    def test_ldapfeed_insert_collision(self):
+        """
+        when a user computed login from ldapfeed collides with a CWUser
+        login the user MUST not be inserted, and message MUST be present
+        at error level regarding the collision for troubleshooting purpose.
+        We also check that in case the CWUser is skipped, the entity EmailAddress
+        is not modified.
+        If EmailAddress are not modified, CWGroup are not.
+        """
+        with self.admin_access.cnx() as cnx:
+            user = cnx.find('CWUser', login=u'syt').one()
+            user.cw_set(cw_source=cnx.find('CWSource', name=u'system').one())
+            with cnx.security_enabled(write=False):
+                user.cw_set(cwuri=u'http://testing.fr/cubicweb/{}'.format(user.eid))
+                for mail in user.use_email:
+                    mail.cw_set(address=mail.address[:-3] + u".net")
+            cnx.commit()
+            with self.assertLogs('cubicweb.appobject', level='ERROR') as cm:
+                self.pull(cnx)
+                self.assertEqual(
+                    cm.output,
+                    ['ERROR:cubicweb.appobject:not synchronizing user syt.'
+                     ' User already exist in source system']
+                )
+            for mail in user.use_email:
+                self.assertTrue(mail.address.endswith(".net"))
+
     def test_base(self):
         with self.admin_access.repo_cnx() as cnx:
             # check a known one
--- a/cubicweb/sobjects/ldapparser.py	Fri Mar 15 12:12:23 2019 +0100
+++ b/cubicweb/sobjects/ldapparser.py	Thu May 16 14:26:38 2019 +0200
@@ -156,9 +156,26 @@
 
     def extentities_generator(self):
         self.debug('processing ldapfeed source %s %s', self.source, self.searchgroupfilterstr)
+        # get existing users *not* managed by this source
+        non_managed_users = dict(self._cw.execute(
+            'Any L, SN WHERE'
+            ' U is CWUser, U login L,'
+            ' U cw_source S, S name SN,'
+            ' NOT S eid %(eid)s',
+            {'eid': self.source.eid}, build_descr=False))
         # generate users and email addresses
         for userdict in self.user_source_entities_by_extid.values():
             attrs = self.ldap2cwattrs(userdict, 'CWUser')
+            login = attrs['login'][0]
+            try:
+                source = non_managed_users[login]
+            except KeyError:
+                pass
+            else:
+                self.error(
+                    'not synchronizing user %s. User already exist in source %s',
+                    login, source)
+                continue
             pwd = attrs.get('upassword')
             if not pwd:
                 # generate a dumb password if not fetched from ldap (see