docs/unstability.rst
author Pierre-Yves.David@ens-lyon.org
Wed, 09 May 2012 13:08:46 +0200
changeset 227 abe52cf492ee
parent 221 d43b72724b84
child 228 5a17c0d41a00
permissions -rw-r--r--
doc: several update and review.


-----------------------------------
The Unstability Principle
-----------------------------------



An intrinsic contradiction
-----------------------------------

XXX starts by talking about getting ride of changeset.

DVCS bring two new major concepts to the Version Control Scene:

    * Organisation of the history with robust DAG,
    * Mutation of history.


However, the two concepts opposes them self:

To achieve a robust history, three key elements are gathered in *changeset*:

    * Full snapshot of the versioned content,
    * Reference to the previous full snapshot used to build the new one,
    * A description of the change who lead from the old content to the new old.

All three elements are used generate a *unique* hash that identify the changeset
(with various other metadata). This identification is a key part of DVCS design.

XXX missing lines ?

::

  Schema base,  A, B and B'

The old changeset is usually discarded
t in DVCS history.


::

  Schema base,  A and A' and B.

Rewriting a changeset with children does not changes children parent! And
because children of the rewritten changeset still **depends** on the older
"dead" version of the changeset with can not get ride of this dead version.

This is a very useful property because Changing B parent means changing B
content too.  This require the creation of **another** changeset.

I'll qualify those children as **unstable** because they are based one a dead
changeset and prevent people to get ride of it.

This instability is an **unavoidable consequence** of the strict dependency of
changese.  History Rewriting history alway  need to take it in account and
provide a way to rewrite the descendant on the new changeset to avoid
coexistence of the old and new version of a rewritten changeset..


Everybody is working around the issue
------------------------------------------------

I'm not claiming that rewriting history is impossible. People are successfully
doing for years. However they all need to work around this unstability:



Rewriting all at once
``````````````````````````

The simplest way to avoid unstability is to ensure rewritting operation always
ends in a stable situation. This is achieve by rewriting all impacted changeset
at the same time.

rewritting all descendants at the same time that the rewritting of a changeset.

::

  Schema!

Several Mercurial command follow this idea: rebase, collapse, histedit.
Mercurial also refuse to amend changeset with descendant.  The git brnach design enforce such approach in git too.


However, DVCS are **Distributed**. This means that you do not control what
happen outside your repository. Once a changeset have been exchanged *outside*,
you can't be sure of it's descendant. Therefore** if you rewritte changeset that
exists elsewere, you can't erradicate the risk of unstability.**

Do not rewrite exchanged changeset
```````````````````````````````````

To work around this issue mercurial introduced phases that prevent you to
rewrite exchanged changeset and ensure other can't pull certain changeset from
you. But this is a very frustrating limitation that prevent you to
efficiently share, review and collaborate on mutable changeset.

Git world use another approach to prevent unstability.
By convention only a single developper works on a changeset contained in a named
branch. But once again this is a huge blocker for collaborating and clueless people
**will** mess up social convention soon or later.


Loose the DAG robustness
````````````````````````````

The other approach use in Mercurial is to keep the mutable part of the history
outside the DVCS constraint. This is the MQ approach of sticking a quilt queue
over Mercurial.

This allow much more flexible workflow two major feature are lost in the
process:

  * Graceful merge. MQ use plain-patch to store changeset content and patch have
    trouble to apply in changing context. applying you queu can because very
    painful if context changeset.

  * easy branching. A quilt queue is by definition a linear queue.

It is possible to collaborate over versionned mq! But you are going ahead a lot
of trouble.

.. Ignore conflicts
.. ```````````````````````````````````
.. 
.. Another ignored issue is conflicting rewritting of the same changeset. If a
.. changeset is rewritten two times we have two newer version, duplicated history
.. complicate to merge.
.. 
.. Mercurial work around by
.. 
.. The "One set of mutable changset == One developper" mantra is also a way to work
.. around conflicting rewritting of changeset. If two different people are able to
.. 
.. The git branch model allow to overwrite changeset version by another one. But it
.. does not care about divergent version. It is the equilent of "common ftp" source
.. management for changeset.

Facing The Danger Once And For All
------------------------------------------------
The more effort you put to avoid instability, the more option you deny. And even
most restrictive work flow can't garantee that instability will never show up!

Obsolete marker can handle the job
```````````````````````````````````

It is time to provide a full featured solution to deal with instability and to
stop working around the issue! This is why I developing a new feature for
mercurial called "Obsolete marker". Obsolete marker have two key property:


* Any changeset is we want to get ride of is **explicitly** marked as "obsolete"
  by history rewritting operation.

  By explicitly marking the obsolete part of the history, we will be able to
  easily detect appearance of unstability. 

* Relations between old and new version of changesets are tracked by Obsolete
  markers.

  By Storing a meta-history of changeset evolution we are able to easily resolve
  instability and edition conflict [#]_ .

.. [#] edition conflict is another major obstable to collaboration. See the
       section dedicated to obsolete marker for details.

Improving robusness improves simplicity
````````````````````````````````````````````````

This proposal should **first** be seen as a safety measure.

It allow to detect unstability as soon as possible

::
    $ hg pull
    added 3 changeset
    +2 unstable changeset
    (do you want "hg stabilize" ?)
    working directory parent is obsolete!
    $ hg push
    outgoing unstable changesets
    (use "hg stabilize" or force the push)

And should not not encourage people to create unstability

::
    $ hg up 42
    $ hg commit --amend
    changeset have descendant.
    $ hg commit --amend -f
    +5 unstable changeset

    $ hg rebase -D --rev 40::44
    rebasing already obsolete changeset 42:AAA will conflict with newer version 48:BBB

While allowing powerful feature
````````````````````````````````````````````````

* "kill" changeset remotely.

* track resulting changeset when submitting patch//pull request.

* Focus on what you do:

  I do not like the "all at once" model of history rewriting. I'm confortable
  with unstability and obsolete marker offer all the tool to safely create and
  handle unstability locally.