docs/obs-concept.rst
changeset 1291 1e3c607cf4a5
parent 1290 8aa9a21156fe
parent 1289 12d5c9eaa86d
child 1292 62229e7451f7
equal deleted inserted replaced
1290:8aa9a21156fe 1291:1e3c607cf4a5
     1 .. Copyright 2011 Pierre-Yves David <pierre-yves.david@ens-lyon.org>
       
     2 ..                Logilab SA        <contact@logilab.fr>
       
     3 
       
     4 -----------------------------------------------------------
       
     5 Why Do We Need a New Concept
       
     6 -----------------------------------------------------------
       
     7 
       
     8 Current DVCSes are great tools for forging a series of flawless
       
     9 changesets on your own. But they perform poorly when it comes to
       
    10 **sharing** some work in progress and **collaborating** on such work
       
    11 in progress.
       
    12 
       
    13 When people forge a new version of a changeset they actually create a
       
    14 new changeset and get rid of the original changeset. Difficulties to
       
    15 collaborate mostly came from the way old content is *removed* from
       
    16 a repository.
       
    17 
       
    18 Mercurial Approach: Strip
       
    19 -----------------------------------------------------
       
    20 
       
    21 With the current version of mercurial, every changeset that exists in
       
    22 your repository is *visible* and *meaningful*. To delete old
       
    23 (rewritten) changesets, mercurial removes them from the repository
       
    24 storage with an operation called *strip*. After the *stripping*, the
       
    25 repository looks as if the changeset never existed.
       
    26 
       
    27 This approach is simple and effective except for one big
       
    28 drawback: you can remove changesets from **your repository only**. If
       
    29 a stripped changeset exists in another repository it touches, it will
       
    30 show up again. This is because a shared changeset becomes
       
    31 part of a shared global history. Stripping a changeset from all
       
    32 repositories is at best impractical and in most case impossible.
       
    33 
       
    34 As consequence, **you can not rewrite something once you exchange it with
       
    35 others**. The old version will still exist along side the new one [#]_.
       
    36 
       
    37 Moreover stripping changesets creates backup bundles. This allows
       
    38 restoration of the deleted changesets, but the process is painful.
       
    39 
       
    40 Finally, as the repository format is not optimized for deletion. stripping a
       
    41 changeset may be slow in some situations.
       
    42 
       
    43 To sum up, the strip approach is very simple but does not handle
       
    44 interaction with the outer world, which is very unfortunate for a
       
    45 *Distributed* VCS.
       
    46 
       
    47 .. [#] various work around exists but they require their own workflows
       
    48    which are distinct from the very elegant basic workflow of
       
    49    Mercurial.
       
    50 
       
    51 Git Approach: Overwrite Reference
       
    52 -----------------------------------------------------
       
    53 
       
    54 The Git approach to repository structure is a bit more complex: there
       
    55 can be any amount of unrelated changesets in a repository, and **only
       
    56 changesets referenced by a git branch** are *visible* and
       
    57 *meaningful*.
       
    58 
       
    59 
       
    60 .. figure:: ./figures/git.*
       
    61 
       
    62 
       
    63 This simplifies the process of getting rid of old changesets. You can
       
    64 just leave them in place and move the reference on the new one. You
       
    65 can then propagate this change by moving the git-branch on remote host
       
    66 with the newer version of the marker overwriting the older one.
       
    67 
       
    68 This approach goes a bit further but still has a major drawback:
       
    69 
       
    70 Because you **overwrite** the git-branch, you have no conflict
       
    71 resolution. The last to act wins. This makes collaboration on multiple
       
    72 changesets difficult because you can't merge concurrent updates on a
       
    73 changeset.
       
    74 
       
    75 Every overwrite is a forced operation where the operator says, "yes I
       
    76 want this to replace that". In highly distributed environments, a user
       
    77 may end up with conflicting references and no proper way to choose.
       
    78 
       
    79 Because of this way to visualize a repository, git-branches are a core
       
    80 part of git, which makes the user interface more complicated and
       
    81 constrains moving through history.
       
    82 
       
    83 Finally, even if all older changesets still exist in the repository,
       
    84 accesing them is still painful.
       
    85 
       
    86 
       
    87 -----------------------------------------------------
       
    88 The Obsolete Marker Concept
       
    89 -----------------------------------------------------
       
    90 
       
    91 
       
    92 As none of the concepts was powerful enough to fulfill the need of
       
    93 safely rewriting history, including easy sharing and collaboration on
       
    94 mutable history, we needed another one.
       
    95 
       
    96 Basic concept
       
    97 -----------------------------------------------------
       
    98 
       
    99 
       
   100 Every history rewriting operation stores the information that old rewritten
       
   101 changeset is replaced by newer version in a given set of changesets.
       
   102 
       
   103 All basic history rewriting operation can create an appropriate obsolete marker.
       
   104 
       
   105 
       
   106 .. figure:: ./figures/example-1-update.*
       
   107 
       
   108     *Updating* a changeset
       
   109 
       
   110     Create one obsolete marker: ``([A'] obsolete A)``
       
   111 
       
   112 
       
   113 
       
   114 .. figure:: ./figures/example-2-split.*
       
   115 
       
   116     *Splitting* a changeset in multiple one
       
   117 
       
   118     Create one obsolete marker ``([B1, B2] obsolete B)]``
       
   119 
       
   120 
       
   121 .. figure:: ./figures/simple-3-merge.*
       
   122 
       
   123     *Merging* multiple changeset in a single one
       
   124 
       
   125     Create two obsolete markers ``([C] obsolete A), ([C] obsolete B)``
       
   126 
       
   127 .. figure:: ./figures/simple-4-reorder.*
       
   128 
       
   129     *Moving* changeset around
       
   130 
       
   131     Reordering those two changesets need two obsolete markers:
       
   132     ``([A'] obsolete A), ([B'] obsolete B)``
       
   133 
       
   134 
       
   135 
       
   136 .. figure:: ./figures/simple-5-delete.*
       
   137 
       
   138     *Removing* a changeset:
       
   139 
       
   140     One obselete marker ``([] obsolete B)``
       
   141 
       
   142 
       
   143 To conclude, a single obsolete marker express a relation from **0..n** new
       
   144 changesets to **1** old changeset.
       
   145 
       
   146 Basic Usage
       
   147 -----------------------------------------------------
       
   148 
       
   149 Obsolete markers create a perpendicular history: **a versioned
       
   150 changeset graph**. This means that offers the same features we have
       
   151 for versioned files but applied to changeset:
       
   152 
       
   153 First: we can display a **coherent view** of the history graph in which only a
       
   154 single version of your changesets is displayed by the UI.
       
   155 
       
   156 Second, because obsolete changeset content is still **available**. You can 
       
   157 you can
       
   158 
       
   159     * **browse** the content of your obsolete commits,
       
   160 
       
   161     * **compare** newer and older versions of a changeset,
       
   162 
       
   163     * **restore** content of previously obsolete changesets.
       
   164 
       
   165 Finally, the obsolete marker can be **exchanged between
       
   166 repositories**. You are able to share the result on your history
       
   167 rewriting operations with other prople and **collaborate on the
       
   168 mutable part of the history**.
       
   169 
       
   170 Conflicting history rewriting operation can be detected and
       
   171 **resolved** as easily as conflicting changes on a file.
       
   172 
       
   173 
       
   174 Detecting and solving tricky situations
       
   175 -----------------------------------------------------
       
   176 
       
   177 History rewriting can lead to complex situations. The obsolete marker
       
   178 introduces a simple representation for this complex reality. But
       
   179 people using complex workflows will one day or another have to face
       
   180 the intrinsic complexity of some real-world situation.
       
   181 
       
   182 This section describes possible situations, defines precise sets of
       
   183 changesets involved in such situations and explains how the error
       
   184 cases can be resolved automatically using the available information.
       
   185 
       
   186 
       
   187 Obsolete changesets
       
   188 ````````````````````
       
   189 
       
   190 Old changesets left behind by obsolete operation are called **obsolete**.
       
   191 
       
   192 With the current version of mercurial, this *obsolete* part is stripped from the
       
   193 repository before the end of every rewriting operation.
       
   194 
       
   195 .. figure:: ./figures/error-obsolete.*
       
   196 
       
   197     Rebasing `B` and `C` on `A` (as `B'`, `C'`)
       
   198 
       
   199     This rebase operation added two obsolete markers from new
       
   200     changesets to old changesets. These two old changesets are now
       
   201     part of the *obsolete* part of the history.
       
   202 
       
   203 In most cases, the obsolete set will be fully hidden to both the UI and
       
   204 discovery, hence users do not have to care about them unless they want to
       
   205 audit history rewriting operations.
       
   206 
       
   207 Unstable changesets
       
   208 ```````````````````
       
   209 
       
   210 While exploring the possibilities of the obsolete marker a bit
       
   211 further, you may end up with *obsolete* changesets which have
       
   212 *non-obsolete* children. There is two common ways to achieve this:
       
   213 
       
   214 * Pull a changeset based of an old version of a changeset [#]_.
       
   215 
       
   216 * Use a partial rewriting operation. For example amend on a changeset with
       
   217   children.
       
   218 
       
   219 *Non-obsolete* changeset based on *obsolete* one are called **unstable**
       
   220 
       
   221 .. figure:: ./figures/error-unstable.*
       
   222 
       
   223     Amend `A` into `A'` leaving `B` behind.
       
   224 
       
   225     In this situation we cannot consider `B` as *obsolete*. But we
       
   226     have all the necessary data to detect `B` as an *unstable* branch
       
   227     of the history because its parent `A` is *obsolete*. In addition,
       
   228     we have enough data to automatically resolve this instability: we
       
   229     know that the new version of `B` parent (`A`) is `A'`. We can
       
   230     deduce that we should rebase `B` on `A'` to get a stable history
       
   231     again.
       
   232 
       
   233 Proper warnings should be issued when part of the history becomes
       
   234 unstable. The UI will be able to use the obsolete marker to
       
   235 automatically suggest a resolution to the user of even carry them out
       
   236 for them.
       
   237 
       
   238 
       
   239 XXX details on automatic resolution for
       
   240 
       
   241 * movement
       
   242 
       
   243 * handling deletion
       
   244 
       
   245 * handling split on multiple head
       
   246 
       
   247 
       
   248 .. [#] For this to happen one needs to explicitly enable exchange of draft
       
   249        changesets. See phase help for details.
       
   250 
       
   251 The two parts of the obsolete set
       
   252 ``````````````````````````````````````
       
   253 
       
   254 The previous section shows that there could be two kinds of *obsolete*
       
   255 changesets:
       
   256 
       
   257 * an *obsolete* changeset with no or *obsolete* only descendants is called **extinct**.
       
   258 
       
   259 * an *obsolete* changeset with *unstable* descendants is called **suspended**.
       
   260 
       
   261 
       
   262 .. figure:: ./figures/error-extinct.*
       
   263 
       
   264     Amend `A` and `C` leaving `B` behind.
       
   265 
       
   266     In this example we have two *obsolete* changesets: `C` with no *unstable*
       
   267     children is *extinct*. `A` with *unstable* descendant (`B`) is *suspended*.
       
   268     `B` is *unstable* as before.
       
   269 
       
   270 
       
   271 Because nothing outside the obsolete set default on *extinct*
       
   272 changesets, they can be safely hidden in the UI and even garbage
       
   273 collected. *Suspended* changesets have to stay visible and available
       
   274 until their unstable descendant are rewritten into stable version.
       
   275 
       
   276 
       
   277 Conflicting rewrites
       
   278 ````````````````````
       
   279 
       
   280 If people start to concurrently edit the same part of the history they will
       
   281 likely meet conflicting situations when a changeset has been rewritten in two
       
   282 different ways.
       
   283 
       
   284 
       
   285 .. figure:: ./figures/error-conflicting.*
       
   286 
       
   287     Conflicting rewrite of `A` into `A'` and `A''`
       
   288 
       
   289 This kind of conflict is easy to detect with an obsolete marker
       
   290 because an obsolete changeset can have more than one new version. It
       
   291 may be seen as the multiple heads case. Mercurial warns you about this
       
   292 on pull. It is resolved the same way by a merge of A' and A'' that
       
   293 will keep the same parent than `A'` and `A''` with two obsolete
       
   294 markers pointing to both `A` and `A'`
       
   295 
       
   296 .. figure:: ./figures/explain-troubles-concurrent-10-solution.*
       
   297 
       
   298 Allowing multiple new changesets to obsolete a single one allows to
       
   299 distinguish a split changeset from a history rewriting conflict.
       
   300 
       
   301 Reliable history
       
   302 ``````````````````````
       
   303 
       
   304 Obsolete markers help to smooth rewriting operation process. However
       
   305 they do not change the fact that **you should only rewrite the mutable
       
   306 part of the history**. The phase concept enforces this rule by
       
   307 explicitly defining a public immutable set of changesets. Rewriting
       
   308 operations refuse to work on public changesets, but there are still
       
   309 some corner cases where previously rewritten changesets are made
       
   310 public.
       
   311 
       
   312 Special rules apply for obsolete markers pointing to public changesets:
       
   313 
       
   314 * Public changesets are excluded from the obsolete set (public
       
   315   changesets are never hidden or candidate to garbage collection)
       
   316 
       
   317 * *newer* version of a public changeset are called **bumped** and
       
   318   highlighted as an error case.
       
   319 
       
   320 .. figure:: ./figures/explain-troubles-concurrent-10-sumup.*
       
   321 
       
   322 Solving such an error is easy. Because we know what changeset a
       
   323 *bumped* tries to rewrite, we can easily compute a smaller
       
   324 changeset containing only the change from the old *public* to the new
       
   325 *bumped*.
       
   326 
       
   327 .. figure:: ./figures/explain-troubles-concurrent-15-solution.*
       
   328 
       
   329 
       
   330 Conclusion
       
   331 ----------------
       
   332 
       
   333 The obsolete marker is a powerful concept that allows mercurial to safely handle
       
   334 history rewriting operations. It is a new type of relation between Mercurial
       
   335 changesets which tracks the result of history rewriting operations.
       
   336 
       
   337 This concept is simple to define and provides a very solid base for:
       
   338 
       
   339 
       
   340 - Very fast history rewriting operations,
       
   341 
       
   342 - auditable and reversible history rewriting process,
       
   343 
       
   344 - clean final history,
       
   345 
       
   346 - sharing and collaborating on the mutable part of the history,
       
   347 
       
   348 - gracefully handling history rewriting conflicts,
       
   349 
       
   350 - various history rewriting UI's collaborating with an underlying common API.
       
   351 
       
   352 .. list-table:: Comparison on solution [#]_
       
   353    :header-rows: 1
       
   354 
       
   355    * - Solution
       
   356      - Remove changeset locally
       
   357      - Works on any point of your history
       
   358      - Propagation
       
   359      - Collaboration
       
   360      - Speed
       
   361      - Access to older version
       
   362 
       
   363    * - Strip
       
   364      - `+`
       
   365      - `+`
       
   366      - \
       
   367      - \ 
       
   368      - \ 
       
   369      - `- -`
       
   370 
       
   371    * - Reference
       
   372      - `+`
       
   373      - \ 
       
   374      - `+`
       
   375      - \ 
       
   376      - `+`
       
   377      - `-`
       
   378 
       
   379    * - Obsolete
       
   380      - `+`
       
   381      - `+`
       
   382      - `++`
       
   383      - `++`
       
   384      - `+`
       
   385      - `+`
       
   386 
       
   387 
       
   388 
       
   389 .. [#] To preserve good tradition in comparison table, an overwhelming advantage
       
   390        goes to the defended solution.