Online Book Reader

Home Category

Mercurial_ The Definitive Guide - Bryan O'Sullivan [27]

By Root 995 0
to reconstruct a revision from a snapshot and a small number of deltas.

* * *

Identification and Strong Integrity

Along with delta or snapshot information, a revlog entry contains a cryptographic hash of the data that it represents. This makes it difficult to forge the contents of a revision, and easy to detect accidental corruption.

Hashes provide more than a mere check against corruption; they are used as the identifiers for revisions. The changeset identification hashes that you see as an end user are from revisions of the changelog. Although filelogs and the manifest also use hashes, Mercurial only uses these behind the scenes.

Mercurial verifies that hashes are correct when it retrieves file revisions and when it pulls changes from another repository. If it encounters an integrity problem, it will complain and stop whatever it’s doing.

In addition to the effect it has on retrieval efficiency, Mercurial’s use of periodic snapshots makes it more robust against partial data corruption. If a revlog becomes partly corrupted due to a hardware error or system bug, it’s often possible to reconstruct some or most revisions from the uncorrupted sections of the revlog, both before and after the corrupted section. This would not be possible with a delta-only storage model.

Revision History, Branching, and Merging

Every entry in a Mercurial revlog knows the identity of its immediate ancestor revision, usually referred to as its parent. In fact, a revision contains room for not one parent, but two. Mercurial uses a special hash, called the null ID, to represent the idea “there is no parent here.” This hash is simply a string of zeros.

In Figure 4-4, you can see an example of the conceptual structure of a revlog. Filelogs, manifests, and changelogs all have this same structure; they differ only in the kind of data stored in each delta or snapshot.

The first revision in a revlog (at the bottom of the image) has the null ID in both of its parent slots. For a “normal” revision, its first parent slot contains the ID of its parent revision, and its second contains the null ID, indicating that the revision has only one real parent. Any two revisions that have the same parent ID are branches. A revision that represents a merge between branches has two normal revision IDs in its parent slots.

Figure 4-4. The conceptual structure of a revlog

The Working Directory

In the working directory, Mercurial stores a snapshot of the files from the repository as of a particular changeset.

The working directory “knows” which changeset it contains. When you update the working directory to contain a particular changeset, Mercurial looks up the appropriate revision of the manifest to find out which files it was tracking at the time that changeset was committed, and which revision of each file was then current. It then recreates a copy of each of those files, with the same contents it had when the changeset was committed.

The dirstate is a special structure that contains Mercurial’s knowledge of the working directory. It is maintained as a file named .hg/dirstate inside a repository. The dirstate details which changeset the working directory is updated to, and all of the files that Mercurial is tracking in the working directory. It also lets Mercurial quickly notice changed files, by recording their checkout times and sizes.

Just as a revision of a revlog has room for two parents, so that it can represent either a normal revision (with one parent) or a merge of two earlier revisions, the dirstate also has slots for two parents. When you use the hg update command, the changeset that you update to is stored in the “first parent” slot, and the null ID in the second. When you hg merge with another changeset, the first parent remains unchanged, and the second parent is filled in with the changeset you’re merging with. The hg parents command tells you what the parents of the dirstate are.

What Happens When You Commit

The dirstate stores parent information for more than just book-keeping purposes. Mercurial

Return Main Page Previous Page Next Page

®Online Book Reader