Online Book Reader

Home Category

Mercurial_ The Definitive Guide - Bryan O'Sullivan [107]

By Root 970 0
some of the other extensions that are available for Mercurial, and briefly touch on some of the machinery you’ll need to know about if you want to write an extension of your own.

In Improve Performance with the inotify Extension, we’ll discuss the possibility of huge performance improvements using the inotify extension.

Improve Performance with the inotify Extension

Are you interested in having some of the most common Mercurial operations run as much as a hundred times faster? Read on!

Mercurial has great performance under normal circumstances. For example, when you run the hg status command, Mercurial has to scan almost every directory and file in your repository so that it can display file status. Many other Mercurial commands need to do the same work behind the scenes; for example, the hg diff command uses the status machinery to avoid doing an expensive comparison operation on files that obviously haven’t changed.

Because obtaining file status is crucial to good performance, the authors of Mercurial have optimized this code to within an inch of its life. However, there’s no avoiding the fact that when you run hg status, Mercurial is going to have to perform at least one expensive system call for each managed file to determine whether it’s changed since the last time Mercurial checked. For a sufficiently large repository, this can take a long time.

To put a number on the magnitude of this effect, I created a repository containing 150,000 managed files. I timed hg status as taking ten seconds to run, even when none of those files had been modified.

Many modern operating systems contain a file notification facility. If a program signs up to an appropriate service, the operating system will notify it every time a file of interest is created, modified, or deleted. On Linux systems, the kernel component that does this is called inotify.

Mercurial’s inotify extension talks to the kernel’s inotify component to optimize hg status commands. The extension has two components. A daemon sits in the background and receives notifications from the inotify subsystem. It also listens for connections from a regular Mercurial command. The extension modifies Mercurial’s behavior so that instead of scanning the filesystem, it queries the daemon. Since the daemon has perfect information about the state of the repository, it can respond with a result instantaneously, avoiding the need to scan every directory and file in the repository.

Recall that I measured plain Mercurial as taking ten seconds to run hg status on a 150,000 file repository. With the inotify extension enabled, the time dropped to 0.1 seconds, a factor of one hundred faster.

Before we continue, please pay attention to some caveats:

The inotify extension is Linux-specific. Because it interfaces directly to the Linux kernel’s inotify subsystem, it does not work on other operating systems.

It should work on any Linux distribution that was released after early 2005. Older distributions are likely to have a kernel that lacks inotify, or a version of glibc that does not have the necessary interfacing support.

Not all filesystems are suitable for use with the inotify extension. Network filesystems such as NFS are a non-starter, for example, particularly if you’re running Mercurial on several systems, all mounting the same network filesystem. The kernel’s inotify system has no way of knowing about changes made on another system. Most local filesystems (e.g., ext3, XFS, ReiserFS) should work fine.

The inotify extension is not yet shipped with Mercurial as of May 2009, so it’s a little more involved to set up than other extensions. But the performance improvement is worth it!

The extension currently comes in two parts: a set of patches to the Mercurial source code, and a library of Python bindings to the inotify subsystem.

* * *

Note


There are in fact two Python inotify binding libraries. One of them is called pyinotify, and is packaged by some Linux distributions as python-inotify. This is not the one you’ll need, as it is too buggy and inefficient

Return Main Page Previous Page Next Page

®Online Book Reader