Beautiful Code [118]
JDOM developers had considered this optimization, but got hung up on poor package design. In JDOM, the SAXBuilder class that creates a new Document object from a SAX parser is in the org.jdom.input package. The Element, Document, Attribute, and other node classes are in the org.jdom package. This means that all verifying and nonverifying constructors called by the builder must be public. Consequently, other clients can also call those constructors—clients that aren't making the appropriate checks. This enables JDOM to produce malformed XML. Later, in JDOM 1.0, the developers reversed themselves and decided to bundle a special factory class that accepted unverified input. This factory class is faster, but opens up a potentially troublesome backdoor in the verification system. The problem was just an artifact of separating the JDOM builder into input and core packages.
* * *
Note: Excessive package subdivision is a common anti-pattern in Java code. It often leaves developers faced with the unappealing choice of either making things public that shouldn't be, or limiting functionality.Do not use packages merely to organize a class structure. Each package should be essentially independent of the internals of all other packages. If two classes in your program or library have to access each other more than they have to access other, nonrelated classes, they should be placed together in one package.In C++, friend functions solve this problem neatly. Although Java does not currently have friend functions, Java 7 may make it possible to grant more access to subpackages that members of the general public do not have.
* * *
When I commenced work on XOM, I had the example of JDOM to learn from, so I kept the input classes in the same package as the core node classes. This meant I could provide package-protected, nonverifying methods that were available to the parser, but not to client classes from other packages.
The mechanics of XOM are straightforward. Each node class has a private no-args constructor, along with a package-protected factory method named build that invokes this constructor and sets up the fields without checking the names. Example 5-6 demonstrates this with the relevant code from the Element class. XOM is actually a little pickier than most parsers about namespaces, so it does have to check those. Still, it can omit a lot of redundant checks.
Example 5-6. Parser-based digit character verification
private Element() {}
static Element build(String name, String uri, String localName) {
Element result = new Element();
String prefix = "";
int colon = name.indexOf(':');
if (colon >= 0) {
prefix = name.substring(0, colon);
}
result.prefix = prefix;
result.localName = localName;
// We do need to verify the URI here because parsers are
// allowing relative URIs which XOM forbids, for reasons
// of canonical XML if nothing else. But we only have to verify
// that it's an absolute base URI. I don't have to verify
// no conflicts.
if (! "".equals(uri)) Verifier.checkAbsoluteURIReference(uri);
result.URI = uri;
return result;
}
This approach dramatically and measurably sped up parsing performance, since it didn't require the same large amount of work as its predecessors.
Correct, Beautiful, Fast (in That Order): Lessons from Designing XML Verifiers > Version 5: Third Optimization O(1)
5.7. Version 5: Third Optimization O(1)
After I implemented the constructor detailed in the previous section and added some additional optimizations, XOM was fast enough for anything I needed to do. Read performance was essentially limited only by parser speed and there were very few bottlenecks left in the document-building process.
However, other users with different use cases were encountering different problems. In particular, some users were writing custom builders that read non-XML formats into a XOM tree. They were not using an XML parser, and therefore were not able