Apache Security - Ivan Ristic [45]
One-Way Encryption
One-way encryption is the process performed by certain mathematical functions that generate "random" output when given some data on input. These functions are called hash functions or message digest functions. The word hash is used to refer to the output produced by a hash function. Hash functions have the following attributes:
The size of the output they produce is much smaller than the size of the input. In fact, the size of the output is fixed.
The output is always identical when the inputs are identical.
The output seems random (i.e., a small variation of the input data results in a large variation in the output).
It is not possible to reconstruct the input, given the output (hence the term one-way).
Hash functions have two common uses. One is to store some information without storing the data itself. For example, hash functions are frequently used for safe password storage. Instead of storing passwords in plaintext—where they can be accessed by whoever has access to the system—it is better to store only password hashes. Since the same password always produces the same hash, the system can still perform its main function—password verification—but the risk of user password database compromise is gone.
The other common use is to quickly verify data integrity. (You may have done this, as shown in Chapter 2, when you verified the integrity of the downloaded Apache distribution.) If a hash output is provided for a file, the recipient can calculate the hash himself and compare the result with the provided value. A difference in values means the file was changed or corrupted.
Hash functions are free of usage, export, or patent restrictions, and that led to their popularity and unrestricted usage growth.
Here are three popular hash functions:
Message Digest algorithm 5 (MD5)
Produces 128-bit output from input of any length. Released as RFC 1321 in 1992. In wide use.
Secure Hash Algorithm 1 (SHA-1)
Designed as an improvement to MD5 and produces 160-bit output for input of any length. A U.S. government standard.
SHA-256, SHA-384, and SHA-512
Longer-output variants of the popular SHA-1.
Today, it is believed a hash function should produce output at least 160 bits long. Therefore, the SHA-1 algorithm is recommended as the hash algorithm of choice for new applications.
Public-Key Infrastructure
Encryption algorithms alone are insufficient to verify someone's identity in the digital world. This is especially true if you need to verify the identity of someone you have never met. Public-key infrastructure (PKI) is a concept that allows identities to be bound to certificates and provides a way to verify that certificates are genuine. It uses public-key encryption, digital certificates, and certificate authorities to do this.
Digital certificates
A digital certificate is an electronic document used to identify an organization, an individual, or a computer system. It is similar to documents issued by governments, which are designed to prove one thing or the other (such as your identity, or the fact that you have passed a driving test). Unlike hardcopy documents, however, digital certificates can have an additional function: they can be used to sign other digital certificates.
Each certificate contains information about a subject (the person or organization whose identity is being certified), as well as the subject's public key and a digital signature made by the authority issuing the certificate. There are many standards developed for digital certificates, but X.509 v3 is almost universally used (the popular PGP encryption protocol being the only exception).
A digital certificate is your ID in the digital world. Unlike the real world, no organization has exclusive rights to issue "official" certificates at this time (although