Wednesday, November 30, 2011

The Art & Science of Storing Passwords


Sample Image - StoringPasswords.gif

Introduction

Almost every website nowadays needs to maintain a list of users and passwords. Many multi-user applications require a way to authenticate users, and passwords seem like a natural.
You don�t necessarily have to provide your own username/password authentication solution. Alternatives include using Active Directory (if you are in a Microsoft domain), Microsoft Passport, LDAP (Lightweight Directory Access Protocol) for non-Microsoft environments, and the membership framework provided in .NET 2.0. All four alternatives have their place, but if you want to understand the logic behind storing passwords and understand how to correctly implement a password management scheme, this article is for you.
This article assumes a web application � slightly different rules apply to a distributed multi-user application. For distributed multi-user applications, the authentication exchange follows slightly different rules.

The trivial approach � just store it!

The simplest approach to manage user names and passwords is to store everything in plaintext (no encryption or scrambling) in a file or database. The result would be something like this:
User ID
User Name
Password
101
Bob
Snake
102
David
Rainbow6
103
George
DarkTower
104
Eve
Snake

Advantages of storing passwords �in the clear�

  1. Authenticating (checking that the username and password pair matches the pair in the table) is very simple � just compare the strings!
  2. Forgotten passwords can be retrieved � the password is easily accessible, given the user name.

Disadvantages of storing passwords �in the clear�

  1. First, anyone with access to the file (or able to SELECT from the table) gains immediate access to all passwords! An employee with legitimate access to the file might print the file or email out the information, and Voila! all the passwords are compromised.
    But SQL 2005 supports encryption � isn�t that good enough? No, SQL encryption is not good enough. The built-in encryption only protects the information on disk. If a user is allowed to access the data (perform a SELECT), SQL will automatically decrypt the information. If your web application is allowed to access the data (and how else would you compare the user name and password?) then a hacker hacking your application can access the data as well, gaining the same access as your web application.
  2. The second problem is that during the authentication exchange, the password is visible on the network. Unless secure communication is used throughout, the password can be seen while traveling on the network. For example, even if the web application uses SSL to submit the password, the password is still visible when the web application server SELECTs the information from the remote database. The results of the query are transferred unencrypted over the network.

Is storing password �in the clear� an acceptable solution?

  • If your data store is encrypted (using SQL Server 2005, for example),
  • and your internal communication network is highly secure (uses only IP-SEC or a VPN tunnel for communication between servers),
  • and you only use secure communication between the application client and the application server,
  • and you trust all employees with database access to never make mistakes (such as printing the password information or storing to file),
  • and you are sure nobody else has physical access to any of the servers used.
Then, yes, storing passwords �in the clear� is fine.

Encrypting passwords � a little more secure

A better approach for storing passwords (and the only viable alternative if users need to be able to recover passwords) is scrambling the passwords before storing them.
This approach relies on having a secret. The secret is either the scrambling algorithm, or the key used in conjunction with a modern encryption algorithm.
Scrambling (or encrypting) passwords is a reversible operation. The secret is used to garble the password, and the same secret can be used to retrieve the original password. When a user supplies a password, the stored password is de-scrambled using the secret, and the passwords are compared. An alternative approach is to scramble the provided password using the secret and compare the two garbled versions � a match indicates the provided password was good.
If a user needs to retrieve a password, the stored password is de-scrambled and provided to the user (usually via email).
User ID
User Name
Password
101
Bob
k468dD8F
102
David
56lkV#p6
103
George
8Fk4lVQ0
104
Eve
k468dD8F

Advantages of scrambling using a secret

  1. A forgotten or lost password can be retrieved.
  2. Only a single secret needs to be stored securely (either an algorithm or a key).
  3. For multi-user distributed applications, when using encryption, either the clear text password needs to be communicated (to be checked), or the secret must be communicated to perform the authentication on the front end.

Disadvantages of scrambling passwords

  1. If the secret is compromised, all passwords are compromised. If somebody has access to the secret and to the password store, all passwords can be unscrambled!
  2. Access to the password store alone is sufficient to provide information about the passwords. Since all passwords are scrambled using the same algorithm, if two users share the same scrambled password, they must have the same password as well. Crafty hackers with access to the password store can create users with known passwords and check for other users with the same password. This type of attack is a variation of the �known plaintext� attack. Attacks such as this can be stopped using a SALT (see below).
  3. When using a block cipher, the password length must be stored as part of the scrambled password. The length must be stored because block ciphers always produce a fixed-size block of scrambled text. If the password length is not scrambled (for example, if stored as a column in a table), the information is very useful to password crackers. Knowing the exact length of the password makes it much easier to try to guess a password.

Is this an acceptable password storing solution?

If lost password retrieval is a must, then yes, this is the only acceptable solution. A few guidelines though:
  • Store your secret in a secure place. Hard-coding the secret in the application code is not a good solution. In my opinion, storing the secret in a file (even if it is the web.config file) is a terrible idea. If you must use a secret, store it in a database (limiting access by requiring an authenticated connection) or in an access-restricted registry key.
  • Use a good cryptographic algorithm (examples below), don�t create your own or use a trivial scrambling algorithm. Unless you know exactly what you are doing, your own algorithm might be very easy to crack.
  • Use a SALT (see below) to prevent two users with the same password from having the same scrambled password.
  • Encryption output is binary, and must be encoded if stored as text. If storing binary data is OK, no encoding is necessary, but if the encrypted passwords are to be stored as text, consider using RADIX64 to convert binary information to text. RADIX64 uses a character for every 6 bits, expanding the output by 33%.

Storing hashed passwords � the one-way solution

A cryptographic hash is also known as a �one-way function�. A hash function takes input of any length, and produces a unique output of constant length. For example, if we hash a password (of any length) with the MD5 cryptographic hash, the result would be a 128 bit number that uniquely corresponds to the password. Cryptographic hashes work on more than passwords � if the cryptographic hash of two files is identical, then the two files are identical as well.
In recent years, as computing power increased, some cryptographic hash functions are no longer recommended for use (MD4, MD5, SHA1). In my opinion, if used just for hashing passwords, you are probably OK. As for me, I modified my code to use SHA2.
When storing hashed passwords, the password is hashed (run through the hashing algorithm) and the resulting hash is stored instead of the password. To compare passwords, just hash the given password using the same hash function and compare the results. If the hashes are the same, the passwords match.
The beauty of a one way function is that there is no way to compute the password based on the hash. Hashed passwords are not immune to brute force attacks � given a dictionary and the password hash, a hacker can compute the hashes of all the words in the dictionary, compare the words with the password hash, and discover the password. This is where strong passwords (containing letters, numbers, and special characters) help us defend against brute-force attacks.

Advantages of storing hashed passwords

  1. The original �clear text� password is never stored. Even if the password store is compromised, only the hashes become public.
  2. The password length is not stored and cannot be estimated, making password cracking that much harder.
  3. There is no need for a secret as none is used to hash the password.
  4. For multi-user distributed applications, the password hash can be used for authentication. When using encryption, either the clear text password needs to be communicated (to be checked), or the secret must be communicated to perform the authentication on the front end.

Disadvantages of storing hashed passwords

  1. Lost passwords cannot be recovered (except using brute-force methods). A new password must be created and communicated to the user.
  2. Like encrypted passwords, if a SALT (see below) is not used, users with the same password will have the same password hash.

Is this an acceptable password storing solution?

Yes, but please follow these guidelines:
  • Use a good cryptographic hash. MD5 is no longer recommended, and neither is SHA1. SHA2 is the current favorite. If you must use MD5 or SHA1, use a stronger salting algorithm.
  • Use a SALT (see below) to prevent two users with the same password from having the same scrambled password.
  • Hash output is binary, and must be encoded if stored as text. If storing binary data is OK, no encoding is necessary, but if the hashed passwords are to be stored as text, consider using RADIX64 to convert binary information to text. RADIX64 uses a character for every 6 bits, expanding the output by 33%.

Storing password length � why and how

Why do I need to store the password length?

Password length is only needed when encrypting and decrypting the password using a block cipher. Many block ciphers have a block size of 64 bits - anything encrypted will end up with a multiple of 8 bytes for size. Encrypting a 12 byte password would result in a 16 byte output. When the 16 bytes are decrypted, the result would be a 16 byte string with garbage for the last 4 bytes. If the password length is stored, the extra �padding� can be trimmed when decrypting the password. Without the trimming, any na�ve attempts to compare the strings will fail � the 12 character password is not the same as the 16 character decrypted password. Also, attempting to compare the encrypted password to check for a match might fail because the extra 4 bytes of padding might be different from encryption to encryption.
Hashing does not require a password length � the hash of a 12 character password using SHA2 will always result in a 320 bit output, and hashing the same password will always result in the same 320 bit output.
When using a scrambling algorithm such as ROT13 (which I do not recommend), the scrambled password has the same length as the original password. Anyone with access to the password store can then obtain the length of the password. If you decide to use a scrambling algorithm or a stream cipher (as opposed to a block cipher), please make sure to pad the output to hide the password length.

How can I store the password length?

One solution is to store the password length as a column in your table. The problem with this approach is that if a hacker gains access to the information, the length of a password is very useful when attempting a brute-force attack. Knowing that a password has only 5 letters allows the hacker to limit the number of password guesses needed to crack the password.
A better solution is to store the password length as part of the encrypted string. The password length can be pre-pended to the password string (for example, as the first two characters). When the password is decrypted, the first two characters are used to reconstruct the length, and the password can be safely trimmed. Storing the length encrypted with the password makes sure no-one can access the password length without knowing the secret used to encrypt the password.

Example � Storing length and trimming passwords

Storing a message with the length:
// Encode length as first 4 bytes

// Data is in the �message� string

byte[] length = new byte[4];
length[0] = (byte)(message.Length & 0xFF);
length[1] = (byte)((message.Length >> 8) & 0xFF);
length[2] = (byte)((message.Length >> 16) & 0xFF);
length[3] = (byte)((message.Length >> 24) & 0xFF);
csEncrypt.Write(length, 0, 4);
csEncrypt.Write(toEncrypt, 0, toEncrypt.Length);
Retrieving the message from a binary array:
// Holder for the unencrypted message

byte[] fromEncrypt = new byte[encrypted.Length-4];

// Used to convert a byte array to a string

// with a specific encoding

UTF8Encoding textConverter = new UTF8Encoding();

//Read the data out of the crypto stream.

// The first four bytes are the actual length

// The rest is the message + padding

byte[] length = new byte[4];
// Read the data from the decrypted stream

csDecrypt.Read(length, 0, 4);
csDecrypt.Read(fromEncrypt, 0, fromEncrypt.Length);
int len = (int)length[0] | (length[1] << 8) | 
          (length[2] << 16) | (length[3] << 24);

//Convert the byte array back into a string.

// Trim the string to the specified length to remove

// the padding

// My default textConverter is UTF8

return textConverter.GetString(fromEncrypt).Substring(0, len);

Salting � everything tastes better with a little salt

Hashing (or encrypting) the same password for two users results in the same output. The repeated information can be used maliciously to obtain passwords. If I am a user with access to the password store, I can set my own password to a dictionary word, and then scan the password store for somebody else with the same hashed or encrypted password. If I find a match, I cracked that user�s password.

Simple salting � use an additional piece of data

To prevent the above problem, we can inject some variation into the hashing (or encrypting scheme). For example: if we prefix the password with the user name, the hash or encrypted output will no longer match.
User: Bob
Password: Snake
Hashed password: hash(�Snake�) -> k468dD8F
User: Eve
Password: Snake
Hashed password: hash(�Snake�) -> k468dD8F
Obviously, Bob and Eve have the same password. Even worse, if a hacker obtains our password store, the hacker can pre-compute the hashes for an entire dictionary and look for matches in our password store, greatly accelerating the cracking process.
If we throw the user name into the mix:
User: Bob
Password: Snake
Hashed password: hash(�Bob.Snake�) -> 4Fgja93Q
User: Eve
Password: Snake
Hashed password: hash(�Eve.Snake�) -> k468dD8F
Bob and Eve now have different password hashes. If a hacker gets hold of our password store, the hacker now needs to compute each password hash, specifically for each user. The hacker needs to pre-compute the dictionary hashes with the �Bob.� prefix for Bob and with the �Eve.� prefix for Eve � no free lunch here.

Advanced salting � use random information

Using a random salt significantly improves the strength of encrypting passwords, and makes brute-force cracking much more costly.
To use a random salt, compute a random number, and use the random number as a component when calculating the hash or when encrypting. Store the random number in a database column so the number can be available later when checking the password.
For example:
// Generate a six-byte salt

public static byte[] GenerateSALT()
{
  byte[] data = new byte[6];
  new System.Security.Cryptography.RNGCryptoServiceProvider().GetBytes(data);
  return data;
}
When initially storing the password for Bob:
  1. Compute a random number (use a good cryptographic random generator like the ones inSystem.Security.Cryptography).
  2. Add the random number to the password string.
  3. Compute the hash, or encrypt the resulting string.
  4. Store both the hash/encryption output and the random number in the password store.
When comparing passwords, follow the same algorithm:
  1. Obtain the random number from the password store.
  2. Add the random number to the password string.
  3. Compute the hash, or encrypt the resulting string.
  4. Compare the result with the stored hash or encrypted password, a match indicates a password match.
Using a random number is even better than using the user name. User names are not very random - they follow very strict rules. Random numbers (when coming from a good cryptographic random number generator) inject more randomness into the resulting hash or encrypted output. Note that if you decide to use the user name as a salt, the user name can never change. If the user name changes, the hash is no longer valid.

Encoding binary output as text

The output from hash functions and encryption algorithms is binary. To store the binary information as textual strings, the information can be encoded.
Two popular encoding schemes are UUENCODE (popular in the Unix world) and Base64 (popular everywhere). Base64 is the encoding scheme used to convert binary attachments for sending over SMTP email.

Example � encoding a binary array to a string

//Get encrypted array of bytes.

byte[] encrypted = msEncrypt.ToArray();
// Convert to Base64 string

string b64 = Convert.ToBase64String(encrypted);

Example � decoding a string to a binary array

// Base64 decode

// the data is in the �b64� string

byte[] encrypted = Convert.FromBase64String(b64);

Scrambling algorithms and examples

The .NET framework (System.Security.Cryptography) comes with built-in support for several encryption algorithms:
  • DES � old, somewhat obsolete.
  • TripleDES � old, but still strong.
  • RC2 � old, but still useful.
  • Rijndael (AES) - modern.
Example - encoding a string:
// Encode the string �message�

// ScrambleKey and ScambleIV are randomly generated

// RC2CryptoServiceProvider rc2 = new RC2CryptoServiceProvider();

// rc2.GenerateIV();

// ScrambleIV = rc2.IV;

// rc2.GenerateKey();

// ScrambleKey = rc2.Key

UTF8Encoding textConverter = new UTF8Encoding();
RC2CryptoServiceProvider rc2CSP = 
                 new RC2CryptoServiceProvider();

//Convert the data to a byte array.

byte[] toEncrypt = textConverter.GetBytes(message);

//Get an encryptor.

ICryptoTransform encryptor = 
   rc2CSP.CreateEncryptor(ScrambleKey, ScrambleIV);
            
//Encrypt the data.

MemoryStream msEncrypt = new MemoryStream();
CryptoStream csEncrypt = new CryptoStream(msEncrypt, 
                         encryptor, CryptoStreamMode.Write);

//Write all data to the crypto stream and flush it.

// Encode length as first 4 bytes

byte[] length = new byte[4];
length[0] = (byte)(message.Length & 0xFF);
length[1] = (byte)((message.Length >> 8) & 0xFF);
length[2] = (byte)((message.Length >> 16) & 0xFF);
length[3] = (byte)((message.Length >> 24) & 0xFF);
csEncrypt.Write(length, 0, 4);
csEncrypt.Write(toEncrypt, 0, toEncrypt.Length);
csEncrypt.FlushFinalBlock();

//Get encrypted array of bytes.

byte[] encrypted = msEncrypt.ToArray();
Example � decoding a string:
// Decode the �encrypted� byte[]

UTF8Encoding textConverter = new UTF8Encoding();
RC2CryptoServiceProvider rc2CSP = new RC2CryptoServiceProvider();
//Get a decryptor that uses the same key and IV as the encryptor.

ICryptoTransform decryptor = 
       rc2CSP.CreateDecryptor(ScrambleKey, ScrambleIV);

//Now decrypt the previously encrypted message using the decryptor

// obtained in the above step.

MemoryStream msDecrypt = new MemoryStream(encrypted);
CryptoStream csDecrypt = new CryptoStream(msDecrypt, 
                         decryptor, CryptoStreamMode.Read);

byte[] fromEncrypt = new byte[encrypted.Length-4];

//Read the data out of the crypto stream.

byte[] length = new byte[4];
csDecrypt.Read(length, 0, 4);
csDecrypt.Read(fromEncrypt, 0, fromEncrypt.Length);
int len = (int)length[0] | (length[1] << 8) | 
          (length[2] << 16) | (length[3] << 24);

//Convert the byte array back into a string.

return textConverter.GetString(fromEncrypt).Substring(0, len);

Hashing algorithms and examples

The .NET framework (System.Security.Cryptography) comes with built-in support for several cryptographic hashes:
  • MD5 � Broken, attempt to avoid.
  • RIPEMD160 � Acceptable, never popular in the US.
  • SHA1 � Acceptable, but a 160 bit output is considered too short.
  • SHA256 � The recommended minimum.
  • SHA384 (SHA2) - Recommended.
  • SHA512 � Very good, but does not provide any additional protection compared to SHA384.
Example � hashing a string:
public byte[] EncryptPassword(string userName, string password, 
       int encryptionVersion, byte[] salt1, byte[] salt2)
{
    string tmpPassword = null;

    switch(encryptionVersion)
    {
        case 2: // password + lots of salt

            tmpPassword = Convert.ToBase64String(salt1)
             + Convert.ToBase64String(salt2)
             + userName.ToLower() + password;
            break;
        case 1: // user name as salt

            tmpPassword  = userName.ToLower() + password;
            break;
        case 0: // no salt

        default:
            tmpPassword = password;
            break;
    }

    //Convert the password string into an Array of bytes.

    UTF8Encoding textConverter = new UTF8Encoding();
    byte[] passBytes = textConverter.GetBytes(tmpPassword);

    //Return the encrypted bytes

    if (encryptionVersion == 2)
        return new SHA384Managed().ComputeHash(passBytes);
    else
        return new MD5CryptoServiceProvider().ComputeHash(passBytes);
}
Example � comparing two hashes for equality:
// Just compare to two arrays for equality

// You can add a length comparison, but normally

// all hashes are the same size.

private bool PasswordsMatch(byte[] psswd1, byte[] psswd2)
{
    try
    {
        for(int i = 0; i < psswd1.Length; i++)
        {
            if(psswd1[i] != psswd2[i])
                return false;
        }
        return true;
    }
    catch(IndexOutOfRangeException)
    {
        return false;
    }
}

No comments: