steelcity_ballin wrote:One MD5s biggest problems is the fact that two completely different passwords can potentially generate the same exact hash. To account for this, you salt the password with some other fields further lessening the likelihood. Some salts can be a combination of fields for the same user such as the Id in the database, really any field(s). So when the user enters their password "Phoenix123", perhaps search the database for their username and password, salting the password as MD5(MyUserName+Phoenix123) and find the match, for example.
This is a problem with any hash function, it is not unique to MD5. Collisions are NOT the reason salts are used.
Salts are used so that you cannot build what are called rainbow tables. A rainbow table is simply a precomputed table of all possible hashes given some input rule. So I can generate a rainbow table of say all MD5 values of 8 digit passwords build from alphanumeric input, if your password is 8 digits or less and doesn't use special characters it's MD5 value will be in my database. Depending on your computing resources you can generate rainbow tables of varying input types and lengths.
If you use salts then instead of storing MD5("password") you store MD5("password"+"random salt"). Now if an attacker acquires your list of MD5 hashes, and they also know the salt, they will not be able to use their precomputed rainbow table. This increases the computational resources the attacker has to throw at the problem.
steelcity_ballin wrote:The real trick is to make the operation too expensive to be feasible via brute force. I read an article the other day that stated that you can easily buy/rent cloud computer power in the thousands of CPUs for pretty cheap. Consider the Gawker incident recently; they were not able to, in their own time table, to break any password over 8 characters because it was too expensive. They did manage to get a ridiculous number of accounts that were
A) not salted
B) hashed with DES (old, out dated, inappropriate)
C) weak passwords [less than 8 characters]
Some of this is right, some not so. The problem with gawker was that they were using an inappropriate implementation that only looked at the first 8 digits of a users entered password. So even if a user had a 20 digit password, only the first 8 were being hashed and compared. Obviously this is a broken approach regardless of your hashing algorithm (though there are good reasons why MD5 should not be used anymore).
Remember that we are discussing here a worst case scenario (i.e an attacker has compromised your user account database), the aim of salting and hashing is to prevent an attacker being able to do anything useful with that information; it would be a lot better to prevent that compromise in the first place! Per user salts and a hash function such as bcrypt that takes a lot of computational time is a solution to *ONE* specific problem, "How do I stop an attacker who has acquired my user database from discovering my user's passwords?", there are still many other security related questions that need to be addressed in an application like this.
To understand recursion, you must first understand recursion.