Hashing Explained
When it comes to security, the weakest link in the chain is usually the good old-fashioned password. Hackers often publish tables of usernames and passwords as they find them when breaking into systems, meaning that it’s important to keep the password data secure. Techniques such as ‘hashing’ and ‘salting’ are commonly used in order to keep these passwords safe, so let’s take a look at how it works.
What is hashing?
When hashing data, a hash function is used to create a fixed-length datastream based on the original data. This means that, regardless of the amount of data input, the output would always be the same size. The output is commonly referred to as a hash value, digest or simply hash. When creating the hash, the hash function should ensure that the output will always be the same for the same input, but different inputs should always produce a unique output. Two different inputs that produce the same output are referred to as a collision. Thus, good hash functions are referred to as being collision resistant. A hash function is normally a one-way process, meaning that the hash can be made from the input data, but it is impossible to get the input data from the hash.
Hashing for security
The irreversibility of hash functions is why hashing is recommended for password storage. By storing the passwords as hashes, it’s impossible for anyone who steals the stored hashes to turn them back into passwords. When verifying passwords at a user login, the password will be hashed the same way as it was originally stored. Hashes are then compared to see if they match. An attacker with a list of hashed passwords can then start hashing all the possible combinations of the passwords until they get a match.
Adding a dash of salt
While it usually takes time, the process reveals matches for weaker passwords much faster. To avoid password discovery, stored passwords are usually salted. The salt is a random string of characters and is usually fairly long. Each user will have their own salt value and s added to passwords before they are hashed. The salt needs to be stored in order to be used again later and isn’t designed to be kept secure. The purpose of salting is that an attacker with a database of password hashes can’t just build one list of hashes to reveal passwords, but would need to generate a list for each user using the user’s unique salt and potential password. Salting and hashing make the process of unscrambling passwords so time-consuming that it is no longer worth the effort.
Additional hash uses
Hashes are also used to verify data integrity, by taking a hash of the data the supplier can provide the hash as an identifier of the legitimate data. The end user can create a hash of the data that they have, and check whether the hashes match. If they do match, that can be taken as evidence that the data has not been tampered with. This technique is commonly used when providing downloads of software from mirrors or other places outside of the creator’s control. In a similar manner, hashes can be used to verify the integrity of messages and other data that you may suspect of tampering. For this purpose, it’s important that you know that the recorded hash hasn’t been tampered with.
As with the password lookup, hashes are also used to search for large data entries in databases and similar systems. While matching the large data may be slow and possibly computationally expensive, taking a hash of the new data and comparing it to hashes of the existing data can be far faster. The hash can also be used as a fingerprint for a message or data in order to identify it later without needing to see the whole thing.
There are many uses for hashes and quite a number of hash functions with each having different use cases for which they may be better suited.