Hashing Blockchain

Guide to Hashing Functions

So what is Hashing and what is it used for?

Let’s start with the basic conceptual model of the hashing process: This is a three-part process where you have a value (the input) that goes into the hashing algorithm (process) to generate a digest message (the output). The Hashing function takes a certain value and applies a mathematical operation (also called a hashing algorithm) to get an output known as a hash or a digest message.

The hashing process takes in a input, uses a hashing algorthm and produces an output
Figure 1 A conceptual model of the Hash Function. Image from Wikipedia.  

The Hash Table: A novel data structure

Hash tables fundamentally store key-value pairs. To create a simple hash table to store information you need to get a key and have it passed in as input into a hashing algorithm. The hashing algorithm will then produce an integer as the output. This output is used as the index, mapped to a value stored in a hash table.

Hash Functions map a key to an index in the array, while the value is data that lives or is inserted at that index. JavaScript Hash Tables are also known as Hash Maps, Maps, or Dictionaries. More on this here.

Hash Tables: A practical example

Let’s say you want to look up a person’s telephone number. You can provide the person’s name into a hash function and use the output of the hash function (an integer) to know exactly at which index in the hash table the number can be found.

A picture containing parking, meter, side, street

Description automatically generated
Figure 2 Using Hashing Functions to retrieve encrypted user data. Image from Wikipedia.

Hashing is also used in file version control systems. If you are looking to compare two files to see if there are changes between them, you can use the two files as input into a hashing algorithms. If two files produce the same output, when processed by a hashing algorithm, we can be almost certain that those two files are identical.

Version control during a software project is integral to project success. So being able to compare file versions using hash functions can be very useful. However there are other industries where hash functions have taken off in leaps and bounds: This includes the financial sector where we have seen the rise of the use of cryptocurrencies (based on hash functions) including blockchain technology and Bitcoin.

Hashing vs Encryption.

Encryption is a two-way function; what is encrypted can be decrypted with the proper key. Hashing, however, is a one-way function that scrambles plain text to produce a unique message digest. With a properly designed algorithm, there is no way to reverse the hashing process to reveal the original password

For those of you interested in Cybersecurity, here is a link to an interesting online tool to digest hashes https://www.browserling.com/tools/sha3-hash

Using JavaScript to implement Hash Maps

Think of a hash map as a “hack” on top of an array. All we need is a function to convert a key into an array index (an integer). That function is called a hashing function.

There are many important reasons for using hashing. One application of hashing is using it to store passwords in databases. You may need to store your user’s password so that you know whether they have entered a valid password or not.

However, it would be very insecure to store all passwords as plain text in the database. Rather, it would be more secure if you applied a hashing function to the password and stored the hash/digest in the database instead.

Below is an example of how a value can be hashed in Node.js using the crypto module.

const crypto = require('crypto'); //line 1
const hash = crypto.createHash('sha256'); //line 2
hash.update('some data to hash'); 
console.log(hash.digest('hex')); 

// Console Output 
// 6a2da20943931e9834fc12cfe5bb47bbd9ae43489a30726962b576f4e3993e50
Source of code: https://nodejs.org/api/crypto.html#crypto_class_hash

In line 2 of the code above, we specify which hashing algorithm we use to calculate the hash. In this example, we specify that the SHA256 algorithm will be used. Another common hashing algorithm is MD5.

To look up the value for a given key, we just run the key through our hashing function to get the index to go to in our underlying array and grab the value.

How does that hashing method work? There are a few different approaches, and they can get pretty complicated. But here’s a simple proof of concept: https://www.interviewcake.com/concept/java/hash-map

Conclusion

Have a look at the various code repos you have published on GitHub as well as private projects. How have you used hasing functions in past projects?  

If you are interested in learning more about Hashing Functions check out this cool YouTube video lecture where Professor Devadas covers the basics of cryptography and their applications to security:

I would encourage you to look at the various projects in your repos and consider the security weak points of each repo. One of those issues might be an opportunity to use hashing functions to protect users.

Have you ever had a bad agent hack into your system? How did you gain control of the situation? Let me hear what you have to say in the comment section below.