Hashset internal implementation in java

In this post we will see, how the internal working of HashSet in java or How HashSet stores only unique values. It most common and important question of core java. We have read in recent posts, A HashSet contains only unique values and can contain only one null value. So, let’s find the answer and also discuss the HashSet internal implementation in java.

Here is the table content of the article will we will cover this topic.
1. Internal working of HashSet in java?
2. Why duplicate not allowed in HashSet?
3. Why HashSet can contain only one null?

1. Internal working of HashSet in java?

The HashSet internally uses the HashMap to store the objects. Before a deep dive, we recommend you should be familiar with HashMap. Whenever we create an object HashSet, internally creates the object of HashMap also. Let’s see the code of each constructor of HashSet in JDK

public HashSet() 
{
    map = new HashMap<>();
}

public HashSet(int initialCapacity) 
{
    map = new HashMap<>(initialCapacity);
}
	
public HashSet(int initialCapacity, float loadFactor)
{
     map = new HashMap<>(initialCapacity, loadFactor);
}
	
public HashSet(Collection<? extends E> c) 
{
       map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
       addAll(c);
}

As we can see the constructor of HashSet creates the object of HashMap. The object of HashMap stores all the elements that we enter in the HashSet. Now the question arise HashMap always stores the data based on key and value pair. But in the HashSet, we just provide the value. Let’s have a look at the add(E e) method of HashSet.

public boolean add(E e) 
{
        return map.put(e, PRESENT)==null;
 }

The element that we add into HashSet stores as keys of this HashMap object and a constant store as value.

2. Why duplicate not allowed in HashSet?

Everyone knows a HashSet doesn’t contain duplicate values. We can add values in HashSet by use of add(E e) method, it returns true if the element is not already present in HashSet otherwise false. The add(E e) method uses the object of HashMap and put the element as a key and a constant called “PRESENT” as a value. This “PRESENT” is defined in the HashSet class as below.

// Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();
	
public boolean add(E e)
{
    return map.put(e, PRESENT)==null;
}

The “PRESENT” is a static final object of the Object class that is used to put as a value in HashMap. We already know a HashMap can’t contain duplicate keys.
Let’s take java hashset example, how does it work internally in memory and HashSet implementation in java.

import java.util.HashSet;
import java.util.Set;

public class HashSetExample
{
    public static void main(String[] args) 
    {
    	Set<String> hashSet = new HashSet<String>();
    	hashSet.add("Hello");
    }
}

The add(E e) method invokes the put(e, PRESNT) method, it returns null if element e is not existing in HashMap. If element e exists, then it returns the element.

Hashset internal implementation in java

Let’s try to add duplicate element in HashSet.

import java.util.HashSet;
import java.util.Set;

public class HashSetExample
{
    public static void main(String[] args) 
    {
    	Set<String> hashSet = new HashSet<String>();
    	hashSet.add("Hello");
    	hashSet.add("Hello");
    	System.out.println("Size of HashSet: "+hashSet.size());
    }
}

It is not adding the duplicate value because a HashMap can’t contains duplicate key. So the put(e, PRESENT) method returning an object that is not equal to null.

import java.util.HashSet;
import java.util.Set;

public class HashSetExample
{
    public static void main(String[] args) 
    {
    	Set<String> hashSet = new HashSet<String>();
    	hashSet.add("Hello");
    	hashSet.add("Java");
    	hashSet.add("Goal");
    	System.out.println("Size of HashSet: "+hashSet.size());
    }
}
hashset internal implementation in java

3. Why HashSet can contain only one null?

As we have discussed the HashSet internally use the HashMap. So, when we add the element in HashSet, it internally adds the element in object of HashMap as we have discussed in above section.

So, when we add the null value more then one time, it considered as duplicate in HashSet because internal HashMap can contain only unique key.  

Leave a Comment