Class CompactHashSet<E>

java.lang.Object
java.util.AbstractCollection<E>
java.util.AbstractSet<E>
com.google.common.collect.CompactHashSet<E>
All Implemented Interfaces:
Serializable, Iterable<E>, Collection<E>, Set<E>
Direct Known Subclasses:
CompactLinkedHashSet

class CompactHashSet<E> extends AbstractSet<E> implements Serializable
CompactHashSet is an implementation of a Set. All optional operations (adding and removing) are supported. The elements can be any objects.

contains(x), add(x) and remove(x), are all (expected and amortized) constant time operations. Expected in the hashtable sense (depends on the hash function doing a good job of distributing the elements to the buckets to a distribution not far from uniform), and amortized since some operations can trigger a hash table resize.

Unlike java.util.HashSet, iteration is only proportional to the actual size(), which is optimal, and not the size of the internal hashtable, which could be much larger than size(). Furthermore, this structure only depends on a fixed number of arrays; add(x) operations do not create objects for the garbage collector to deal with, and for every element added, the garbage collector will have to traverse 1.5 references on average, in the marking phase, not 5.0 as in java.util.HashSet.

If there are no removals, then iteration order is the same as insertion order. Any removal invalidates any ordering guarantees.

This class should not be assumed to be universally superior to java.util.HashSet. Generally speaking, this class reduces object allocation and memory consumption at the price of moderately increased constant factors of CPU. Only use this class when there is a specific reason to prioritize memory over CPU.

  • Field Details

    • HASH_FLOODING_FPP

      static final double HASH_FLOODING_FPP
      Maximum allowed false positive probability of detecting a hash flooding attack given random input.
      See Also:
    • MAX_HASH_BUCKET_LENGTH

      private static final int MAX_HASH_BUCKET_LENGTH
      Maximum allowed length of a hash table bucket before falling back to a j.u.LinkedHashSet based implementation. Experimentally determined.
      See Also:
    • table

      @CheckForNull private transient Object table
      The hashtable object. This can be either:
      • a byte[], short[], or int[], with size a power of two, created by CompactHashing.createTable, whose values are either
        • UNSET, meaning "null pointer"
        • one plus an index into the entries and elements array
      • another java.util.Set delegate implementation. In most modern JDKs, normal java.util hash collections intelligently fall back to a binary search tree if hash table collisions are detected. Rather than going to all the trouble of reimplementing this ourselves, we simply switch over to use the JDK implementation wholesale if probable hash flooding is detected, sacrificing the compactness guarantee in very rare cases in exchange for much more reliable worst-case behavior.
      • null, if no entries have yet been added to the map
    • entries

      @CheckForNull private transient int[] entries
      Contains the logical entries, in the range of [0, size()). The high bits of each int are the part of the smeared hash of the element not covered by the hashtable mask, whereas the low bits are the "next" pointer (pointing to the next entry in the bucket chain), which will always be less than or equal to the hashtable mask.
       hash  = aaaaaaaa
       mask  = 00000fff
       next  = 00000bbb
       entry = aaaaabbb
       

      The pointers in [size(), entries.length) are all "null" (UNSET).

    • elements

      @CheckForNull transient Object[] elements
      The elements contained in the set, in the range of [0, size()). The elements in [size(), elements.length) are all null.
    • metadata

      private transient int metadata
      Keeps track of metadata like the number of hash table bits and modifications of this data structure (to make it possible to throw ConcurrentModificationException in the iterator). Note that we choose not to make this volatile, so we do less of a "best effort" to track such errors, for better performance.
    • size

      private transient int size
      The number of elements contained in the set.
  • Constructor Details

    • CompactHashSet

      CompactHashSet()
      Constructs a new empty instance of CompactHashSet.
    • CompactHashSet

      CompactHashSet(int expectedSize)
      Constructs a new instance of CompactHashSet with the specified capacity.
      Parameters:
      expectedSize - the initial capacity of this CompactHashSet.
  • Method Details

    • create

      public static <E> CompactHashSet<E> create()
      Creates an empty CompactHashSet instance.
    • create

      public static <E> CompactHashSet<E> create(Collection<? extends E> collection)
      Creates a mutable CompactHashSet instance containing the elements of the given collection in unspecified order.
      Parameters:
      collection - the elements that the set should contain
      Returns:
      a new CompactHashSet containing those elements (minus duplicates)
    • create

      @SafeVarargs public static <E> CompactHashSet<E> create(E... elements)
      Creates a mutable CompactHashSet instance containing the given elements in unspecified order.
      Parameters:
      elements - the elements that the set should contain
      Returns:
      a new CompactHashSet containing those elements (minus duplicates)
    • createWithExpectedSize

      public static <E> CompactHashSet<E> createWithExpectedSize(int expectedSize)
      Creates a CompactHashSet instance, with a high enough "initial capacity" that it should hold expectedSize elements without growth.
      Parameters:
      expectedSize - the number of elements you expect to add to the returned set
      Returns:
      a new, empty CompactHashSet with enough capacity to hold expectedSize elements without resizing
      Throws:
      IllegalArgumentException - if expectedSize is negative
    • init

      void init(int expectedSize)
      Pseudoconstructor for serialization support.
    • needsAllocArrays

      boolean needsAllocArrays()
      Returns whether arrays need to be allocated.
    • allocArrays

      int allocArrays()
      Handle lazy allocation of arrays.
    • delegateOrNull

      @CheckForNull Set<E> delegateOrNull()
    • createHashFloodingResistantDelegate

      private Set<E> createHashFloodingResistantDelegate(int tableSize)
    • convertToHashFloodingResistantImplementation

      Set<E> convertToHashFloodingResistantImplementation()
    • isUsingHashFloodingResistance

      boolean isUsingHashFloodingResistance()
    • setHashTableMask

      private void setHashTableMask(int mask)
      Stores the hash table mask as the number of bits needed to represent an index.
    • hashTableMask

      private int hashTableMask()
      Gets the hash table mask using the stored number of hash table bits.
    • incrementModCount

      void incrementModCount()
    • add

      public boolean add(E object)
      Specified by:
      add in interface Collection<E>
      Specified by:
      add in interface Set<E>
      Overrides:
      add in class AbstractCollection<E>
    • insertEntry

      void insertEntry(int entryIndex, E object, int hash, int mask)
      Creates a fresh entry with the specified object at the specified position in the entry arrays.
    • resizeMeMaybe

      private void resizeMeMaybe(int newSize)
      Resizes the entries storage if necessary.
    • resizeEntries

      void resizeEntries(int newCapacity)
      Resizes the internal entries array to the specified capacity, which may be greater or less than the current capacity.
    • resizeTable

      private int resizeTable(int oldMask, int newCapacity, int targetHash, int targetEntryIndex)
    • contains

      public boolean contains(@CheckForNull Object object)
      Specified by:
      contains in interface Collection<E>
      Specified by:
      contains in interface Set<E>
      Overrides:
      contains in class AbstractCollection<E>
    • remove

      public boolean remove(@CheckForNull Object object)
      Specified by:
      remove in interface Collection<E>
      Specified by:
      remove in interface Set<E>
      Overrides:
      remove in class AbstractCollection<E>
    • moveLastEntry

      void moveLastEntry(int dstIndex, int mask)
      Moves the last entry in the entry array into dstIndex, and nulls out its old position.
    • firstEntryIndex

      int firstEntryIndex()
    • getSuccessor

      int getSuccessor(int entryIndex)
    • adjustAfterRemove

      int adjustAfterRemove(int indexBeforeRemove, int indexRemoved)
      Updates the index an iterator is pointing to after a call to remove: returns the index of the entry that should be looked at after a removal on indexRemoved, with indexBeforeRemove as the index that *was* the next entry that would be looked at.
    • iterator

      public Iterator<E> iterator()
      Specified by:
      iterator in interface Collection<E>
      Specified by:
      iterator in interface Iterable<E>
      Specified by:
      iterator in interface Set<E>
      Specified by:
      iterator in class AbstractCollection<E>
    • spliterator

      public Spliterator<E> spliterator()
      Specified by:
      spliterator in interface Collection<E>
      Specified by:
      spliterator in interface Iterable<E>
      Specified by:
      spliterator in interface Set<E>
    • forEach

      public void forEach(Consumer<? super E> action)
      Specified by:
      forEach in interface Iterable<E>
    • size

      public int size()
      Specified by:
      size in interface Collection<E>
      Specified by:
      size in interface Set<E>
      Specified by:
      size in class AbstractCollection<E>
    • isEmpty

      public boolean isEmpty()
      Specified by:
      isEmpty in interface Collection<E>
      Specified by:
      isEmpty in interface Set<E>
      Overrides:
      isEmpty in class AbstractCollection<E>
    • toArray

      public Object[] toArray()
      Specified by:
      toArray in interface Collection<E>
      Specified by:
      toArray in interface Set<E>
      Overrides:
      toArray in class AbstractCollection<E>
    • toArray

      public <T> T[] toArray(T[] a)
      Specified by:
      toArray in interface Collection<E>
      Specified by:
      toArray in interface Set<E>
      Overrides:
      toArray in class AbstractCollection<E>
    • trimToSize

      public void trimToSize()
      Ensures that this CompactHashSet has the smallest representation in memory, given its current size.
    • clear

      public void clear()
      Specified by:
      clear in interface Collection<E>
      Specified by:
      clear in interface Set<E>
      Overrides:
      clear in class AbstractCollection<E>
    • writeObject

      private void writeObject(ObjectOutputStream stream) throws IOException
      Throws:
      IOException
    • readObject

      private void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException
      Throws:
      IOException
      ClassNotFoundException
    • requireTable

      private Object requireTable()
    • requireEntries

      private int[] requireEntries()
    • requireElements

      private Object[] requireElements()
    • element

      private E element(int i)
    • entry

      private int entry(int i)
    • setElement

      private void setElement(int i, E value)
    • setEntry

      private void setEntry(int i, int value)