[Note: this document is formatted similarly to the SGI STL implementation documentation pages, and refers to concepts and classes defined there. However, neither this document nor the code it describes is associated with SGI, nor is it necessary to have SGI's STL implementation installed in order to use this class.]

dense_hash_map<Key, Data, HashFcn, EqualKey, Alloc>

dense_hash_map is a Hashed Associative Container that associates objects of type Key with objects of type Data. dense_hash_map is a Pair Associative Container, meaning that its value type is pair<const Key, Data>. It is also a Unique Associative Container, meaning that no two elements have keys that compare equal using EqualKey.

Looking up an element in a dense_hash_map by its key is efficient, so dense_hash_map is useful for "dictionaries" where the order of elements is irrelevant. If it is important for the elements to be in a particular order, however, then map is more appropriate.

dense_hash_map is distinguished from other hash-map implementations by its speed and by the ability to save and restore contents to disk. On the other hand, this hash-map implementation can use significantly more space than other hash-map implementations, and it also has requirements -- for instance, for a distinguished "empty key" -- that may not be easy for all applications to satisfy.

This class is appropriate for applications that need speedy access to relatively small "dictionaries" stored in memory, or for applications that need these dictionaries to be persistent. [implementation note])

Example

(Note: this example uses SGI semantics for hash<> -- the kind used by gcc and most Unix compiler suites -- and not Dinkumware semantics -- the kind used by Microsoft Visual Studio. If you are using MSVC, this example will not compile as-is: you'll need to change hash to hash_compare, and you won't use eqstr at all. See the MSVC documentation for hash_map and hash_compare, for more details.)
#include <iostream>
#include <sparsehash/dense_hash_map>

using google::dense_hash_map;      // namespace where class lives by default
using std::cout;
using std::endl;
using ext::hash;  // or __gnu_cxx::hash, or maybe tr1::hash, depending on your OS

struct eqstr
{
  bool operator()(const char* s1, const char* s2) const
  {
    return (s1 == s2) || (s1 && s2 && strcmp(s1, s2) == 0);
  }
};

int main()
{
  dense_hash_map<const char*, int, hash<const char*>, eqstr> months;
  
  months.set_empty_key(NULL);
  months["january"] = 31;
  months["february"] = 28;
  months["march"] = 31;
  months["april"] = 30;
  months["may"] = 31;
  months["june"] = 30;
  months["july"] = 31;
  months["august"] = 31;
  months["september"] = 30;
  months["october"] = 31;
  months["november"] = 30;
  months["december"] = 31;
  
  cout << "september -> " << months["september"] << endl;
  cout << "april     -> " << months["april"] << endl;
  cout << "june      -> " << months["june"] << endl;
  cout << "november  -> " << months["november"] << endl;
}

Definition

Defined in the header dense_hash_map. This class is not part of the C++ standard, though it is mostly compatible with the tr1 class unordered_map.

Template parameters

ParameterDescriptionDefault
Key The hash_map's key type. This is also defined as dense_hash_map::key_type.  
Data The hash_map's data type. This is also defined as dense_hash_map::data_type. [7]  
HashFcn The hash function used by the hash_map. This is also defined as dense_hash_map::hasher.
Note: Hashtable performance depends heavily on the choice of hash function. See the performance page for more information.
hash<Key>
EqualKey The hash_map key equality function: a binary predicate that determines whether two keys are equal. This is also defined as dense_hash_map::key_equal. equal_to<Key>
Alloc The STL allocator to use. By default, uses the provided allocator libc_allocator_with_realloc, which likely gives better performance than other STL allocators due to its built-in support for realloc, which this container takes advantage of. If you use an allocator other than the default, note that this container imposes an additional requirement on the STL allocator type beyond those in [lib.allocator.requirements]: it does not support allocators that define alternate memory models. That is, it assumes that pointer, const_pointer, size_type, and difference_type are just T*, const T*, size_t, and ptrdiff_t, respectively. This is also defined as dense_hash_map::allocator_type.

Model of

Unique Hashed Associative Container, Pair Associative Container

Type requirements

Public base classes

None.

Members

MemberWhere definedDescription
key_type Associative Container The dense_hash_map's key type, Key.
data_type Pair Associative Container The type of object associated with the keys.
value_type Pair Associative Container The type of object, pair<const key_type, data_type>, stored in the hash_map.
hasher Hashed Associative Container The dense_hash_map's hash function.
key_equal Hashed Associative Container Function object that compares keys for equality.
allocator_type Unordered Associative Container (tr1) The type of the Allocator given as a template parameter.
pointer Container Pointer to T.
reference Container Reference to T
const_reference Container Const reference to T
size_type Container An unsigned integral type.
difference_type Container A signed integral type.
iterator Container Iterator used to iterate through a dense_hash_map. [1]
const_iterator Container Const iterator used to iterate through a dense_hash_map.
local_iterator Unordered Associative Container (tr1) Iterator used to iterate through a subset of dense_hash_map. [1]
const_local_iterator Unordered Associative Container (tr1) Const iterator used to iterate through a subset of dense_hash_map.
iterator begin() Container Returns an iterator pointing to the beginning of the dense_hash_map.
iterator end() Container Returns an iterator pointing to the end of the dense_hash_map.
const_iterator begin() const Container Returns an const_iterator pointing to the beginning of the dense_hash_map.
const_iterator end() const Container Returns an const_iterator pointing to the end of the dense_hash_map.
local_iterator begin(size_type i) Unordered Associative Container (tr1) Returns a local_iterator pointing to the beginning of bucket i in the dense_hash_map.
local_iterator end(size_type i) Unordered Associative Container (tr1) Returns a local_iterator pointing to the end of bucket i in the dense_hash_map. For dense_hash_map, each bucket contains either 0 or 1 item.
const_local_iterator begin(size_type i) const Unordered Associative Container (tr1) Returns a const_local_iterator pointing to the beginning of bucket i in the dense_hash_map.
const_local_iterator end(size_type i) const Unordered Associative Container (tr1) Returns a const_local_iterator pointing to the end of bucket i in the dense_hash_map. For dense_hash_map, each bucket contains either 0 or 1 item.
size_type size() const Container Returns the size of the dense_hash_map.
size_type max_size() const Container Returns the largest possible size of the dense_hash_map.
bool empty() const Container true if the dense_hash_map's size is 0.
size_type bucket_count() const Hashed Associative Container Returns the number of buckets used by the dense_hash_map.
size_type max_bucket_count() const Hashed Associative Container Returns the largest possible number of buckets used by the dense_hash_map.
size_type bucket_size(size_type i) const Unordered Associative Container (tr1) Returns the number of elements in bucket i. For dense_hash_map, this will be either 0 or 1.
size_type bucket(const key_type& key) const Unordered Associative Container (tr1) If the key exists in the map, returns the index of the bucket containing the given key, otherwise, return the bucket the key would be inserted into. This value may be passed to begin(size_type) and end(size_type).
float load_factor() const Unordered Associative Container (tr1) The number of elements in the dense_hash_map divided by the number of buckets.
float max_load_factor() const Unordered Associative Container (tr1) The maximum load factor before increasing the number of buckets in the dense_hash_map.
void max_load_factor(float new_grow) Unordered Associative Container (tr1) Sets the maximum load factor before increasing the number of buckets in the dense_hash_map.
float min_load_factor() const dense_hash_map The minimum load factor before decreasing the number of buckets in the dense_hash_map.
void min_load_factor(float new_grow) dense_hash_map Sets the minimum load factor before decreasing the number of buckets in the dense_hash_map.
void set_resizing_parameters(float shrink, float grow) dense_hash_map DEPRECATED. See below.
void resize(size_type n) Hashed Associative Container Increases the bucket count to hold at least n items. [4] [5]
void rehash(size_type n) Unordered Associative Container (tr1) Increases the bucket count to hold at least n items. This is identical to resize. [4] [5]
hasher hash_funct() const Hashed Associative Container Returns the hasher object used by the dense_hash_map.
hasher hash_function() const Unordered Associative Container (tr1) Returns the hasher object used by the dense_hash_map. This is idential to hash_funct.
key_equal key_eq() const Hashed Associative Container Returns the key_equal object used by the dense_hash_map.
allocator_type get_allocator() const Unordered Associative Container (tr1) Returns the allocator_type object used by the dense_hash_map: either the one passed in to the constructor, or a default Alloc instance.
dense_hash_map() Container Creates an empty dense_hash_map.
dense_hash_map(size_type n) Hashed Associative Container Creates an empty dense_hash_map that's optimized for holding up to n items. [5]
dense_hash_map(size_type n, const hasher& h) Hashed Associative Container Creates an empty dense_hash_map that's optimized for up to n items, using h as the hash function.
dense_hash_map(size_type n, const hasher& h, const key_equal& k) Hashed Associative Container Creates an empty dense_hash_map that's optimized for up to n items, using h as the hash function and k as the key equal function.
dense_hash_map(size_type n, const hasher& h, const key_equal& k, const allocator_type& a) Unordered Associative Container (tr1) Creates an empty dense_hash_map that's optimized for up to n items, using h as the hash function, k as the key equal function, and a as the allocator object.
template <class InputIterator>
dense_hash_map(InputIterator f, InputIterator l) 
[2]
Unique Hashed Associative Container Creates a dense_hash_map with a copy of a range.
template <class InputIterator>
dense_hash_map(InputIterator f, InputIterator l, size_type n) 
[2]
Unique Hashed Associative Container Creates a hash_map with a copy of a range that's optimized to hold up to n items.
template <class InputIterator>
dense_hash_map(InputIterator f, InputIterator l, size_type n, const
hasher& h) 
[2]
Unique Hashed Associative Container Creates a hash_map with a copy of a range that's optimized to hold up to n items, using h as the hash function.
template <class InputIterator>
dense_hash_map(InputIterator f, InputIterator l, size_type n, const
hasher& h, const key_equal& k) 
[2]
Unique Hashed Associative Container Creates a hash_map with a copy of a range that's optimized for holding up to n items, using h as the hash function and k as the key equal function.
template <class InputIterator>
dense_hash_map(InputIterator f, InputIterator l, size_type n, const
hasher& h, const key_equal& k, const allocator_type& a) 
[2]
Unordered Associative Container (tr1) Creates a hash_map with a copy of a range that's optimized for holding up to n items, using h as the hash function, k as the key equal function, and a as the allocator object.
dense_hash_map(const hash_map&) Container The copy constructor.
dense_hash_map& operator=(const hash_map&) Container The assignment operator
void swap(hash_map&) Container Swaps the contents of two hash_maps.
pair<iterator, bool> insert(const value_type& x)
Unique Associative Container Inserts x into the dense_hash_map.
template <class InputIterator>
void insert(InputIterator f, InputIterator l) 
[2]
Unique Associative Container Inserts a range into the dense_hash_map.
void set_empty_key(const key_type& key) [6] dense_hash_map See below.
void set_deleted_key(const key_type& key) [6] dense_hash_map See below.
void clear_deleted_key() [6] dense_hash_map See below.
void erase(iterator pos) Associative Container Erases the element pointed to by pos. [6]
size_type erase(const key_type& k) Associative Container Erases the element whose key is k. [6]
void erase(iterator first, iterator last) Associative Container Erases all elements in a range. [6]
void clear() Associative Container Erases all of the elements.
void clear_no_resize() dense_hash_map See below.
const_iterator find(const key_type& k) const Associative Container Finds an element whose key is k.
iterator find(const key_type& k) Associative Container Finds an element whose key is k.
size_type count(const key_type& k) const Unique Associative Container Counts the number of elements whose key is k.
pair<const_iterator, const_iterator> equal_range(const
key_type& k) const 
Associative Container Finds a range containing all elements whose key is k.
pair<iterator, iterator> equal_range(const
key_type& k) 
Associative Container Finds a range containing all elements whose key is k.
data_type& operator[](const key_type& k) [3] 
dense_hash_map See below.
template <ValueSerializer, OUTPUT> bool serialize(ValueSerializer serializer, OUTPUT *fp) dense_hash_map See below.
template <ValueSerializer, INPUT> bool unserialize(ValueSerializer serializer, INPUT *fp) dense_hash_map See below.
NopointerSerializer dense_hash_map See below.
bool write_metadata(FILE *fp) dense_hash_map DEPRECATED. See below.
bool read_metadata(FILE *fp) dense_hash_map DEPRECATED. See below.
bool write_nopointer_data(FILE *fp) dense_hash_map DEPRECATED. See below.
bool read_nopointer_data(FILE *fp) dense_hash_map DEPRECATED. See below.
bool operator==(const hash_map&, const hash_map&)
Hashed Associative Container Tests two hash_maps for equality. This is a global function, not a member function.

New members

These members are not defined in the Unique Hashed Associative Container, Pair Associative Container, or tr1's +Unordered Associative Container requirements, but are specific to dense_hash_map.
MemberDescription
void set_empty_key(const key_type& key) Sets the distinguished "empty" key to key. This must be called immediately after construct time, before calls to another other dense_hash_map operation. [6]
void set_deleted_key(const key_type& key) Sets the distinguished "deleted" key to key. This must be called before any calls to erase(). [6]
void clear_deleted_key() Clears the distinguished "deleted" key. After this is called, calls to erase() are not valid on this object. [6]
void clear_no_resize() Clears the hashtable like clear() does, but does not recover the memory used for hashtable buckets. (The memory used by the items in the hashtable is still recovered.) This can save time for applications that want to reuse a dense_hash_map many times, each time with a similar number of objects.
data_type& 
operator[](const key_type& k) [3]
Returns a reference to the object that is associated with a particular key. If the dense_hash_map does not already contain such an object, operator[] inserts the default object data_type(). [3]
void set_resizing_parameters(float shrink, float grow) This function is DEPRECATED. It is equivalent to calling min_load_factor(shrink); max_load_factor(grow).
template <ValueSerializer, OUTPUT> bool serialize(ValueSerializer serializer, OUTPUT *fp) Emit a serialization of the hash_map to a stream. See below.
template <ValueSerializer, INPUT> bool unserialize(ValueSerializer serializer, INPUT *fp) Read in a serialization of a hash_map from a stream, replacing the existing hash_map contents with the serialized contents. See below.
bool write_metadata(FILE *fp) This function is DEPRECATED. See below.
bool read_metadata(FILE *fp) This function is DEPRECATED. See below.
bool write_nopointer_data(FILE *fp) This function is DEPRECATED. See below.
bool read_nopointer_data(FILE *fp) This function is DEPRECATED. See below.

Notes

[1] dense_hash_map::iterator is not a mutable iterator, because dense_hash_map::value_type is not Assignable. That is, if i is of type dense_hash_map::iterator and p is of type dense_hash_map::value_type, then *i = p is not a valid expression. However, dense_hash_map::iterator isn't a constant iterator either, because it can be used to modify the object that it points to. Using the same notation as above, (*i).second = p is a valid expression.

[2] This member function relies on member template functions, which may not be supported by all compilers. If your compiler supports member templates, you can call this function with any type of input iterator. If your compiler does not yet support member templates, though, then the arguments must either be of type const value_type* or of type dense_hash_map::const_iterator.

[3] Since operator[] might insert a new element into the dense_hash_map, it can't possibly be a const member function. Note that the definition of operator[] is extremely simple: m[k] is equivalent to (*((m.insert(value_type(k, data_type()))).first)).second. Strictly speaking, this member function is unnecessary: it exists only for convenience.

[4] In order to preserve iterators, erasing hashtable elements does not cause a hashtable to resize. This means that after a string of erase() calls, the hashtable will use more space than is required. At a cost of invalidating all current iterators, you can call resize() to manually compact the hashtable. The hashtable promotes too-small resize() arguments to the smallest legal value, so to compact a hashtable, it's sufficient to call resize(0).

[5] Unlike some other hashtable implementations, the optional n in the calls to the constructor, resize, and rehash indicates not the desired number of buckets that should be allocated, but instead the expected number of items to be inserted. The class then sizes the hash-map appropriately for the number of items specified. It's not an error to actually insert more or fewer items into the hashtable, but the implementation is most efficient -- does the fewest hashtable resizes -- if the number of inserted items is n or slightly less.

[6] dense_hash_map requires you call set_empty_key() immediately after constructing the hash-map, and before calling any other dense_hash_map method. (This is the largest difference between the dense_hash_map API and other hash-map APIs. See implementation.html for why this is necessary.) The argument to set_empty_key() should be a key-value that is never used for legitimate hash-map entries. If you have no such key value, you will be unable to use dense_hash_map. It is an error to call insert() with an item whose key is the "empty key."

dense_hash_map also requires you call set_deleted_key() before calling erase(). The argument to set_deleted_key() should be a key-value that is never used for legitimate hash-map entries. It must be different from the key-value used for set_empty_key(). It is an error to call erase() without first calling set_deleted_key(), and it is also an error to call insert() with an item whose key is the "deleted key."

There is no need to call set_deleted_key if you do not wish to call erase() on the hash-map.

It is acceptable to change the deleted-key at any time by calling set_deleted_key() with a new argument. You can also call clear_deleted_key(), at which point all keys become valid for insertion but no hashtable entries can be deleted until set_deleted_key() is called again.

[7] dense_hash_map requires that data_type has a zero-argument default constructor. This is because dense_hash_map uses the special value pair(empty_key, data_type()) to denote empty buckets, and thus needs to be able to create data_type using a zero-argument constructor.

If your data_type does not have a zero-argument default constructor, there are several workarounds:

Input/Output

It is possible to save and restore dense_hash_map objects to an arbitrary stream (such as a disk file) using the serialize() and unserialize() methods.

Each of these methods takes two arguments: a serializer, which says how to write hashtable items to disk, and a stream, which can be a C++ stream (istream or its subclasses for input, ostream or its subclasses for output), a FILE*, or a user-defined type (as described below).

The serializer is a functor that takes a stream and a single hashtable element (a value_type, which is a pair of the key and data) and copies the hashtable element to the stream (for serialize()) or fills the hashtable element contents from the stream (for unserialize()), and returns true on success or false on error. The copy-in and copy-out functions can be provided in a single functor. Here is a sample serializer that read/writes a hashtable element for an int-to-string hash_map to a FILE*:

struct StringToIntSerializer {
  bool operator()(FILE* fp, const std::pair<const int, std::string>& value) const {
    // Write the key.  We ignore endianness for this example.
    if (fwrite(&value.first, sizeof(value.first), 1, fp) != 1)
      return false;
    // Write the value.
    assert(value.second.length() <= 255);   // we only support writing small strings
    const unsigned char size = value.second.length();
    if (fwrite(&size, 1, 1, fp) != 1)
      return false;
    if (fwrite(value.second.data(), size, 1, fp) != 1)
      return false;
    return true;
  }
  bool operator()(FILE* fp, std::pair<const int, std::string>* value) const {
    // Read the key.  Note the need for const_cast to get around
    // the fact hash_map keys are always const.
    if (fread(const_cast<int*>(&value->first), sizeof(value->first), 1, fp) != 1)
      return false;
    // Read the value.
    unsigned char size;    // all strings are <= 255 chars long
    if (fread(&size, 1, 1, fp) != 1)
      return false;
    char* buf = new char[size];
    if (fread(buf, size, 1, fp) != 1) {
      delete[] buf;
      return false;
    }
    value->second.assign(buf, size);
    delete[] buf;
    return true;
  }
};

Here is the functor being used in code (error checking omitted):

   dense_hash_map<string, int> mymap = CreateMap();
   FILE* fp = fopen("hashtable.data", "w");
   mymap.serialize(StringToIntSerializer(), fp);
   fclose(fp);

   dense_hash_map<string, int> mymap2;
   FILE* fp_in = fopen("hashtable.data", "r");
   mymap2.unserialize(StringToIntSerializer(), fp_in);
   fclose(fp_in);
   assert(mymap == mymap2);

Note that this example serializer can only serialize to a FILE*. If you want to also be able to use this serializer with C++ streams, you will need to write two more overloads of operator()'s, one that reads from an istream, and one that writes to an ostream. Likewise if you want to support serializing to a custom class.

If both the key and data are "simple" enough, you can use the pre-supplied functor NopointerSerializer. This copies the hashtable data using the equivalent of a memcpy<>. Native C data types can be serialized this way, as can structs of native C data types. Pointers and STL objects cannot.

Note that NopointerSerializer() does not do any endian conversion. Thus, it is only appropriate when you intend to read the data on the same endian architecture as you write the data.

If you wish to serialize to your own stream type, you can do so by creating an object which supports two methods:

   bool Write(const void* data, size_t length);
   bool Read(void* data, size_t length);

Write() writes length bytes of data to a stream (presumably a stream owned by the object), while Read() reads data bytes from the stream into data. Both return true on success or false on error.

To unserialize a hashtable from a stream, you wil typically create a new dense_hash_map object, then call unserialize() on it. unserialize() destroys the old contents of the object. You must pass in the appropriate ValueSerializer for the data being read in.

Both serialize() and unserialize() return true on success, or false if there was an error streaming the data.

Note that serialize() is not a const method, since it purges deleted elements before serializing. It is not safe to serialize from two threads at once, without synchronization.

NOTE: older versions of dense_hash_map provided a different API, consisting of read_metadata(), read_nopointer_data(), write_metadata(), write_nopointer_data(). These methods were never implemented and always did nothing but return false. You should exclusively use the new API for serialization.

Validity of Iterators

erase() is guaranteed not to invalidate any iterators -- except for any iterators pointing to the item being erased, of course. insert() invalidates all iterators, as does resize().

This is implemented by making erase() not resize the hashtable. If you desire maximum space efficiency, you can call resize(0) after a string of erase() calls, to force the hashtable to resize to the smallest possible size.

In addition to invalidating iterators, insert() and resize() invalidate all pointers into the hashtable. If you want to store a pointer to an object held in a dense_hash_map, either do so after finishing hashtable inserts, or store the object on the heap and a pointer to it in the dense_hash_map.

See also

The following are SGI STL, and some Google STL, concepts and classes related to dense_hash_map.

hash_map, Associative Container, Hashed Associative Container, Pair Associative Container, Unique Hashed Associative Container, set, map multiset, multimap, hash_set, hash_multiset, hash_multimap, sparse_hash_map, sparse_hash_set, dense_hash_set