Hash2: An Extensible Hashing Framework

Overview

This library implements an extensible framework for implementing hashing algorithms that can support user-defined types. Its structure is largely based on the paper "Types don’t know #" by Howard Hinnant, Vinnie Falco and John Bytheway.

The key feature of the design is the clean separation between the hash algorithm, which takes an untyped stream of bytes (a message) and produces a hash value (a message digest), and the hash_append function, which takes a type and is responsible for turning the value of this type into a sequence of bytes and feeding them to a hash algorithm.

This allows hashing support for user-defined types to be written once, and then automatically being usable with any hash algorithms, even such that weren’t yet available at the time the type was defined.

The following popular hashing algorithms are provided:

but it’s also possible for users to write their own; as long as the hash algorithm conforms to the concept, hash_append will work with it, and so will all user-defined types that support hash_append.

Hashing Byte Sequences

This library addresses two major use cases: hashing an untyped sequence of bytes, and hashing C++ objects.

Untyped byte sequences (also called messages) are hashed by passing them to a hash algorithm, which then produces a hash value (or a message digest).

The same hash algorithm, when passed the same message, will always produce the same digest. (Published algorithms provide message and corresponding digest pairs, called test vectors, to enable verification of independent implementations.)

(To hash a C++ object, it’s first converted (serialized) to a sequence of bytes, then passed to a hash algorithm.)

Hash Algorithm Requirements

A hash algorithm must have the following structure, and meet the following minimum requirements:

struct HashAlgorithm
{
    using result_type = /*integral or array-like*/;

    static constexpr int block_size = /*...*/; // optional

    HashAlgorithm();
    explicit HashAlgorithm( std::uint64_t seed );
    HashAlgorithm( unsigned char const* seed, std::size_t n );

    HashAlgorithm( HashAlgorithm const& r );
    HashAlgorithm& operator=( HashAlgorithm const& r );

    void update( void const* data, std::size_t n );

    result_type result();
};

result_type

The nested type result_type is the type of the produced hash value. It can be an unsigned integer type (that is not bool), typically std::uint32_t or std::uint64_t, or a std::array-like type with a value type of unsigned char.

Normally, non-cryptographic hash functions have an integer result_type, and cryptographic hash functions have an array-like result_type, but that’s not required.

The provided utility function get_integral_result can be used to obtain an integer hash value from any valid result_type.

block_size

Cryptographic hash functions provide a block_size value, which is their block size (e.g. 64 for MD5, 128 for SHA2-512) and is required in order to implement the corresponding HMAC.

block_size is an optional requirement.

Default Constructor

All hash algorithms must be default constructible. The default constructor initializes the internal state of the hash algorithm to its initial values, as published in its specification.

For example, the default constructor of md5_128 corresponds to calling the function MD5_Init of the reference implementation.

Constructor Taking an Integer Seed

All hash algorithms must be constructible from a value of type std::uint64_t, which serves as a seed.

Using a seed value of 0 is equivalent to default construction.

Distinct seed values cause the internal state to be initialized differently, and therefore, instances of the hash algorithm initialized by different seeds produce different hash values when passed the same message.

Seeding using random (unobservable from the outside) values is useful for preventing hash flooding attacks.

Constructor Taking a Byte Sequence Seed

All hash algorithms can be constructed from a seed sequence of unsigned char values (this makes all hash algorithms keyed hash functions.)

A null sequence (one with length 0) produces a default-constructed instance.

Different seed sequences produce differently initialized instances.

While this requirement makes all hash algorithms usable as MACs (Message Authentication Codes), you should as a general rule prefer an established MAC algorithm, such as HMAC. (A HMAC implementation is provided.)

Copy Constructor, Copy Assignment

Hash algorithms are copy constructible and copy assignable, providing the usual guarantees for these operations. That is, a copy is equivalent to the original.

update

The function update is the mechanism by which the input message is provided to the hash algorithm.

Calling update several times is equivalent to calling it once with the concatenated byte sequences from the individual calls. That is, the input message may be provided in parts, and the way it’s split into parts does not matter and does not affect the final hash value.

Given

Hash hash; // some hash algorithm
unsigned char message[6] = { /*...*/ }; // some input message

the following update call

hash.update( message, 6 );

is equivalent to

hash.update( message, 4 );
hash.update( message + 4, 2 );

and to

for( int i = 0; i < 6; ++i ) hash.update( &message[i], 1 );

result

After the entire input message has been provided via calls to update, the final hash value can be obtained by calling result.

The call to result finalizes the internal state, by padding the message as per the concrete algorithm specification, by optionally incorporating the length of the message into the state, and by performing finalization operations on the state, again as specified by the concrete algorithm.

A final hash value is then obtained by transforming the internal state, and returned.

Note that result is non-const, because it changes the internal state. It’s allowed for result to be called more than once; subsequent calls perform the state finalization again and as a result produce a pseudorandom sequence of result_type values. This can be used to effectively extend the output of the hash function. For example, a 256 bit result can be obtained from a hash algorithm whose result_type is 64 bit, by calling result four times.

As a toy example, not intended for production use, this is how one could write a random number generator on top of the FNV-1a implementation provided by the library:

std::uint64_t random()
{
    static boost::hash2::fnv1a_64 hash;
    return hash.result();
}

Compile Time Hashing

Under C++14, it’s possible to invoke some hash algorithms at compile time. These algorithms provide the following interface:

struct HashAlgorithm
{
    using result_type = /*integral or array-like*/;

    static constexpr int block_size = /*...*/; // optional

    constexpr HashAlgorithm();
    explicit constexpr HashAlgorithm( std::uint64_t seed );
    constexpr HashAlgorithm( unsigned char const* seed, std::size_t n );

    constexpr HashAlgorithm( HashAlgorithm const& r );
    constexpr HashAlgorithm& operator=( HashAlgorithm const& r );

    void update( void const* data, std::size_t n );
    constexpr void update( unsigned char const* data, std::size_t n );

    constexpr result_type result();
};

Apart from the added constexpr qualifiers, the only difference is that update has a second overload that takes unsigned char const* instead of void const*. (Pointers to void cannot be used in constexpr functions before C++26.)

Provided Hash Algorithms

FNV-1a

The Fowler-Noll-Vo hash function is provided as a representative of the class of hash functions that process their input one byte at a time. The 32 or 64 bit state is updated for each input character ch by using the operation state = (state ^ ch) * fnv_prime.

FNV-1a is non-cryptographic, relatively weak compared to state of the art hash functions (although good for its class), but fast when the input strings are short.

xxHash

xxHash is a fast non-cryptographic hashing algorithm by Yann Collet.

Its speed (~5GB/s for xxhash_32, ~10GB/s for xxhash_64 on a Xeon E5-2683 v4 @ 2.10GHz) makes it well suited for quick generation of file or data integrity checksums.

SipHash

SipHash by Jean-Philippe Aumasson and Daniel J. Bernstein (paper) has been designed to thwart hash flooding attacks against hash tables that receive external untrusted input (e.g. HTTP message headers, or JSON objects.)

It’s not a cryptographic hash function (even though its design is similar to one), because it does not provide collision resistance when the initial seed is known.

It is, however, a cryptographically strong keyed hash function (or a pseudorandom function, PRF). If the initial seed is unknown to the attacker, it’s computationally difficult to engineer a collision, or to recover the seed by observing the output.

SipHash has been adopted as the de-facto standard hash function for hash tables that can be exposed to external input, and is used in Python, Perl, Ruby, Rust, and other languages.

SipHash is the recommended hash function for hash tables exposed to external input. As a best practice, it should be seeded with a random value that varies per connection, and not a fixed one per process.

MD5

Designed in 1991 by Ron Rivest, MD5 used to be the best known and the most widely used cryptographic hash function, but has been broken and is no longer considered cryptographic for any purposes. It produces a 128 bit digest.

MD5 should no longer be used in new code when cryptographic strength is required, except when implementing an existing specification or protocol that calls for its use.

Prefer SHA2-512/256 (or SHA2-256 in 32 bit code) instead.

If you require a digest of exactly 128 bits, use RIPEMD-128 instead. Do note that 128 bit digests are no longer considered cryptographic, because attacks with a complexity of 2⁶⁴ are within the capabilities of well-funded attackers.

SHA-1

SHA-1 is a cryptographic hash function that was designed by NSA and published in 1995 by NIST as a Federal Information Processing Standard (FIPS). It produces a 160 bit digest.

SHA-1 is now considered insecure against a well-funded attacker, and should no longer be used in new code. Prefer SHA2-512/256, SHA2-256 in 32 bit code, or, if you require a digest of exactly 160 bits, RIPEMD-160 instead.

SHA-2

SHA-2 is a family of cryptographic hash functions, also designed by NSA, initially published by NIST in 2002, and updated in 2015. It includes SHA2-224, SHA2-256, SHA2-384, SHA2-512, SHA2-512/224, and SHA2-512/256, each producing a digest with the corresponding bit length.

Of these, SHA2-256 and SHA2-512 are the basis ones, and the rest are variants with the digest truncated.

The SHA-2 functions haven’t been broken and are in wide use, despite the existence of a newer standard (SHA-3).

SHA2-256 and its truncated variant SHA2-224 use 32 bit operations and therefore do not lose performance on a 32 bit platform.

SHA2-512 and its truncated variants SHA2-384, SHA2-512/224, and SHA2-512/256 use 64 bit operations and are approximately 1.5 times as fast as SHA2-256 on a 64 bit platform, but twice as slow in 32 bit code.

On 64 bit platforms, SHA2-512/256 and SHA2-512/224 should be preferred over SHA2-256 and SHA2-224 not just because of speed, but because they are resistant to length extension attacks as they don’t expose all of the bits of their internal state in the final digest.

RIPEMD-160, RIPEMD-128

Designed in 1996, RIPEMD-160 is a cryptographic hash function that was less well known than MD5 and SHA-1, but that has recently become popular because of its use in Bitcoin and other cryptocurrencies.

Even though it hasn’t been broken, there’s no reason to prefer its use in new code over SHA-2.

RIPEMD-128 is a truncated variant of RIPEMD-160. (Do note that 128 bit digests are no longer considered cryptographic, because attacks with a complexity of 2⁶⁴ are within the capabilities of well-funded attackers.)

HMAC

HMAC (Hash-based Message Authentication Code) is an algorithm for deriving a message authentication code by using a cryptographic hash function. It’s described in RFC 2104.

A message authentication code differs from a digest by the fact that it depends on both the contents of the message and on a secret key; in contrast, a message digest only depends on the contents of the message.

Even though all hash algorithms provided by the library can be used to produce message authentication codes, by means of seeding the hash algorithm initially with a secret key by calling the constructor taking a byte sequence, hash algorithms have usually not been designed to be used in this manner, and such use hasn’t been cryptographically analyzed and vetted. (SipHash is an exception; it has specifically been designed as a MAC.)

The HMAC algorithm is provided in the form of a class template hmac<H> that adapts a cryptographic hash algorithm H. hmac<H> satisfies the requirements of a cryptographic hash algorithm.

Convenience aliases of common HMAC instantiations are provided. For example, the md5.hpp header defining md5_128 also defines hmac_md5_128 as an alias to hmac<md5_128>.

Choosing a Hash Algorithm

If your use case requires cryptographic strength, use SHA2-512/256 (or SHA2-256 in 32 bit code) for digests, and the corresponding HMAC for message authentication codes.

Note	Digests of fewer than 256 bits in length are no longer recommended when cryptographic security is required or desired.

For computing file or content checksums, when speed is of the essence and externally induced collisions aren’t a concern, use xxHash-64.

Note	xxHash-32 will be faster in 32 bit code, but since it only produces a 32 bit result, collisions will become an issue when the number of items reaches tens of thousands, which is usually unacceptable.

For a large number of items (many millions), 64 bits may not be enough; in that case, use either MD5, or xxHash-64, extended to 128 bits.

Note	Even though MD5 is no longer cryptographically secure, it can still be used when cryptographic strength is not a requirement.

For hash tables, use SipHash by default, with a random (unpredictable from the outside) seed that varies per connection or per container. Avoid using a fixed processwide seed. Never use SipHash without a seed.

For hash tables with very short keys (3-4-5 bytes), unexposed to external input, you can use FNV-1a, although the default hash function of e.g. boost::unordered_flat_map will typically perform better.

Hashing C++ Objects

The traditional approach to hashing C++ objects is to make them responsible for providing a hash value. The standard, for instance, follows this by making it the responsibility of each type T to implement a specialization of std::hash<T>, which when invoked with a value returns its size_t hash.

This, of course, means that the specific hash algorithm varies per type and is, in the general case, completely opaque.

This library takes a different approach; the hash algorithm is known and chosen by the user. A C++ object is hashed by first being converted to a sequence of bytes representing its value (a message) which is then passed to the hash algorithm.

The conversion must obey the following requirements:

Equal objects must produce the same message;
Different objects should produce different messages;
An object should always produce a non-empty message.

The first two requirements follow directly from the hash value requirements, whereas the third one is a bit more subtle and is intended to prevent things like the distinct sequences [[1], [], []] and [[], [1], []] producing the same message. (This is similar to the requirement that all C++ objects have sizeof that is not zero, including empty ones.)

In this library, the conversion is performed by the function hash_append. It’s declared as follows:

template<class Hash, class Flavor = default_flavor, class T>
constexpr void hash_append( Hash& h, Flavor const& f, T const& v );

and the effect of invoking hash_append(h, f, v) is to call h.update(p, n) one or more times (but never zero times.) The combined result of these calls forms the message corresponding to v.

hash_append handles natively the following types T:

Integral types (signed and unsigned integers, character types, bool);
Floating point types (float and double);
Enumeration types;
Pointer types (object and function, but not pointer to member types);
C arrays;
Containers and ranges (types that provide begin() and end());
Unordered containers and ranges;
Constant size containers (std::array, boost::array);
Tuple-like types (std::pair, std::tuple);
Described classes (using Boost.Describe).

User-defined types that aren’t in the above categories can provide support for hash_append by declaring an overload of the tag_invoke function with the appropriate parameters.

The second argument to hash_append, the flavor, is used to control the serialization process in cases where more than one behavior is possible and desirable. It currently contains the following members:

static constexpr endian byte_order; // native, little, or big
using size_type = std::uint64_t; // or std::uint32_t

The byte_order member of the flavor affects how scalar C++ objects are serialized into bytes. For example, the uint32_t integer 0x01020304 can be serialized into { 0x01, 0x02, 0x03, 0x04 } when byte_order is endian::big, and into { 0x04, 0x03, 0x02, 0x01 } when byte_order is endian::little.

The value endian::native means to use the byte order of the current platform. This typically results in higher performance, because it allows hash_append to pass the underlying object bytes directly to the hash algorithm, without any processing.

The size_type member type of the flavor affects how container and range sizes (typically of type size_t) are serialized. Since the size of size_t in bytes can vary, serializing the type directly results in different hash values when the code is compiled for 64 bit or for 32 bit. Using a fixed width type avoids this.

There are three predefined flavors, defined in boost/hash2/flavor.hpp:

struct default_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::native;
};

struct little_endian_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::little;
};

struct big_endian_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::big;
};

The default one is used when hash_append is invoked without passing a flavor: hash_append(h, {}, v);. It results in higher performance, but the hash values are endianness dependent.

Contiguously Hashable Types

The first thing hash_append(h, f, v) does is to check whether the type is contiguously hashable under the requested byte order, by testing is_contiguously_hashable<T, Flavor::byte_order>::value. When that’s true, it invokes h.update(&v, sizeof(v)).

Integral Types

When T is an integral type (bool, a signed or unsigned integer type like int or unsigned long, or a character type like char8_t or char32_t), v converted into its byte representation (an array of unsigned char and a size of sizeof(T)) under the requested byte order.

hash_append then calls h.update(p, n), where p is the address of this representation, and n is its size.

For example, the value 0x01020304 of type std::uint32_t, when Flavor::byte_order is endian::little, is converted into the array { 0x04, 0x03, 0x02, 0x01 }.

int main()
{
    boost::hash2::fnv1a_32 h1;
    std::uint32_t v1 = 0x01020304;
    boost::hash2::hash_append( h1, boost::hash2::little_endian_flavor(), v1 );

    boost::hash2::fnv1a_32 h2;
    unsigned char v2[] = { 0x04, 0x03, 0x02, 0x01 };
    h2.update( v2, sizeof(v2) );

    assert( h1.result() == h2.result() );
}

Floating Point Types

When T is a floating point type (only float and double are supported at the moment), v is converted into an unsigned integer type of the same size using the equivalent of std::bit_cast, then hash_append is invoked with that converted value.

int main()
{
    boost::hash2::fnv1a_32 h1;
    float v1 = 3.14f;
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    std::uint32_t v2 = 0x4048f5c3;
    boost::hash2::hash_append( h2, {}, v2 );

    assert( h1.result() == h2.result() );
}

However, there’s a subtlety here. The requirements for a hash function H say that if x == y, then H(x) == H(y). But +0.0 == -0.0, even though the bit representations of these two values differ.

So, in order to meet the requirement, if v is negative zero, it’s first replaced with a positive zero of the same type, before bit_cast to an integer.

int main()
{
    boost::hash2::fnv1a_32 h1;
    boost::hash2::hash_append( h1, {}, +0.0 );

    boost::hash2::fnv1a_32 h2;
    boost::hash2::hash_append( h2, {}, -0.0 );

    assert( h1.result() == h2.result() );
}

Enumeration Types

When T is an enumeration type, v is converted to the underlying type of T, then the converted value is passed to hash_append.

enum E: int
{
    v1 = 123
};

int main()
{
    boost::hash2::fnv1a_32 h1;
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    int v2 = 123;
    boost::hash2::hash_append( h2, {}, v2 );

    assert( h1.result() == h2.result() );
}

Pointers

When T is a pointer type, it’s converted to std::uintptr_t using reinterpret_cast, and the converted value is passed to hash_append.

int x1 = 0;

int main()
{
    boost::hash2::fnv1a_32 h1;
    boost::hash2::hash_append( h1, {}, &x1 );

    boost::hash2::fnv1a_32 h2;
    boost::hash2::hash_append( h2, {}, reinterpret_cast<std::uintptr_t>(&x1) );

    assert( h1.result() == h2.result() );
}

Arrays

When T is an array type U[N], the elements of v are passed to hash_append in sequence.

This is accomplished by calling hash_append_range(h, f, v + 0, v + N).

int main()
{
    boost::hash2::fnv1a_32 h1;
    int v1[4] = { 1, 2, 3, 4 };
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    boost::hash2::hash_append_range( h2, {}, v1 + 0, v1 + 4 );

    assert( h1.result() == h2.result() );

    boost::hash2::fnv1a_32 h3;
    boost::hash2::hash_append( h3, {}, v1[0] );
    boost::hash2::hash_append( h3, {}, v1[1] );
    boost::hash2::hash_append( h3, {}, v1[2] );
    boost::hash2::hash_append( h3, {}, v1[3] );

    assert( h1.result() == h3.result() );
}

Ranges

When T is a range (boost::container_hash::is_range<T>::value is true), its elements are passed to hash_append as follows:

When T is an unordered range (boost::container_hash::is_unordered_range<T>::value is true), hash_append invokes hash_append_unordered_range(h, f, v.begin(), v.end()). hash_append_unordered_range derives a hash value from the range elements in such a way so that their order doesn’t affect the hash value.
When T is a contiguous range (boost::container_hash::is_contiguous_range<T>::value is true), hash_append first invokes hash_append_range(h, f, v.data(), v.data() + v.size()), then, if has_constant_size<T>::value is false, it invokes hash_append_size(h, f, v.size()).
Otherwise, hash_append first invokes hash_append_range(h, f, v.begin(), v.end()), then, if has_constant_size<T>::value is false, it invokes hash_append_size(h, f, m), where m is std::distance(v.begin(), v.end()).

As a special case, in order to meet the requirement that a call to hash_append must always result in at least one call to Hash::update, for ranges of constant size 0, hash_append(h, f, '\x00') is called.

int main()
{
    boost::hash2::fnv1a_32 h1;
    std::vector<int> v1 = { 1, 2, 3, 4 };
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    std::list<int> v2 = { 1, 2, 3, 4 };
    boost::hash2::hash_append( h2, {}, v2 );

    assert( h1.result() == h2.result() );

    boost::hash2::fnv1a_32 h3;
    boost::hash2::hash_append_range( h3, {}, v1.data(), v1.data() + v1.size() );
    boost::hash2::hash_append_size( h3, {}, v1.size() );

    assert( h1.result() == h3.result() );

    boost::hash2::fnv1a_32 h4;
    boost::hash2::hash_append_range( h4, {}, v2.begin(), v2.end() );
    boost::hash2::hash_append_size( h4, {}, std::distance(v2.begin(), v2.end()) );

    assert( h2.result() == h4.result() );
}

Tuples

When T is a tuple (boost::container_hash::is_tuple_like<T>::value is true), its elements as obtained by get<I>(v) for I in [0, std::tuple_size<T>::value) are passed to hash_append, in sequence.

As a special case, in order to meet the requirement that a call to hash_append must always result in at least one call to Hash::update, for tuples of size 0, hash_append(h, f, '\x00') is called.

int main()
{
    boost::hash2::fnv1a_32 h1;
    std::tuple<int, int, int> v1 = { 1, 2, 3 };
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    boost::hash2::hash_append( h2, {}, get<0>(v1) );
    boost::hash2::hash_append( h2, {}, get<1>(v1) );
    boost::hash2::hash_append( h2, {}, get<2>(v1) );

    assert( h1.result() == h2.result() );
}

Described Classes

When T is a described class (boost::container_hash::is_described_class<T>::value is true), Boost.Describe primitives are used to enumerate its bases and members, and then, for each base class subobject b of v, hash_append(h, f, b) is called, then for each member subobject m of v, hash_append(h, f, m) is called.

struct X
{
    int a;
};

BOOST_DESCRIBE_STRUCT(X, (), (a))

struct Y: public X
{
    int b;
};

BOOST_DESCRIBE_STRUCT(Y, (X), (b))

int main()
{
    boost::hash2::fnv1a_32 h1;
    X v1 = { { 1 }, 2 };
    boost::hash2::hash_append( h1, {}, v1 );

    boost::hash2::fnv1a_32 h2;
    boost::hash2::hash_append( h2, {}, v1.a );
    boost::hash2::hash_append( h2, {}, v1.b );

    assert( h1.result() == h2.result() );
}

As a special case, in order to meet the requirement that a call to hash_append must always result in at least one call to Hash::update, for classes without any bases or members, hash_append(h, f, '\x00') is called.

User Defined Types

When T is a user defined type that does not fall into one of the above categories, it needs to provide its own hashing support, by defining an appropriate tag_invoke overload.

This tag_invoke overload needs to have the following form:

template<class Hash, class Flavor>
void tag_invoke( boost::hash2::hash_append_tag const&, Hash& h, Flavor const& f, X const& v );

where X is the user-defined type.

It can be defined as a separate free function in the namespace of X, but the recommended approach is to define it as an inline friend in the definition of X:

#include <boost/hash2/hash_append_fwd.hpp>
#include <string>

class X
{
private:

    std::string a;
    int b;

    // not part of the salient state
    void const* c;

public:

    friend bool operator==( X const& x1, X const& x2 )
    {
        return x1.a == x2.a && x1.b == x2.b;
    }

    template<class Hash, class Flavor>
    friend void tag_invoke( boost::hash2::hash_append_tag const&,
        Hash& h, Flavor const& f, X const& v )
    {
        boost::hash2::hash_append(h, f, v.a);
        boost::hash2::hash_append(h, f, v.b);
    }
};

This overload needs to meet the three requirements for a hash function. In practice, this means that the definitions of equality (operator==) and hashing (tag_invoke) must agree on what members need to be included.

In the example above, the member c is not part of the object state, so it’s neither compared in operator==, nor included in the object message in tag_invoke.

The particular implementation of tag_invoke is type-specific. In general, it needs to include all salient parts of the object’s value in the resultant message, but the exact way to do so is type-dependent.

As another example, here’s how one might implement tag_invoke for an "inline string" type (a string that stores its characters, up to some maximum count, in the type itself):

#include <boost/hash2/hash_append_fwd.hpp>
#include <algorithm>
#include <cstdint>

class Str
{
private:

    static constexpr std::size_t N = 32;

    std::uint8_t size_ = 0;
    char data_[ N ] = {};

public:

    friend constexpr bool operator==( Str const& x1, Str const& x2 )
    {
        return x1.size_ == x2.size_ && std::equal( x1.data_, x1.data_ + x1.size_, x2.data_ );
    }

    template<class Hash, class Flavor>
    friend constexpr void tag_invoke( boost::hash2::hash_append_tag const&,
        Hash& h, Flavor const& f, X const& v )
    {
        boost::hash2::hash_append_range( h, f, v.data_, v.data_ + v.size_ );
        boost::hash2::hash_append_size( h, f, v.size_ );
    }
};

Note	This example is illustrative; in practice, the above type will likely provide `begin()`, `end()`, `data()`, and `size()` member functions, which will make it a contiguous range and the built-in support will do the right thing.

Usage Examples

md5sum

A command line utility that prints the MD5 digests of a list of files passed as arguments.

#include <boost/hash2/md5.hpp>
#include <array>
#include <string>
#include <cerrno>
#include <cstdio>

static void md5sum( std::FILE* f, char const* fn )
{
    boost::hash2::md5_128 hash;

    int const N = 4096;
    unsigned char buffer[ N ];

    for( ;; )
    {
        std::size_t n = std::fread( buffer, 1, N, f );

        if( std::ferror( f ) )
        {
            std::fprintf( stderr, "'%s': read error: %s\n", fn, std::strerror( errno ) );
            return;
        }

        if( n == 0 ) break;

        hash.update( buffer, n );
    }

    std::string digest = to_string( hash.result() );

    std::printf( "%s *%s\n", digest.c_str(), fn );
}

int main( int argc, char const* argv[] )
{
    for( int i = 1; i < argc; ++i )
    {
        std::FILE* f = std::fopen( argv[i], "rb" );

        if( f == 0 )
        {
            std::fprintf( stderr, "'%s': open error: %s\n", argv[i], std::strerror( errno ) );
            continue;
        }

        md5sum( f, argv[i] );

        std::fclose( f );
    }
}

Sample command:

md5sum apache_builds.json canada.json citm_catalog.json twitter.json

Sample output:

7dc25b5fd9eb2217ed648dad23b311da *apache_builds.json
8767d618bff99552b4946078d3a90c0c *canada.json
b4391581160654374bee934a3b91255e *citm_catalog.json
bf7d37451840af4e8873b65763315cbf *twitter.json

hash2sum

A command line utility that prints the digests of a list of files, using a specified hash algorithm.

The hash algorithm is passed as the first command line argument.

This example requires C++14.

#include <boost/hash2/md5.hpp>
#include <boost/hash2/sha1.hpp>
#include <boost/hash2/sha2.hpp>
#include <boost/hash2/ripemd.hpp>
#include <boost/mp11.hpp>
#include <array>
#include <string>
#include <cerrno>
#include <cstdio>

template<class Hash> void hash2sum( std::FILE* f, char const* fn )
{
    Hash hash;

    int const N = 4096;
    unsigned char buffer[ N ];

    for( ;; )
    {
        std::size_t n = std::fread( buffer, 1, N, f );

        if( std::ferror( f ) )
        {
            std::fprintf( stderr, "'%s': read error: %s\n", fn, std::strerror( errno ) );
            return;
        }

        if( n == 0 ) break;

        hash.update( buffer, n );
    }

    std::string digest = to_string( hash.result() );

    std::printf( "%s *%s\n", digest.c_str(), fn );
}

template<class Hash> void hash2sum( char const* fn )
{
    std::FILE* f = std::fopen( fn, "rb" );

    if( f == 0 )
    {
        std::fprintf( stderr, "'%s': open error: %s\n", fn, std::strerror( errno ) );
    }
    else
    {
        hash2sum<Hash>( f, fn );
        std::fclose( f );
    }
}

using namespace boost::mp11;
using namespace boost::hash2;

using hashes = mp_list<

    md5_128,
    sha1_160,
    sha2_256,
    sha2_224,
    sha2_512,
    sha2_384,
    sha2_512_256,
    sha2_512_224,
    ripemd_160,
    ripemd_128

>;

constexpr char const* names[] = {

    "md5_128",
    "sha1_160",
    "sha2_256",
    "sha2_224",
    "sha2_512",
    "sha2_384",
    "sha2_512_256",
    "sha2_512_224",
    "ripemd_160",
    "ripemd_128"

};

int main( int argc, char const* argv[] )
{
    if( argc < 2 )
    {
        std::fputs( "usage: hash2sum <hash> <files...>\n", stderr );
        return 2;
    }

    std::string hash( argv[1] );
    bool found = false;

    mp_for_each< mp_iota<mp_size<hashes>> >([&](auto I){

        if( hash == names[I] )
        {
            using Hash = mp_at_c<hashes, I>;

            for( int i = 2; i < argc; ++i )
            {
                hash2sum<Hash>( argv[i] );
            }

            found = true;
        }

    });

    if( !found )
    {
        std::fprintf( stderr, "hash2sum: unknown hash algorithm name '%s'; use one of the following:\n\n", hash.c_str() );

        for( char const* name: names )
        {
            std::fprintf( stderr, "   %s\n", name );
        }

        return 1;
    }
}

Sample command:

hash2sum sha2_512_224 apache_builds.json canada.json citm_catalog.json twitter.json

Sample output:

a95d7fde785fe24f9507fd1709014567bbc595867f1abaad96f50dbc *apache_builds.json
b07e42587d10ec323a25fd8fc3eef2213fb0997beb7950350f4e8a4b *canada.json
4ceee5a83ad320fedb0dfddfb6f80af50b99677e87158e2d039aa168 *citm_catalog.json
854ebe0da98cadd426ea0fa3218d60bb52cf6494e435d2f385a37d48 *twitter.json

Compile Time Hashing

This example demonstrates calculating the MD5 digest of a data array, embedded in the program source, at compile time. It requires C++14.

#include <boost/hash2/md5.hpp>
#include <iostream>

// xxd -i resource
constexpr unsigned char resource[] = {
  0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x03, 0x6d, 0x90,
  0xcf, 0x6e, 0x83, 0x30, 0x0c, 0xc6, 0xcf, 0x45, 0xea, 0x3b, 0x78, 0x9c,
  0x4b, 0x02, 0x3d, 0x6e, 0xd0, 0x43, 0xff, 0x1c, 0x26, 0x55, 0x3b, 0x14,
  0x6d, 0xd7, 0x2a, 0x04, 0x43, 0x22, 0x95, 0x84, 0x25, 0x66, 0x8c, 0x47,
  0xda, 0x5b, 0x2e, 0x91, 0xd6, 0xcb, 0xd4, 0x93, 0x2d, 0xdb, 0xdf, 0xef,
  0xb3, 0x5d, 0x2a, 0x1a, 0x6e, 0xbb, 0x75, 0x52, 0x2a, 0x14, 0x6d, 0x8c,
  0x03, 0x92, 0x00, 0x45, 0x34, 0x66, 0xf8, 0x39, 0xe9, 0xaf, 0x2a, 0x75,
  0xd8, 0x39, 0xf4, 0x2a, 0x05, 0x69, 0x0d, 0xa1, 0xa1, 0x2a, 0xcd, 0x5f,
  0xe0, 0xfd, 0x72, 0xae, 0x5a, 0x2b, 0x79, 0x54, 0x73, 0x25, 0xbc, 0xda,
  0xb2, 0x98, 0xa6, 0x91, 0xc0, 0xef, 0xa8, 0xc6, 0xb6, 0x4b, 0x88, 0x17,
  0x6c, 0xb5, 0x43, 0x49, 0xda, 0xf4, 0x40, 0x16, 0xca, 0x80, 0x0f, 0xcc,
  0x2a, 0x7d, 0xa8, 0x7f, 0x50, 0x2c, 0xb9, 0xd8, 0x31, 0xc6, 0x22, 0xf9,
  0x8f, 0x58, 0xf2, 0xfb, 0xd6, 0x4f, 0x59, 0xb6, 0x4e, 0x56, 0x3f, 0x70,
  0xb0, 0xe3, 0xe2, 0x74, 0xaf, 0x08, 0xf6, 0x38, 0x08, 0x03, 0x47, 0x31,
  0xa3, 0xdf, 0xc0, 0x36, 0xcf, 0x8b, 0xd0, 0x3f, 0x6a, 0x4f, 0x4e, 0x37,
  0x13, 0x61, 0x0b, 0x93, 0x69, 0xd1, 0x01, 0x29, 0x84, 0xbd, 0xb5, 0x9e,
  0xa0, 0xb6, 0x1d, 0xcd, 0xc2, 0x21, 0x9c, 0xb5, 0x44, 0xe3, 0x71, 0x03,
  0x1f, 0xe8, 0xbc, 0xb6, 0x06, 0x0a, 0x96, 0x07, 0xd3, 0x55, 0x8d, 0x08,
  0x42, 0x4a, 0x3b, 0x8c, 0xc2, 0x2c, 0xf1, 0x86, 0x4e, 0xdf, 0xc2, 0xf4,
  0xeb, 0xe1, 0xf4, 0x56, 0x9f, 0xae, 0xc5, 0x35, 0x67, 0xf4, 0x4d, 0x60,
  0x5d, 0xf8, 0xcf, 0xb8, 0x80, 0xa0, 0x20, 0x89, 0xef, 0x7b, 0xe6, 0x7c,
  0x9e, 0x67, 0xd6, 0x44, 0x13, 0x66, 0x5d, 0xcf, 0xff, 0x29, 0xd6, 0x49,
  0x96, 0x85, 0x0b, 0x7e, 0x01, 0x36, 0x66, 0x95, 0x6b, 0x80, 0x01, 0x00,
  0x00
};

template<std::size_t N> constexpr auto md5( unsigned char const(&a)[ N ] )
{
    boost::hash2::md5_128 hash;
    hash.update( a, N );
    return hash.result();
}

constexpr auto resource_digest = md5( resource );

int main()
{
    std::cout << "Resource digest: " << resource_digest << std::endl;
}

Since the constexpr overload of update takes unsigned char const* (void const* is not allowed in constexpr functions), if the data to be hashed is a character array of type char const[], passing it directly to update will not compile. In that case, we can use hash_append_range instead of calling update, as in the following example.

#include <boost/hash2/sha2.hpp>
#include <boost/hash2/hash_append.hpp>
#include <iostream>

extern constexpr char const license[] =

"Boost Software License - Version 1.0 - August 17th, 2003\n"
"\n"
"Permission is hereby granted, free of charge, to any person or organization\n"
"obtaining a copy of the software and accompanying documentation covered by\n"
"this license (the \"Software\") to use, reproduce, display, distribute,\n"
"execute, and transmit the Software, and to prepare derivative works of the\n"
"Software, and to permit third-parties to whom the Software is furnished to\n"
"do so, all subject to the following:\n"
"\n"
"The copyright notices in the Software and this entire statement, including\n"
"the above license grant, this restriction and the following disclaimer,\n"
"must be included in all copies of the Software, in whole or in part, and\n"
"all derivative works of the Software, unless such copies or derivative\n"
"works are solely in the form of machine-executable object code generated by\n"
"a source language processor.\n"
"\n"
"THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n"
"IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n"
"FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT\n"
"SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE\n"
"FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,\n"
"ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER\n"
"DEALINGS IN THE SOFTWARE.\n"

;

constexpr unsigned char secret[] = {
    0xA4, 0x80, 0x0E, 0xE8, 0x20, 0x0B, 0x7C, 0x9A,
    0xF1, 0x3E, 0x3D, 0xEC, 0x64, 0x4F, 0x64, 0xCA,
    0x33, 0xCC, 0x84, 0xC8, 0x34, 0xE3, 0x08, 0xAE,
    0x92, 0x89, 0xEB, 0xD0, 0x47, 0x39, 0x87, 0xD8,
};

template<std::size_t N> constexpr auto hmac_sha2_256( char const(&s)[ N ] )
{
    boost::hash2::hmac_sha2_256 hmac( secret, sizeof(secret) );

    // N-1, in order to not include the null terminator
    boost::hash2::hash_append_range( hmac, {}, s, s + N - 1 );

    return hmac.result();
}

constexpr auto license_mac = hmac_sha2_256( license );

int main()
{
    std::cout << "License authentication code: " << license_mac << std::endl;
}

Use with Unordered Containers

To use one of our hash algorithms (such as fnv1a_64) with an unordered container (such as boost::unordered_flat_map), we need to create an adaptor class that exposes an interface compatible with std::hash<T>.

To do that, in the operator()(T const& v) member function of our adaptor, we need to create an instance h of the hash algorithm, use hash_append(h, {}, v) to send v to it, and then extract the result using h.result() and return it as std::size_t.

The minimal working example below illustrates this approach.

#include <boost/hash2/fnv1a.hpp>
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/get_integral_result.hpp>
#include <boost/unordered/unordered_flat_map.hpp>
#include <string>

template<class T, class H> class hash
{
public:

    std::size_t operator()( T const& v ) const
    {
        H h;
        boost::hash2::hash_append( h, {}, v );
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
};

int main()
{
    using hasher = hash<std::string, boost::hash2::fnv1a_64>;

    boost::unordered_flat_map<std::string, int, hasher> map;

    map[ "foo" ] = 1;
    map[ "bar" ] = 2;
}

Since hash<T, H> is templated not just on the key type T, but on the hash algorithm type H, we can easily switch from fnv1a_64 to another hash algorithm, for example siphash_64, by only changing the line

    using hasher = hash<std::string, boost::hash2::fnv1a_64>;

    using hasher = hash<std::string, boost::hash2::siphash_64>;

This will work, but SipHash is not intended to be used without an initial random seed, and we don’t pass any. To rectify this, let’s modify hash<T, H> to have a constructor taking a seed of type uint64_t:

#include <boost/hash2/siphash.hpp>
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/get_integral_result.hpp>
#include <boost/unordered/unordered_flat_map.hpp>
#include <string>

template<class T, class H> class hash
{
private:

    std::uint64_t seed_;

public:

    explicit hash( std::uint64_t seed ): seed_( seed )
    {
    }

    std::size_t operator()( T const& v ) const
    {
        H h( seed_ );
        boost::hash2::hash_append( h, {}, v );
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
};

int main()
{
    std::uint64_t seed = 0x0102030405060708ull;

    using hasher = hash<std::string, boost::hash2::siphash_64>;

    boost::unordered_flat_map<std::string, int, hasher> map( 0, hasher( seed ) );

    map[ "foo" ] = 1;
    map[ "bar" ] = 2;
}

Note	In real code, the seed will not be a hardcoded constant; ideally, every unordered container instance will have its own random and unpredictable seed.

Since all hash algorithms that conform to our library requirements are constructible with an initial seed of type uint64_t, the above will work with any of them.

This is good enough for any practical purposes, but in principle, SipHash64 takes a 16 byte seed per specification, and we only (effectively) pass 8 bytes. We could modify our hash yet again and this time use a constructor taking a sequence of bytes as the seed:

#include <boost/hash2/siphash.hpp>
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/get_integral_result.hpp>
#include <boost/unordered/unordered_flat_map.hpp>
#include <string>

template<class T, class H> class hash
{
private:

    H h_;

public:

    hash( unsigned char const* p, std::size_t n ): h_( p, n )
    {
    }

    std::size_t operator()( T const& v ) const
    {
        H h( h_ );
        boost::hash2::hash_append( h, {}, v );
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
};

int main()
{
    unsigned char const seed[ 16 ] =
    {
        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
        0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10
    };

    using hasher = hash<std::string, boost::hash2::siphash_64>;

    boost::unordered_flat_map<std::string, int, hasher> map( 0, hasher( seed, sizeof(seed) ) );

    map[ "foo" ] = 1;
    map[ "bar" ] = 2;
}

As before, construction from a byte sequence is a required part of the hash algorithm interface, so the above will work with any of them.

To avoid the need to store the initial seed as we did in the uint64_t case — which would require an allocation because n can be arbitrary, necessitating the use of std::vector<unsigned char> — we construct an instance h_ of the hash algorithm, passing it the seed, to capture the initial seeded state, and then copy this seeded instance in operator().

But once we’ve done that, we might notice that we can construct this initial instance h_ using any of the three supported constructors, not just the one taking a byte sequence:

#include <boost/hash2/siphash.hpp>
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/get_integral_result.hpp>
#include <boost/unordered/unordered_flat_map.hpp>
#include <string>

template<class T, class H> class hash
{
private:

    H h_;

public:

    hash(): h_()
    {
    }

    explicit hash( std::uint64_t seed ): h_( seed )
    {
    }

    hash( unsigned char const* p, std::size_t n ): h_( p, n )
    {
    }

    std::size_t operator()( T const& v ) const
    {
        H h( h_ );
        boost::hash2::hash_append( h, {}, v );
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
};

int main()
{
    using hasher = hash<std::string, boost::hash2::siphash_64>;

    {
        boost::unordered_flat_map<std::string, int, hasher> map;

        map[ "foo" ] = 1;
        map[ "bar" ] = 2;
    }

    {
        std::uint64_t seed = 0x0102030405060708ull;

        boost::unordered_flat_map<std::string, int, hasher> map( 0, hasher( seed ) );

        map[ "foo" ] = 1;
        map[ "bar" ] = 2;
    }

    {
        unsigned char const seed[ 16 ] =
        {
            0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
            0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10
        };

        boost::unordered_flat_map<std::string, int, hasher> map( 0, hasher(seed, sizeof(seed)) );

        map[ "foo" ] = 1;
        map[ "bar" ] = 2;
    }
}

This variation of hash<T, H> is universal; it can be used without a seed, with an unsigned integer seed, and with a byte sequence seed.

Note	In real code, you might want to omit the default constructor, to avoid the possibility of accidentally using an unseeded hash algorithm.

There’s one final modification we could do to hash. In the examples above, we unconditionally use the 64 bit variant of SipHash, even though we only need a result of type std::size_t because that’s what std::hash mandates.

It would be better for performance if we used siphash_32 when std::size_t is 32 bit, and siphash_64 when it’s 64 bit.

For that, we can make hash take two hash algorithms, one 32 bit and one 64 bit, and have it pick the appropriate one automatically:

#include <boost/hash2/siphash.hpp>
#include <boost/hash2/md5.hpp>
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/get_integral_result.hpp>
#include <boost/unordered/unordered_flat_map.hpp>
#include <boost/core/type_name.hpp>
#include <type_traits>
#include <string>
#include <iostream>

template<class T, class H1, class H2 = H1> class hash
{
public:

    using hash_type = typename std::conditional<
        sizeof(typename H1::result_type) == sizeof(std::size_t), H1, H2
    >::type;

private:

    hash_type h_;

public:

    hash(): h_()
    {
    }

    explicit hash( std::uint64_t seed ): h_( seed )
    {
    }

    hash( unsigned char const* p, std::size_t n ): h_( p, n )
    {
    }

    std::size_t operator()( T const& v ) const
    {
        hash_type h( h_ );
        boost::hash2::hash_append( h, {}, v );
        return boost::hash2::get_integral_result<std::size_t>( h.result() );
    }
};

int main()
{
    {
        using hasher = hash<std::string, boost::hash2::siphash_32, boost::hash2::siphash_64>;

        std::cout << boost::core::type_name<hasher>() << " uses "
            << boost::core::type_name<hasher::hash_type>() << std::endl;

        boost::unordered_flat_map<std::string, int, hasher> map;

        map[ "foo" ] = 1;
        map[ "bar" ] = 2;
    }

    {
        using hasher = hash<std::string, boost::hash2::md5_128>;

        std::cout << boost::core::type_name<hasher>() << " uses "
            << boost::core::type_name<hasher::hash_type>() << std::endl;

        boost::unordered_flat_map<std::string, int, hasher> map;

        map[ "foo" ] = 1;
        map[ "bar" ] = 2;
    }
}

To keep the case where we only pass one hash algorithm working, we default the second template parameter to the first one, so that if only one hash algorithm is passed, it will always be used.

Result Extension

Some of our hash algorithms, such as xxhash_64 and siphash_64, have more than 64 bits of internal state, but only produce a 64 bit result.

If we’re using one of these algorithms to produce file or content checksums, do not tolerate collisions, and operate on a large number of files or items (many millions), it might be better to use a 128 bit digest instead.

Since the algorithms maintain more than 64 bits of state, we can call result() twice and obtain a meaningful 128 bit result.

The following example demonstrates how. It defines an algorithm xxhash_128 which is implemented by wrapping xxhash_64 and redefining its result_type and result members appropriately:

#include <boost/hash2/xxhash.hpp>
#include <boost/hash2/digest.hpp>
#include <boost/endian/conversion.hpp>

class xxhash_128: private boost::hash2::xxhash_64
{
public:

    using result_type = boost::hash2::digest<16>;

    using xxhash_64::xxhash_64;
    using xxhash_64::update;

    result_type result()
    {
        std::uint64_t r1 = xxhash_64::result();
        std::uint64_t r2 = xxhash_64::result();

        result_type r = {};

        boost::endian::store_little_u64( r.data() + 0, r1 );
        boost::endian::store_little_u64( r.data() + 8, r2 );

        return r;
    }
};

#include <string>
#include <iostream>

int main()
{
    std::string tv( "The quick brown fox jumps over the lazy dog" );

    xxhash_128 hash( 43 );
    hash.update( tv.data(), tv.size() );

    std::cout << hash.result() << std::endl;
}

Implementation Features

Supported Compilers

The library requires C++11. The following compilers:

g++ 4.8 or later
clang++ 3.9 or later
Visual Studio 2015 and above

are being tested on Github Actions and Appveyor.

Reference

Hash Algorithms

<boost/hash2/fnv1a.hpp>

namespace boost {
namespace hash2 {

class fnv1a_32;
class fnv1a_64;

} // namespace hash2
} // namespace boost

This header implements the FNV-1a algorithm, in 32 and 64 bit variants.

fnv1a_32

class fnv1a_32
{
private:

    std::uint32_t state_; // exposition only

public:

    using result_type = std::uint32_t;

    constexpr fnv1a_32();
    explicit constexpr fnv1a_32( std::uint64_t seed );
    constexpr fnv1a_32( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr fnv1a_32();

Default constructor.

Effects:: Initializes state_ to 0x811c9dc5.

explicit constexpr fnv1a_32( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8) where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr fnv1a_32( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, and then, if n is not zero, performs update(p, n).
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: For each unsigned char value ch in the range [p, p+n) performs state_ = (state_ ^ ch) * 0x01000193.

result

constexpr result_type result();

Effects:: Updates state_ to (state_ ^ 0xFF) * 0x01000193.
Returns:: The value of state_ before the update.
Remarks:: The state is updated to allow repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

fnv1a_64

class fnv1a_64
{
private:

    std::uint64_t state_; // exposition only

public:

    using result_type = std::uint64_t;

    constexpr fnv1a_64();
    explicit constexpr fnv1a_64( std::uint64_t seed );
    constexpr fnv1a_64( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr fnv1a_64();

Default constructor.

Effects:: Initializes state_ to 0xcbf29ce484222325.

explicit constexpr fnv1a_64( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8) where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr fnv1a_64( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, and then, if n is not zero, performs update(p, n).
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: For each unsigned char value ch in the range [p, p+n) performs state_ = (state_ ^ ch) * 0x100000001b3.

result

constexpr result_type result();

Effects:: Updates state_ to (state_ ^ 0xFF) * 0x100000001b3.
Returns:: The value of state_ before the update.
Remarks:: The state is updated to allow repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/xxhash.hpp>

namespace boost {
namespace hash2 {

class xxhash_32;
class xxhash_64;

} // namespace hash2
} // namespace boost

This header implements the XXH32 and XXH64 algorithms.

xxhash_32

class xxhash_32
{
public:

    using result_type = std::uint32_t;

    constexpr xxhash_32();
    explicit constexpr xxhash_32( std::uint64_t seed );
    constexpr xxhash_32( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr xxhash_32();

Default constructor.

Effects:: Initializes the internal state of the XXH32 algorithm to its initial values.

explicit constexpr xxhash_32( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the internal state of the XXH32 algorithm using the low 32 bits of seed as the seed, then if the high 32 bits of seed aren’t zero, mixes them into the state.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

xxhash_32( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the XXH32 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Obtains a 32 bit hash value from the state as specified by XXH32, then updates the state.
Returns:: The obtained hash value.
Remarks:: The state is updated to allow repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

xxhash_64

class xxhash_64
{
public:

    using result_type = std::uint64_t;

    constexpr xxhash_64();
    explicit constexpr xxhash_64( std::uint64_t seed );
    constexpr xxhash_64( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr xxhash_64();

Default constructor.

Effects:: Initializes the internal state of the XXH64 algorithm to its initial values.

explicit constexpr xxhash_64( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the internal state of the XXH64 algorithm using seed as the seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

xxhash_64( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the XXH64 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Obtains a 64 bit hash value from the state as specified by XXH64, then updates the state.
Returns:: The obtained hash value.
Remarks:: The state is updated to allow repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/siphash.hpp>

namespace boost {
namespace hash2 {

class siphash_32;
class siphash_64;

} // namespace hash2
} // namespace boost

This header implements the SipHash and HalfSipHash algorithms.

siphash_32

class siphash_32
{
public:

    using result_type = std::uint32_t;

    constexpr siphash_32();
    explicit constexpr siphash_32( std::uint64_t seed );
    constexpr siphash_32( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr siphash_32();

Default constructor.

Effects:: Initializes the internal state of the HalfSipHash algorithm as if using a sequence of 8 zero bytes as the key.

explicit constexpr siphash_32( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the internal state of the HalfSipHash algorithm using seed as the key, as if it were a sequence of its 8 constituent bytes, in little-endian order.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

siphash_32( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: If n is 8, initializes the state as specified by the algorithm; otherwise, initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the HalfSipHash algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Obtains a 32 bit hash value from the state as specified by HalfSipHash.
Returns:: The obtained hash value.
Remarks:: The state is updated, which allows repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

siphash_64

class siphash_64
{
public:

    using result_type = std::uint64_t;

    constexpr siphash_64();
    explicit constexpr siphash_64( std::uint64_t seed );
    constexpr siphash_64( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr siphash_64();

Default constructor.

Effects:: Initializes the internal state of the SipHash algorithm as if using a sequence of 16 zero bytes as the key.

explicit constexpr siphash_64( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the internal state of the SipHash algorithm using seed as the key, as if it were a sequence of its 8 constituent bytes, in little-endian order, followed by 8 zero bytes.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

siphash_64( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: If n is 16, initializes the state as specified by the algorithm; otherwise, initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the SipHash algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Obtains a 64 bit hash value from the state as specified by SipHash, then updates the state.
Returns:: The obtained hash value.
Remarks:: The state is updated, which allows repeated calls to result() to return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/hmac.hpp>

namespace boost {
namespace hash2 {

template<class H> class hmac;

} // namespace hash2
} // namespace boost

This header implements the HMAC algorithm.

hmac

template<class H> class hmac
{
public:

    using result_type = typename H::result_type;

    static constexpr int block_size = H::block_size;

    constexpr hmac();
    explicit constexpr hmac( std::uint64_t seed );
    constexpr hmac( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

The class template hmac takes as a parameter a cryptographic hash algorithm H and implements the corresponding hash-based message authentication code (HMAC) algorithm.

For example, HMAC-SHA2-256 is implemented by hmac<sha2_256>.

Constructors

constexpr hmac();

Default constructor.

Effects:: Initializes the internal state using an empty byte sequence as the secret key.

explicit constexpr hmac( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: If seed is zero, initializes the state as if by default construction, otherwise, initializes it using the 8 bytes of the little-endian representation of seed as the secret key.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

hmac( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as specified by the HMAC algorithm using [p, p+n) as the secret key.
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the HMAC algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the HMAC digest.
Returns:: The HMAC digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/md5.hpp>

#include <boost/hash2/hmac.hpp>
#include <boost/hash2/digest.hpp>

namespace boost {
namespace hash2 {

class md5_128;

using hmac_md5_128 = hmac<md5_128>;

} // namespace hash2
} // namespace boost

This header implements the MD5 algorithm.

md5_128

class md5_128
{
public:

    using result_type = digest<16>;

    static constexpr int block_size = 64;

    constexpr md5_128();
    explicit constexpr md5_128( std::uint64_t seed );
    constexpr md5_128( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr md5_128();

Default constructor.

Effects:: Initializes the internal state of the MD5 algorithm to its initial values.

explicit constexpr md5_128( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8); result(); where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

md5_128( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the MD5 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the MD5 digest.
Returns:: The MD5 digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/sha1.hpp>

#include <boost/hash2/hmac.hpp>
#include <boost/hash2/digest.hpp>

namespace boost {
namespace hash2 {

class sha1_160;

using hmac_sha1_160 = hmac<sha1_160>;

} // namespace hash2
} // namespace boost

This header implements the SHA-1 algorithm.

sha1_160

class sha1_160
{
public:

    using result_type = digest<20>;

    static constexpr int block_size = 64;

    constexpr sha1_160();
    explicit constexpr sha1_160( std::uint64_t seed );
    constexpr sha1_160( unsigned char const* p, std::size_t n );

    void update( void const* p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr sha1_160();

Default constructor.

Effects:: Initializes the internal state of the SHA-1 algorithm to its initial values.

explicit constexpr sha1_160( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8); result(); where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr sha1_160( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the SHA-1 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the SHA-1 digest.
Returns:: The SHA-1 digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

<boost/hash2/sha2.hpp>

#include <boost/hash2/hmac.hpp>
#include <boost/hash2/digest.hpp>

namespace boost {
namespace hash2 {

class sha2_256;
class sha2_224;
class sha2_512;
class sha2_384;
class sha2_512_256;
class sha2_512_224;

using hmac_sha2_256 = hmac<sha2_256>;
using hmac_sha2_224 = hmac<sha2_224>;
using hmac_sha2_512 = hmac<sha2_512>;
using hmac_sha2_384 = hmac<sha2_384>;
using hmac_sha2_512_256 = hmac<sha2_512_256>;
using hmac_sha2_512_224 = hmac<sha2_512_224>;

} // namespace hash2
} // namespace boost

This header implements the SHA-2 family of functions.

sha2_256

class sha2_256
{
    using result_type = digest<32>;

    static constexpr int block_size = 64;

    constexpr sha2_256();
    constexpr explicit sha2_256( std::uint64_t seed );
    constexpr sha2_256( unsigned char const * p, std::size_t n );

    void update( void const * p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr sha2_256();

Default constructor.

Effects:: Initializes the internal state of the SHA-256 algorithm to its initial values.

constexpr explicit sha2_256( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8); result(); where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr sha2_256( unsigned char const * p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const * p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the SHA-256 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the SHA-256 digest.
Returns:: The SHA-256 digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

sha2_224

The SHA-224 algorithm is identical to the SHA-256 algorithm described above.

The only differences are the internal state’s initial values and the size of the message digest, which is:

using result_type = digest<28>;

Otherwise, all other operations and constants are identical.

The message digest is obtained by truncating the final results of the SHA-256 algorithm to its leftmost 224 bits.

sha2_512

class sha2_512
{
    using result_type = digest<64>;

    static constexpr int block_size = 128;

    constexpr sha2_512();
    constexpr explicit sha2_512( std::uint64_t seed );
    constexpr sha2_512( unsigned char const * p, std::size_t n );

    void update( void const * p, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr sha2_512();

Default constructor.

Effects:: Initializes the internal state of the SHA-512 algorithm to its initial values.

constexpr explicit sha2_512( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8); result(); where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr sha2_512( unsigned char const * p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const * p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the SHA-512 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the SHA-512 digest.
Returns:: The SHA-512 digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

sha2_384

The SHA-384 algorithm is identical to the SHA-512 algorithm described above.

The only differences are the internal state’s initial values and the size of the message digest, which is:

using result_type = digest<48>;

Otherwise, all other operations and constants are identical.

The message digest is obtained by truncating the final results of the SHA-512 algorithm to its leftmost 384 bits.

sha2_512_224

The SHA-512/224 algorithm is identical to the SHA-512 algorithm described above.

The only differences are the internal state’s initial values and the size of the message digest, which is:

using result_type = digest<28>;

Otherwise, all other operations and constants are identical.

The message digest is obtained by truncating the final results of the SHA-512 algorithm to its leftmost 224 bits.

sha2_512_256

The SHA-512/256 algorithm is identical to the SHA-512 algorithm described above.

The only differences are the internal state’s initial values and the size of the message digest, which is:

using result_type = digest<32>;

Otherwise, all other operations and constants are identical.

The message digest is obtained by truncating the final results of the SHA-512 algorithm to its leftmost 256 bits.

<boost/hash2/ripemd.hpp>

#include <boost/hash2/hmac.hpp>
#include <boost/hash2/digest.hpp>

namespace boost {
namespace hash2 {

class ripemd_160;
class ripemd_128;

using hmac_ripemd_160 = hmac<ripemd_160>;
using hmac_ripemd_128 = hmac<ripemd_128>;

} // namespace hash2
} // namespace boost

This header implements the RIPEMD-160 and RIPEMD-128 algorithms.

ripemd_160

class ripemd_160
{
    using result_type = digest<20>;

    static constexpr int block_size = 64;

    constexpr ripemd_160();
    explicit constexpr ripemd_160( std::uint64_t seed );
    constexpr ripemd_160( unsigned char const* p, std::size_t n );

    void update( void const * pv, std::size_t n );
    constexpr void update( unsigned char const* p, std::size_t n );

    constexpr result_type result();
};

Constructors

constexpr ripemd_160();

Default constructor.

Effects:: Initializes the internal state of the RIPEMD-160 algorithm to its initial values.

explicit constexpr ripemd_160( std::uint64_t seed );

Constructor taking an integer seed value.

Effects:: Initializes the state as if by default construction, then if seed is not zero, performs update(p, 8); result(); where p points to a little-endian representation of the value of seed.
Remarks:: By convention, if seed is zero, the effect of this constructor is the same as default construction.

constexpr ripemd_160( unsigned char const* p, std::size_t n );

Constructor taking a byte sequence seed.

Effects:: Initializes the state as if by default construction, then if n is not zero, performs update(p, n); result().
Remarks:: By convention, if n is zero, the effect of this constructor is the same as default construction.

update

void update( void const* p, std::size_t n );
constexpr void update( unsigned char const* p, std::size_t n );

Effects:: Updates the internal state of the RIPEMD-160 algorithm from the byte sequence [p, p+n).
Remarks:: Consecutive calls to update are equivalent to a single call with the concatenated byte sequences of the individual calls.

result

constexpr result_type result();

Effects:: Pads the accumulated message and finalizes the RIPEMD-160 digest.
Returns:: The RIPEMD-160 digest of the message formed from the byte sequences of the preceding calls to update.
Remarks:: Repeated calls to result() return a pseudorandom sequence of result_type values, effectively extending the output.

ripemd_128

The RIPEMD-128 algorithm is identical to the RIPEMD-160 algorithm described above.

The only differences are the number of rounds used and the size of the message digest, which is:

using result_type = digest<16>;

Otherwise, all other operations and constants are identical.

Utilities and Traits

<boost/hash2/digest.hpp>

namespace boost {
namespace hash2 {

template<std::size_t N> class digest
{
private: // exposition only

    unsigned char data_[ N ] = {};

public:

    using value_type = unsigned char;
    using reference = unsigned char&;
    using const_reference = unsigned char const&;
    using iterator = unsigned char*;
    using const_iterator = unsigned char const*;
    using size_type = std::size_t;
    using difference_type = std::ptrdiff_t;

    // constructors

    constexpr digest() = default;
    constexpr digest( unsigned char const (&v)[ N ] ) noexcept;

    // copy

    constexpr digest( digest const& ) = default;
    constexpr digest& operator=( digest const& ) = default;

    // iteration

    constexpr iterator begin() noexcept;
    constexpr const_iterator begin() const noexcept;

    constexpr iterator end() noexcept;
    constexpr const_iterator end() const noexcept;

    // data, size

    constexpr unsigned char* data() noexcept;
    constexpr unsigned char const* data() const noexcept;

    constexpr size_type size() const noexcept;
    constexpr size_type max_size() const noexcept;

    // element access

    constexpr reference operator[]( std::size_t i );
    constexpr const_reference operator[]( std::size_t i ) const;

    constexpr reference front() noexcept;
    constexpr const_reference front() const noexcept;

    constexpr reference back() noexcept;
    constexpr const_reference back() const noexcept;
};

// comparisons

template<std::size_t N>
constexpr bool operator==( digest<N> const& a, digest<N> const& b ) noexcept;

template<std::size_t N>
constexpr bool operator!=( digest<N> const& a, digest<N> const& b ) noexcept;

// to_chars

template<std::size_t N>
constexpr char* to_chars( digest<N> const& v, char* first, char* last ) noexcept;

template<std::size_t N, std::size_t M>
constexpr void to_chars( digest<N> const& v, char (&w)[ M ] ) noexcept;

// operator<<

template<std::size_t N>
std::ostream& operator<<( std::ostream& os, digest<N> const& v );

// to_string

template<std::size_t N>
std::string to_string( digest<N> const& v );

} // namespace hash2
} // namespace boost

digest

digest<N> is a constexpr-friendly class template similar to std::array<unsigned char, N>. It is used to store the resulting message digest of hash algorithms such as SHA2-256 or RIPEMD-160.

Constructors

constexpr digest() = default;

Effects:: Zero-initializes data_.

constexpr digest( unsigned char const (&v)[ N ] ) noexcept;

Effects:: Initializes the elements of data_ from the corresponding elements of v.

Iteration

constexpr iterator begin() noexcept;
constexpr const_iterator begin() const noexcept;

Returns:: data_.

constexpr iterator end() noexcept;
constexpr const_iterator end() const noexcept;

Returns:: data_ + N.

Accessors

constexpr unsigned char* data() noexcept;
constexpr unsigned char const* data() const noexcept;

Returns:: data_.

constexpr size_type size() const noexcept;
constexpr size_type max_size() const noexcept;

Returns:: N.

Element Access

constexpr reference operator[]( std::size_t i );
constexpr const_reference operator[]( std::size_t i ) const;

Requires:: i < size().
Returns:: data_[ i ].

constexpr reference front() noexcept;
constexpr const_reference front() const noexcept;

Returns:: data_[ 0 ].

constexpr reference back() noexcept;
constexpr const_reference back() const noexcept;

Returns:: data_[ N-1 ].

Comparisons

template<std::size_t N>
constexpr bool operator==( digest<N> const& a, digest<N> const& b ) noexcept;

Returns:: true when the elements of a.data_ are equal to the corresponding elements of b.data_, false otherwise.

template<std::size_t N>
constexpr bool operator!=( digest<N> const& a, digest<N> const& b ) noexcept;

Returns:: !(a == b).

Formatting

template<std::size_t N>
constexpr char* to_chars( digest<N> const& v, char* first, char* last ) noexcept;

Effects:: Writes the contents of data_ as a hexadecimal string to the provided output range [first, last).
Returns:: A pointer one past the end of the generated output, or nullptr if [first, last) is not large enough.

template<std::size_t N, std::size_t M>
constexpr void to_chars( digest<N> const& v, char (&w)[ M ] ) noexcept;

Requires:: M >= N*2 + 1.
Effects:: Writes the contents of data_ as a hexadecimal string, then a null terminator, to the provided output buffer w.

template<std::size_t N>
std::ostream& operator<<( std::ostream& os, digest<N> const& v );

Effects:: Writes the contents of data_ as a hexadecimal string to os.
Returns:: os.

template<std::size_t N>
std::string to_string( digest<N> const& v );

Returns:: A string containing the contents of data_ in hexadecimal format.

<boost/hash2/endian.hpp>

namespace boost {
namespace hash2 {

enum class endian;

} // namespace hash2
} // namespace boost

endian

enum class endian
{
    little = /*...*/,
    big = /*...*/,
    native = /*little or big*/
};

The enumeration type endian corresponds to the standard std::endian type from C++20. Its values are little, signifying little-endian byte order, big, signifying big-endian byte order, and native, which equals either little or big depending on whether the current platform is little- or big-endian.

Unlike std::endian, platforms where little equals big, or where native equals neither little or big, aren’t supported.

<boost/hash2/flavor.hpp>

#include <boost/hash2/endian.hpp>

namespace boost {
namespace hash2 {

struct default_flavor;
struct little_endian_flavor;
struct big_endian_flavor;

} // namespace hash2
} // namespace boost

The header boost/hash2/flavor.hpp contains the predefined flavor types.

(A flavor is passed as the second argument to hash_append in order to influence its behavior.)

Flavor types have two members, a type size_type and a value byte_order of type boost::hash2::endian.

size_type controls how the argument of hash_append_size is treated (it’s converted to size_type before hashing.)

byte_order controls the endianness that is used to hash scalar types.

default_flavor

struct default_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::native;
};

default_flavor requests native, endian-dependent, hashing of scalar types. This makes the hash values dependent on the endianness of the current platform, but has the potential of being substantially faster if large arrays of scalar types are being passed to hash_append.

There is rarely a need to use default_flavor explicitly, because it’s default when no flavor is given to hash_append, like so: hash_append( h, {}, v );

little_endian_flavor

struct little_endian_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::little;
};

little_endian_flavor requests little endian hashing of scalar types. This makes the hash values independent of the endianness of the underlying platform. However, if the platform is big endian, hash_append will be slower because it will need to convert scalar types to little endian.

big_endian_flavor

struct big_endian_flavor
{
    using size_type = std::uint64_t;
    static constexpr auto byte_order = endian::big;
};

big_endian_flavor requests big endian hashing of scalar types. This makes the hash values independent of the endianness of the underlying platform. However, if the platform is little endian, which is very likely, hash_append will be slower because it will need to convert scalar types to big endian.

<boost/hash2/get_integral_result.hpp>

namespace boost {
namespace hash2 {

template<class T, class R> constexpr T get_integral_result( R const& r ) noexcept;

} // namespace hash2
} // namespace boost

get_integral_result

template<class T, class R> constexpr T get_integral_result( R const& r ) noexcept;

Requires:

T must be an integral type that is not bool. R must be a valid hash algorithm result type; that is, it must be an unsigned integer type, or an array-like type with a value_type of unsigned char (std::array<unsigned char, N> or digest<N>) and size of at least 8.

Returns:

A value that is derived from r in a way that is approximately uniformly distributed over the possible values of T. r is assumed to have been produced by a result() invocation of a hash algorithm.

Remarks:

When R is an array-like type, get_integral_result is allowed to assume that r has been produced by a high quality hash algorithm and that therefore its values are uniformly distributed over the entire domain of R.

Example:

template<class T, class Hash> struct my_hash
{
    std::size_t operator()( std::string const& st ) const noexcept
    {
        Hash hash;
        boost::hash2::hash_append( hash, {}, st );
        return boost::hash2::get_integral_result<std::size_t>( hash.result() );
    }
};

<boost/hash2/is_trivially_equality_comparable.hpp>

namespace boost {
namespace hash2 {

template<class T> struct is_trivially_equality_comparable;

} // namespace hash2
} // namespace boost

is_trivially_equality_comparable

template<class T> struct is_trivially_equality_comparable:
  std::integral_constant< bool,
    std::is_integral<T>::value || std::is_enum<T>::value || std::is_pointer<T>::value >
{
};

template<class T> struct is_trivially_equality_comparable<T const>:
  is_trivially_equality_comparable<T>
{
};

The trait is_trivially_equality_comparable is used by the library to detect types that are trivially equality comparable.

A type is trivially equality comparable if, for two values x and y of that type, x == y is equivalent to std::memcmp( &x, &y, sizeof(x) ) == 0.

That is, for trivially equality comparable types, comparing their values is the same as comparing their storage byte representations.

This allows hash_append to assume that the message identifying an object’s value is the same as the storage bytes of the object.

is_trivially_equality_comparable can be specialized for user-defined types if the default implementation does not give the correct result.

For example, for the following type

struct X
{
    int v;
};

bool operator==( X const& x1, X const& x2 )
{
    return x1.v == x2.v;
}

(under the assumption that it has no padding bytes, that is, sizeof(X) == sizeof(int)) is_trivially_equality_comparable<X>::value will be false by default, but the type meets the requirements for being trivially equality comparable, so a specialization can be added:

template<> struct boost::hash2::is_trivially_equality_comparable<X>: std::true_type {};

or, if you want to be on the safe side,

template<> struct boost::hash2::is_trivially_equality_comparable<X>:
  std::integral_constant<bool, sizeof(X) == sizeof(int)> {};

On the other hand, the following type

enum E: unsigned {};

bool operator==( E e1, E e2 )
{
    return e1 % 256 == e2 % 256;
}

is not trivially equality comparable (because (E)0x204 == (E)0x704, but memcmp will give a nonzero result), but is_trivially_equality_comparable<E>::value will be true by default.

In this (quite rare) case, a specialization can be added to report false:

template<> struct boost::hash2::is_trivially_equality_comparable<E>: std::false_type {};

<boost/hash2/is_endian_independent.hpp>

namespace boost {
namespace hash2 {

template<class T> struct is_endian_independent;

} // namespace hash2
} // namespace boost

is_endian_independent

template<class T> struct is_endian_independent:
    std::integral_constant< bool, sizeof(T) == 1 >
{
};

template<class T> struct is_endian_independent<T const>:
    is_endian_independent<T>
{
};

The trait is_endian_independent is used by the library to detect endian-independent types.

A type is endian-independent if its memory representation is the same on little-endian and big-endian platforms.

This includes all single byte types (with a sizeof of 1) and all types whose constituent members are also endian-independent.

The default implementation of the trait only reports true for single byte types. It can be specialized for endian independent user-defined types.

For example, the following type

struct X
{
    unsigned char a;
    unsigned char b;
    unsigned char c;
};

is endian-independent, and is_endian_independent can be specialized appropriately for it:

template<> struct boost::hash2::is_endian_independent<X>: std::true_type {};

<boost/hash2/is_contiguously_hashable.hpp>

#include <boost/hash2/endian.hpp>
#include <boost/hash2/is_trivially_equality_comparable.hpp>
#include <boost/hash2/is_endian_independent.hpp>

namespace boost {
namespace hash2 {

template<class T, endian E> struct is_contiguously_hashable;

} // namespace hash2
} // namespace boost

is_contiguously_hashable

template<class T, endian E> struct is_contiguously_hashable:
  std::integral_constant<bool,
    is_trivially_equality_comparable<T>::value &&
      (E == endian::native || is_endian_independent<T>::value)>
{
};

template<class T, std::size_t N, endian E> struct is_contiguously_hashable<T[N], E>:
  is_contiguously_hashable<T, E>
{
};

The trait is_contiguously_hashable is used by the library to detect contiguously hashable types.

A type is contiguously hashable under a particular byte order E if the message that would have been produced for the type if it weren’t considered contiguously hashable is the same as its underlying storage byte representation.

hash_append(hash, flavor, value), when the type of value is contiguously hashable under the byte order requested by flavor (decltype(flavor)::byte_order), issues a single call to hash.update(&value, sizeof(value)) as an optimization.

is_contiguously_hashable is not intended to be specialized for user-defined types. Its implementation relies on is_trivially_equality_comparable and is_endian_independent, and is correct as long as those underlying traits are correct.

<boost/hash2/has_constant_size.hpp>

namespace boost {
namespace hash2 {

template<class T> struct has_constant_size;

} // namespace hash2
} // namespace boost

has_constant_size

template<class T> struct has_constant_size<T>: std::integral_constant<bool, /*see below*/>
{
};

template<class T> struct has_constant_size<T const>: has_constant_size<T>
{
};

The trait has_constant_size is used by the library to detect container and range types that have a constant size. This allows hash_append to not include the size in the message, as it doesn’t contribute to the object state.

A container or range type has constant size if for all values v of that type, v.size() has the same value.

The default implementation reports true for tuple-like types (those for which std::tuple_size is specialized), such as std::array, for boost::array, and for digest.

has_constant_size can be specialized for user-defined container and range types that have constant size.

For example, boost::uuids::uuid has a constant size of 16, so a specialization can be added appropriately:

template<> struct boost::hash2::has_constant_size<boost::uuids::uuid>: std::true_type {};

Hashing C++ Objects

<boost/hash2/hash_append_fwd.hpp>

Synopsis

namespace boost {
namespace hash2 {

template<class Hash, class Flavor, class T>
constexpr void hash_append( Hash& h, Flavor const& f, T const& v );

template<class Hash, class Flavor, class It>
constexpr void hash_append_range( Hash& h, Flavor const& f, It first, It last );

template<class Hash, class Flavor, class T>
constexpr void hash_append_size( Hash& h, Flavor const& f, T const& v );

template<class Hash, class Flavor, class It>
constexpr void hash_append_sized_range( Hash& h, Flavor const& f, It first, It last );

template<class Hash, class Flavor, class It>
constexpr void hash_append_unordered_range( Hash& h, Flavor const& f, It first, It last );

struct hash_append_tag;

} // namespace hash2
} // namespace boost

The header boost/hash2/hash_append_fwd.hpp declares the functions implemented in boost/hash2/hash_append.hpp.

It can be used when code wishes to implement hash_append support for a user-defined type without physically depending on the implementation of hash_append.

Example

X.hpp

#include <boost/hash2/hash_append_fwd.hpp>
#include <vector>

class X
{
private:

    int a = -1;
    std::vector<int> b{ 1, 2, 3 };

    template<class Hash, class Flavor>
    friend void tag_invoke(
        boost::hash2::hash_append_tag const&, Hash& h, Flavor const& f, X const& v )
    {
        boost::hash2::hash_append( h, f, v.a );
        boost::hash2::hash_append( h, f, v.b );
    }

public:

    X() = default;
};

main.cpp

#include "X.hpp"
#include <boost/hash2/hash_append.hpp>
#include <boost/hash2/md5.hpp>
#include <iostream>

int main()
{
    X x;

    boost::hash2::md5_128 hash;
    boost::hash2::hash_append( hash, {}, x );

    std::cout << "MD5 digest of x: " << hash.result() << std::endl;
}

<boost/hash2/hash_append.hpp>

#include <boost/hash2/flavor.hpp>

namespace boost {
namespace hash2 {

template<class Hash, class Flavor = default_flavor, class T>
constexpr void hash_append( Hash& h, Flavor const& f, T const& v );

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_range( Hash& h, Flavor const& f, It first, It last );

template<class Hash, class Flavor = default_flavor, class T>
constexpr void hash_append_size( Hash& h, Flavor const& f, T const& v );

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_sized_range( Hash& h, Flavor const& f, It first, It last );

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_unordered_range( Hash& h, Flavor const& f, It first, It last );

struct hash_append_tag;

} // namespace hash2
} // namespace boost

hash_append

template<class Hash, class Flavor = default_flavor, class T>
constexpr void hash_append( Hash& h, Flavor const& f, T const& v );

Appends the representation of v to the message stored in h.

Effects:

If is_contiguously_hashable<T, Flavor::byte_order>::value is true, calls h.update(&v, sizeof(v));
If std::is_integral<T>::value is true, obtains a byte representation of v in the byte order requested by Flavor::byte_order, then calls h.update(p, n) where p is the address of that representation and n is sizeof(v);
If std::is_floating_point<T>::value is true, first replaces v with positive zero if it’s negative zero, then calls hash_append(h, f, std::bit_cast<U>(v)), where U is an unsigned integer type with the same size as T;
If std::is_pointer<T>::value is true, calls hash_append(h, f, reinterpret_cast<std::uintptr_t>(v));
If T is std::nullptr_t, calls hash_append(h, f, static_cast<void*>(v));
If T is an array type U[N], calls hash_append_range(h, f, v + 0, v + N);
If a suitable overload of tag_invoke exists for T, calls (unqualified) tag_invoke(hash_append_tag(), h, f, v);
If std::is_enum<T>::value is true, calls hash_append(h, f, w), where w is v converted to the underlying type of T;
If boost::container_hash::is_unordered_range<T>::value is true, calls hash_append_unordered_range(h, f, v.begin(), v.end());
If boost::container_hash::is_contiguous_range<T>::value is true and
- has_constant_size<T>::value is true, calls hash_append_range(h, f, v.data(), v.data() + v.size());
- has_constant_size<T>::value is false, calls hash_append_sized_range(h, f, v.data(), v.data() + v.size());
If boost::container_hash::is_range<T>::value is true and
- has_constant_size<T>::value is true, calls hash_append_range(h, f, v.begin(), v.end());
- has_constant_size<T>::value is false, calls hash_append_sized_range(h, f, v.begin(), v.end());
If boost::container_hash::is_tuple_like<T>::value is true, calls hash_append(h, f, w) for each tuple element w;
If boost::container_hash::is_described_class<T>::value is true, calls hash_append(h, f, b) for each base class subobject b of v, then hash_append(h, f, m) for each member subobject m of v;
Otherwise, the result is a compile-time error.

Remarks:

In case the above description would result in no calls being made (e.g. for a range of constant size zero, or a described struct with no bases and members), a call to hash_append(h, f, '\x00') is made to satisfy the requirement that hash_append always results in at least one call to Hash::update.

hash_append_range

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_range( Hash& h, Flavor const& f, It first, It last );

Appends the representations of the elements of the range [first, last) to the message stored in h.

Requires:

It must be an iterator type. [first, last) must be a valid iterator range.

Effects:

If It is T* and is_contiguously_hashable<T, Flavor::byte_order>::value is true, calls h.update(first, (last - first) * sizeof(T));.
Otherwise, for each element v in the range denoted by [first, last), calls hash_append(h, f, v);.

Remarks:

If hash_append_range is called in a constant expression, the contiguously hashable optimization is only applied for unsigned char* and unsigned char const*.

hash_append_size

template<class Hash, class Flavor = default_flavor, class T>
constexpr void hash_append_size( Hash& h, Flavor const& f, T const& v );

Appends the representation of v, converted to Flavor::size_type, to the message stored in h.

Requires:: T must be an integral type.
Effects:: Equivalent to hash_append(h, f, static_cast<typename Flavor::size_type>(v));

hash_append_sized_range

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_sized_range( Hash& h, Flavor const& f, It first, It last );

Appends the representations of the elements of the range [first, last), followed by the size of the range, to the message stored in h.

Requires:: It must be an iterator type. [first, last) must be a valid iterator range.
Effects:: Equivalent to hash_append_range(h, f, first, last); hash_append(h, f, m);, where m is std::distance(first, last).

hash_append_unordered_range

template<class Hash, class Flavor = default_flavor, class It>
constexpr void hash_append_unordered_range( Hash& h, Flavor const& f, It first, It last );

Constructs a value from the representations of the elements of the range [first, last), in a way such that their order doesn’t affect the result, then appends that value, followed by the size of the range, to the message stored in h.

Requires:

It must be an iterator type. [first, last) must be a valid iterator range.

Effects:

For each element v in the range denoted by [first, last), obtains a hash value r by doing

Hash h2(h);
hash_append(h2, f, v);
auto r = h2.result();

and then combines the so obtained r values in a way that is not sensitive to their order, producing a combined value q. Calls hash_append(h, f, q), followed by hash_append(h, f, m), where m is std::distance(first, last).

hash_append_tag

struct hash_append_tag
{
};

hash_append_tag is a tag type used as the first argument of a tag_invoke overload to identify the hash_append operation.