FNV-1a Hash
FNV-1a (Fowler / Noll / Vo, "1a" variant) is a non-cryptographic multiplicative hash that processes one input byte at a time. It is small enough to memorise, has universally agreed constants across nearly every programming language and standard library, and is appropriate for hash tables and fingerprints on inputs that are not adversarially controlled. It is not a strong hash, and it should not be used for anything where security or robustness against crafted input matters. Its const fn nature makes it convenient for compile-time data lookup tables in embedded firmware, command dispatch in serial protocol parsers, and keyword tables in Forth-like interpreters where the hash result is fixed at build time and the runtime cost is zero.
Algorithm
In pseudocode:
hash = offset_basis
for each byte b in input:
hash = hash XOR b
hash = hash * fnv_prime (mod 2^n)
return hashThere is no setup phase and no finalisation phase. The first byte mixes into the hash on iteration 1. The function returns whatever the last iteration produced.
Constants
| width | offset basis | FNV prime |
|---|---|---|
| 32 | 0x811c9dc5 |
0x01000193 (224 + 28 + 0x93) |
| 64 | 0xcbf29ce484222325 |
0x100000001b3 (240 + 28 + 0xB3) |
| 128 | 0x6c62272e07bb014262b821756295c58d |
288 + 28 + 0x3b |
For widths above 128, see RFC 9923. For widths that are not a power of two, compute the next-larger FNV and XOR-fold. A 24-bit hash is computed as (h >> 24) ^ (h & 0xffffff) of a 32-bit FNV-1a result.
When FNV-1a is acceptable
FNV-1a is acceptable when several of the following hold. Inputs are short, roughly eight bytes or fewer on a modern x86-64 core, longer on platforms with weaker SIMD. Inputs are not under adversarial control. Cross-language interoperability matters, since the same input must produce the same hash from Rust, Go, Python, C, and so on. The implementation must be tiny, fitting in a const fn with no dependencies. A predictable inner loop is preferable to higher raw throughput. The result is consumed at compile time for static dispatch or table generation.
When to use something else
- Untrusted input in a hash table. Use SipHash-1-3 or SipHash-2-4 with a per-process random key. Hash flooding against FNV-1a is trivial, and seeding the offset basis with random data does not produce a keyed hash; it just mitigates the most naive attacks.
- Cryptographic preimage or collision resistance. Use BLAKE3 for general fingerprinting where speed matters, SHA-256 where ecosystem compatibility matters. FNV-1a has no resistance to either preimage or collision attacks.
- High throughput on long inputs. Use xxh3 or xxHash64 for general use, FarmHash or t1ha if the platform suits them. FNV-1a is roughly an order of magnitude slower on inputs over a few hundred bytes.
- Key derivation, password hashing, or anything in a security envelope. Use Argon2id, scrypt, or a proper KDF. A hash function on its own is the wrong primitive regardless of which hash you pick.
- Single-hash parallelism on a GPU. Use BLAKE3 or another tree-structured hash. FNV-1a's serial recurrence cannot be parallelised across input chunks.
- Very large hash tables where collision probability matters. Use xxh3 or another well-mixed hash. FNV-1a's structural biases become visible at table sizes where the birthday-bound collision probability is non-trivial.
Performance
The core loop is a single XOR followed by a single multiplication. The crossover point compared to something like FarmHash or xxh3 is roughly 8 bytes of hashed content natively, and roughly 20 bytes on JavaScript engines. Below that threshold, FNV-1a's lack of setup and finalisation overhead tends to beat the higher steady-state throughput of modern hashes.
When the metric is end-to-end hash-table performance rather than raw hashing throughput, simple multiplicative hashes (FNV, DJB2, x17) frequently outperform "better" hashes because the smaller instruction-cache footprint matters more than the cycles saved per hash.
Embedded considerations
FNV-1a uses no lookup tables, so it adds tens of bytes of code rather than the kilobyte-class footprint of table-based CRC32. The state is one machine word. There is no allocation, no input-dependent branching, and the running time is linear in input length with no variance, which makes it easy to fit into a real-time control loop without disturbing worst case execution analysis.
The width choice tracks the multiplier. The 32-bit FNV prime fits a single-cycle UMULL on Cortex-M3 and M4. On Cortex-M0 there is no hardware multiplier, so even 32-bit FNV-1a involves software emulation, but the result remains small and predictable. The 64-bit variant is a poor choice below a 32-bit-with-MUL core, because the 64x64 multiplication is several hundred cycles in software.
The compile-time evaluation pays off. A const fn implementation produces ROM-resident hash values at build time, enabling perfect or near-perfect hash tables for keyword dispatch in serial protocol parsers, command lookup in embedded shells, and identifier interning in Forth-like interpreters with zero runtime hashing cost. The hash exists only to select the dispatch entry; comparison against the actual key string remains the verification step.
GPU considerations
The serial recurrence prevents parallelising a single hash across input chunks. This is the fundamental limitation versus tree-structured hashes like BLAKE3 or block-parallel hashes like xxh3 with its four parallel lanes. For hashing a single large input fast on a GPU, FNV-1a is the wrong tool.
The pattern that does work is batch-parallel hashing with one independent input per thread. Spatial hash grids in compute shaders, vertex deduplication during mesh processing, particle bin assignment, and similar pipelines all fit. The inner loop has no branches, no thread divergence, and predictable sequential reads from the input buffer.
The 64-bit variant is awkward in shader languages. WGSL has no native u64 type, so a 64x64 multiply requires manual decomposition into 32-bit halves. HLSL and GLSL have uint64_t extensions, but the multiply is emulated on most older GPU generations and slower than the 32-bit equivalent on the rest. The 32-bit FNV-1a is the natural choice for GPU batch hashing, accepting that its structural weaknesses are more visible at 32 bits than at 64.
Implementation gotchas
The recurrence runs modulo 2^n. In languages with checked arithmetic by default (Rust technically only in debug mode, Swift, and others) the multiplication must use wrapping semantics.
When using the result to index a hash table, prefer prime or odd bucket counts, or index from the high bits via (hash * n) >> k (Lemire's fast range reduction). Indexing the low bits of an FNV-1a output via hash & mask on a power-of-two table can collide pathologically on structured input. The low bits of an FNV-1a output are not actually well-mixed. The least significant bit of the output is the XOR of the offset basis LSB with the LSBs of all input bytes, provably and exactly:
This function returns the same value as (fnv1a_64(data) & 1) as u8 for every input. Multiplication by an odd prime preserves the LSB, so the multiply step is a no-op at bit position 0, and only the XOR step contributes. The result is that a 2-bucket table indexed by hash & 1 carries exactly one bit of information per input byte (the LSB), and any pair of inputs whose byte-LSBs have the same XOR parity collides regardless of how different they look. Wider masks generalise the problem, with progressively more complex but still low-entropy finite-state-machine relationships between input bytes and low output bits. Structured input (sequential identifiers, prefixed keys, regular field formats) produces structured collisions in the bucket distribution.
XOR-folding to widths smaller than the native FNV variant is the canonical way to produce non-power-of-two-width outputs. It does not eliminate the linear XOR-identity weakness, but it does spread the bias.
If the hash table is exposed to untrusted input, switch to a keyed hash such as SipHash rather than seeding FNV-1a with a per-process random offset basis. The latter mitigates the most naive flooding attacks but does not provide keyed security.
Reference Rust implementation
The 32-bit variant differs only in the constants and the integer type:
Both functions are const-compatible if the loop is rewritten as a recursive or index-based form, which makes them suitable for compile-time string hashing and ROM-resident lookup table generation:
The only structural change versus the runtime version is replacing for &byte in data with an indexed while loop. This is necessary because for loops in Rust desugar to IntoIterator::into_iter().next(), and trait method calls in const context are still gated behind feature flags. while, indexing, and arithmetic on integer types have been stable in const since Rust 1.46 (I think this is the right version). Other languages may be able to get away with their form of iterators.
Alternatives
| use case | recommended hash |
|---|---|
| Trusted short keys, cross-language | FNV-1a |
| Compile-time string hashing | FNV-1a |
| Embedded dispatch tables | FNV-1a (32-bit) |
| Trusted long inputs, x86-64 | xxh3, xxHash64, FarmHash |
| Untrusted input in hash tables | SipHash-1-3 or SipHash-2-4 |
| GPU batch hashing | FNV-1a (32-bit) |
| GPU single-hash throughput | BLAKE3 |
| Cryptographic fingerprinting | BLAKE3, SHA-256 |
| Bloom filters, sketches | xxh3, MurmurHash3 |
| Distributed shard keys | xxh3, FNV-1a if cross-language |
References
Gut Check
Creating Crypt Style SHA512 Passwords With Ruby
I needed to generate crypt-style SHA512 passwords in ruby for an /etc/shadow file. After a bunch of Googling and messing around with the OpenSSL library I …
Security Through Obesity
Jeremy Spilman recently proposed changes to how user's hashes are stored in website's and companies databases. This post was originally going to look at some of …
Quick Thoughts on Hash Cash
This is an interesting proof of work concept. The first example I have found of this in the wild is to prevent abuse for anonymous account registration on an …