1. The attack in 30 seconds
A length-extension attack lets an attacker who knows
tag = SHA-256(secret || message) and the byte-length of
secret compute SHA-256(secret || message || pad ||
extra) for any extra they choose — without
ever learning secret.
This works because Merkle-Damgard hashes (SHA-1, SHA-256, SHA-512)
expose their internal state in the output. The attacker treats the
public tag as the chaining value at position len(secret) +
len(message) + len(pad), then continues hashing from there.
2. Why this is NOT a Bitcoin vulnerability
Bitcoin uses SHA-256d — H(H(x)) —
not the H(secret || x) construction. The outer hash
absorbs exactly 32 bytes of input (the inner output), and an attacker
cannot extend the outer chaining value without first inverting the
inner SHA-256, which is itself a preimage attack. So Bitcoin's
block-id and txid hashing is safe.
The point of this demo is the underlying construction: a naive user of SHA-256 (e.g. someone implementing a custom MAC for a new protocol) can fall into the LE trap. A naive user of BLAKE3 cannot. Bitcoin Core has been bitten by this thinking error twice in early proposals (BIP143 sighash design, original Lightning HTLC), both caught in review before activation.
3. How to run
cd b3chain pip3 install blake3 python3 contrib/testing/compare/compare-length-extension.py
4. Sample output
Length-extension demo: SHA-256 (vulnerable) vs BLAKE3 (immune)
[1] SHA-256 H(secret || msg) MAC
secret length: 16 bytes (known to attacker, value secret)
message: b'amount=100&to=alice'
appended: b'&to=mallory&override=true'
attack time: 97.5 us
forged tag: 82f688f4...8ae17ed5
verifier says: ACCEPTED (forgery succeeded)
[2] BLAKE3 H(secret || msg) MAC
attack time: 1.0 us
verifier says: rejected (expected)
| algorithm | attack succeeded? | time |
|-----------|-------------------|---------:|
| sha256 | YES (forgery) | 97.5 us |
| blake3 | no | 1.0 us |
5. The trick, step by step
- The verifier computes
tag = SHA-256(secret || msg)withsecret = ab × 16andmsg = "amount=100&to=alice". - The attacker does not know
secretbut knows its length. They split the publictagback into the eight 32-bit SHA-256 internal state words. - They compute
pad, the SHA-256 padding that the verifier appended tosecret || msgto round it up to a multiple of 64 bytes. - They feed
appendthrough SHA-256's compression function, but starting from the recovered state and pretending the previously absorbed length islen(secret) + len(msg) + len(pad). - They send the verifier
(msg || pad || append)with the new tag. The verifier independently computesSHA-256(secret || msg || pad || append)and gets the same tag, because that is exactly what the attacker computed.
6. Why BLAKE3 is immune
- BLAKE3 is a tree hash. The output is the root of a Merkle tree over 1 KiB chunks, not a single chained chaining value. There is no "internal state at position N" that the attacker can resume from.
- BLAKE3 has a built-in keyed mode
(
blake3.keyed_hash(key, msg)) that uses the key as an IV rather than a prefix. This is the construction you should use for a MAC. - BLAKE3 has a built-in derive_key mode for KDFs, avoiding the need for HKDF. Each mode has its own domain- separation flag bits in the compression function.
7. The right way to MAC with SHA-256
Use HMAC:
import hmac, hashlib tag = hmac.new(secret, msg, hashlib.sha256).digest()
HMAC explicitly defends against LE by hashing twice with two distinct
keyed inputs (ipad, opad). Don't roll your
own; use the library.
8. Source files
- contrib/testing/compare/compare-length-extension.py
- BLAKE3 specification (paper) — section on tree mode and domain separation