Internet-Draft CPP Core January 2026
Kamimura Expires 2 August 2026 [Page]
Workgroup:
Independent Submission
Internet-Draft:
draft-vso-cpp-core-00
Published:
Intended Status:
Experimental
Expires:
Author:
T. Kamimura
VeritasChain Co., Ltd.

Content Provenance Profile (CPP) Core

Abstract

The Content Provenance Profile (CPP) is an open specification for cryptographically verifiable media capture provenance. This document defines the core data model, hashing conventions, Merkle tree construction rules, RFC 3161 Time-Stamp Authority (TSA) anchoring protocol, and offline verification procedures for CPP.

CPP enables capture devices to produce tamper-evident provenance records that bind media content to external timestamps via trusted third parties. Unlike self-attestation models, CPP requires independent timestamp verification through RFC 3161 TSA services, providing externally verifiable proof of when media was captured.

This specification focuses on the interoperable core of CPP: the data structures, cryptographic operations, and verification algorithms necessary for independent third-party verification. Application-specific features such as depth analysis are defined as optional extensions.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 5 July 2026.

Table of Contents

1. Introduction

1.1. Problem Statement

Digital media authenticity faces several fundamental challenges:

  • Self-Attestation Weakness: Systems where creators sign their own claims provide no independent verification. A verifier must trust that the creator's claimed timestamp is accurate.
  • Metadata Stripping: Social media platforms and messaging applications routinely strip embedded metadata, breaking provenance chains that depend on file-level embedding.
  • Omission Attacks: Systems that prove individual media authenticity cannot detect when unfavorable evidence has been selectively deleted from a collection.
  • Terminology Confusion: Terms like "verified" mislead users into believing content truthfulness has been established, when only provenance has been recorded.

1.2. Design Goals

CPP addresses these challenges through the following design principles:

  1. External Timestamp Verification: Timestamps MUST be anchored to independent RFC 3161 Time-Stamp Authorities, enabling third-party verification without trusting the capture device.
  2. Omission Detection: The Completeness Invariant mechanism enables detection of deleted events within a collection.
  3. Offline Verification: All data necessary for verification is included in the Evidence Pack, enabling verification without network access.
  4. Provenance ≠ Truth: CPP proves when and by what device media was captured. It does NOT prove content truthfulness or scene authenticity.

1.3. Scope

This document specifies:

  • The CPP event data model
  • Hash computation and canonicalization rules
  • Merkle tree construction and proof verification
  • RFC 3161 TSA anchoring requirements
  • Verification procedures for implementers

This document does NOT specify:

  • Application user interface requirements
  • Network protocols for proof distribution
  • Key management or certificate policies
  • Content authenticity claims

1.4. Relationship to Other Specifications

CPP defines its own Merkle tree construction that is NOT compatible with Certificate Transparency [RFC6962]. While inspired by similar principles, CPP uses different domain separation prefixes and padding rules optimized for media provenance use cases. Implementations MUST NOT assume RFC 6962 compatibility.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Additionally, this document uses the following terms:

Event
A discrete, signed record representing a provenance action (capture, export, deletion).
EventHash
A SHA-256 hash of the canonicalized event data, formatted as sha256:<64_hex_chars>.
LeafHash
SHA-256(0x00 || EventHash_bytes), used as input to the Merkle tree. The 0x00 prefix provides domain separation from internal nodes.
MerkleRoot
The root hash of a binary Merkle tree containing one or more LeafHashes.
AnchorDigest
The 32-byte value submitted to the TSA. Represented as a 64-character lowercase hexadecimal string without prefix. MUST equal the MerkleRoot value (after stripping the sha256: prefix).
TreeSize
The count of original leaves in the Merkle tree before any padding. An unsigned integer. MUST be >= 1.
PaddedSize
The count of leaves after padding to a power of 2. Computed as the smallest power of 2 greater than or equal to TreeSize.
Evidence Pack
A self-contained data structure containing all information necessary for offline verification.
Tombstone
An event that records the legitimate deletion of a previous event.
Provenance Available
The recommended terminology for UI display, indicating capture provenance is recorded without implying content truth verification.
Genesis PrevHash
The constant value used as PrevHash for the first event in a chain: sha256:0000000000000000000000000000000000000000000000000000000000000000 (64 zeros).

3. Threat Model

3.1. Addressed Threats

Table 1
Threat Mitigation
Timestamp forgery RFC 3161 TSA provides independent timestamp
Evidence tampering EventHash binds content; Merkle proof binds to anchor
Selective deletion Completeness Invariant detects missing events
TSA token swapping messageImprint must match AnchorDigest

3.2. Explicitly Not Addressed

  • Content truthfulness (scene staging, deepfakes)
  • Device compromise before capture
  • Key extraction from secure hardware
  • TSA collusion with adversary

4. Data Model

4.1. Events

An Event is the fundamental unit of provenance in CPP. Events are signed records that capture discrete provenance actions.

4.1.1. Event Types

Table 2
Type Description
INGEST Media captured from device sensor
SEAL Collection sealed with Completeness Invariant
EXPORT Proof shared externally
TOMBSTONE Legitimate deletion record

4.1.2. Event Structure

The following fields are REQUIRED for all events:

Table 3
Field Type Required Description
EventID string REQUIRED Unique identifier (UUID)
ChainID string REQUIRED Identifier linking events in a sequence
PrevHash string REQUIRED Hash of previous event in chain
Timestamp string REQUIRED ISO 8601 timestamp with millisecond precision
EventType string REQUIRED One of: INGEST, SEAL, EXPORT, TOMBSTONE
HashAlgo string REQUIRED Always "SHA256"
SignAlgo string REQUIRED "ES256" or "Ed25519"
EventHash string REQUIRED SHA-256 hash of canonicalized event
Signature string REQUIRED Raw Base64-encoded signature (no prefix)

4.1.3. Encoding Requirements

All Base64-encoded fields (Signature, TSA.Token, public_key) MUST conform to:

  • [RFC4648] Section 4 (standard base64 alphabet, NOT base64url)
  • No whitespace characters (no line breaks, spaces, or tabs)
  • Padding characters (=) MUST be included when required

PROHIBITED:

  • base64:MEUCIQDx... - Prefixes not allowed
  • data:application/octet-stream;base64,... - Data URIs not allowed
  • Base64url alphabet (- and _ instead of + and /)
  • Line wrapping or embedded whitespace

4.1.4. INGEST Event

INGEST events MUST include:

Table 4
Field Type Required Description
Asset.AssetHash string REQUIRED SHA-256 hash of media bytes
Asset.AssetType string REQUIRED IMAGE or VIDEO
Asset.MimeType string REQUIRED MIME type of asset
Asset.AssetID string OPTIONAL Unique asset identifier
Asset.AssetName string OPTIONAL Original filename
Asset.AssetSize integer OPTIONAL File size in bytes

4.1.5. SEAL Event

SEAL events finalize a collection and commit the Completeness Invariant. A SEAL event MUST include:

Table 5
Field Type Required Description
CollectionID string REQUIRED Identifier for the sealed collection
EventCount integer REQUIRED Number of events in collection (excluding SEAL)
CompletenessInvariant object REQUIRED Completeness verification data
MerkleRoot string REQUIRED Root hash of events in collection
4.1.5.1. CompletenessInvariant Object
Table 6
Field Type Required Description
ExpectedCount integer REQUIRED Number of events that MUST be present
HashSum string REQUIRED XOR of all EventHash values (sha256: format)
FirstTimestamp string REQUIRED ISO 8601 timestamp of first event
LastTimestamp string REQUIRED ISO 8601 timestamp of last event

The HashSum is computed as:

HashSum = EventHash[0] XOR EventHash[1] XOR ... XOR EventHash[n-1]

Where XOR operates on the 32-byte binary values of each EventHash.

4.1.5.2. SEAL Anchoring Pattern

The recommended anchoring pattern for SEAL events:

  1. Compute the Merkle tree over all INGEST events in the collection
  2. Create the SEAL event with MerkleRoot referencing this tree
  3. Include the SEAL event's EventHash as an additional leaf in a NEW Merkle tree
  4. Anchor this new tree (containing the SEAL event) to a TSA

This pattern ensures the Completeness Invariant itself is bound to an external timestamp. The SEAL event's EventHash covers all CI fields, so the TSA anchor proves the CI existed at GenTime.

Alternative Pattern (NOT RECOMMENDED): Including the SEAL event in the same Merkle tree it references creates a circular dependency and is prohibited.

4.1.6. TOMBSTONE Event

TOMBSTONE events MUST additionally include:

Table 7
Field Type Required Description
DeletedEventId string REQUIRED EventID being invalidated
Reason string REQUIRED Deletion reason code
DeletedAt string REQUIRED ISO 8601 deletion timestamp

4.2. Hash Chain

Events form a hash chain through the PrevHash field:

Event 1: PrevHash = sha256:0000...0000 (genesis - 64 zeros)
Event 2: PrevHash = EventHash(Event 1)
Event 3: PrevHash = EventHash(Event 2)

Verification of chain integrity:

  1. For each event after the first, the verifier MUST compute EventHash of the previous event.
  2. The computed hash MUST match the current event's PrevHash field.
  3. If any mismatch is detected, verification MUST fail with status CHAIN_INTEGRITY_VIOLATION.

4.3. Merkle Tree Structure

CPP defines its own binary Merkle tree construction optimized for media provenance. This construction uses domain separation prefixes to prevent attacks where leaf values could be confused with internal node values.

Important: CPP Merkle trees are NOT compatible with RFC 6962 (Certificate Transparency). Implementations MUST use the exact algorithms specified in this section.

4.3.1. Domain Separation

CPP uses single-byte prefixes to separate domains:

Table 8
Domain Prefix Byte Description
Leaf 0x00 Applied to EventHash bytes
Internal 0x01 Applied to concatenated child hashes

4.3.2. Leaf Nodes

LeafHash = SHA256(0x00 || EventHash_bytes)

Where:

  • 0x00 is a single byte with value zero
  • EventHash_bytes is the 32-byte binary representation of EventHash (after stripping the sha256: prefix)
  • || denotes byte concatenation

Rationale: The 0x00 prefix ensures leaf hashes cannot collide with internal node hashes, preventing second preimage attacks on the tree structure.

4.3.3. Internal Nodes

InternalHash = SHA256(0x01 || Left_bytes || Right_bytes)

Where:

  • 0x01 is a single byte with value one
  • Left_bytes is the 32-byte hash of the left child
  • Right_bytes is the 32-byte hash of the right child
  • || denotes byte concatenation

4.3.4. Tree Construction

Step 1: Compute Leaf Hashes

For each event, compute LeafHash = SHA256(0x00 || EventHash_bytes).

Step 2: Determine Padding

PaddedSize is the smallest power of 2 >= TreeSize:

function computePaddedSize(treeSize):
    if treeSize == 0:
        return 0  // Invalid - TreeSize MUST be >= 1
    paddedSize = 1
    while paddedSize < treeSize:
        paddedSize = paddedSize * 2
    return paddedSize

Step 3: Pad Leaf Array

If TreeSize < PaddedSize, duplicate the last leaf hash until the array length equals PaddedSize.

Step 4: Build Tree

function buildTree(paddedLeaves):
    levels = [paddedLeaves]
    current = paddedLeaves

    while current.length > 1:
        nextLevel = []
        for i in range(0, current.length, 2):
            left = current[i]
            right = current[i + 1]
            parent = SHA256(0x01 || left || right)
            nextLevel.append(parent)
        levels.append(nextLevel)
        current = nextLevel

    return levels  // levels[0] = leaves, levels[-1] = [root]

4.3.5. Merkle Proof Structure

Table 9
Field Type Description
TreeSize integer Original leaf count (before padding), unsigned, MUST be >= 1
LeafHashMethod string MUST be exactly SHA256(0x00||EventHash) (18 ASCII characters)
LeafHash string Computed LeafHash for this event with sha256: prefix
LeafIndex integer 0-based position in tree, range [0, TreeSize-1]
Proof array Sibling hashes from bottom to top, each with sha256: prefix
Root string MerkleRoot with sha256: prefix

TreeSize Constraint: An empty Merkle tree (TreeSize = 0) is not permitted. Verifiers MUST reject proofs where TreeSize < 1.

4.4. Anchor Structure

Table 10
Field Type Description
AnchorID string Unique anchor identifier
AnchorType string MUST be "RFC3161"
AnchorDigest string MerkleRoot without prefix, 64 lowercase hex chars
AnchorDigestAlgorithm string MUST be "sha-256"
Merkle object Merkle proof structure
TSA object TSA response data

5. Canonicalization and Hashing

5.1. JSON Canonicalization

Events MUST be canonicalized using [RFC8785] (JSON Canonicalization Scheme) before hashing.

The following fields MUST be excluded from canonicalization:

  • EventHash
  • Signature

All other fields MUST be included. Field names in the canonical event object use PascalCase (e.g., EventID, ChainID, PrevHash).

5.2. EventHash Computation

function computeEventHash(event):
    eventCopy = copy(event)
    delete eventCopy.EventHash
    delete eventCopy.Signature
    canonical = JCS_canonicalize(eventCopy)  // RFC 8785
    hashBytes = SHA256(canonical)
    return "sha256:" + lowercase_hex(hashBytes)

The resulting EventHash is a 71-character string: the prefix "sha256:" followed by 64 lowercase hexadecimal characters.

5.3. LeafHash Computation

function computeLeafHash(eventHash):
    hexStr = eventHash.substring(7)  // Remove "sha256:" prefix
    eventHashBytes = hexDecode(hexStr)  // 32 bytes
    prefixedData = [0x00] + eventHashBytes  // 33 bytes
    leafHashBytes = SHA256(prefixedData)
    return "sha256:" + lowercase_hex(leafHashBytes)

The 0x00 prefix byte provides domain separation from internal nodes.

5.4. Internal Node Hash Computation

function computeInternalHash(left, right):
    leftBytes = hexDecode(left.substring(7))  // Remove prefix, decode
    rightBytes = hexDecode(right.substring(7))
    prefixedData = [0x01] + leftBytes + rightBytes  // 65 bytes
    hashBytes = SHA256(prefixedData)
    return "sha256:" + lowercase_hex(hashBytes)

The 0x01 prefix byte distinguishes internal nodes from leaves.

5.5. AnchorDigest Computation

AnchorDigest is the MerkleRoot value WITHOUT the sha256: prefix, represented as 64 lowercase hexadecimal characters.

function computeAnchorDigest(merkleRoot):
    return lowercase(merkleRoot.substring(7))

PROHIBITED:

  • SHA256(merkleRoot) - This would create double hashing
  • SHA256(stringEncode(merkleRoot)) - This would hash the string representation
  • Any transformation other than prefix removal
  • Mixed case output - MUST be lowercase

6. Anchoring Protocol

6.1. TSA Request Construction

The messageImprint in TimeStampReq MUST contain:

  • hashAlgorithm: SHA-256 (OID 2.16.840.1.101.3.4.2.1)
  • hashedMessage: AnchorDigest as 32-byte OCTET STRING
TimeStampReq ::= SEQUENCE {
   version         INTEGER { v1(1) },
   messageImprint  MessageImprint,
   reqPolicy       OBJECT IDENTIFIER OPTIONAL,
   nonce           INTEGER OPTIONAL,
   certReq         BOOLEAN DEFAULT FALSE,
   extensions      [0] IMPLICIT Extensions OPTIONAL
}

MessageImprint ::= SEQUENCE {
   hashAlgorithm   AlgorithmIdentifier,  -- SHA-256
   hashedMessage   OCTET STRING          -- AnchorDigest (32 bytes)
}

6.1.1. certReq Recommendation

Producers SHOULD set certReq to TRUE to request the TSA's signing certificate be included in the response. This enables:

  • Offline verification without fetching certificates separately
  • Long-term verification even if TSA infrastructure changes
  • Self-contained Evidence Packs

If certReq is FALSE and the TSA certificate is not included in the response, verifiers MUST attempt to obtain the certificate through other means (e.g., AIA extension, local cache) or return VALID_WARNING.

6.2. TSA Response Processing

Upon receiving TimeStampResp, the producer:

  1. MUST verify the response status is granted (0) or grantedWithMods (1)
  2. MUST extract the TimeStampToken from the response
  3. MUST store the complete DER-encoded TimeStampToken
  4. MUST extract and store the messageImprint from TSTInfo
  5. MUST extract and store GenTime from TSTInfo

6.3. Single-Leaf Tree Rules

When TreeSize equals 1, the following invariants MUST hold:

  • LeafIndex MUST equal 0
  • Proof MUST be an empty array
  • Root MUST equal LeafHash
  • LeafHash MUST equal SHA256(0x00 || EventHash_bytes)

If any of these conditions fail, verification MUST return INVALID.

6.4. Multi-Leaf Tree Rules

For TreeSize greater than 1:

  • LeafIndex MUST be in range [0, TreeSize-1]
  • PaddedSize = smallest power of 2 >= TreeSize
  • Proof length MUST NOT exceed log2(PaddedSize)
  • Sibling hashes are ordered from bottom (leaf level) to top (root level)
  • Index parity determines pairing order: even=left, odd=right
  • All internal nodes use SHA256(0x01 || left || right)

7. Verification Procedures

7.1. Verification Result Codes

Table 11
Code Meaning
VALID All checks passed, including TSA signature verification
VALID_WARNING Cryptographic checks passed, but TSA certificate chain could not be fully validated
INVALID Cryptographic verification failed
CHAIN_INTEGRITY_VIOLATION Hash chain is broken
COMPLETENESS_VIOLATION Completeness Invariant mismatch

Note: VALID_WARNING indicates the proof is cryptographically sound but the TSA's identity could not be independently verified. Applications SHOULD display this distinction to users.

7.2. Event Verification

function verifyEvent(event, publicKey):
    // Step 1: Recompute EventHash
    computedHash = computeEventHash(event)
    if computedHash != event.EventHash:
        return INVALID("EventHash mismatch")

    // Step 2: Verify signature
    hashBytes = hexDecode(event.EventHash.substring(7))
    sigBytes = base64Decode(event.Signature)
    if not verifySignature(publicKey, hashBytes, sigBytes):
        return INVALID("Signature verification failed")

    return VALID

7.3. Merkle Proof Verification

function verifyMerkleProof(eventHash, leafIndex, proof,
                           expectedRoot, treeSize):
    // Step 1: Validate inputs
    if treeSize < 1:
        return INVALID("TreeSize must be >= 1")
    if leafIndex < 0 or leafIndex >= treeSize:
        return INVALID("LeafIndex out of range")

    paddedSize = computePaddedSize(treeSize)
    maxProofLength = log2(paddedSize)
    if proof.length > maxProofLength:
        return INVALID("Proof too long")

    // Step 2: Compute leaf hash with domain separation
    currentHash = computeLeafHash(eventHash)  // SHA256(0x00 || bytes)

    // Step 3: Handle single-leaf case
    if treeSize == 1:
        if leafIndex != 0:
            return INVALID("LeafIndex must be 0 for single-leaf")
        if proof.length != 0:
            return INVALID("Proof must be empty for single-leaf")
        if lowercase(currentHash) != lowercase(expectedRoot):
            return INVALID("Root != LeafHash for single-leaf")
        return VALID

    // Step 4: Traverse proof from bottom to top
    index = leafIndex
    for siblingHash in proof:
        if index % 2 == 0:
            // Current is left child
            currentHash = computeInternalHash(currentHash, siblingHash)
        else:
            // Current is right child
            currentHash = computeInternalHash(siblingHash, currentHash)
        index = floor(index / 2)

    // Step 5: Compare with expected root (case-insensitive)
    if lowercase(currentHash) != lowercase(expectedRoot):
        return INVALID("Computed root != expected root")

    return VALID

7.4. TSA Verification

TSA verification ensures the timestamp token was legitimately issued by a Time-Stamp Authority and binds the correct digest.

function verifyTSAAnchor(eventHash, anchor):
    // Step 1: Verify Merkle structure
    merkle = anchor.Merkle
    result = verifyMerkleProof(
        eventHash,
        merkle.LeafIndex,
        merkle.Proof,
        merkle.Root,
        merkle.TreeSize
    )
    if result != VALID:
        return result

    // Step 2: Verify LeafHashMethod
    if merkle.LeafHashMethod != "SHA256(0x00||EventHash)":
        return INVALID("Unsupported LeafHashMethod")

    // Step 3: Verify AnchorDigest == MerkleRoot
    expectedDigest = lowercase(merkle.Root.substring(7))
    if lowercase(anchor.AnchorDigest) != expectedDigest:
        return INVALID("AnchorDigest != MerkleRoot")

    // Step 4: Parse TSA Token (RFC 5652 ContentInfo)
    tsaToken = base64Decode(anchor.TSA.Token)
    contentInfo = parseContentInfo(tsaToken)  // RFC 5652
    signedData = parseSignedData(contentInfo.content)
    tstInfo = parseTSTInfo(signedData.encapContentInfo.eContent)

    // Step 5: Verify hash algorithm is SHA-256
    if tstInfo.messageImprint.hashAlgorithm != SHA256_OID:
        return INVALID("Unsupported TSA hash algorithm")

    // Step 6: Verify messageImprint == AnchorDigest (MUST)
    tstImprint = lowercase_hex(tstInfo.messageImprint.hashedMessage)
    if tstImprint != lowercase(anchor.AnchorDigest):
        return INVALID("TSA messageImprint != AnchorDigest")

    // Step 7: Verify CMS signature over TSTInfo (MUST per RFC 5652)
    signerInfo = signedData.signerInfos[0]
    signatureValid = verifyCMSSignature(
        signedData.encapContentInfo.eContent,
        signerInfo.signature,
        signerInfo.signatureAlgorithm,
        extractSignerCert(signedData.certificates, signerInfo.sid)
    )
    if not signatureValid:
        return INVALID("TSA signature verification failed")

    // Step 8: Verify certificate chain (SHOULD)
    certValid = verifyCertificateChain(
        signedData.certificates,
        signerInfo.sid,
        trustAnchors
    )

    if certValid:
        return VALID(genTime = tstInfo.genTime)
    else:
        return VALID_WARNING(
            genTime = tstInfo.genTime,
            warning = "TSA certificate chain could not be verified"
        )

7.4.1. CMS Signature Verification Requirements

Per [RFC5652], verifiers MUST:

  1. Parse the TimeStampToken as a ContentInfo structure
  2. Extract the SignedData from the content field
  3. Locate the SignerInfo corresponding to the TSA
  4. Verify the signature over the encapsulated TSTInfo
  5. Verify the signer's certificate was valid at signing time

Verifiers SHOULD:

  1. Build and validate the certificate chain to a trust anchor
  2. Verify the TSA certificate contains the id-kp-timeStamping extended key usage
  3. Check for certificate revocation

7.5. Chain Integrity Verification

GENESIS_PREV_HASH = "sha256:00000000000000000000000000000000" +
                    "00000000000000000000000000000000"

function verifyChainIntegrity(events):
    if events.length == 0:
        return VALID

    // First event must have Genesis PrevHash (64 zeros)
    if events[0].PrevHash != GENESIS_PREV_HASH:
        return CHAIN_INTEGRITY_VIOLATION("Invalid genesis PrevHash")

    for i in range(1, events.length):
        expectedPrevHash = events[i-1].EventHash
        if events[i].PrevHash != expectedPrevHash:
            return CHAIN_INTEGRITY_VIOLATION(
                "Break at event " + i +
                ": expected " + expectedPrevHash +
                ", found " + events[i].PrevHash)

    return VALID

7.6. Completeness Invariant Verification

The Completeness Invariant is verified against a SEAL event. The SEAL event MUST be anchored to a TSA to provide external timestamp binding for the entire collection.

function verifyCompleteness(events, sealEvent):
    ci = sealEvent.CompletenessInvariant

    // Step 1: Verify count matches
    if events.length != ci.ExpectedCount:
        return COMPLETENESS_VIOLATION(
            "Count mismatch: expected " + ci.ExpectedCount +
            ", found " + events.length)

    // Step 2: Compute XOR hash sum
    computed = bytes(32)  // Initialize to all zeros
    for event in events:
        eventHashBytes = hexDecode(event.EventHash.substring(7))
        computed = XOR(computed, eventHashBytes)

    // Step 3: Compare with sealed value
    expectedHashSum = hexDecode(ci.HashSum.substring(7))
    if computed != expectedHashSum:
        return COMPLETENESS_VIOLATION(
            "Hash sum mismatch - events may be missing or added")

    // Step 4: Verify timestamp bounds
    for event in events:
        if event.Timestamp < ci.FirstTimestamp:
            return COMPLETENESS_VIOLATION(
                "Event timestamp before collection start")
        if event.Timestamp > ci.LastTimestamp:
            return COMPLETENESS_VIOLATION(
                "Event timestamp after collection end")

    return VALID

7.6.1. Attack Detection

Table 12
Attack Detection
Delete event Hash sum mismatch and/or count mismatch
Add fake event Count mismatch and/or hash sum mismatch
Reorder events Chain integrity violation (PrevHash mismatch)
Modify event EventHash mismatch in chain

8. Privacy Considerations

8.1. Location Data

Location collection SHOULD be disabled by default. When enabled, implementations SHOULD:

  • Clearly indicate when location is being recorded
  • Allow users to delete location from individual events
  • Consider privacy implications of location precision

8.2. Biometric Data

Implementations MUST NOT store raw biometric data (fingerprints, face images). Human presence verification, if implemented, SHOULD:

  • Process biometrics locally on-device
  • Store only verification results (boolean flags)
  • Never transmit biometric data to external services

8.3. Tombstone Privacy

When events are deleted via TOMBSTONE:

  • Original event content is removed
  • TOMBSTONE preserves chain integrity
  • Reason codes allow selective disclosure

8.4. Shareable vs Forensic Proofs

Evidence Packs may be created with different privacy levels:

Table 13
Level Includes Use Case
Shareable Timestamp, device info, asset hash Social sharing
Forensic All metadata including location Legal proceedings

9. Security Considerations

9.1. Hash Algorithm Agility

This specification mandates SHA-256 for all hash computations. Future versions MAY define additional algorithms via the HashAlgo field. Verifiers MUST reject unknown hash algorithms.

9.2. Signature Algorithm Requirements

Implementations MUST support ES256 (ECDSA with P-256 and SHA-256) for mobile device compatibility. Ed25519 MAY be supported for non-mobile implementations.

Private keys SHOULD be stored in hardware security modules (Secure Enclave, StrongBox, TPM) where available.

9.3. TSA Trust

Security of timestamp proofs depends on TSA trustworthiness. Implementations:

  • MUST verify the CMS signature in the TimeStampToken per [RFC5652]
  • SHOULD validate the certificate chain to a configured trust anchor
  • SHOULD use TSAs with published certificate policies
  • MAY support multiple TSA services for redundancy

If certificate chain validation fails but CMS signature verification succeeds, the result SHOULD be VALID_WARNING rather than INVALID, as the timestamp binding is cryptographically sound even if the TSA's identity cannot be fully verified.

9.4. Merkle Tree Security

9.4.1. Domain Separation

The 0x00/0x01 prefix bytes ensure:

  • Leaf hashes cannot equal internal node hashes for any input
  • An attacker cannot construct a valid proof by substituting internal nodes for leaves
  • The tree structure is unambiguous given a root hash

This construction differs from Certificate Transparency [RFC6962] which uses a similar but incompatible scheme.

9.4.2. Padding Security

Duplicating the last leaf for padding:

  • Is deterministic (no randomness)
  • Produces the same tree for the same inputs
  • Does not leak information about padding count (TreeSize is explicitly stored)

9.5. Clock Accuracy

Device timestamps (Timestamp field) are self-attested and may be inaccurate. The authoritative timestamp is GenTime from the TSA response.

Implementations SHOULD warn users when device time differs significantly from TSA GenTime (e.g., more than 5 minutes).

9.6. Deletion Detection Limitations

The Completeness Invariant detects deletions within a sealed collection. It does NOT detect:

  • Events never created (adversary captured but never recorded)
  • Events in other collections
  • Deletions before sealing

9.6.1. XOR Hash Sum Limitations

The Completeness Invariant uses XOR for omission detection, NOT for cryptographic commitment. Important limitations:

  • Collision by design: XOR is commutative and self-inverse. An attacker who can forge TWO events with EventHashes that XOR to zero can delete both without detection.
  • Not a commitment scheme: Unlike Merkle roots, the XOR hash sum does not cryptographically bind to a specific set of events.
  • Complementary mechanism: The CI is designed to work WITH the Merkle tree anchor, not replace it. The TSA-anchored Merkle root provides the cryptographic commitment; the CI provides additional omission detection for collections.

Threat model: The CI protects against accidental deletion or deletion by parties who cannot forge events (e.g., device owners deleting their own legitimately-captured evidence). It does NOT protect against adversaries who control event creation.

9.7. Canonicalization Attacks

JSON canonicalization per [RFC8785] prevents ordering and whitespace attacks. However, implementations MUST ensure:

  • Field names exactly match the specification (PascalCase for events)
  • No additional fields are introduced before hashing
  • Unicode normalization is handled consistently

10. IANA Considerations

This document has no IANA actions.

11. Implementation Experience

11.1. VeraSnap (Non-Normative)

VeraSnap is a consumer iOS application implementing CPP. It demonstrates:

  • Secure Enclave key storage with ES256 signatures
  • RFC 3161 TSA integration with multiple providers
  • Merkle tree construction per this specification
  • Offline proof verification

Implementation validated that:

  • Single-leaf Merkle proofs verify correctly
  • TSA messageImprint extraction works across TSA providers
  • Evidence Packs enable verification without network access

Deployment experience informed the explicit specification of:

  • AnchorDigest computation (avoiding double-hashing)
  • Single-leaf tree invariants
  • LeafHashMethod field for algorithm agility

12. References

12.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC3161]
Adams, C., Cain, P., Pinkas, D., and R. Zuccherato, "Internet X.509 Public Key Infrastructure Time-Stamp Protocol (TSP)", RFC 3161, DOI 10.17487/RFC3161, , <https://www.rfc-editor.org/info/rfc3161>.
[RFC4648]
Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/info/rfc4648>.
[RFC5652]
Housley, R., "Cryptographic Message Syntax (CMS)", STD 70, RFC 5652, DOI 10.17487/RFC5652, , <https://www.rfc-editor.org/info/rfc5652>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8785]
Rundgren, A., Jordan, B., and S. Erdtman, "JSON Canonicalization Scheme (JCS)", RFC 8785, DOI 10.17487/RFC8785, , <https://www.rfc-editor.org/info/rfc8785>.

12.2. Informative References

[RFC6962]
Laurie, B., Langley, A., and E. Kasper, "Certificate Transparency", RFC 6962, DOI 10.17487/RFC6962, , <https://www.rfc-editor.org/info/rfc6962>.
[C2PA]
Coalition for Content Provenance and Authenticity, "C2PA Specification", , <https://c2pa.org/specifications/>.

Appendix A. JSON Examples

A.1. Canonical Event (Normative)

The canonical event structure uses PascalCase field names. This is the structure that MUST be used for EventHash computation.

{
  "EventID": "550e8400-e29b-41d4-a716-446655440001",
  "ChainID": "urn:uuid:550e8400-e29b-41d4-a716-446655440000",
  "PrevHash": "sha256:00000000000000000000000000000000000000000000000000000000000000",
  "Timestamp": "2026-01-27T10:30:00.000Z",
  "EventType": "INGEST",
  "HashAlgo": "SHA256",
  "SignAlgo": "ES256",
  "Asset": {
    "AssetID": "asset-001",
    "AssetType": "IMAGE",
    "AssetHash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "AssetName": "IMG_0001.HEIC",
    "MimeType": "image/heic"
  },
  "EventHash": "sha256:7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730",
  "Signature": "MEUCIQDKsRwMv..."
}

A.2. SEAL Event (Normative)

{
  "EventID": "550e8400-e29b-41d4-a716-446655440010",
  "ChainID": "urn:uuid:550e8400-e29b-41d4-a716-446655440000",
  "PrevHash": "sha256:7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730",
  "Timestamp": "2026-01-27T18:00:00.000Z",
  "EventType": "SEAL",
  "HashAlgo": "SHA256",
  "SignAlgo": "ES256",
  "CollectionID": "collection-2026-01-27",
  "EventCount": 5,
  "CompletenessInvariant": {
    "ExpectedCount": 5,
    "HashSum": "sha256:1a2b3c4d5e6f7890abcdef1234567890abcdef1234567890abcdef1234567890",
    "FirstTimestamp": "2026-01-27T10:30:00.000Z",
    "LastTimestamp": "2026-01-27T17:45:00.000Z"
  },
  "MerkleRoot": "sha256:03938e2c8f758e6cae443d499b41c899c373eb0c0198bae61796a069f2b05904",
  "EventHash": "sha256:abcd1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab",
  "Signature": "MEYCIQCx..."
}

A.3. Anchor Structure (Normative)

{
  "Anchor": {
    "AnchorID": "anchor-001",
    "AnchorType": "RFC3161",
    "AnchorDigest": "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
    "AnchorDigestAlgorithm": "sha-256",
    "Merkle": {
      "TreeSize": 1,
      "LeafHashMethod": "SHA256(0x00||EventHash)",
      "LeafHash": "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
      "LeafIndex": 0,
      "Proof": [],
      "Root": "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"
    },
    "TSA": {
      "Token": "MIIEzAYJKoZIhvcNAQcCoIIEvTCCBLkCAQMx...",
      "MessageImprint": {
        "HashAlgorithm": "sha-256",
        "HashedMessage": "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"
      },
      "GenTime": "2026-01-27T10:31:00.000Z",
      "Service": "https://freetsa.org/tsr"
    }
  }
}

A.4. Evidence Pack (Non-Normative)

The Evidence Pack is a distribution format. Field names use snake_case for compatibility with common JSON conventions in web APIs. This format is non-normative; implementations MAY use alternative formats.

{
  "proof_version": "1.3",
  "proof_type": "CPP_INGEST_PROOF",
  "proof_id": "proof-001",
  "event": {
    "event_id": "550e8400-e29b-41d4-a716-446655440001",
    "event_type": "INGEST",
    "timestamp": "2026-01-27T10:30:00.000Z",
    "asset_hash": "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "asset_type": "IMAGE"
  },
  "event_hash": "sha256:7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730",
  "signature": {
    "algo": "ES256",
    "value": "MEUCIQDKsRwMv..."
  },
  "public_key": "MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...",
  "timestamp_proof": {
    "type": "RFC3161",
    "anchor_digest": "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
    "digest_algorithm": "sha-256",
    "merkle": {
      "tree_size": 1,
      "leaf_hash_method": "SHA256(0x00||EventHash)",
      "leaf_index": 0,
      "proof": [],
      "root": "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"
    },
    "tsa": {
      "token": "MIIEzAYJKoZIhvcNAQcCoIIEvTCCBLkCAQMx...",
      "message_imprint": "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
      "gen_time": "2026-01-27T10:31:00.000Z",
      "service": "https://freetsa.org/tsr"
    }
  }
}

Note: When verifying an Evidence Pack, implementations MUST reconstruct the canonical event structure (PascalCase) from the evidence pack fields before computing EventHash.

Appendix B. Test Vectors

All test vectors in this section use the domain-separated hash construction defined in this specification.

B.1. Test Vector 1: Single-Leaf Tree

Input:

EventHash = "sha256:7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730"

Computation:

EventHash_bytes = 0x7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730

LeafHash = SHA256(0x00 || EventHash_bytes)
         = SHA256(0x007d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730)
         = sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929

For TreeSize=1:
  Root = LeafHash = sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929
  LeafIndex = 0
  Proof = []
  AnchorDigest = 719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929

Expected Anchor:

{
  "Merkle": {
    "TreeSize": 1,
    "LeafHashMethod": "SHA256(0x00||EventHash)",
    "LeafHash": "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
    "LeafIndex": 0,
    "Proof": [],
    "Root": "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"
  },
  "AnchorDigest": "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"
}

Verification:

verifyMerkleProof(
    "sha256:7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730",
    0, [],
    "sha256:719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929",
    1
) = VALID

B.2. Test Vector 2: Two-Leaf Tree

Input:

EventHash[0] = "sha256:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
EventHash[1] = "sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"

Computation:

L0 = SHA256(0x00 || 0xaa...aa)
   = sha256:e0bb82791bae3c50bd9c20fa4ccdcb8064a56e5c12bc69b07e6712ac9b4429e6

L1 = SHA256(0x00 || 0xbb...bb)
   = sha256:4f16119d36ccd0da91102f57692d73934fd0ad2494280df88449accedbbfb7ea

Root = SHA256(0x01 || L0_bytes || L1_bytes)
     = SHA256(0x01 || 0xe0bb82...e6 || 0x4f1611...ea)
     = sha256:03938e2c8f758e6cae443d499b41c899c373eb0c0198bae61796a069f2b05904

TreeSize = 2
PaddedSize = 2 (no padding needed)

For index 0: Proof = ["sha256:4f16119d36ccd0da91102f57692d73934fd0ad2494280df88449accedbbfb7ea"]
For index 1: Proof = ["sha256:e0bb82791bae3c50bd9c20fa4ccdcb8064a56e5c12bc69b07e6712ac9b4429e6"]

Verification of Index 0:

1. currentHash = SHA256(0x00 || EventHash[0]_bytes)
              = sha256:e0bb82791bae3c50bd9c20fa4ccdcb8064a56e5c12bc69b07e6712ac9b4429e6

2. index = 0, which is EVEN -> current is LEFT child

3. siblingHash = sha256:4f16119d36ccd0da91102f57692d73934fd0ad2494280df88449accedbbfb7ea

4. currentHash = SHA256(0x01 || currentHash_bytes || siblingHash_bytes)
              = sha256:03938e2c8f758e6cae443d499b41c899c373eb0c0198bae61796a069f2b05904

5. Compare with expected Root: MATCH

Result: VALID

B.3. Test Vector 3: TSA messageImprint Verification

Input:

AnchorDigest = "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"

TSA Token TSTInfo contains:
  messageImprint.hashAlgorithm = 2.16.840.1.101.3.4.2.1 (SHA-256)
  messageImprint.hashedMessage = 0x719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929

Verification:

1. Extract messageImprint.hashedMessage from TSTInfo
   hashedMessage = 0x719f871f...1e929

2. Convert to lowercase hex string
   tstImprint = "719f871f1018a17ebe199d4f0db27e3a4929f8ab3e46f5c0d30054f4b331e929"

3. Compare with AnchorDigest (case-insensitive)
   tstImprint == lowercase(AnchorDigest) ?
   "719f871f...1e929" == "719f871f...1e929" ? YES

Result: VALID

Acknowledgements

The authors thank:

Author's Address

Tokachi Kamimura
VeritasChain Co., Ltd.