| Internet-Draft | SCITT-Refusal-Events | January 2026 |
| Kamimura | Expires 3 August 2026 | [Page] |
This document defines a claim set for recording AI content refusal events. The claim set specifies the semantic content and correlation rules for refusal audit trails, independent of any particular serialization format. The claims are designed to be carried within SCITT Signed Statements and verified using SCITT Receipts.¶
This specification addresses claim semantics and verification requirements; it does not mandate a specific encoding. A CDDL definition is provided for CBOR-based implementations, and equivalent JSON representations are shown in an appendix for illustration.¶
This specification provides auditability of logged refusal decisions. It does not define content moderation policies, classification criteria, or what AI systems should refuse.¶
This note is to be removed before publishing as an RFC.¶
The latest version of this document, along with implementation resources and examples, can be found at <https://github.com/veritaschain/cap-spec>.¶
Discussion of this document takes place on the SCITT Working Group mailing list (scitt@ietf.org).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 3 August 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
AI systems capable of generating content increasingly implement safety mechanisms to refuse requests deemed harmful, illegal, or policy-violating. However, these refusal decisions typically leave no verifiable audit trail. When a system refuses to generate content, the event vanishes—there is no receipt, no log entry accessible to external parties, and no mechanism for third-party verification.¶
This document defines a claim set for recording such refusal events. The claim set specifies:¶
The claims are designed to be carried within SCITT Signed Statements [I-D.ietf-scitt-architecture] and verified using SCITT Receipts. This document does not mandate a specific serialization; implementations may use CBOR, JSON, or other formats as appropriate for their deployment context.¶
The January 2026 Grok incident demonstrated the need for verifiable refusal records. The AI system generated millions of harmful images while the provider claimed moderation systems were functioning. External parties had no mechanism to verify whether refusals actually occurred or whether claimed safeguards were enforced.¶
This creates several problems:¶
SCITT provides the infrastructure—Signed Statements, Transparency Services, and Receipts—to address this gap. This document defines the claim semantics for AI refusal events that can be carried within that infrastructure.¶
This document defines:¶
This document does NOT define:¶
This specification is an application profile for SCITT, defining domain-specific claim semantics. It does not extend or modify SCITT architecture.¶
This specification provides auditability of refusal decisions that are logged, not cryptographic proof that no unlogged generation occurred. An AI system that bypasses logging entirely cannot be detected by this mechanism alone.¶
Transparency Services do not enforce the completeness invariant defined in this document; such checks are performed by verifiers at the application level using the claim semantics defined herein.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document uses terminology from [I-D.ietf-scitt-architecture]. The following terms are specific to this specification:¶
This specification maps directly to SCITT primitives:¶
| This Document | SCITT Primitive |
|---|---|
| ATTEMPT claim set | Signed Statement payload |
| Outcome claim set | Signed Statement payload |
| AI System | Issuer |
| Inclusion proof | Receipt |
This section defines the claims for refusal events. Claims are specified by their semantic meaning; encoding is deployment-specific. Section 4 provides a CDDL grammar for CBOR representation.¶
The following claims appear in all event types:¶
An ATTEMPT records receipt of a generation request. It MUST be created before any safety evaluation begins.¶
The requirement to create ATTEMPT before safety evaluation prevents selective logging where only "safe" requests are recorded.¶
A DENY records refusal of a generation request.¶
A GENERATE records successful content generation.¶
An ERROR records system failure, not a policy decision.¶
ERROR does not indicate policy enforcement. A high ERROR rate may indicate operational issues or adversarial inputs.¶
The completeness invariant is the core integrity property:¶
For every ATTEMPT with event-id A that is registered with a Transparency Service, there MUST exist exactly one Outcome (DENY, GENERATE, or ERROR) with attempt-id equal to A.¶
Formally:¶
|{ATTEMPT}| = |{DENY}| + |{GENERATE}| + |{ERROR}|
¶
where each Outcome references exactly one ATTEMPT, and no ATTEMPT is referenced by multiple Outcomes.¶
Violations indicate:¶
Verifiers SHOULD check this invariant when auditing event streams. Transparency Services do not enforce this invariant; it is an application-level semantic constraint.¶
To maintain audit trail integrity:¶
Some scenarios require delayed outcomes (human review, system recovery). Implementations SHOULD document expected latency bounds.¶
This section provides a CDDL [RFC8610] grammar for CBOR-based representation of the claim set. This grammar is RECOMMENDED for implementations using CBOR encoding within SCITT Signed Statements.¶
Implementations using other encodings (e.g., JSON) SHOULD maintain semantic equivalence with this definition.¶
; Refusal Event Claim Set
; draft-kamimura-scitt-refusal-events-02
refusal-event = attempt-event / outcome-event
; Common claims (appear in all events)
common-claims = (
event-type: event-type-value,
event-id: uuid,
timestamp: time,
issuer: tstr
)
event-type-value = "ATTEMPT" / "DENY" / "GENERATE" / "ERROR"
; ATTEMPT event
attempt-event = {
common-claims,
prompt-hash: hash-value,
input-type: input-type-value,
? reference-input-hashes: [+ hash-value],
? session-id: uuid,
? actor-hash: hash-value,
? model-id: tstr,
? policy-id: tstr,
* tstr => any ; extension point
}
input-type-value = "text" / "image" / "text+image" /
"audio" / "video" / "multimodal"
; Outcome events
outcome-event = deny-event / generate-event / error-event
deny-event = {
common-claims,
attempt-id: uuid,
? risk-category: tstr,
? risk-score: float16 .le 1.0,
? refusal-reason: tstr,
? human-override: bool,
* tstr => any
}
generate-event = {
common-claims,
attempt-id: uuid,
? output-hash: hash-value,
* tstr => any
}
error-event = {
common-claims,
attempt-id: uuid,
? error-code: tstr,
? error-message: tstr,
* tstr => any
}
; Supporting types
uuid = bstr .size 16 / tstr ; binary or string representation
hash-value = bstr / tstr ; raw bytes or prefixed string
time = #6.0(tstr) / #6.1(number) / uint ; tag 0 (date-time string),
; tag 1 (epoch), or raw uint
¶
Notes on the CDDL definition:¶
The claim names defined in this specification (e.g., "event-type", "prompt-hash") represent semantic labels. Implementations MAY use alternative key representations (such as short integer labels for CBOR efficiency) provided the semantic meaning is preserved and documented.¶
This section describes how refusal event claim sets integrate with SCITT infrastructure.¶
A refusal event claim set is carried as the payload of a SCITT Signed Statement:¶
For CBOR-encoded claim sets, the Content-Type "application/cbor" or a more specific media type MAY be used.¶
After creating a Signed Statement, the Issuer SHOULD register it with a SCITT Transparency Service via SCRAPI [I-D.ietf-scitt-scrapi].¶
The Transparency Service returns a Receipt proving inclusion. Issuers SHOULD store Receipts for future verification requests.¶
Registration timing affects audit trail integrity:¶
A Verifiable Refusal Record consists of:¶
Verifiers confirm:¶
This demonstrates that a refusal was logged. It does not prove that no unlogged generation occurred.¶
A Transparency Service MAY apply a Registration Policy that validates:¶
Transparency Services SHOULD NOT attempt to enforce the completeness invariant; this is an application-level verification performed by auditors.¶
This document has no IANA actions at this time.¶
Future revisions may request:¶
The working group should consider whether standardized registries would benefit interoperability, or whether deployment-specific vocabularies are sufficient.¶
This specification assumes:¶
This specification does NOT protect against:¶
An adversary controlling the AI system might omit events to hide policy violations. The completeness invariant provides detection for logged events: auditors can identify ATTEMPTs without Outcomes.¶
However, if an ATTEMPT is never logged, this specification cannot detect the omission. Complete prevention requires external mechanisms (trusted execution, attestation [RFC9334]) outside this specification's scope.¶
A malicious Transparency Service might present different views to different parties. Mitigations include:¶
Replay of old events is mitigated by:¶
Refusal events may be triggered by harmful content. To prevent the audit log from becoming a repository of harmful content:¶
actor-hash provides pseudonymization. Implementations SHOULD:¶
This appendix provides an example taxonomy for risk-category values. This taxonomy is non-normative; deployments MAY use different values appropriate to their context.¶
| Category | Description |
|---|---|
| CSAM_RISK | Child sexual abuse material |
| NCII_RISK | Non-consensual intimate imagery |
| MINOR_SEXUALIZATION | Sexualization of minors |
| REAL_PERSON_DEEPFAKE | Unauthorized realistic depiction |
| VIOLENCE_EXTREME | Graphic violence |
| HATE_CONTENT | Discriminatory content |
| TERRORIST_CONTENT | Terrorism-related material |
| SELF_HARM_PROMOTION | Self-harm encouragement |
| COPYRIGHT_VIOLATION | Clear IP infringement |
| COPYRIGHT_STYLE_MIMICRY | Artist style imitation |
| OTHER | Other policy violations |
A future revision may define an IANA registry for risk categories if interoperability requires standardized values.¶
This appendix provides guidance for deployments with varying assurance requirements. These are not conformance tiers; all deployments MUST satisfy the normative requirements in Section 3.¶
Suitable for voluntary transparency:¶
Suitable for regulatory compliance (e.g., EU AI Act):¶
Suitable for maximum accountability:¶
This appendix shows equivalent JSON representations of the claim sets defined in Section 3. These examples are for illustration; the normative definition is the CDDL in Section 4.¶
{
"event-type": "ATTEMPT",
"event-id": "019467a1-0001-7000-0000-000000000001",
"timestamp": 1738159425,
"issuer": "urn:example:ai-service:img-gen-prod",
"prompt-hash": "sha256:7f83b1657ff1fc53b92dc18148a1d65d...",
"input-type": "text+image",
"reference-input-hashes": [
"sha256:9f86d081884c7d659a2feaa0c55ad015..."
],
"session-id": "019467a1-0001-7000-0000-000000000000",
"actor-hash": "sha256:e3b0c44298fc1c149afbf4c8996fb924...",
"model-id": "img-gen-v4.2.1",
"policy-id": "content-safety-v2"
}
¶
{
"event-type": "DENY",
"event-id": "019467a1-0001-7000-0000-000000000002",
"timestamp": 1738159426,
"issuer": "urn:example:ai-service:img-gen-prod",
"attempt-id": "019467a1-0001-7000-0000-000000000001",
"risk-category": "NCII_RISK",
"risk-score": 0.94,
"refusal-reason": "Content policy violation detected"
}
¶
An Evidence Pack is a self-contained collection of events suitable for regulatory submission or third-party audit. This format is deployment-specific and not required for conformance.¶
A typical Evidence Pack contains:¶
The CAP-SRP specification [CAP-SRP] defines a detailed Evidence Pack format for content AI applications.¶
The authors thank the SCITT Working Group for feedback on earlier revisions, particularly regarding SCITT charter scope and the distinction between claim semantics and payload formats.¶
This work builds upon transparency log concepts from Certificate Transparency [RFC6962] and the SCITT architecture.¶