How We Managed a DNSSEC Failure at the .de TLD: A Step-by-Step Response Guide

By

Introduction

When a top-level domain like .de—one of the largest country-code TLDs globally—suffers a DNSSEC misconfiguration, the impact can cascade across the internet. On May 5, 2026, at roughly 19:30 UTC, DENIC, the registry for .de, began publishing incorrect DNSSEC signatures. Any validating resolver, including Cloudflare's 1.1.1.1, rejected those signatures per the DNSSEC specification and returned SERVFAIL to clients, making millions of domains unreachable. This guide walks through how we responded, step by step, so you can handle a similar TLD-wide DNSSEC failure if it ever occurs.

How We Managed a DNSSEC Failure at the .de TLD: A Step-by-Step Response Guide
Source: blog.cloudflare.com

What You Need

Step-by-Step Response

Step 1: Detect the Anomaly

Monitor SERVFAIL spikes. In our case, alerts on Cloudflare Radar showed a sharp increase in SERVFAIL responses for .de domains. Validating resolvers like 1.1.1.1 returned failures because the signatures from DENIC couldn't be verified. Set up thresholds: e.g., >10% increase in SERVFAIL for a specific TLD compared to baseline. Use tools like Prometheus, Grafana, or built-in resolver logs to catch this early.

Step 2: Isolate the Validation Failure

Confirm the issue is DNSSEC-related. Check resolver logs for messages like “validation failure” or “no valid RRSIG”. Query the affected zone manually using dig +dnssec .de SOA. If the response fails validation but non-DNSSEC queries succeed, the problem is with signatures. In our incident, the .de zone published signatures that did not match the published DNSKEY records, breaking the chain of trust.

Step 3: Identify the Root Cause

Determine which key is misused. DNSSEC uses Zone Signing Keys (ZSK) and Key Signing Keys (KSK). A common cause is a failed key rotation: the old key is removed before new signatures propagate. In the .de case, DENIC had likely started a KSK rotation but the DS record in the root zone still pointed to the old key, or the new signatures were signed with a key not yet anchored. Check the DS records at the parent (root for TLDs) and compare with the DNSKEY in the child zone. Use delv or validns to trace the chain.

Step 4: Implement a Temporary Mitigation

Disable DNSSEC validation for the affected TLD. This is a last resort but necessary to restore service while the registry fixes the issue. On Cloudflare's 1.1.1.1, we added an exception to bypass validation for the .de zone. In BIND, you would add a validation exemption in named.conf using trust-anchors with an empty key or set dnssec-validation no; for that zone. In Unbound, use val-override-trusted-keys with a manual trust anchor or val-permissive-mode yes;. Document the change and set a timer to revert once fixed.

How We Managed a DNSSEC Failure at the .de TLD: A Step-by-Step Response Guide
Source: blog.cloudflare.com

Important: This makes queries for .de domains insecure—they lose integrity verification. Communicate this to internal teams and potentially to users via status pages.

Step 5: Monitor and Communicate

Watch for resolution recovery. After applying the bypass, confirm that SERVFAIL rates drop for .de. Simultaneously, contact the registry (DENIC) to report the issue and get an ETA. Use established escalation paths: emails, phone calls, or registry forums. Provide them with specific error details: which signatures failed, timestamps, and affected DNSKEYS. Keep internal stakeholders updated via incident channels.

Step 6: Re-enable DNSSEC Once Fixed

Remove the bypass after registry confirms fix. DENIC corrected the signatures by republishing the .de zone with valid RRSIG records. Validate the fix with a few test domains (e.g., example.de) using a non-bypassed resolver. Verify that dig +dnssec returns AD flag and no validation errors. Once confident, revert the configuration change. For Cloudflare, we removed the .de exception and confirmed that 1.1.1.1 began validating again without SERVFAIL spikes.

Tips for Future Preparedness

By following these steps, you can minimize disruption during a TLD-level DNSSEC outage. The .de incident taught us that even well-maintained registries can fail, but a rapid, structured response keeps the internet running.

Tags:

Related Articles

Recommended

Discover More

Apple's Q2 2026 Earnings Drive Modest After-Hours Stock GainJohn Ternus Steps into the Spotlight: What Apple’s Q2 2026 Earnings Call Reveals About the FutureOnePlus and Realme Merge in Shock Restructuring Amid Global PullbackApple Watch Series 12 and watchOS 27: Touch ID, New Chip, and Satellite Upgrades Expected This FallWhen AI Eliminates the 'Bug': The Hidden Cost of Efficiency on Team Bonds