Guide

Drupal log noise — filtering watchdog to find real problems

You went to /admin/reports/dblog to investigate a single 500 error a user reported, and the page gave you 47,000 entries from the last 24 hours. Drupal recent log messages are overwhelming.

1. Problem

You went to /admin/reports/dblog to investigate a single 500 error a user reported, and the page gave you 47,000 entries from the last 24 hours. Drupal recent log messages are overwhelming. Most of them are cron, page not found, php, access denied, and a wall of notice-level entries from contrib modules that no one asked to be told about. Somewhere in that flood is the real problem — but you cannot see it.

You filter by severity. Still thousands. You filter by type. The dropdown has 60+ values. You sort by date. The 404s drown everything.

This is the classic Drupal watchdog too many entries problem: the database log table (watchdog) collects every event indiscriminately, contrib modules log freely, and by the time you need the log, it is no longer useful. You suspect a real fatal is in there — buried under noise — but the dblog UI gives you no way to suppress signal-to-noise classification, and grep on a database table is awkward at best.

If you are reading this in panic because production is misbehaving and dblog is a wall of green, this guide is about getting the noise out of the way and surfacing the entries that actually represent failure.

2. Impact

A noisy log is a broken log. There are three real consequences.

Real errors get missed. When watchdog shows 47,000 entries and 46,800 are 404s from a misconfigured CDN probe, the three php.fatal entries that explain why checkout is failing are statistically invisible. You will miss them. Engineers learn to ignore dblog entirely once it crosses the noise threshold, which means the only durable record of failure on the site is being treated as background radiation.

The database becomes the bottleneck. The watchdog table on a busy Drupal site can grow by hundreds of thousands of rows per day. Inserts compete with content writes. The cleanup cron (dblog.cron) deletes the oldest rows once you exceed dblog.settings:row_limit, but if your retention is set to 1000 and you produce 50,000/hour, you have effectively a 1-minute window of log history. Any incident older than that is gone forever.

Severity loses meaning. Drupal has eight RFC 5424 severity levels (emergency through debug). When contrib modules log routine successes at notice and configuration warnings at warning, the levels lose their semantic meaning. error becomes the new notice. By the time something logs at critical, no one trusts the label.

This is not a cosmetic problem. It is a detection problem. You cannot run a Drupal site reliably if you cannot tell signal from noise in your own logs.

3. Why It’s Hard to Spot

Drupal does not surface log volume as a metric. The dblog UI shows you a paginated table; it does not show you the rate of entries per type, per hour, or the distribution of severities. There is no "this channel is emitting 3,000x its baseline" warning. You only see what someone clicked through to.

Uptime checks miss it entirely — the site responds 200, dblog is just a database table, and the database is healthy. Synthetic monitoring on the front-end has no visibility into the admin reports. You can have a module logging error-level entries every second and your StatusCake check is happily green.

The Drupal community's defaults make it worse. Core ships with dblog enabled and row_limit = 1000, which means high-traffic sites silently truncate their own history. Many contrib modules log success cases ("Cache cleared", "Index built", "Email sent to user 42") at notice or info, on the theory that more logging is better. Aggregated across 80 contrib modules on a typical mid-size Drupal site, this produces a constant background of low-value entries.

And dblog has no native regex filter, no rate display, no diff-from-yesterday view. The closest you can get in core is the type/severity dropdowns, which is exactly what fails when there are 60 types and the noisy ones are at the same severity as the real ones.

This is the silent-failure shape: real problems are emitted, recorded, and never read. The data exists; the detection does not.

4. Cause

Every entry in /admin/reports/dblog corresponds to a watchdog.event — a single call to \Drupal::logger($channel)->log($level, $message, $context). The dblog module persists these to the watchdog table; syslog module mirrors them to syslog. The Logystera Drupal agent ships these as watchdog.event signals with the original severity, channel (type), message template, and placeholders.

A watchdog.event signal carries:

severity: integer 0–7 (emergency=0, alert=1, critical=2, error=3, warning=4, notice=5, info=6, debug=7)
type: the channel — usually a module name (cron, page not found, php, user, system, myproject_payments)
message: the un-substituted template (e.g. %type: @message in %function (line %line of %file).)
variables: the placeholder values
request_uri, referer, uid, hostname, timestamp

The volume problem comes from the fact that the channel and severity are entirely at the discretion of whichever code emitted the log call. Core's "page not found" channel emits a warning for every 404, including bot traffic and asset misses. The "php" channel emits a notice for E_NOTICE-level PHP warnings, and an error for runtime exceptions. There is no governance — the log is whatever the modules say it is.

Logystera's signal-to-noise classification works on this data the same way you would by hand if you had time: it aggregates watchdog.event by (type, severity, message_template) and treats high-volume, low-severity, repetitive templates as noise; it treats low-volume, high-severity, anomalous templates as signal. The result is a derived view where a single new php.fatal-correlated watchdog.event of severity 2 (critical) is louder than 40,000 page-not-found warnings.

Two supporting signals matter here: php.warning (E_WARNING and E_NOTICE caught via Drupal's PHP error handler — these become "php" channel watchdog entries with severity 4 or 5) and php.fatal (E_ERROR, E_PARSE, uncaught exceptions — severity 3 in watchdog, but distinct in PHP-FPM's stderr because they kill the request). perf.hook_timing flags hooks that overrun expected duration — slow logging hooks themselves often contribute to the noise volume.

5. Solution

5.1 Diagnose (logs first)

Stop using the dblog UI. Go to the database and the disk.

Step 1 — Profile the noise. Find which (type, severity) pairs dominate.

-- Run via drush sql:cli or psql/mysql
SELECT type, severity, COUNT(*) AS n
FROM watchdog
WHERE timestamp > UNIX_TIMESTAMP(NOW() - INTERVAL 1 HOUR)
GROUP BY type, severity
ORDER BY n DESC
LIMIT 20;

This produces the volume distribution. Each row is a candidate watchdog.event cluster. Anything emitting more than ~100/hour at notice or info is almost certainly noise. Anything appearing at severity 0–3 with low count is the signal you want.

Step 2 — Pull the actual messages for a suspicious channel.

SELECT severity, message, COUNT(*) AS n,
       MAX(FROM_UNIXTIME(timestamp)) AS last_seen
FROM watchdog
WHERE type = 'php' AND timestamp > UNIX_TIMESTAMP(NOW() - INTERVAL 1 HOUR)
GROUP BY severity, message
ORDER BY n DESC;

The php channel is where the supporting signals live. Severity 3 entries here correspond to php.fatal signals; severity 4 entries map to php.warning. Group by message template, not the substituted text — otherwise every unique URL or user ID looks like a different error.

Step 3 — Cross-check with PHP-FPM's stderr. Drupal's PHP error handler catches and logs warnings to watchdog, but true fatals (uncaught exceptions, parse errors, allowed-memory-exhausted) often appear in PHP-FPM's log before — or instead of — watchdog, because the request died before the shutdown handler ran.

grep -E "PHP (Fatal error|Parse error|Uncaught)" /var/log/php-fpm/error.log | tail -50

This produces php.fatal signals. If you see fatals in PHP-FPM that are not in watchdog, your error handler is being bypassed — usually because the fatal happened during bootstrap.

Step 4 — Check the web server log for 404 floods feeding the "page not found" channel.

awk '$9 == 404 {print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20

If a small set of paths is generating most of the 404s — /wp-login.php, /.env, /sites/default/files/private/, scanner probes — that is the noise feeding watchdog.event type=page not found. These are real watchdog.event signals but they are not failure signals; they are bot signals.

Step 5 — Look for slow hooks contributing to log volume. A hook that runs on every page load and logs on success will generate 1 entry per request.

SELECT type, message, COUNT(*) AS n
FROM watchdog
WHERE severity >= 5  -- notice/info/debug
  AND timestamp > UNIX_TIMESTAMP(NOW() - INTERVAL 10 MINUTE)
GROUP BY type, message
HAVING n > 50
ORDER BY n DESC;

If you see an entry with thousands of occurrences in 10 minutes, you have found a logging-on-every-request bug. Cross-reference with perf.hook_timing — the hook that emits it is also probably running slowly, which is why someone added the logging in the first place.

By the end of step 5 you have: the noisy channels, the actual failures hiding in them, the bot-driven 404 floods, and the modules to either silence or fix.

5.2 Root Causes

(see root causes inline in 5.3 Fix)

5.3 Fix

Each fix maps to a specific cause and the signal it changes.

Cause 1: A contrib or custom module logs success cases at notice.

Signal pattern: high-volume watchdog.event with severity ≥ 5, single type, repetitive message template.
Fix: downgrade to debug, or remove. If it is contrib, file an issue on drupal.org and patch locally. If it is your code, either delete the log call or move it behind if (\Drupal::config('system.logging')->get('error_level') === 'verbose').

Cause 2: The "page not found" channel is dominated by bot traffic.

Signal pattern: watchdog.event type=page not found, severity 4, request_uri matching scanner patterns (.env, wp-*, phpmyadmin, .git/config).
Fix: block at the edge (Cloudflare WAF, fail2ban) or disable logging for 404s by setting $config['system.logging']['404_log'] = FALSE; in settings.php. Disabling the log is fine — these are not failures, and you have access logs for forensics.

Cause 3: Real php.warning entries from a buggy module are flooding the "php" channel.

Signal pattern: watchdog.event type=php, severity 4–5, message template referencing the same %function and %file.
Fix: trace to the source file and line. Common cases: undefined-index access in render arrays, deprecated function usage after a core update, type-juggling warnings under PHP 8.x. Fix the underlying code; the warnings stop.

Cause 4: A genuine php.fatal is being swallowed by the noise.

Signal pattern: watchdog.event type=php, severity 3, low count, correlated with PHP-FPM stderr PHP Fatal error entries.
Fix: this is the reason you started filtering. Read the message. Fix the code. Common cases: missing class after a composer.json change, undefined service in DI, typed-property access on NULL. The fatal is a real failure regardless of how loud the log is around it.

Cause 5: dblog row_limit is too low and useful history is being truncated.

Signal pattern: you cannot find an event you saw an hour ago.
Fix: drush config:set dblog.settings row_limit 1000000 — or, better, install the syslog module, ship to Logystera, and set dblog row_limit low (1,000) since you no longer rely on it for retention. Watchdog becomes a debug convenience; the durable record lives elsewhere.

Cause 6: A slow hook (perf.hook_timing flagged) is also a logging hook.

Signal pattern: perf.hook_timing for hook_X correlated with high watchdog.event volume from the same module.
Fix: the hook is doing too much per request. Move work into a queue. Stop logging on the hot path.

The order matters: fix the noise first (causes 1, 2, 6), then the real warnings (3), then the fatals (4), then retention (5). Filtering before fixing is what turns 47,000 entries into 47.

5.4 Verify

Re-run step 1 from section 5. The volume distribution should look fundamentally different.

SELECT type, severity, COUNT(*) AS n
FROM watchdog
WHERE timestamp > UNIX_TIMESTAMP(NOW() - INTERVAL 1 HOUR)
GROUP BY type, severity
ORDER BY n DESC
LIMIT 20;

Healthy looks like:

No single (type, severity) pair exceeds ~500/hour for a mid-size site
The top entries are at severity 4–5, not severity 5–6
The "page not found" channel is either disabled or producing reasonable volume
The "php" channel is empty or near-empty at severity 3 (no recurring fatals)
Total watchdog row count over 1 hour is in the low thousands, not tens of thousands

For the supporting signals, check PHP-FPM:

grep -c "PHP Fatal error" /var/log/php-fpm/error.log

If this is zero for 30 minutes under normal traffic, php.fatal is no longer firing and the fix held. If it climbs again, you missed a code path.

For php.warning:

grep "PHP Warning\|PHP Notice" /var/log/php-fpm/error.log | tail -100

The same warning recurring is unfixed code. New warnings are new bugs.

The signal that should stop appearing: high-cardinality, high-volume watchdog.event clusters at low severity. The signal that should remain visible: low-volume, high-severity watchdog.event entries from real problems. If both are working, the dblog UI is usable again — and more importantly, your drupal dblog noise filter real errors workflow is now actual filtering, not scrolling.

Give it a full traffic cycle (24 hours) before declaring victory. Bot traffic and cron-driven log volume are not constant; the spike you cleared at 10 AM may return at 2 AM when the report module runs.

6. How to Catch This Early

Fixing it is straightforward once you know the cause. The hard part is knowing it happened at all.

This issue surfaces as watchdog.event.

The fix above is reactive — you waited until dblog was unusable, then dug in. The real problem is not fixing watchdog noise once. It is knowing, in real time, when the volume profile changes.

Drupal does not alert you when the php channel suddenly starts emitting 100 entries/minute. Cron-driven log truncation does not alert you when it deletes the entry that explained the outage. There is no built-in baseline for "what does normal log volume look like for this site." Every Drupal team rediscovers this the hard way.

This type of issue surfaces as watchdog.event, which Logystera ingests directly from the dblog/syslog stream and classifies by (type, severity, message_template) — the same grouping you would do by hand at 2 AM. The signal-to-noise classification suppresses high-volume routine entries and elevates anomalous ones, so a single new php.fatal-class watchdog.event is alertable even when 40,000 page-not-found warnings are also flowing through. Supporting signals (php.warning, php.fatal, perf.hook_timing) are correlated automatically — when a slow hook starts logging, you see one alert, not three.

The point is not the dashboard. The point is that the failure became audible the moment it started, instead of the next time someone happened to open /admin/reports/dblog.

7. Related Silent Failures

Drupal cron silent failure — cron.run heartbeat absence; queue processing stalls; watchdog.event type=cron severity 3 entries lost in noise
Drupal page not found bot floods — watchdog.event type=page not found cardinality explosion; scanner probes; consider disabling 404 logging entirely
Drupal PHP fatal after composer update — php.fatal from missing class; appears in PHP-FPM stderr before watchdog catches it
Drupal slow hook execution — perf.hook_timing correlated with watchdog.event from the same module; logging-on-hot-path bugs
Drupal dblog table growth and DB write contention — when row_limit is high and noise is uncontrolled; insert latency on shared tables

See what's actually happening in your Drupal system

Connect your site. Logystera starts monitoring within minutes.

Request a demo Drupal integration