Guide
Drupal site down — database connection refused (PDOException 2002 / 1045)
1. Problem
Your Drupal site is gone. Every page returns the same generic error and the homepage, /user/login, /admin, and any JSON:API endpoint all serve the same wall of text:
PDOException: SQLSTATE[HY000] [2002] Connection refused
Or, if the credentials are the issue and not the host:
PDOException: SQLSTATE[HY000] [1045] Access denied for user 'drupal'@'10.0.4.21' (using password: YES)
The browser shows a stack trace if $config['system.logging']['error_level'] is verbose, or just "The website encountered an unexpected error" if not. drush status hangs and then prints Drupal bootstrap : Successful followed by Database : Connection failed. Five minutes ago everything worked. This is the textbook "drupal site down database connection refused" scenario, and it surfaces as a db.connection_failed signal in your error log before the first user complains.
The standard advice — restart MySQL, clear the cache — is wrong half the time. Two different failure modes hide behind the same blank page: the database is unreachable (2002) or the database rejects the credentials (1045). They look identical in the browser and need different fixes.
2. Impact
A Drupal site that can't open a PDO connection is fully offline. The router, entity API, form system, and the cache backend (if you use the DB cache) all fail at bootstrap. Anonymous traffic can keep flowing for a few minutes if you have a reverse-proxy cache (Varnish, Cloudflare APO), but anything authenticated — Commerce checkout, member portals, /admin/content — is broken from the first request.
For a Drupal Commerce store, abandoned carts and failed commerce_payment redirects can corrupt order state — Drupal writes the order row and then the payment row, so a DB blip mid-checkout produces orphaned orders that surface two days later as "I was charged but didn't get the receipt." For a publisher with paywalled Drupal 10 content, every minute is refund tickets and churn risk.
There's a quieter cost: watchdog (the dblog module) cannot write its own log entry when the database is the failure mode — your usual diagnostics live in the thing that broke. drupal_php_error_total and drupal_server_errors_total stack up unread, and by the time someone tails /var/log/php-fpm/error.log, the original cause may already be overwritten by the symptom cascade.
3. Why It’s Hard to Spot
Drupal's failure mode here is uniquely opaque. When PDO fails at bootstrap, the Symfony kernel catches the PDOException and renders a generic error page. The user-facing output contains nothing diagnostic — by design, to avoid leaking settings.php in a 500 response. And Drupal's structured logger (dblog) also lives in the database that just failed, so watchdog cannot record the outage.
Standard uptime monitors miss this for three compounding reasons:
- The web server (nginx + PHP-FPM) is still healthy. Only the body is wrong.
- Drupal's exception handler returns
500, but a CDN page cache keeps serving the cached homepage at200long after/adminis broken. drush statusprints partial output and exits non-zero, but most cron health checks only grep for "Successful".
Hosting dashboards make it worse. cPanel, Pantheon, Acquia, Platform.sh all surface "MySQL service: running" as the health metric. None check that this site's username@host combination can authenticate against the right schema. If a DBA rotated the password during off-hours, the dashboard stays green while every Drupal install on the box is dead.
The result is a silent failure: visitors see a broken page, monitoring sees green, and the only place the truth lives is in PHP's stderr and MySQL's error log.
4. Cause
Drupal's \Drupal\Core\Database\Connection::__construct() calls into PDO at bootstrap, immediately after settings.php loads. PDO opens a TCP (or Unix) socket to the host in $databases['default']['default']['host'], then sends auth with the configured user and password.
If the TCP connection fails — host unreachable, port closed, MySQL daemon down, firewall blocking — PDO throws PDOException with SQLSTATE HY000 and driver code 2002. The Logystera Drupal agent's DatabaseErrorHandler matches the message against /gone away|lost connection|connection refused|can.t connect/i and emits a db.connection_failed signal with the SQLSTATE attached.
If TCP connects but credentials fail, MySQL returns error 1045 and PDO throws PDOException with SQLSTATE 28000. The same handler matches /access denied for user/i and emits db.access_denied — distinct from db.connection_failed because the fix is different.
Because dblog depends on PDO, watchdog never records the failure. The Logystera agent emits db.connection_failed directly to its in-process signal buffer, which flushes to the gateway out-of-band — independent of dblog — so the signal survives even when the database is the thing that broke.
5. Solution
5.1 Diagnose (logs first)
The diagnosis path is mechanical: confirm the failure mode in PHP's log, separate "host unreachable" from "auth rejected," and time-correlate with the most recent change window.
1. PHP error log — confirms Drupal's PDO failed and tells you the SQLSTATE.
# Drupal 9/10 with PHP-FPM (Debian/Ubuntu)
tail -n 500 /var/log/php-fpm/error.log | grep -iE "PDOException|SQLSTATE\[HY000\]|SQLSTATE\[28000\]"
The line you want looks like this — the [2002] or [1045] is the diagnostic key:
PHP Fatal error: Uncaught PDOException: SQLSTATE[HY000] [2002] Connection refused
in /var/www/drupal/web/core/lib/Drupal/Core/Database/Driver/mysql/Connection.php:99
That stack frame in Driver/mysql/Connection.php is the literal bootstrap PDO call. This is what produces db.connection_failed in the Logystera agent.
2. MySQL error log — confirms whether MySQL even saw the connection.
# Default locations: Debian/Ubuntu and RHEL/CentOS
grep -iE "aborted connection|access denied|too many connections" \
/var/log/mysql/error.log /var/log/mysqld.log 2>/dev/null | tail -n 20
What you see decides everything:
- Nothing for the time window → MySQL never received the TCP attempt → produces
db.connection_failedwith code 2002. The problem is between Drupal and MySQL: hostname, firewall, or MySQL is down. Access denied for user 'drupal'@'10.0.4.21'→ producesdb.access_deniedwith code 1045. Auth failure.Aborted connection ... Got timeout reading communication packets→ intermittent network orwait_timeoutissue, surfaces later asdb.connection_failedmid-request.
3. Test from Drupal's perspective with drush sql:query.
This is the single most useful command in this guide:
drush -r /var/www/drupal/web sql:query "SELECT 1;" 2>&1
# or, if drush can't bootstrap:
drush -r /var/www/drupal/web sql:connect | bash -c "$(cat) -e 'SELECT 1;'"
If drush sql:query returns 1, your settings.php credentials are valid and the failure is intermittent — likely connection-pool exhaustion (max_connections) under load, not a static config bug. Watch the MySQL error log live with tail -f /var/log/mysql/error.log and reproduce.
If it errors with SQLSTATE[HY000] [2002], the host in $databases is unreachable. If it errors with [1045], your settings.php password is wrong for the user/host pair.
4. Time-correlate with the most recent change window.
db.connection_failed events almost never happen in isolation. They cluster around a specific real-world event: a deploy, a database password rotation, a composer update that re-wrote settings.php, an off-hours backup window, a security-group change in your VPC. Find the change.
# Did anything touch settings.php in the last 24h?
ls -la /var/www/drupal/web/sites/default/settings.php
ls -la /var/www/drupal/web/sites/default/settings.local.php 2>/dev/null
# Was Drupal's deployment touched? (composer / drush deploy timestamps)
stat -c '%y %n' /var/www/drupal/composer.lock /var/www/drupal/web/sites/default/services.yml
# What did the deploy log say at the time of the first failure?
journalctl --since "30 minutes ago" -u php8.3-fpm | grep -iE "PDO|SQLSTATE"
If the first PDOException in php-fpm/error.log lines up with the mtime on settings.php or with a composer update window, you have your answer: somebody's deploy rewrote the database stanza or the password is now stale. That correlation is what turns "the site is down" into "the site has been emitting db.connection_failed since 02:14 UTC, immediately after the nightly DB password rotation cron at 02:13 UTC."
5.2 Root Causes
Each cause maps to a specific Drupal-side fix and a specific signal. Prioritized by frequency.
settings.phphost wrong or stale —host,port, orunix_socketin the$databasesarray points to a host that is no longer reachable. Common after a DB migration, a managed-hosting upgrade (Acquia/Pantheon shuffle), or a Docker compose change. Producesdb.connection_failedwith SQLSTATE[HY000] [2002]and no MySQL log entry (MySQL never sees the connection).- MySQL/MariaDB daemon down — service crashed (OOM kill, disk full, corrupt InnoDB log) or never started after a host reboot. Produces
db.connection_failedwith SQLSTATE[HY000] [2002], andjournalctl -u mysqlwill show the crash reason. - DB user password rotated,
settings.phpnot updated — a DBA, a secrets manager rotation policy, or a managed hosting "auto-rotate" changed the password but the deploy that updatessettings.phphasn't run. Producesdb.access_deniedwith SQLSTATE[28000] [1045]andAccess denied for userin the MySQL error log. - DB user dropped or its host grant changed — Drupal moved to a new instance with a new private IP, but the MySQL grant is still
'drupal'@'old.ip'. Producesdb.access_denied(1045) withAccess denied for user 'drupal'@'NEW_IP'in MySQL's log — the host in the error message is the smoking gun. - Firewall / security group blocking — a security-group revision (AWS), iptables change, or VPC route change cut Drupal off from RDS. Produces
db.connection_failed(2002) withConnection timed outrather thanConnection refused(refused = MySQL is up but not listening on that port; timed out = packets are being dropped). - Connection pool exhausted — under load, MySQL's
max_connectionsis hit. Drupal's PDO call times out waiting for a slot. Producesdb.connection_limit(signal nameToo many connections, code 1040) intermixed withdb.connection_failed. Cluster-correlated withdrupal_request_errors_total5xx spikes during traffic peaks.
5.3 Fix
Match the fix to what db.connection_failed told you, not to a guess.
Cause A — settings.php host stale: confirm the stanza, then update the host. If settings.php is generated (Pantheon, Lando, ddev), regenerate rather than hand-edit.
grep -A 8 "\$databases\['default'\]\['default'\]" \
/var/www/drupal/web/sites/default/settings.php
drush cr && drush sql:query "SELECT 1;"
Cause B — MySQL daemon down: read the journal before restarting. If OOM killed it, raise instance memory or tune innodb_buffer_pool_size first — restarting a memory-starved DB without fixing the cause guarantees the same outage in 20 minutes.
systemctl status mariadb
journalctl -u mariadb --since "1 hour ago" | tail -n 100
Cause C — Password rotated: update both sides — MySQL and settings.php (or your secrets-injection mechanism: env vars, settings.local.php, Pantheon's secret API). Run drush cr.
ALTER USER 'drupal'@'%' IDENTIFIED BY 'new_password_from_secrets_manager';
FLUSH PRIVILEGES;
Cause D — Grant host wrong: the host in MySQL's Access denied log line is the smoking gun.
SELECT user, host FROM mysql.user WHERE user = 'drupal';
RENAME USER 'drupal'@'old.ip' TO 'drupal'@'new.ip';
FLUSH PRIVILEGES;
Cause E — Firewall blocking: nc -zv $DB_HOST 3306 — connection refused means the daemon isn't listening; timeout means packets are being dropped (security group / iptables). Drupal won't recover until the TCP handshake completes.
Cause F — max_connections exhausted: kill long-running queries with KILL , raise max_connections if RAM allows, and find the slow query in the slow-query log. A common Drupal pattern: a misbehaving Views block runs an unindexed node__field_* join on every page render and locks connections during traffic spikes.
SHOW STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';
SHOW PROCESSLIST;
5.4 Verify
You're looking for two things to hold simultaneously: db.connection_failed events stop appearing, and drupal_server_errors_total returns to baseline.
# Should be empty for at least 15 minutes under normal traffic:
grep -E "PDOException.*SQLSTATE\[HY000\] \[2002\]" /var/log/php-fpm/error.log | tail -n 5
grep -i "access denied" /var/log/mysql/error.log | tail -n 5
# Should return 1 instantly on three consecutive runs:
for i in 1 2 3; do drush -r /var/www/drupal/web sql:query "SELECT 1;"; sleep 5; done
In Logystera's entity view, healthy state for a Drupal site looks like: zero db.connection_failed events for 30 minutes, drupal_server_errors_total 5xx rate under 0.5% of requests, and drupal_php_error_total back to its normal background of 0–3/hour (deprecation noise from contrib modules). If drupal_php_error_total settles but db.connection_failed is still firing 1–2/minute, you fixed the symptom (e.g., bumped max_connections) without fixing the cause (the slow query holding the slots).
The baseline matters: a healthy production Drupal site emits roughly 0 db.connection_failed per day. Unlike PHP deprecations, this signal has no expected baseline noise — any non-zero rate over 5 minutes is anomalous. If the rate stays at 0 for an hour under your normal traffic peak, the issue is resolved.
If db.connection_failed reappears within an hour, you addressed a symptom, not the cause. Go back to the MySQL error log.
6. How to Catch This Early
Fixing it is straightforward once you know the cause. The hard part is knowing it happened at all.
This issue surfaces as db.connection_failed.
Everything you just did manually — grep php-fpm/error.log for PDOException, separate [2002] from [1045], time-correlate with the deploy window, confirm with drush sql:query — Logystera does automatically. The Drupal agent's DatabaseErrorHandler registers as a logger backend and emits db.connection_failed to the gateway out-of-band the instant PDO throws — independent of dblog, which would otherwise be unable to write to the database that just failed.
!Logystera dashboard — db.connection_failed over time db.connection_failed rate, last 24h — spike at 02:14 UTC, immediately after the nightly DB password rotation window.
The rule that fires is id 432 — Drupal database connection failure, severity critical, threshold 1 event in 60 seconds. No smoothing: a single db.connection_failed in production triggers. Because the healthy baseline is zero, a threshold of 1 has no false positives in practice.
!Logystera alert — Drupal database connection failure Critical alert fires within 60s of the first db.connection_failed event, including SQLSTATE and the affected entity.
The alert payload includes the timestamp, the SQLSTATE (so you know [2002] vs [1045] before you open a terminal), the affected entity name, and the truncated message excerpt that the agent sanitizes (with VALUES (...) substitution to scrub PII). That's enough to decide which of the six root causes in §5.2 you have, from the alert body alone.
The fix is simple once you know the problem. The hard part is knowing it happened at all. Logystera turns this kind of failure from a customer-reported emergency at 02:30 into a 60-second notification with the SQLSTATE that proves it.
7. Related Silent Failures
db.deadlockanddb.lock_timeout— same DSL family, but the DB is reachable. Surfaces during heavycommerce_orderwrites or batch imports.db.connection_limit(1040 — too many connections) — distinct from2002. DB is up, password works, butmax_connectionsis exhausted. Often precedesdb.connection_failedduring traffic spikes.db.access_denied(1045) — paired signal for the auth-failure side of this guide. Same blank-page symptom, fix is inmysql.usergrants.- DB cache backend unreachable —
cache_defaultandcache_renderrows in MySQL. Drupal can't read its own router cache, which is whydrush critself fails during the outage. Surfaces asdrupal_php_error_totalwithDatabaseExceptionWrapper. dblogwrites failing silently — when the DB is the failure mode, watchdog cannot record its own failure. This is why log monitoring that depends ondblogalways misses DB outages.
See what's actually happening in your Drupal system
Connect your site. Logystera starts monitoring within minutes.