Guide

WordPress cron reliability — using max vs average delay to find the blocking job

Your WordPress site looks healthy. Posts publish, logins work, the homepage is fast. But the average delay between a scheduled WP-Cron event's intended run time and its actual run time has been creeping up for days.

1. Problem

Your WordPress site looks healthy. Posts publish, logins work, the homepage is fast. But the average delay between a scheduled WP-Cron event's intended run time and its actual run time has been creeping up for days. Yesterday the average was 12 seconds. Today it's 4 minutes. wp cron event list now shows (now - 14 minutes) against half the registered hooks.

This is the "wordpress cron jobs not running overdue" scenario, and on a low-traffic site it surfaces silently as a creeping wp_cron_overdue_count over hours, not minutes. Nothing has crashed. WP-Cron is firing. But every time it runs, one specific hook eats the entire request budget, and the rest of the queue shifts to the next request that nobody is making. That's the symptom users finally report:

"Our WooCommerce abandoned cart email went out four hours late. The transactional ones too. And the daily Akismet recheck hasn't run since Tuesday."

This is not scheduled posts not publishing (acute, fixable in 5 minutes), not WP-Cron being disabled (visible immediately), not a single zombie hook stuck for years. It's the operational case where cron IS firing, but one slow job is starving every other hook in the same request.

2. Impact

Once wp_cron_overdue_count climbs, every time-sensitive feature begins to lie. WooCommerce's woocommerce_cleanup_personal_data runs eight hours late, putting GDPR deletion past compliance. action_scheduler_run_queue falls behind, failed-payment retries don't fire, customer notifications don't send. Newsletter plugins (Mailpoet, FluentCRM) queue thousands of emails that send hours later — by which time the campaign link is dead or the recipient already converted on a competitor.

For a 100k-pageview/month membership site, a 4-hour cron lag compounds into ~200 missed renewal-reminder emails per day, 5–12% of which would have converted to retained subscribers. At an LTV of $84, that's $840–$2,000/day in silent churn. The CFO never sees a refund spike — just a slow bleed in retention numbers two months later.

The quietest cost is observability. Many monitoring stacks report cron health as yes/no based on whether wp-cron.php returns 200. It does. Uptime is green. The only place the truth lives is in the cron option array and plugin queue tables. No human is grepping those at 2 a.m.

3. Why It’s Hard to Spot

WordPress's cron model is uniquely deceptive on this failure. WP-Cron is a pseudo-cron — it runs on visitor traffic, not a system clock. When a page is requested, WordPress checks if any events are due, then spawns a non-blocking request to wp-cron.php that iterates through due events serially in a single PHP process bounded by WP_CRON_LOCK_TIMEOUT (default 60s).

That model has three failure modes that compound:

Traffic starvation. On a low-traffic site (5 visitors/hour overnight), there may be no request to trigger WP-Cron between 02:00 and 07:00. Every hook scheduled in that window queues up. By morning, wp_cron_overdue_count is 40+ deep, and the first visitor at 07:03 fires a single run that processes maybe two of them before lock timeout.
One slow job blocks the queue. Events run serially. If woocommerce_run_update_callback takes 38 seconds (realistic ALTER TABLE on a multi-million-row order table), and WP_CRON_LOCK_TIMEOUT is 60, only one or two other hooks fit in the remaining window. Everything else postpones to the next trigger — hours away on a low-traffic site.
The dashboard lies. WP Admin > Tools shows nothing about cron. Site Health has no cron panel. wp cron event list requires SSH. By the time someone runs it, the queue is 3 hours late and the question is "since when?" — which the WordPress UI cannot answer.

That's why cron drift is the most-common, latest-detected silent failure on production WordPress. Uptime monitors miss it. Plugin error logs miss it (no errors — just lateness). Standard APM misses it (the gap between scheduled and actual is a derived metric nobody computes).

4. Cause

The Logystera WordPress agent samples the cron array on every wp_loaded and emits wp_cron_overdue_count — the number of scheduled events whose time is in the past at sample time.

Healthy wp_cron_overdue_count is bursty but bounded: it spikes briefly as events become due, then drops to near zero as WP-Cron drains them. Unhealthy wp_cron_overdue_count ratchets up — each sample shows a higher floor than the previous, and the queue never fully drains.

The diagnostic key is the delta between wp_cron_avg_delay_seconds and wp_cron_max_delay_seconds (both: now − scheduled_time across overdue events). When max ≈ avg (avg=8s, max=12s), the queue is uniformly behind — traffic-starvation drift, not a blocking job. When max is dramatically larger than avg (avg=45s, max=14400s), one or two hooks are sitting at the front of the queue refusing to run while everything else is barely late. That delta is the blocking-job fingerprint.

The supporting signal wp_cron_duration_ms_sum records cumulative execution time per hook. A hook with duration_ms_sum an order of magnitude above peers is the suspect — eating the request budget while WP_CRON_LOCK_TIMEOUT bounces everything else.

5. Solution

5.1 Diagnose (logs first)

The diagnosis path is mechanical: read the cron option, compute max-vs-avg, identify the blocking hook by duration, time-correlate with the recent low-traffic window or plugin update.

1. Dump the cron queue and confirm overdue events.

wp --path=/var/www/html cron event list --fields=hook,next_run_relative,recurrence \
   --format=table | head -n 30

The column to read is next_run_relative. Anything with (now -) prefix is overdue and contributes to wp_cron_overdue_count:

+------------------------------------+---------------------+------------+
| hook                               | next_run_relative   | recurrence |
+------------------------------------+---------------------+------------+
| wp_version_check                   | (now - 4 hours)     | twicedaily |
| wp_update_themes                   | (now - 4 hours)     | twicedaily |
| woocommerce_cleanup_logs           | (now - 3 hours)     | daily      |
| action_scheduler_run_queue         | (now - 9 seconds)   | every 1min |
| akismet_scheduled_delete           | (now - 12 minutes)  | daily      |
+------------------------------------+---------------------+------------+

If the same (now - X) deltas reappear on three consecutive wp cron event list calls 60 seconds apart, the queue is not draining. That's wp_cron_overdue_count ratcheting.

2. Compute the max-vs-avg delta directly from the option table.

wp --path=/var/www/html eval '
$cron = _get_cron_array(); $now = time(); $delays = [];
foreach ($cron as $ts => $hooks) {
    if ($ts < $now) foreach ($hooks as $h => $_) { $delays[$h] = $now - $ts; }
}
if (!$delays) { echo "queue clean\n"; exit; }
echo "overdue: " . count($delays) . "\n";
echo "avg: " . array_sum($delays)/count($delays) . "s\n";
echo "max: " . max($delays) . "s\n";
echo "blocker: " . array_search(max($delays), $delays) . "\n";'

Read the ratio. max/avg < 3 = traffic-starvation drift, queue uniformly behind because nothing triggers WP-Cron often enough. max/avg > 10 = one specific hook is the blocker, and blocker: names it. That hook is the one you investigate next. The same arithmetic is what the agent runs on every emit to compute wp_cron_max_delay_seconds and wp_cron_avg_delay_seconds.

3. Identify the slow job from wp_cron_duration_ms_sum.

If the agent has been emitting for any meaningful window, the blocker has a wp_cron_duration_ms_sum an order of magnitude above the rest. Without the agent:

time wp --path=/var/www/html cron event run woocommerce_run_update_callback
# 38.412s real time → this is the queue eater

Anything over 30 seconds in a default install will starve the rest of the queue under WP_CRON_LOCK_TIMEOUT=60.

4. Time-correlate with the traffic-starvation window or last plugin update.

# Did the avg-delay creep start after a plugin auto-update?
grep -iE "auto.*update|plugin.*activated" /var/www/html/wp-content/debug.log | tail -n 20

# Cross-reference with overnight traffic in the access log
awk '$4 ~ /\[..\/..\/....:0[2-6]/ {n++} END {print "low-traffic hits:", n}' \
   /var/log/nginx/access.log

The real-site pattern: wp_cron_avg_delay_seconds flat at 5–10s during the day, balloons to 1800s+ between 02:00 and 07:00 (traffic starvation), then partially recovers when the first morning visitor triggers a run. wp_cron_overdue_count shows this as a daily sawtooth with a slowly rising floor — the unblocked tail of the queue that never gets to run.

5.2 Root Causes

Each cause maps to a specific signal pattern. Prioritized by frequency.

Traffic starvation (most common). Site has < 1 request every 5 minutes during off-hours. WP-Cron has nothing to fire from. Produces wp_cron_overdue_count rising linearly during low-traffic windows, with max ≈ avg (uniform drift). wp_http_requests_total shows the gap. Fix is system cron.
One slow hook blocks the queue (this guide's case). A plugin hook (Action Scheduler, Mailpoet queue, broken cron handler) takes 30+ seconds. Produces a huge max/avg ratio (avg=20s, max=14400s). wp_cron_duration_ms_sum for that one hook dominates the chart.
WP_CRON_LOCK_TIMEOUT exhaustion. Cron runs hit the 60s wall, the lock isn't released cleanly, the next run can't start until the transient expires. Produces wp_cron_errors_total with reason lock_timeout alongside rising wp_cron_overdue_count.
DISABLE_WP_CRON set without system cron. define('DISABLE_WP_CRON', true) added to wp-config.php but no matching * curl /wp-cron.php. Produces wp_cron_overdue_count rising monotonically forever, wp_http_requests_total to /wp-cron.php flat at zero.
Plugin update broke a hook. A cron callback fatals. Produces wp_cron_errors_total per run; the hook reappears immediately because the next-scheduled time was set before the failure. Time-correlates with wp.state_change.
DB lock contention. Row-level lock on the cron option during long INSERTs elsewhere. Hooks miss their slot, push to the next cycle, miss again. Produces wp_cron_overdue_count with a sawtooth aligned to backup windows.

5.3 Fix

Match the fix to what the max-vs-avg delta told you, not to a guess.

Cause A — Traffic starvation: disable WP-Cron, use system cron. Canonical fix for any site under ~10k pageviews/day.

# wp-config.php
define('DISABLE_WP_CRON', true);

# crontab -e (web user)
* * * * * curl -s https://example.com/wp-cron.php?doing_wp_cron > /dev/null 2>&1

After this, wp_http_requests_total to /wp-cron.php holds steady at 1/minute regardless of human traffic, and wp_cron_avg_delay_seconds stops creeping during off-hours.

Cause B — One slow hook blocking the queue: move it off WP-Cron entirely. Action Scheduler, a real queue, or a separate system cron entry that runs only that hook out-of-band.

wp cron event delete woocommerce_run_update_callback
# system cron, runs out-of-band:
0 3 * * * cd /var/www/html && wp cron event run woocommerce_run_update_callback

Cause C — WP_CRON_LOCK_TIMEOUT exhaustion: raise the limit only as temporary mitigation while implementing Cause B's fix. Permanently raising it makes things worse on large queues.

define('WP_CRON_LOCK_TIMEOUT', 300); // temporary

Cause D — Plugin update broke a hook: inspect, roll back the plugin if needed, clear the orphaned schedule.

wp cron event list --hook=broken_hook_name --format=json
wp plugin update broken-plugin --version=2.4.1
wp cron event delete broken_hook_name && wp cron event schedule broken_hook_name now hourly

Cause E — DB lock contention: check SHOW PROCESSLIST during cron runs for wp_options waits. Move heavy INSERT-driven plugins (analytics, audit) into a custom table.

5.4 Verify

You're looking for two things to hold simultaneously: wp_cron_overdue_count returns to its bursty-but-bounded pattern, and the max/avg ratio collapses back to under 3.

# Run the diagnostic snippet from §5.1 step 2 every minute for 15 minutes
for i in $(seq 1 15); do wp eval '...'; sleep 60; done
# Expected: avg < 30s, max < 60s, overdue count < 5 throughout

The healthy baseline for a Logystera-monitored site: wp_cron_overdue_count hovers between 0 and 5 with brief spikes to 10–15 every few minutes (events become due, then run within the next pass). wp_cron_avg_delay_seconds stays under 30. wp_cron_max_delay_seconds stays under 90. The ratio max/avg stays under 3.

Unhealthy baseline: wp_cron_overdue_count floor never dips below 20, avg is 60s, max is 14400s. If you fixed Cause B but max/avg is still 50, you removed the wrong hook — wp_cron_duration_ms_sum will tell you which one is now dominant.

If wp_cron_overdue_count returns to bursty-but-bounded for an hour under your normal off-hours traffic, the issue is resolved. If it sneaks back up overnight, the system-cron entry didn't install — crontab -l and verify.

6. How to Catch This Early

Fixing it is straightforward once you know the cause. The hard part is knowing it happened at all.

This issue surfaces as wp_cron_overdue_count.

Everything you just did manually — dump the cron option, compute max-vs-avg, identify the blocking hook by duration, time-correlate with the traffic gap — Logystera does automatically. The WordPress agent samples the cron array on every wp_loaded and emits wp_cron_overdue_count, wp_cron_avg_delay_seconds, wp_cron_max_delay_seconds, and per-hook wp_cron_duration_ms_sum. The dashboard panel charts all three on a shared time axis — the visualization that makes the max-vs-avg delta jump out at a glance.

!Logystera dashboard — wp_cron_overdue_count over time wp_cron_overdue_count last 24h, with avg_delay and max_delay overlaid — divergence at 02:14 (avg flat, max climbing to 4h) is the blocking-job fingerprint during the traffic-starvation window.

The rule that fires is id 511 — WordPress cron queue stalled, severity warning at max/avg > 10 for 5 minutes, escalating to critical when wp_cron_overdue_count > 30 for 15 minutes. The two-tier threshold is deliberate: a brief max/avg spike during one long task is normal; a sustained ratio means a hook is structurally blocking the queue.

!Logystera alert — WordPress cron queue stalled Warning alert fires within 5 minutes of max/avg > 10, naming the specific hook from wp_cron_duration_ms_sum eating the request budget.

The alert payload includes the timestamp of first divergence, current wp_cron_overdue_count, the max/avg ratio, the top three hooks by wp_cron_duration_ms_sum, and a histogram of which hours the drift accumulates. That's enough to decide between Cause A (traffic starvation — histogram clusters in off-hours) and Cause B (one slow hook — one hook dominates duration) from the alert body, before opening a terminal.

The fix is simple once you know the problem. The hard part is knowing it happened at all. Logystera turns a slow-bleed cron drift — the kind that costs a renewal campaign two weeks before the CFO sees the retention number — into a 5-minute notification with the hook name that proves which one is blocking the queue.

7. Related Silent Failures

wp.cron type=missed_schedule — acute version of this signal. Triggers when a single scheduled post or cron event misses its slot. Surfaces as a customer-reported failure ("my post didn't publish"); this guide's drift is the slow-bleed precursor.
DISABLE_WP_CRON without system cron — absolute version of cron drift. wp_cron_overdue_count rises monotonically forever, never recovers. Same crontab fix as Cause A, but the graph is a straight line, not a sawtooth.
wp_cron_errors_total with lock_timeout — direct evidence of WP_CRON_LOCK_TIMEOUT exhaustion. Often appears alongside max/avg divergence; raising the timeout is a workaround, not a fix.
Action Scheduler queue depth — WooCommerce uses Action Scheduler as a parallel queue layered on WP-Cron. When action_scheduler_run_queue is the blocker, WP-Cron drifts AND Action Scheduler stalls — doubly hidden.
wp_http_requests_total flat to /wp-cron.php — smoking-gun supporting signal for traffic starvation. If the rate is zero for a 4-hour overnight window, you've identified Cause A without looking at the queue.

See what's actually happening in your WordPress system

Connect your site. Logystera starts monitoring within minutes.

Request a demo WordPress integration