Guide
99.9% uptime tells you nothing about whether your site is working
Most teams who ask "is my WordPress site up?" mean two very different things at once. They mean "can a visitor load it" and "is it doing the work I expect it to do." Uptime monitoring answers the first question. It does not, and cannot, answer the second. The gap between those two definitions is where every silent failure in WordPress lives.
An uptime monitor pings https://yoursite.com/ every five minutes and looks for a 200 OK response. If it gets one, the dashboard turns green. If you have configured it well, it might also check that a specific string appears on the page, or that response time is under some bound. That's the entire mechanism. The site responded — uptime: 100%.
Meanwhile, behind that 200 OK: cron has not run in eleven months, wp_mail returned false on every attempt yesterday, the contact form is silently dropping submissions, three plugins are throwing PHP fatals on the admin dashboard, and a botnet has tried five thousand passwords against wp-login.php in the past hour. None of that is visible to a uptime ping. The site is "up." The site is also broken. This guide is about why those are not the same thing, and how to monitor the parts that uptime cannot see.
Why uptime monitoring became the default — and why it stopped being enough
Uptime monitoring was born in an era when the only thing that could go wrong with a website was the web server crashing. If Apache was running, the site worked. If Apache was down, the site didn't. A simple HTTP probe against the homepage was a perfectly reasonable proxy for "is the site working." The mental model fit the failure mode of the era.
That mental model is now twenty years stale. WordPress is not a static document served by Apache; it is an application with a plugin ecosystem of tens of thousands of moving parts. The application can be wrong while the web server is fine. The web server can be fine while the database is degraded. The database can be fine while the email subsystem has been broken for a month. Each of those subsystems can fail independently, silently, and invisibly to a probe at the front door.
What modern WordPress failure actually looks like
A representative sampling of what teams discover the first week they install application-level monitoring on sites they thought were "fine":
- Cron has not executed since the last hosting migration — months ago. Scheduled posts in draft, plugin update checks not running, transient cleanup never happening. Cron not running is one of the most common silent failures we see.
- wp_mail has been returning false on every call for weeks because an SMTP credential rotated and nobody updated it. The contact form looks like it submits — there is no error to the user. Emails not sending is invisible without monitoring.
- A plugin started throwing PHP notices a few weeks ago. With WP_DEBUG_DISPLAY off (correctly, for production), the notices end up in debug.log if it's enabled, or in /dev/null if it isn't. Silent PHP errors happen constantly.
- Bots are hammering wp-login.php at hundreds of attempts per hour. The security plugin blocks the obvious ones but doesn't tell you the volume, the targeted usernames, or the secondary attack on xmlrpc that is still wide open.
- The admin dashboard takes six seconds to load because three plugins are running expensive queries on every admin request. Editors are wasting hours every week and have stopped complaining because they assume "it's just slow."
What uptime monitoring cannot see, ever, by design
An external probe lives outside the application. It can only observe what is visible from the front door. The list of things the front door does not show is long and important.
| What's invisible | Why uptime can't see it |
|---|---|
| Email delivery | Forms submit, the page returns 200, wp_mail returns false internally. The user sees a thank-you page. You see uptime green. The email never sent. |
| Cron execution | Background tasks live entirely server-side. Whether they run is invisible from the outside. The probe pinging the homepage cannot tell you that the daily backup hook hasn't fired in six weeks. |
| PHP errors | A fatal in a plugin's admin code does not surface on the public homepage. A notice on every request does not change the HTTP status code. The probe sees 200 and moves on. |
| Authentication attacks | 500 failed logins per hour at wp-login.php do not affect the homepage's response. Uptime is healthy. Your security posture is not. |
| Admin performance | The admin and the frontend are different surfaces. Uptime monitors check the homepage. The admin can be unusable for editors while the homepage stays fast. |
| Database health | An options table at 2 GB or a slow query taking eight seconds does not break the front page — until it does. Uptime gives no warning. |
| REST API and webhooks | A flood of requests at /wp-json/... can overwhelm PHP-FPM workers without ever touching the homepage path the uptime monitor probes. |
| Plugin state changes | A critical plugin auto-disabled itself due to a fatal during an update. Uptime is fine. The site is fine. The functionality the plugin provided is gone. |
How to manually check application health beyond uptime
If you don't have monitoring in place yet, here is a periodic checklist you can run by hand. None of this is sustainable past one site, and none of it gives you trends — but it will tell you the current state.
1. Check that cron is alive
wp cron event list --due-now wp cron test
If due-now returns hundreds of overdue events, cron is not running.
2. Send a test email
wp eval 'var_dump(wp_mail("[email protected]", "WP test", "ping"));'
If it returns false, mail is broken. If it returns true and you don't receive it, mail is broken differently — check SMTP and inbox filtering.
3. Inspect debug.log
tail -200 wp-content/debug.log grep -c "PHP Fatal" wp-content/debug.log
A high fatal count, or recent fatals, is a red flag.
4. Time the admin
curl -w "%{time_total}\n" -o /dev/null -s -b cookies.txt https://yoursite.com/wp-admin/
Anything over two seconds is a sign of trouble. Logged-in admin requests bypass page caching.
5. Look for auth pressure
grep "wp-login.php" /var/log/nginx/access.log | wc -l grep "xmlrpc.php" /var/log/nginx/access.log | wc -l
Hundreds of hits per hour is bot traffic, regardless of whether the security plugin is blocking it.
How to fix the visibility gap
"Fixing" uptime-vs-health is a bit of a misnomer — there's nothing broken about uptime monitoring per se. It does what it says on the tin. The fix is to layer application-level monitoring on top, so you have visibility into the things uptime can't see.
Step 1: Keep your uptime monitor
It's not useless — it tells you when the server is genuinely down. Don't replace it. Layer on top.
Step 2: Add application-level signal capture
Install a plugin that captures structured events from inside WordPress: cron, mail, auth, errors, REST traffic, plugin lifecycle, performance. This is the layer your uptime monitor cannot reach.
Step 3: Derive metrics, not just events
Raw events are useful for forensics. Metrics are what tell you "the rate is going up" or "this hasn't happened in 24 hours when it should have happened 600 times." A monitoring system that stores events without deriving metrics from them is half a system.
Step 4: Define detection rules with thresholds
"Alert when cron has not run in 24 hours." "Alert when wp_mail failure rate exceeds 50% over 15 minutes." "Alert when failed logins exceed 100 per hour." These are pre-built and tuned in Logystera; if you build your own, expect to spend time on the calibration.
Step 5: Route alerts somewhere humans actually look
Email is fine for low-volume teams. Webhook into Slack, PagerDuty, or your incident tool for higher-stakes work. Don't route to a folder nobody opens.
The gap between uptime and health, in one table
| What you want to know | Uptime monitor | Log-based monitoring |
|---|---|---|
| Is the server responding? | Yes | Yes |
| Is the homepage rendering? | Yes | Yes |
| Are emails being delivered? | No | Yes |
| Is cron running on schedule? | No | Yes |
| Are there PHP errors? | No | Yes |
| Are we under brute-force attack? | No | Yes |
| Is the admin slow for editors? | No | Yes |
| Did a plugin auto-disable? | No | Yes |
| Are REST endpoints saturating? | No | Yes |
| Did config silently change? | No | Yes |
Both have a place. Uptime tells you about server availability. Logs tell you about application correctness. Run both.
How Logystera closes the gap
Logystera is the application-health layer designed to sit alongside your uptime monitor, not to replace it. The WordPress plugin captures structured events from thirty-plus signal sources inside the application. The processor derives metrics from those events. Pre-built detection rules — calibrated on real production failure modes — fire when something looks wrong.
What you see in the Logystera dashboard for each site:
- Cron health: time since last successful run, hooks fired vs expected, average duration trend
- Mail health: success vs failure rate, SMTP error categories, daily send volume
- Auth health: failed login rates by endpoint, targeted usernames, post-failure success rate (compromise indicator)
- Error health: PHP fatal/error/warning counts by source plugin, rolling baselines
- Performance: response-time percentiles for admin and frontend, slow-request log
- Configuration: plugin activation/deactivation timeline, theme switches, option changes
When something crosses a threshold — cron silent for 24 hours, mail failure rate above 50%, login attempts spiked — you get an alert with the exact events that triggered it and links to drill into the metric history. Uptime stays green; Logystera tells you what uptime can't.
Real example: 99.99% uptime, three broken sites
An agency managed fifteen WordPress sites for various clients. Their uptime tool reported 99.99% uptime across the portfolio for the previous quarter. The client newsletter mentioned the number proudly. When they connected Logystera as a sanity check, the first 72 hours surfaced: one site that had not sent an email in eleven days (an SMTP password rotated during a host migration), two sites whose cron had been stopped for three months (the migration disabled wp-cron and the system cron was never set up), and one site running 200+ PHP warnings per hour from a plugin conflict that had appeared after a routine update.
All three sites were "up" by every external definition. None of them were healthy. The uptime number was correct and useless.
Related guides
WordPress Cron Not Running
A specific instance of the uptime-vs-health gap.
WordPress Emails Not Sending
Mail failure is invisible to uptime; visible to logs.
WordPress Log Monitoring
The full picture: what to capture and how.
Log-Based Monitoring
Why logs beat synthetic checks for application health.
Silent PHP Errors
Errors WordPress hides; logs reveal.
WordPress Admin Slow
A health problem the homepage probe never sees.
Frequently asked questions
Should I stop using my uptime monitor?
No. Uptime monitoring catches genuine outages — when the server crashes, when DNS breaks, when the SSL certificate expires, when the host has a network event. Those are real problems and uptime monitors are good at detecting them. The point isn't to replace uptime; it's to add the layer of visibility that uptime can't provide.
Can't I just configure my uptime monitor to check more URLs?
You can, and you should — checking the contact-form page or a logged-in admin URL is better than only checking the homepage. But it doesn't get you to "is wp_mail actually delivering" or "did cron run today" or "are we being attacked." Those questions can only be answered from inside the application, with structured event capture.
What about real-user monitoring (RUM) tools?
RUM tools (like Pingdom RUM, SpeedCurve, or browser-side telemetry from APM vendors) measure what real visitors experience — page load time, Core Web Vitals, JS errors. They are great for performance insight and complement application-level monitoring. They do not see server-side events: cron, mail, auth attacks, plugin state. Different layer, different problem.
How do I explain "site is up but broken" to non-technical stakeholders?
The analogy that lands: a restaurant can have its lights on (uptime is up) while the kitchen is on strike (cron is down) and the phone line is dead (mail is broken) and the back door is being battered by burglars (auth attacks). The signs out front are unaffected. Customers walking by see "open." That is uptime vs health.
What's the smallest first step I can take?
Check three things this week: (1) does wp cron event list --due-now return a sane number on each site? (2) does wp_mail() actually deliver from each site? (3) is debug.log growing or full of recent fatals? Spending an hour on those three checks across your portfolio will turn up problems on at least some of the sites — that's the gap uptime missed.
See what's actually happening in your WordPress system
Connect your site. Logystera starts monitoring within minutes.