Guide
Why is my Drupal site getting WordPress scanner traffic? (and what to do about it)
1. Problem
You run a Drupal site. You don't run WordPress. You've never run WordPress. And yet your nginx access log is full of this:
185.220.101.42 - - [27/Apr/2026:14:03:11 +0000] "GET /wp-login.php HTTP/1.1" 404 162
185.220.101.42 - - [27/Apr/2026:14:03:11 +0000] "GET /wp-admin/ HTTP/1.1" 404 162
185.220.101.42 - - [27/Apr/2026:14:03:12 +0000] "POST /xmlrpc.php HTTP/1.1" 404 162
185.220.101.42 - - [27/Apr/2026:14:03:12 +0000] "GET /wordpress/wp-login.php HTTP/1.1" 404 162
192.241.220.18 - - [27/Apr/2026:14:03:14 +0000] "GET /wp-content/plugins/revslider/temp/update_extract/revslider/admin.php HTTP/1.1" 404 162
You search "drupal site getting wp-login.php requests" and find forum threads from 2016 telling you to ignore it. Your CDN bill ticked up this week. The 404 graph in Drupal's reports is a wall.
This surfaces as a drupal_wp_scanner_hits_total counter in Logystera the moment your site enters the IP ranges that mass-scanners are sweeping. The question — "is this noise or do I block it?" — has a concrete answer that depends on volume, source distribution, and what the same IPs hit after the WP probes fail.
2. Impact
The first-order cost is bandwidth and CPU. A scanner pool can hit a single Drupal site with 200–800 requests per minute across wp-login.php, xmlrpc.php, wp-admin/, and a long tail of plugin-exploit paths (revslider, gravityforms, duplicator, wp-file-manager). At Drupal-typical TTFB, that's 5–15% of a single PHP-FPM worker plus the full nginx 404 path. On a small VPS, two or three concurrent scanner pools can saturate workers and starve real traffic.
The second cost is SOC fatigue. If you alert on 404 spikes, scanner sweeps drown the signal — drupal_404_total goes vertical for 20 minutes every few hours and your real 404s (broken internal links, missing media) get buried. Analysts learn to ignore the panel, and a week later when an attacker probes /admin or /user/register?destination=..., nobody notices.
The third cost is the one nobody talks about: scanner sweeps are reconnaissance. The same IPs that probe wp-login.php and get a 404 will, eight to forty seconds later, probe /?q=admin, /CHANGELOG.txt, /core/install.php, /sites/default/files/, and Drupal CVE paths (drupal_backdoor_probe_total). If you don't track the WP probes, you also don't see the Drupal probes that follow from the same IP — and those are the ones that matter. The WP probes are the cheap "is anything here" shot; the Drupal probes are the targeted follow-up.
3. Why It’s Hard to Spot
Drupal's dblog and Reports → Recent log messages don't surface this. Watchdog registers page-not-found events but aggregates them into a flat list with no break-out by attacker fingerprint. wp-login.php is listed alongside genuine missing pages, with no distinction between "user typo" and "automated scanner."
Web-server log analyzers (GoAccess, AWStats) treat every 404 as equal. A scanner emitting 800 requests across 40 unique WP paths from one IP looks the same as 800 unique users hitting one broken link. The shape of the traffic — burst, narrow path set, predictable user-agents — is the diagnostic, but it's invisible in flat aggregates.
CDN dashboards are worse. Cloudflare and Fastly count the requests but won't tell you 90% of your 404s are concentrated on twelve WordPress-specific paths. Their built-in WAF rules block some of this on paid plans, but the dashboard view is request count, not attack fingerprint. You see the volume, you don't see the pattern.
And the counterintuitive bit: a lot of Drupal admins assume scanners fingerprint the CMS first and then attack the right paths. They don't. Modern scanners — Mirai-variant botnets, mass-exploit frameworks, and credential stuffers — spray the top-N CMS paths against every IP they find on Shodan and Censys. The scanner doesn't care that you run Drupal; you're an IP with port 443 open. WP-targeted traffic on a Drupal site isn't a misidentification — it's the entire scanner business model.
4. Cause
Internet-scale scanning separates discovery from exploitation. A scanner pool buys (or compromises) a few thousand IPs, pulls a target list from Shodan/Censys output (every IPv4 with 80/443 open), and runs a fixed playbook against every one. The playbook is path-based, not CMS-fingerprint-based — one request tells you "WordPress installed" or "not installed" without parsing HTML.
The Logystera Drupal agent's request-classification middleware tests every nginx/Apache access log line against a curated regex of known WP endpoints: ^/wp-login\.php, ^/wp-admin(/|$), ^/wp-content/, ^/xmlrpc\.php, ^/wordpress/, ^/wp-includes/, plus plugin-exploit paths (revslider, duplicator, wp-file-manager, gravityforms). If the path matches and the response status is 404, the agent emits a drupal_wp_scanner_hits_total counter with labels entity_id, client_ip, and path_class (wp_auth, wp_admin, wp_xmlrpc, wp_plugin_exploit).
The 404 status is load-bearing. Without it, the metric would also count legitimate WP-on-Drupal multi-CMS deployments where Drupal proxies a /blog/ subtree to a WordPress install — there wp-login.php returns 200 or 302, not 404. A pure scanner sees 404 every time, because there's no WordPress to respond.
The active rule drupal_wp_scanner_spike fires at severity warning when a single entity sees more than 100 drupal_wp_scanner_hits_total events in any 10-minute window. The threshold is calibrated against fleet baseline — a typical Drupal site sees 5–40 WP-scanner hits per day from background noise; 100 in 10 minutes is a clear sweep.
5. Solution
5.1 Diagnose (logs first)
The diagnosis path is: confirm the pattern in nginx, separate scanner sweeps from background noise, and time-correlate with what else the same IPs are touching.
1. Nginx access log — confirm the WP-path 404 cluster.
# Pull all WP-pattern 404s from the last hour and count by path
awk '$9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc|wp-content|wordpress)/ {print $7}' \
/var/log/nginx/access.log | sort | uniq -c | sort -rn | head -n 20
What you want to see is a narrow cluster — five to fifteen distinct paths repeated hundreds of times each. That's a scanner sweep, and it's exactly what surfaces as drupal_wp_scanner_hits_total in the Logystera agent.
A typical sweep distribution looks like:
847 /wp-login.php
621 /wp-admin/
598 /xmlrpc.php
412 /wordpress/wp-login.php
389 /wp-content/plugins/revslider/temp/update_extract/revslider/admin.php
201 /wp-content/plugins/wp-file-manager/lib/php/connector.minimal.php
If the distribution is flat (each path hit 1–3 times) you have background noise, not a sweep.
2. Pivot from path to source IP — separate one scanner from many.
# Top source IPs hitting WP paths in the last hour
awk '$9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc|wp-content)/ {print $1}' \
/var/log/nginx/access.log | sort | uniq -c | sort -rn | head -n 10
If 80% of the volume comes from a handful of IPs, you're looking at one or two scanner pools — drupal_top_attack_ips would show the same shape. If volume is spread across hundreds of IPs each contributing 10–40 hits, you're inside an internet-wide sweep that thousands of other sites are also receiving simultaneously. The fix differs.
3. Cross-reference WP probes with Drupal-targeted probes from the same IPs — the load-bearing step.
TOP_IP=$(awk '$9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc)/ {print $1}' \
/var/log/nginx/access.log | sort | uniq -c | sort -rn | head -n 1 | awk '{print $2}')
grep "^${TOP_IP}" /var/log/nginx/access.log | \
awk '{print $7, $9}' | sort | uniq -c | sort -rn | head -n 30
If the same IP that hit /wp-login.php 200 times also hit /CHANGELOG.txt, /core/install.php, /?q=admin, or any node/*/edit endpoint, this is multi-CMS recon — WP probes are the cheap discovery shot, the Drupal probes (drupal_backdoor_probe_total) are the real recon. That changes the priority in §5.3.
4. Time-correlate with the Logystera dashboard or your own CDN logs.
# Bucket WP-scanner hits by 10-minute window
awk '$9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc)/ {
split($4, t, ":"); print t[2] ":" substr(t[3],1,1)"0"
}' /var/log/nginx/access.log | sort | uniq -c | tail -n 10
A scanner sweep is bursty — 0 → 600/min → 0 over 15–30 minutes, then nothing for hours. Background noise is flat. The bursty shape trips the drupal_wp_scanner_spike rule (>100 in 10 minutes). Buckets like 200, 500, 350, 80, 0, 0 are the signature, and the moment Logystera fires the alert.
5.2 Root Causes
Each cause maps to a different signal pattern and a different intervention. Prioritized by frequency.
- Generic internet-wide sweep — anonymous botnet hitting every IPv4 with port 443 open. Produces
drupal_wp_scanner_hits_totalfrom hundreds of distinct IPs, each a small share.drupal_top_attack_ipsshows a long tail, no head. Benign 404 noise — fix is "don't bother." Baseline: 5–40 hits/day. - Targeted scanner pool — a smaller set of IPs (Tor exits, VPS providers, compromised residential proxies) hitting you repeatedly across waves.
drupal_top_attack_ipsshows a head — top 5 IPs >50% of volume. Worth blocking; retriggers the spike rule on the same IPs daily. - Multi-CMS reconnaissance — same IPs hit WP paths and Drupal paths (
/CHANGELOG.txt,/core/install.php,/?q=admin,/user/register). Producesdrupal_wp_scanner_hits_totalanddrupal_backdoor_probe_totalfrom overlapping IP sets. Highest priority — block at the edge immediately. - Plugin-CVE exploit attempt — scanner hitting a tail of WP-plugin paths (
/wp-content/plugins/revslider/...,/wp-content/plugins/duplicator/...). Producesdrupal_wp_scanner_hits_totalwithpath_class=wp_plugin_exploit. Harmless against Drupal (the file doesn't exist), but the same scanner often follows with Drupal-module CVE paths. - Misconfigured legitimate traffic (rare) — multi-CMS host where Drupal serves
/and WordPress serves/blog/, but the WP install was removed without removing the proxy rule. Low-volume, single-IP-cluster from your own users — distinguishable because the IPs are real visitors, not internet randoms.
5.3 Fix
Match the response to which root cause drupal_wp_scanner_hits_total plus drupal_top_attack_ips is showing you.
Cause A — Generic internet-wide sweep: do nothing. Suppress the rule for low-volume baselines (Logystera's default 100/10min threshold already accomplishes this) and ignore the noise. Blocking individual IPs is whack-a-mole; the scanner pool rotates IPs daily.
Cause B — Targeted pool, single sweep: block the top IPs at the edge for 24 hours. Use Cloudflare's IP Access Rules, AWS WAF, or fail2ban. Don't spend a worker cycle on a 404 from an IP that already hit you 800 times.
# fail2ml-style rule: block any IP with >50 wp-scanner 404s in the last hour
awk -v threshold=50 '$9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc)/ {ip[$1]++}
END {for (i in ip) if (ip[i] > threshold) print i}' \
/var/log/nginx/access.log
# Pipe to your edge firewall API
Cause C — Multi-CMS recon (drupal_wp_scanner_hits_total + drupal_backdoor_probe_total from same IPs): block immediately at the edge. These IPs are high-confidence hostile. Also enable Drupal's flood control on /user/login and /user/register, and audit any recently-installed contrib modules for known CVEs — the scanner's next step is exploiting whatever the recon turned up.
Cause D — Plugin-CVE exploit attempts: add an nginx rule that 444s (no response) any path matching /wp-content/plugins/. This costs nothing on the Drupal side and saves the bandwidth of returning a full 404 page.
location ~* ^/(wp-content|wp-admin|wp-includes|xmlrpc\.php|wp-login\.php) {
return 444;
}
Cause E — Misconfigured legitimate traffic: find the dead proxy rule in your nginx or Apache config and remove it.
5.4 Verify
After applying an edge block, you're looking for two things to hold simultaneously: drupal_wp_scanner_hits_total drops back to baseline for the blocked IPs, and drupal_404_total returns to its real (non-scanner) shape — the broken-link tail you've been ignoring.
# Recheck the WP-path 404 count for the last 15 minutes
awk -v cutoff="$(date -d '15 minutes ago' '+%d/%b/%Y:%H:%M')" '
$4 > "["cutoff && $9 == "404" && $7 ~ /(wp-login|wp-admin|xmlrpc)/ {n++}
END {print n}' /var/log/nginx/access.log
The expected baseline for a Drupal site after the fix: drupal_wp_scanner_hits_total drops to 5–40 events per day total — not zero. That's background internet noise from random scanners that nobody has bothered to block, and it's the steady-state for any IPv4 with a public-facing web server. The drupal_wp_scanner_spike rule won't fire at this rate (its threshold is 100/10min). If you're still seeing >100/10min after the block, your edge rule isn't taking effect or you're being targeted by a fresh scanner pool — re-run the diagnose step against the new top IPs.
If drupal_wp_scanner_hits_total settles but drupal_backdoor_probe_total is still firing, you closed the WP-decoy door but left the Drupal door open. That's the actual attack surface; go back to §5.3 Cause C.
6. How to Catch This Early
Fixing it is straightforward once you know the cause. The hard part is knowing it happened at all.
This issue surfaces as drupal_wp_scanner_hits_total.
Everything you just did manually — pull WP-path 404s from nginx, count by path, pivot to source IP, cross-reference whether those IPs also hit Drupal-specific paths, time-bucket to find the sweep window — Logystera does automatically. The Drupal agent's request-classification middleware tags every WP-pattern 404 in real time, the metric increments per request with the source IP and path class as labels, and the drupal_wp_scanner_spike rule (>100 hits in 10 minutes) fires as soon as a sweep crosses the threshold.
!Logystera dashboard — drupal_wp_scanner_hits_total over time drupal_wp_scanner_hits_total rate, last 24h — sweep at 14:03 UTC, 612 hits in 11 minutes from 4 source IPs.
The rule that fires is drupal_wp_scanner_spike, severity warning, threshold 100 events in 10 minutes. Suppression is set to 1 hour per entity so a single multi-wave sweep doesn't page on every wave. The alert payload includes the top source IPs (so you can paste them straight into your edge firewall), the top path classes (wp_auth, wp_admin, wp_xmlrpc, wp_plugin_exploit), and a count of correlated drupal_backdoor_probe_total events from the same IPs — the single most useful field, because it tells you in one glance whether this is noise (Cause A) or recon (Cause C).
!Logystera alert — Drupal hit by WordPress scanner sweep Warning alert fires within 60s of crossing 100/10min threshold, with top source IPs and correlated backdoor probes.
The fix is simple once you know the problem. The hard part is knowing it happened at all. Logystera turns this kind of failure from "why did our CDN bill double last week" into a 60-second notification with the top scanner IPs already extracted, ready to paste into your edge block list — and, more importantly, into a clear "this was just noise" vs "this was recon, audit your modules" signal that doesn't depend on an analyst correlating logs by hand.
7. Related Silent Failures
drupal_backdoor_probe_total— the Drupal-specific recon counter. When it correlates withdrupal_wp_scanner_hits_totalfrom the same IPs, you have multi-CMS reconnaissance and the priority changes from "ignore" to "audit modules."drupal_top_attack_ips— labelled top-N counter that distinguishes a single targeted pool (small head) from an internet-wide sweep (long tail). Drives the Cause A vs Cause B branching in §5.2.drupal_404_total— generic 404 rate. Without the WP-scanner subtraction, this metric is dominated by scanner noise and useless for finding real broken links. Logystera computes it both raw andwp_scanner-subtracted.drupal_login_attempts_totaland Drupal flood control — when scanner sweeps escalate from path-probing to actual/user/loginPOST attempts against Drupal's auth, this is the next-stage signal.drupal_request_errors_total5xx during sweeps — when scanner volume saturates PHP-FPM workers, the side effect is real users hitting 502/504. Thedrupal_wp_scanner_spikealert correlated with a 5xx burst means the sweep is now an availability problem, not just a noise problem.
See what's actually happening in your Drupal system
Connect your site. Logystera starts monitoring within minutes.