Guide

Drupal site under backdoor probing — detecting webshell scanners and known-CMS probes

You opened the access log this morning and the screen would not stop scrolling. Hundreds of 404s a minute, all for files that have never existed on a Drupal site: 185.234.219.114 - - [27/Apr/2026:09:14:02 +0000] "GET /shell.php HTTP/1.

1. Problem

You opened the access log this morning and the screen would not stop scrolling. Hundreds of 404s a minute, all for files that have never existed on a Drupal site:

185.234.219.114 - - [27/Apr/2026:09:14:02 +0000] "GET /shell.php HTTP/1.1" 404 162
185.234.219.114 - - [27/Apr/2026:09:14:02 +0000] "GET /c99.php HTTP/1.1" 404 162
185.234.219.114 - - [27/Apr/2026:09:14:03 +0000] "GET /alfa.php HTTP/1.1" 404 162
185.234.219.114 - - [27/Apr/2026:09:14:03 +0000] "GET /wp-login.php HTTP/1.1" 404 162
185.234.219.114 - - [27/Apr/2026:09:14:04 +0000] "GET /wp-admin/admin-ajax.php HTTP/1.1" 404 162
185.234.219.114 - - [27/Apr/2026:09:14:04 +0000] "GET /vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php HTTP/1.1" 404 162
45.79.181.22 - - [27/Apr/2026:09:14:05 +0000] "GET /admin.php HTTP/1.1" 404 162
45.79.181.22 - - [27/Apr/2026:09:14:05 +0000] "GET /joomla/administrator/ HTTP/1.1" 404 162

Your Drupal site is getting hammered with .php 404s from a scanner. If you searched for "drupal /wp-admin /wp-login probe attack" or "drupal block backdoor scanner ips" and landed here mid-incident, this is the diagnostic playbook. The scanner is walking a fixed list of known backdoor names — shell.php, c99.php, alfa.php, r57.php, wso.php — plus CMS-specific probes for WordPress, Joomla, and old phpMyAdmin installs. None of those files exist on your Drupal site. Every request is a 404. The activity surfaces as the drupal_backdoor_probe_total signal, and it is not random noise.

2. Impact

A backdoor probe scanner is reconnaissance. The attacker is not trying to exploit your Drupal site directly — they are checking whether a previous attacker already compromised it. Webshells like shell.php, c99.php, and alfa.php are dropped by mass-exploitation campaigns and indexed by criminal scanners that come back later to use the access. If your site is hosting one — even one a previous owner installed and never noticed — this scan finds it.

The cost lands in three places.

Compromise discovery delay. If a probe gets a 200 instead of a 404, the attacker is in within minutes. Most webshells accept arbitrary command execution via a single GET parameter (?cmd=id). One successful probe leads to data exfiltration, lateral movement, and ransomware staging — typically before the next morning's standup.
Resource saturation. A sustained scan at 50 requests per second burns PHP-FPM workers on 404 page renders. Drupal's 404 handler is not free: it boots the kernel, runs the routing layer, fires hook_kernel_request, and returns a themed page. At scale this drops legitimate traffic to 502 and balloons the database connection count.
Cover for real attacks. Operators tune out scanner noise. When the access log is 80% backdoor probes, the one real exploitation attempt — a ?destination=... open redirect, a cached views POST, an SA-CORE-2018-002 Drupalgeddon retry — disappears in the noise. Attackers know this and time real work to coincide with scan storms.

A single distributed scan can generate 500,000 404s in a day — 500,000 lines you have to skim without a signal-driven detection layer.

3. Why It’s Hard to Spot

Drupal does not flag this. The 404 handler returns its themed page and moves on. Watchdog logs each 404 as a page not found notice — useless when there are 50,000 of them in dblog overnight, where they evict every other log entry under the default 1000-row cap. By the time you look, the only thing in watchdog is 404s.

Uptime checks return 200, because the homepage is fine. The CDN does not block this traffic — most CDN bot rules whitelist GET requests under a kilobyte, which is exactly what these probes are. Fail2ban is not installed by default on managed Drupal hosts. The WAF, if there is one, blocks SQL injection and XSS but happily passes a request for /shell.php because the request itself is syntactically clean.

The biggest tell is the easiest to miss: the URIs being probed are CMS-specific paths that should never appear on a Drupal site. /wp-admin/admin-ajax.php, /wp-login.php, /joomla/administrator/, /vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php (CVE-2017-9841) — these are signatures of "I am scanning every host on the internet, regardless of platform." Drupal admins assume they are noise and stop reading. The pattern is the signal.

4. Cause

A scanner targets your IP address, not your site. It works from a fixed wordlist of webshells and CMS paths and walks every entry against every host in its block list. The Logystera Drupal module emits an http.request event for every request hitting the kernel, including 404s. When the request URI matches a known-bad pattern (a regex against \.(php|asp|aspx|jsp|cgi)$) and the response is a 404, that combination is counted as one increment of the drupal_backdoor_probe_total signal — scoped per entity, per source IP.

This is what Definition::Rule drupal_backdoor_probe evaluates. Its DSL conditions:

conditions:
  - type: equals
    key: event_type
    value: http.request
  - type: regex
    key: payload.path
    value: "\\.(php|asp|aspx|jsp|cgi)$"
  - type: equals
    key: payload.status_code
    value: 404
threshold:
  count: 20
  interval: 600
  group_by: [entity_id]
suppress:
  time: 1800
  group_by: [entity_id]

20 matching requests in 10 minutes against the same entity trips the rule. The rule does not need to know what shell.php is. It needs to know that 20+ different .php / .asp / .jsp / .cgi paths returned 404 inside ten minutes — which only happens when something is scanning, never when a real user is browsing.

5. Solution

5.1 Diagnose (logs first)

Start with the raw access log. Confirm volume, then identify offenders, then correlate with the suppression window so you understand what the alert pipeline saw.

# Confirm the volume of script-extension 404s in the last hour
tail -n 200000 /var/log/nginx/access.log \
  | awk '$9 == "404"' \
  | grep -E '\.(php|asp|aspx|jsp|cgi)(\?|"| )' \
  | wc -l

Each matched line is one http.request event with status_code=404 and a path matching the regex — exactly the conditions that increment drupal_backdoor_probe_total. A healthy Drupal site produces 0–10 of these per hour (the occasional /wp-login.php from random internet noise). If you see 500+, you are in an active scan.

# Top source IPs hammering script-extension 404s in the last 100k lines
tail -n 100000 /var/log/nginx/access.log \
  | awk '$9 == "404"' \
  | grep -E '\.(php|asp|aspx|jsp|cgi)(\?|"| )' \
  | awk '{print $1}' | sort | uniq -c | sort -rn | head -20

This reproduces what the supporting signal drupal_top_attack_ips shows on the dashboard: the IP distribution across your drupal_backdoor_probe_total events. One IP at 90%+ means a single rented VPS. A flat distribution across 50+ IPs means a botnet or proxy network — harder to block, more dangerous.

# Most-probed nonexistent paths
tail -n 100000 /var/log/nginx/access.log \
  | awk '$9 == "404" {print $7}' \
  | grep -E '\.(php|asp|aspx|jsp|cgi)(\?| |$)' \
  | sort | uniq -c | sort -rn | head -30

This is the data behind the drupal_top_404_uris signal. Expect to see wp-login.php, xmlrpc.php, wp-admin/admin-ajax.php, vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php, and the webshell list (shell.php, c99.php, alfa.php, r57.php, wso.php, marijuana.php). The ratio of WordPress-specific paths to total probes is what drupal_wp_scanner_hits_total measures — when WP paths dominate, you know it is a generic CMS scanner, not a Drupal-targeted probe.

Time correlation. Tie the 404 spike to a real-world event. Scanners are usually triggered by a scheduled list run or a fresh dump of "online IPs" — both observable as a sudden onset.

# Bucket 404 .php probes by minute around the suspected onset
grep -E ' 404 ' /var/log/nginx/access.log \
  | grep -E '\.(php|asp|aspx|jsp|cgi)(\?|"| )' \
  | awk '{print $4}' \
  | cut -d: -f1-3 \
  | sort | uniq -c | tail -30

The output shows the minute-by-minute count. A clean attack signature looks like 0, 0, 0, 412, 488, 511, 503, 497, 0 — flat baseline, sudden ramp, sustained plateau, abrupt stop when the scanner moves to the next IP block. That ramp time is what feeds the §6 alert caption: "drupal_backdoor_probe_total threshold breached at 09:14, 20 events in 47 seconds." If your ramp aligns with a public IP rotation window or with a known scanner schedule (Censys, Shodan, GreyNoise rescan windows), document it — it tells you whether you are a target or collateral.

5.2 Root Causes

Each cause maps to which signal increments and how the log line appears.

Generic internet-wide CMS scanner. Increments drupal_backdoor_probe_total evenly across wp-, vendor/phpunit/, joomla/, phpmyadmin/. Logs show one IP requesting 30–80 distinct paths in under a minute. This is the most common case — you are one host on a list of millions. The supporting signal drupal_wp_scanner_hits_total will dominate drupal_top_404_uris.
Compromise revisit. A previous attacker dropped a webshell on a former hosting provider, sold the access list, and a buyer is now scanning for live shells. Increments drupal_backdoor_probe_total against very specific filenames (shell.php, c99.php, wp-login.php with a specific query string). Look for one IP requesting <5 distinct paths repeatedly — they have a target list, not a wordlist.
Drupalgeddon-class probing. Increments drupal_backdoor_probe_total plus http.request events with status_code=200 against /?q=user/password&name[%23post_render][]=.... If any 200 sneaks in among the 404 storm, escalate immediately — it means a probe found a working endpoint. Logs show mixed status codes from the same IP within seconds.
Compromised neighbor on shared hosting. Increments drupal_backdoor_probe_total for paths like /cgi-bin/test.cgi, /cgi-sys/defaultwebpage.cgi. Source IP is internal or your hosting provider's range. The scanner is on the same physical host. This is the worst-case scenario and warrants a hosting provider escalation.

5.3 Fix

Order of operations: stop the bleeding, then root out compromise, then harden.

Block the top IPs at the edge. From step 1 of §5.1, take the top 5 offending IPs. If you are on Cloudflare, add them to a WAF firewall rule with action Block. If you are on bare nginx, append to /etc/nginx/conf.d/blocklist.conf with deny 185.234.219.114; and reload. Do not rely on Drupal-level blocking — the kernel still boots for every request.
Drop CMS-specific noise at the web server. Most Drupal sites have zero legitimate traffic to /wp-, /xmlrpc.php, /vendor/phpunit/, /joomla/*. Return 444 (nginx connection close) or 403 directly from the webserver, bypassing PHP entirely:

   location ~* (wp-login|wp-admin|xmlrpc|wp-content|phpunit|joomla|administrator/index\.php) {
     return 444;
   }

This zeroes out the PHP-FPM cost of the scan and keeps drupal_backdoor_probe_total from incrementing during the next wave.

Audit for actual compromise. Even with the scan blocked, you must rule out a hit. Walk the docroot for unexpected .php files outside core/contrib paths:

   find /var/www/html -name '*.php' -newer /etc/passwd \
     -not -path '*/core/*' -not -path '*/modules/*' -not -path '*/themes/*' \
     -not -path '*/vendor/*' -not -path '*/sites/*/files/private/*'

Anything in sites/default/files/, sites/default/files/.htaccess, or the docroot that you did not write is a webshell candidate. Quarantine, do not delete — preserve for forensics.

Enable fail2ban or equivalent. A [drupal-404-probe] jail on \.(php|asp|jsp|cgi)$ 404s with findtime=600 and maxretry=20 mirrors the Logystera rule's threshold and gives you OS-level blocking while alerts route through your incident channel.
Rotate exposed secrets. If you found a webshell, assume database, settings.php credentials, S3 keys, and SMTP credentials are compromised. Rotate everything before bringing the site back.

5.4 Verify

After applying blocks and the nginx rule, watch for the signal to stop firing.

Signal disappearance. No new drupal_backdoor_probe_total increments for the same entity_id within the next 30 minutes. The rule's suppress: time: 1800 means even an active scanner will not re-alert in that window, so verification has to extend past it. Wait at least 45 minutes from the last alert before declaring the incident closed.
Expected baseline. A healthy Drupal site sustains 0–5 increments of drupal_backdoor_probe_total per hour from background internet noise — random WP probes, the occasional /admin.php from a misconfigured monitoring tool. Anything under 5/hour is steady-state. If you are still seeing 50+/hour after edge blocks are in place, your block list is incomplete or the scanner has rotated to a fresh IP range.
What to grep:

  # 30-min rolling count, should drop to single digits
  grep -E ' 404 ' /var/log/nginx/access.log \
    | grep -E '\.(php|asp|aspx|jsp|cgi)(\?|"| )' \
    | awk -v cutoff="$(date -u -d '30 min ago' +%d/%b/%Y:%H:%M)" '$4 > "["cutoff' \
    | wc -l

Dashboard panel: the drupal_backdoor_probe_total time-series should flatline. The drupal_top_attack_ips panel should empty out within 30 minutes as the suppression window expires and the signal stops re-firing.

If drupal_backdoor_probe_total stays above 5/hour for 30 minutes after blocks are in place, the underlying source has not been mitigated — go back to §5.3 step 1 and expand the block list.

6. How to Catch This Early

Fixing it is straightforward once you know the cause. The hard part is knowing it happened at all.

This issue surfaces as drupal_backdoor_probe_total.

Everything you just did manually — tail the access log, count 404s by extension, group by source IP, isolate the WordPress-path subset, correlate the spike with the onset minute — Logystera does automatically. The same drupal_backdoor_probe_total signal you just searched for is detected, charted, and alerted in real time. The alert rule fires on 20 matching events in 600 seconds per entity, with a 1800-second suppress so a single persistent scanner does not flood your incident channel — you get one critical alert per attacker per 30 minutes, with the evidence already attached.

drupal_backdoor_probe_total per entity, last 24h — sharp ramp at 09:14 immediately after a fresh scanner IP block went live, suppressed re-fires by the 1800s window.

Critical alert fires within 60s of drupal_backdoor_probe_total crossing 20 events / 10 minutes, with top attacker IPs and probed URIs in the evidence section.

The fix is simple once you know the problem. The hard part is knowing it happened at all. Logystera turns a backdoor probe scan from "buried under 500,000 noise lines in watchdog" into a 60-second notification with the offending IP, the regex match that triggered it, and the suppression window already counting down.

7. Related Silent Failures

Backdoor probing rarely arrives alone. Adjacent signals worth watching, each linked to its own diagnostic guide:

Drupal REST/JSON:API enumeration attempts — api.access walks /jsonapi/node/...?page[offset]=... to scrape entity data. Different attack class, same scanner toolchain. See guide #28 in this cluster.
Drupal failed login attempts — auth.login_failed storm against /user/login and /user/login?_format=json. Often follows a successful backdoor probe when the attacker pivots from "who is online" to "who has weak credentials."
Drupal access-denied watchdog noise — watchdog.access_denied floods that overflow dblog and hide real signals. Direct downstream of any sustained scan.
Drupal Drupalgeddon-class exploit attempts — http.request 200s against form_alter render array injection paths. The dangerous outcome of a scan that finds an unpatched site.
Drupal role-change privilege escalation — user.role_change events. The terminal signal in the chain that starts with a backdoor probe and ends in admin takeover.

See what's actually happening in your Drupal system

Connect your site. Logystera starts monitoring within minutes.

Request a demo Drupal integration