The PolySwarm Blog

Analyze suspicious files and URLs, at scale, millions of times per day. Get real-time threat intel from a crowdsourced network of security experts and antivirus companies competing to protect you.

Using PolySwarm Threat Hunting and Metadata Searching for intel on 0-days

Jul 29, 2019 8:29:00 PM / by Katherine Yan and Javier Botella Fernandez

A deep dive into using PolySwarm’s hunting features to identify malware. This demonstration shows how Threat Hunting and Metadata Searching helps analysts gather info on 0-day malware using EvilGnome as an example.

casey-allen--sCzwQ6D1-g-unsplash

With the release of PolySwarm’s new live and historical hunting features, participants can now use PolySwarm’s CLI and UI to proactively search through PolySwarm’s incoming sample stream (live hunting) or against PolySwarm’s entire archive of samples (historical hunting) to detect and isolate malware threats.

By using YARA rulesets to signature against real world APT malware, users can now answer key questions about newly discovered malware:

  1. When was this malware first seen in the wild?
  2. How well have AV engines / EDR agents detected this threat over time?
  3. Are there variants to this malware?

To demonstrate the efficacy of Threat Hunting, PolySwarm Senior Security Engineer Javier Botella Fernandez provides a play-by-play below using a recent 0-day as an example.

 

Try out PolySwarm THREAT HUNTING for yourself, with a 15-day free trial.  

 

How to use PolySwarm’s Hunting tools to identify 0-day malware

Here’s a situation that might sound familiar: You see a news story come across Twitter about a new 0-day malware that is targeting some organization. Since it’s your job to protect your company’s network, you jump into action. Part of that action will likely involve using threat hunting tools. Hunting can be extremely valuable to glean real-time insights on a previously unknown piece of malware. Below is a detailed explanation of how you can use PolySwarm’s Hunting tools to identify any malware and variants.

Let’s take a real, recent example, like EvilGnome, a 0-day discovered by Intezer Labs, a Linux backdoor that targets desktop users.

 

How to find the 1st stage malware

We will consider two scenarios that vary based on the information you have available.

[ Scenario 1 ]

We don’t have the hash of the malware file, we just know a few details about the script source code from a snippet in the news and we want to find it in the PolySwarm network.

Create a YARA rule

To perform a hunt, we need to create a YARA rule file containing one or more rules. So, we build a YARA rule file with the following rule and name the file evilgnome_spy_agent.yar:

  rule EvilGnome
{
strings:
$s = "targetdir=\"spy-agent\""
condition:
$s
}

Run a Historical Hunt using PolySwarm CLI

Now that we have a rules file, let’s start a historical hunt using the PolySwarm CLI tool.

  $ polyswarm historical start evilgnome_spy_agent.yar 
Successfully submitted rules, hunt id: 34739757571065816

The historical hunt operation was successfully started, so now we need to wait for results to come in.

Check for results using PolySwarm CLI

Historical hunting can take up to several hours to return results, so we will use the PolySwarm CLI to periodically check for results, about every 30 minutes.

  $ polyswarm historical results
Scan status: Running
Found 1 sample in this hunt.
Match on rule EvilGnome
File e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
File type: mimetype: text/x-shellscript, extended_info: POSIX shell script executable (binary data)
SHA256: e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
SHA1: 264bd2b6d809a519b4348dbfc5791d3fc9342af8
MD5: 213c6443b2bd78c4e0aad54ec8338214
First seen: Fri, 19 Jul 2019 11:36:07 GMT
Observed countries: US,ES
Observed filenames: e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755,spy-agent-setup-linux.run

You can also get an extended output of the historical results by adding the --fmt json argument. Using that argument, you get json-formatted output which includes a lot of additional information. Using json format is helpful for parsing and other automated tasks including sandboxing.

We will see an example below showing how to use the json output to extract interesting things in our searches.

Check for results using PolySwarm UI

We can also check for results using PolySwarm’s UI. Log into your account on https://polyswarm.network and go to the Search page, then click on the Historical Hunting tab. You can refresh the page to display any new results.

 

Review the results

Looking at the results using PolySwarm CLI or PolySwarm UI, will give you the same results. Here we see that one result was found. Interestingly, we ran this search on July 31, but the file was first seen in PolySwarm’s Network on July 19.

[ Scenario 2 ]

Search for a known hash using PolySwarm CLI

Maybe we get lucky and one of the news articles provides a cryptographic hash of the malware. Then, we can run a query in PolySwarm’s network for that hash.

The news says this malware’s hash is:

  e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755

Let’s query PolySwarm using PolySwarm CLI and see if any results come back.

  $ polyswarm search hash e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
Found 1 matches to the search query.
Search results for sha256=e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
File e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
File type: mimetype: text/x-shellscript, extended_info: POSIX shell script executable (binary data)
SHA256: e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
SHA1: 264bd2b6d809a519b4348dbfc5791d3fc9342af8
MD5: 213c6443b2bd78c4e0aad54ec8338214
First seen: Fri, 19 Jul 2019 11:36:07 GMT
Observed countries: ES,US
Observed filenames: spy-agent-setup-linux.run,e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755

Indeed, we get a match on that hash with the high-level results. Remember, we can get additional information about this hash using the --fmt jsonargument to the polyswarm command.

[ Analysis ]

So, at this point from Scenario 1 or 2 we have found the sample inside the PolySwarm network; let’s download it and have some fun….

Downloading an artifact using PolySwarm CLI

If you have a hash of an artifact you can download it using PolySwarm CLI and save it into a directory of your choice. The name of the file when it is downloaded is the sha256 hash.

  $ polyswarm download EvilGnome e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
Downloaded e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755: EvilGnome/e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755

Examine the artifact using the file command

Let’s do a first-level check on the artifact using the file command.

  $ file EvilGnome/e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755
EvilGnome/e9bd299eec7dbee7d4f5c97ccf8ab27a7b77388eaa649f353e41df8b7b1df755: POSIX shell script executable (binary data)

The file command tells us that the malware file is a Bash script program for a POSIX system, like Linux.

Run it in your local sandbox

So, let’s run it inside our Linux sandbox to determine all of the files the malware creates and uses, as well as any paths where the malware files get installed.

First, let’s see what the help output tells us about this Bash script.

  ubuntu@ubuntu:~$ ./spy-agent-setup-linux.run — help
Makeself version 2.3.0
1) Getting help or info about ./spy-agent-setup-linux.run :
./spy-agent-setup-linux.run — help Print this message
./spy-agent-setup-linux.run — info Print embedded info : title, default target directory, embedded script …
./spy-agent-setup-linux.run — lsm Print embedded lsm entry (or no LSM)
./spy-agent-setup-linux.run — list Print the list of files in the archive
./spy-agent-setup-linux.run — check Checks integrity of the archive
2) Running ./spy-agent-setup-linux.run :
./spy-agent-setup-linux.run [options] [ -- ] [additional arguments to embedded script] with following options (in that order)
--confirm Ask before running embedded script
--quiet Do not print anything except error messages
--noexec Do not run embedded script
--keep Do not erase target directory after running the embedded script
--noprogress Do not show the progress during the decompression
--nox11 Do not spawn an xterm
--nochown Do not give the extracted files to the current user
--target dir Extract directly to a target directory
directory path can be either absolute or relative
--tar arg1 [arg2 …] Access the contents of the archive through the tar command
-- Following arguments will be passed to the embedded script

The --noexec option seems interesting, so let’s try it.

  ubuntu@ubuntu:~$ ./spy-agent-setup-linux.run --noexec
Creating directory spy-agent
Verifying archive integrity… 100% All good.
Uncompressing setup files… 100%
ubuntu@ubuntu:~$

Running that command created some new files as we can see here:

  Files extracted:
spy-agent
├── gnome-shell-ext
├── gnome-shell-ext.sh
├── rtp.dat
└── setup.sh

Let’s again use the file command to get some base information about those new files:

  spy-agent $ file *
gnome-shell-ext: ELF 64-bit LSB executable, x86–64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86–64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=07f157388a2f56b3f800ea4d84c209c42a68c9fa, not stripped
gnome-shell-ext.sh: POSIX shell script, ASCII text executable
rtp.dat: data
setup.sh: POSIX shell script, ASCII text executable

By running it in the sandbox, we learned that the malware we have is the initial stage of the malware, single-stage dropper, whose purpose is to install and run the malware itself (ELF 64-bit file) and setup everything for persistence.

How to find the 2nd-stage malware via PolySwarm’s Metadata Search

[ Scenario 1 ]

Let’s say, we don’t have the hash, but from the news, we got some external functions that this specific malware is using:

 
Source: https://nakedsecurity.sophos.com/2019/07/25/evilgnome-linux-malware-aimed-at-your-laptop-not-your-servers/

Use PolySwarm CLI to search for metadata

The PolySwarm CLI allows us to run queries over the metadata of all artifacts. These metadata search queries are extremely fast. We should be able to get results in seconds, so let’s give it a try:

  $ polyswarm search metadata ‘exiftool.FileType:"ELF executable" AND lief.exported_functions:/.*Shooter(Sound|Ping|Key|Image|File).*/’  Found 1 matches to the search query.
Search results for {‘query’: {‘query_string’: {‘query’: ‘exiftool.FileType:"ELF executable" AND lief.exported_functions:/.*Shooter(Sound|Ping|Key|Image|File).*/’}}}
File 7ffab36b2fa68d0708c82f01a70c8d10614ca742d838b69007f5104337a4b869
File type: mimetype: application/x-executable, extended_info: ELF 64-bit LSB executable, x86–64, version 1 (SYSV)
SHA256: 7ffab36b2fa68d0708c82f01a70c8d10614ca742d838b69007f5104337a4b869
SHA1: d11582903173e14c4ce41a3d2edfebdf5bf324c5
MD5: dcfc3cb0ca5ea83d835af6979a9b85c1
First seen: Thu, 18 Jul 2019 10:57:34 GMT
Observed countries: ES,US
Observed filenames: 7ffab36b2fa68d0708c82f01a70c8d10614ca742d838b69007f5104337a4b869.bin,7ffab36b2fa68d0708c82f01a70c8d10614ca742d838b69007f5104337a4b869

From the results, you can see that we got it! If there were another sample from the same kind of malware, it would show up as well, but since this is a 0-day malware, we only get this new sample from the wild.

For the advanced user, let’s say that you want to extract specific fields from the search results in order to update your local database with i.e. the ssdeephash.

Here is an inline example to demonstrate:

  $ polyswarm --fmt json search metadata ‘exiftool.FileType:"ELF executable" AND lief.exported_functions:/.*Shooter(Sound|Ping|Key|Image|File).*/ ‘ | python -c ‘import json,sys;json_output=json.load(sys.stdin);print(json_output[0]["result"][0]["artifact_metadata"][0]["tool_metadata"]["ssdeep"])’3072:6LvhrsSmJQZ2qF17QnOifLfD/eNKndEeIrVbv9KPc8FrEX5M0nN8r9AnGA:6LSSmJQQq7QOifGSdeRSc8Frc8JAnGA

How to keep track of this malware using Hunting

Now that we know more about this malware, let’s setup a Live Hunt to alert us if any similar malware shows up in the future.

To do that, let’s build a YARA rule designed to catch all the future samples of this kind of malware, so we can use it in Historical and Live Hunts.

  rule EvilGnome
{
strings:
$spy_agent = "targetdir=\"spy-agent\""
$evil_functions = /Shooter(Sound|Ping|Key|Image|File)/
$config_agent_id = { 6cf7 513a 6b01 0000 }
condition:
$spy_agent or $evil_functions or $config_agent_id
}

Create a YARA Rule using PolySwarm UI

Log into your PolySwarm account and go to the Search page. There you can go to the Rulesets tab and create a new YARA rule.

 

Historical Hunting in PolySwarm UI

Let’s run that YARA rule as a one-time Historical Hunt. That will search through all existing artifacts in the PolySwarm network to find other artifacts that match our YARA rule.

 

Viewing Historical Hunting results in PolySwarm UI

When some results come in, we can view them on the Historical Hunting tab.

 

Live Hunting in PolySwarm UI

Now, let’s also run that same YARA rule as an active Live Hunt. That will examine all new artifacts as they are added to the PolySwarm network from now on. This is how we will track it going into the future.

 

Viewing Live Hunting results using PolySwarm UI

When you first start the Live Hunt, there will be no results. So, we have to wait until some point in the future when somebody submits a new artifact to the PolySwarm network that matches our ruleset.

 

When there is a match, we can view it on the Live Hunting tab:

 

When you click on any result in a Live or Historical Hunt, you can view all of the details of the artifact that are known by the PolySwarm network. This specific artifact did not have a lot of amplifying information, because it’s the raw data configuration file from the malware, but other files have extensive details.

 

The raw data configuration file contains the C&C IP, port, and other interesting things for the analyst to use in an endpoint IOC update. Doing those updates could even get automated with scripts and other software tools.

We can run two commands to get an IP address that we will want to add to an endpoint IOC. So, you can see how this could be scripted.

  $ polyswarm download Auto_Config_Extraction 3b08ce97c512c695c0258c2d0fce86648a28cceb1ce98e0456413e339c7908e8  Downloaded 3b08ce97c512c695c0258c2d0fce86648a28cceb1ce98e0456413e339c7908e8: Auto_Config_Extraction/3b08ce97c512c695c0258c2d0fce86648a28cceb1ce98e0456413e339c7908e8  $ od -An -t u1 -N 4 Auto_Config_Extraction/3b08ce97c512c695c0258c2d0fce86648a28cceb1ce98e0456413e339c7908e8
195 62 52 101

Where 195.62.52.101is the IP Address of the malware’s C&C server.

So, using Live and Historical Hunting in the PolySwarm network, we can get via the CLI all the matching malware configuration files from new samples. And if we use the PolySwarm CLI in an automated system, we could automatically update endpoints for real-time protection. We did not show it in this demonstration, but the PolySwarm CLI has a companion PolySwarm API that can also be integrated with other software. Both are available on the PolySwarm GitHub.

In Conclusion

As we’ve demonstrated, PolySwarm’s threat hunting feature is an effective tool for discovering important information on known malware threats in the marketplace. Historical hunts can help identify the origins and subsequent detection of certain threats, while live hunts give real time access to incoming PolySwarm submissions and can show the ongoing prevalence of newly discovered malware.

 

Threat hunting features are exclusively available to Enterprise (and above) customers. Try out PolySwarm THREAT HUNTING for yourself, with a 15-day free trial.  

 

Topics: PolySwarm, Product, Threat Hunting