Tuesday, August 16, 2011

Inside an APT Covert Communications Channel

Note: I shortened the title of the post from "Inside an APT “Comment Crew” Covert Communications Channel" to "Inside an APT Covert Communications Channel". To be clear, multiple threat groups are using HTML comments as a means of COVCOM. Thus, this should be considered a general technique as opposed to attribution on a specific group. Both Shady RAT and "Comment Crew", as well as others with additional codenames, have been associated with the use of HTML comments as a means of COVCOM.

For many years, hackers operating out of China have been attacking a myriad of commercial and government systems here in the US and abroad. The term “APT” or Advanced Persistent Threat has often been used to describe these attackers. While HBGary is primarily a product company selling an enterprise incident response product, the team has been deep into APT analysis for over five years. Most of the analysis work is in direct support of Digital DNA – an automated system for detection of unknown malware and APT intrusions. I presented a technical description of how this attribution works, what is solves and what it doesn’t, at the BlackHat Conference last year. The work is about tracking threat groups – that is, tracking the humans and the human factors behind the digital artifacts we see. There are many hacking groups involved in these intrusions. One such group has often been called “Comment Crew” for their use of HTML comments as a means of command and control. This group has been associated with the recent “Shady RAT” intrusion revealed by McAfee. For this article I am going to give you a technical in-depth tour of how such a group operates.

For starters, the attackers will gain access to the network via spear-phishing. In almost all cases we have investigated, spear-phishing was the initial point of infection. These phishing emails are full of very specific project names, names of associates, official sounding documents, etc. It is very clear that the hacking group is using stolen email to learn about their targets before crafting a very convincing email. This underscores why the recent spate of SQLi attacks over the last few months pose a far greater threat than most people realize.


Exploit and Dropper


Once access is gained into the network, the hacking group places remote access tools into the environment. These are backdoor programs that are downloaded automatically by the exploit email – we called these “droppers”. In the diagram, point A shows the exploit email ‘detonating’ after being viewed by the victim, point ‘B’ is a server where a ‘dropper’ is stored, and point ‘C’ is the dropper backdoor being placed onto the compromised computer.

Once the dropper has established a beachhead into the network, a hacker will access the host and uninstall the original backdoor, replacing it with a new and more powerful backdoor. These backdoors, especially the secondary and more powerful one, are called “RAT”s – for Remote Access Tool. Many of these RATs are custom written and that can be the basis for a great deal of attribution, allowing us to detect the malware in physical memory.


Interaction with the Host


Remember that most networks are firewalled. This means the attacker can’t just make a TCP connection into the RAT program. The RAT program is within the internal network so it must first make an outbound connection to the attacker. The RAT is designed to connect outbound over port 80 or 443, a port that is allowed outbound by almost all firewall policies. Once the outbound connection is made, the attacker can use the established TCP session to interact with the host, download tools, run command line programs, and laterally move about the network. In the diagram, point A is where the RAT makes an outbound connection to a server on the Internet, point B is a server under the hacker’s control, and point C is where the hacker uses the established TCP connection to interact with the RAT program and subsequently the host environment, potentially exploiting additional machines nearby in the network.

One of the greatest challenges for an incident response team is discerning the difference between ‘normal’ malware and an APT attack. As we can see in this example, an APT attack involves a real human at the other end of the keyboard performing actions on the host. We call this ‘interaction with the host’ and we recommend that an IR team pull a timeline of last-access times from the MFT (master file table), browsing history from index.DAT, event log, and other sources to determine if such interaction is occurring. This is a fast and easy way to discern the difference between a non-targeted external threat (which over 80% of all adverse events will fall into this category) and external targeted attacks (of which APT is included, probably less than 2% of all adverse events).

The RAT program doesn’t contain any fancy stealth or anti-forensics measures. In fact, we rarely even see packers in use (a packer is a method of obfuscating a program after compilation and is a low-cost way for a hacker to add anti-forensics to his malware). It seems the most of the covert methods are applied to the way to RAT communicates with the hacker. This makes sense. Consider that most of the intrusion detection capability lies at the perimeter of the network, and this is what the hacker is trying to defeat. Thus, the HTML comment method of configuring and controlling the RAT programs.


Hidden Comments for Covert Communication (COVCOM)


Instead of letting the RAT connect directly to his personal server, the hacker will first exploit a webserver somewhere on the Internet. This exploited webserver will then be used as the ‘middleman’ to communicate with the RAT. The hacker will place a hidden comment on an otherwise normal webpage and have the RAT connect outbound to this page. Using the hidden comment, the hacker will be able to give commands to the RAT. The RAT will make periodic outbound connections, sometimes waiting days before checking the page. The hidden comment will contain an encoded message that the RAT knows how to decipher. In this case example, the hidden data is base64 encoded. In this diagram, point A is the RAT program making a periodic outbound connection, point B is a compromised webserver somewhere on the Internet, point C is the hidden comment on the webpage, and point D is where said comment is decoded into actual instructions for the RAT. An example of such a comment is shown in the next image. It is interesting to note that the hacker has attempted to make the page look like a 404 HTML error page if viewed in a normal web browser.


Example of BASE64 Encoded Hidden Comment


Once the RAT decodes the message, the data becomes a configuration file for the malware. The file has many features, such as the ability to specify which server addresses to use on the Internet, including backup servers, configuration of the check-in times, and even has the ability to completely update the RAT binary in the field (shown in the diagram as a .bmp file – this is actually a normal PE header executable).


The Decoded Configuration File


All of the above technical information can be detected on a host after intrusion. The RAT program itself is near trivial to detect once you know what you are looking for. But beyond that, because the RAT program has certain outbound connection characteristics, sleep timers, and built-in “host interaction” capabilities, HBGary’s Digital DNA lights it up like a Christmas Tree (example shown in image).


Digital DNA Detects Unknown Malware


Even if you had no prior knowledge about this specific RAT, you would have detected it with HBGary. Beyond that, the decoded configuration file can also be found in physical memory – the primary search method used by Active Defense. Regardless of the configuration values, the option headers shown in the example above have a specific pattern that can be detected quite easily, even if fragmented over multiple buffers. This is exactly the kind of information I am referring to when I talk about “actionable threat intelligence”. Once you know about the attackers TTP’s (tactics, techniques, and procedures) you can encode this into an enterprise-wide scan. We call it ‘continuous protection’ when you adopt continual scanning while also updating the threat intelligence as you learn more about the attacker. In essence, you are applying attrition against the attacker’s presence in your network. For example, if you know how to detect the above configuration file, then the attacker has to change the way that configuration file looks to defeat you – something that also requires them to recode their parser in the malware. Hence, you cost the attacker time and money. That is a Good Thing.

I hope this gave you a somewhat concrete tour of how a real APT covert communication (COVCOM) channel works. Also, I hope it has illustrated some of the threat intelligence that you access on the host. Using enterprise-wide scans, your IR or security team can put a severe dent in the APT presence in your network. As far as product solutions to enable you, obviously we build HBGary’s Active Defense. If you are interested in continuous protection and threat intelligence, we offer 50-node evaluations of Active Defense that can be installed on a laptop. We also offer a deploy-on-demand license for incident response teams (our 500-node pack has been quite popular), as well as the perpetual node model for full enterprise proactive deployments.

-Greg

Monday, August 15, 2011

Shady RAT is Serious Business

Ira Winkler makes some interesting points in his CIO article on Shady RAT. I tend to agree with his observation that security vendors spend too much energy infighting when we all should be facing a common enemy. It is true that Shady RAT is just one of many other, similar attacks. There is no harm in trying to draw attention to the elephant in the room - APT is a grave and serious threat to U.S. companies as well as national security. Shady RAT may appear to be 'sloppy' but it can still be APT. Within infosec the term APT has been debated - but we at HBGary have a very simple definition: if there is interaction with the host, we call it APT. Now, most of the attacks we deal with are targeting intellectual property and appear to have state sponsored underpinnings. The attackers usually leave tools behind, additional backdoors, etc., but none of these are very complex. The malware and techniques are mostly unsophisticated and sloppy, but yet they succeed and remain persistent. Our assumption on this - APT does the minimum necessary to get the job done. If they don't need hard core boot sector viruses and kernel rootkits, they aren't going to use them. We as an industry have a responsibility to protect our customers from a very serious and evolving threat. Downplaying the seriousness of this threat undermines the reason we are here.

-Greg

Tuesday, August 9, 2011

Command Line Programming with Responder PRO

One little known feature of HBGary’s Responder product is that it ships with the full source code to a command-line version. This command-line version of the product can be customized for automated tools, batch processing, and statistical utilities. HBGary is still working to produce an 'official' documentation on the SDK, but in the meantime I figured I would walk the more adventurous of you through some code.

First you need Microsoft Visual Studio. I use VS2008 Pro Edition with version 3.5 SP1 of .NET. In the SDK subdirectory of your Responder installation, you should find the ITHC directory. Just a backstory, but ITHC means Inspector Test Harness Client – it was originally a test harness used by our QA team that eventually proved so useful for batch processing that we included it for customers. The code is written in C#.

When I first opened the .sln file on my Responder install, I found that the project file needed some tweaking. Your mileage may vary, but here are some steps I had to take. First, the references to all the Responder DLL’s were broken. By editing the .csproj file I was able to fix this. The trick is to use a HintPath variable with a relative path to the main install directory, which is two folders above the ITHC directory (see image). I’m not sure why it shipped this way, but alas I was able to fix it.


Fixing the references


Now, in most cases, I like programming in Debug mode so I can single step, use breakpoints, inspect variables, etc. I ran into a snag with my debug build and had to get one of the HBGary engineers to take a look. Again, it was a configuration thing. When you make build settings, the platform will probably be set to AnyCPU. You will need to set the platform target to x86 (see image). This has something to do with mixed mode code and if you don’t set this to x86 you will get a binding error when you attempt to run the ITHC exe. Lastly, I set my output path so the ITHC.exe ended up in the main Responder install directory (see image).


Setting the platform target



Setting the output path


Running the tool requires some precise command line arguments (see image). The project path needs to be as shown path/projectname/projectname.proj and the path to the memory image needs to be fully qualified. If you want to change any of that, you can edit the code in NewProject() and OpenProject() to parse the path differently. At this point I had a fully functional ITHC.exe that would analyze Windows physical memory snapshots.


Command line parameters to the tool


Most of the analysis magic happens in THCAnalyzeFile(). The project file ends with the .proj extension and this will be created or opened if it already exists. There is also a .tmp file that contains cached lookup data for Responder which only exists after an analysis. THCAnalyzeFile() will handle all of this.

At this point I need to explain packages and classes. In Responder, a package is any binary object. For example, the physical memory snapshot is a package. Every extracted livebin is also a package. If you import a file for static analysis, that file is considered a package.

Both packages and classes can have parent/child relationships. The difference is that a class is simply a container without any associated binary data. Think of it as just a folder. In fact, in the Responder GUI, classes are shown as folder icons. Just remember that packages can have child classes, classes can contain other classes, classes can contain packages – there is no restriction on the way you nest these objects.

Around line 249 in the ITHC example you will see the creation of the root package (see image). Every project has a single root package that everything else will reside under. Usually this package has no associated binary object and is simply a placeholder. We usually set this to the name of the forensic case – such as “Case 04321”. In Responder’s GUI, the root package is always shown with a safe icon. Depending on the project type, a class will be created directly under this root package. The name of this class is very important and affects the kinds of things Responder will let you do. So, for a physical memory analysis you need to name this first class "Physical Memory Snapshot". You will see this created around line 266.


root package, bulk update, named attributes


Now just a word on event management. Responder has a robust event alerting system that will post an event to your code whenever an object is modified. You could subscribe to these events and be notified if the user changed a property of an object anywhere in the GUI, for example. But, there is a flipside – if you make a large number of changes all at once you will flood the system with these messages. Most of the time if you are going to change a bunch of objects all at once, you want to disable events for a short time. To do this, you use the BeginBulkUpdate() and EndBulkUpdate() methods. You will see these in use around line 249 (see image).

Around this same section of code you will also see named attributes being set on the case. These attributes are being applied to the root package, the one that shows up as a safe icon when you view it in Responder’s GUI. Any object, including packages and classes, can have named attributes set. The attribute system is typed and the first letter of the name indicates the type. See my previous post on plugin development for a description of these.

Around line 293 you will see the creation of a second package. This package is the one associated with the physical memory snapshot. It is placed under the root node and folder. You will also see the creation of something called a snapshot that is then linked with the package. This is how you link a binary to the package – via the snapshot object. The snapshot is just a small header of metadata that is associated with the binary file – including the path to the file – and this is set as the “.InitialSnapshot” property of the package. After this step, the package and the binary are linked.


package and snapshot for the physical memory image


The most important function is then called – the AnalyzeMemory function (around line 329). This function performs the bulk of the memory analysis. It returns true or false depending on whether it understood the memory snapshot. Just a note; it will return false if you don’t have a valid license. If you have the free version of Responder CE, you still have a license file that must be present or this call will bail out on you.

After analysis is complete, the analysis history is updated to include “WPMA”. This tells Responder that “WPMA” analysis has already completed, so it won’t attempt a second analysis later. Note: WPMA means Windows Physical Memory Analysis. Responder has other analysis types that can be added to this history. You can also add your own for reference later.

Now that analysis is complete you can parse the datastore, query all the found windows objects, processes, modules, etc. You can also query the DDNA results if you are using the Pro version. Some object types, such as control flow, disassembly, dataflow, graph objects, and recon traces are only available in the Pro version. However, the results of the windows memory analysis are fully available in all versions, including the free CE version. See the THCDumpProject() function for more information on parsing the project’s object tree.


Package: ws2_32.dll
Parent Package: svchost.exe
Length: 0 bytes.
Class: Symbols
Class: Strings
Class: Report Items
Class: Global
Strings:
Package: vmwaretray.exe
Parent Package: VMwareTray.exe
Length: 0 bytes.
Class: Strings
Class: Global
Class: Report Items
Class: Symbols
Strings:
Package: msctf.dll
Parent Package: IEXPLORE.EXE
Length: 0 bytes.
Class: Strings
Class: Symbols
Class: Global
Class: Report Items


a short snippit of output from the THCDumpProject() function


For those of you using the Pro version, ITHC includes examples of not just physical memory analysis, but also extraction of livebins and code-level analysis of extracted livebins. If you made it this far, then take a look at AnalyzePackage(), AnalyzeExtractedPackage(), and ExtractPEImageFromMemory() to get more familier with the code level analysis features. I hope that I can write some more specific posts about these features in the near future.


ITHC.exe analyzing a memory snapshot


Because the ITHC utility is written in C# it’s very easy to interface to other systems. Microsoft has done a good job building a robust set of API’s that can be used for SQL database access, serializing files, communicating over the web or TCP/IP, regular expressions, etc. All of this is at your fingertips and can be interfaced with the results of physical memory assessments. I am partial to building bulk analysis tools for large directories of memory snapshots. You are only limited by your imagination.

The SDK directory should be in your Responder install directory. If you are using the free Community Edition you may not have the SDK directory. In this case you can download the SDK as a small but separate download from the free tools section on HBGary's support site. Visit www.hbgary.com for more information.