Monday, June 30, 2008

Whitelisting is the next snake oil

Most people experienced in computer security know that ‘signatures’ are the dominant technology used to combat malware. Signatures – short descriptions of otherwise large binaries, are extremely effective at detecting specific, known programs and documents. They are perfect for scanning the enterprise for known malware, known insecure software, known intellectual property. They are the cash cow of the anti-virus companies.

There are two approaches to signatures – blacklisting and whitelisting. The idea is simple – signatures of known bad stuff is a blacklist, signatures of known good stuff is a whitelist. Blacklisting has been the preferred method for AV over the last decade. Blacklisting has the benefit of near-zero false positives – something customers expect. Blacklisting also keeps the customers coming back – new malware means new signatures – perfect for recurring revenue models for vendor’s balance sheet.

Blacklisting sounds ideal, but it doesn’t work. New malware emerges daily that has no corresponding blacklist signature. The malware must first be detected, and then processed. There is always a time window where Enterprises have no defense. Recent figures suggest that the AV vendors are falling so far behind this curve that they will never catch up with the deluge of new malware arriving daily. It can take weeks for a signature to become available.
This deluge of new malware is due to several factors. First, there is more money behind malware development than ever before. Second, we weren’t really that good at capturing malware in the past. Today, new malware can be automatically collected, without human intervention. The slow trickle of malware turned into a flood as honeypot technology emerged. Sensor grids can obtain new malware samples with efficiency - they automatically ‘drive by’ (aka spidering) malicious websites to get infections and leave open ports on the ‘Net so automated scanners will exploit them. In parallel to the automated collection efforts, cybercrime has risen to epic levels. Finally, the barrier to entry has dropped for the cyber criminal. Cyber weapon toolkits have become commonly available. Anti-detection technology is standard fare. New variants of a malware program can be auto-generated. A safe bet is to expect thousands of new malware to hit the Internet per day.

The flaw in blacklisting has been exposed – it cannot address new and unknown malware threats. Figures range, but a safe claim is that 80% of all new malware goes undetected. This isn’t just a minor flaw; it’s a gross misstep in technology. Blacklisting is, and always has been, snake oil.

Enter the whitelist. The whitelist seems like a natural response to the “new and unknown malware” problem. Anything that is not known to be good will be considered suspicious, or possibly bad. Sound familiar? Whitelisting is not new, of course. Programs like “Tripwire” were in the market in the 90’s – and proven not to work. I founded rootkit.com originally to disprove the entire concept of OS-based whitelisting.

I agree with the idea that “suspicious” is good enough to warrant a look. This is smart thinking. Whitelisting is the solution.

There is a lot more “not-known-good” in the Enterprise than actual malware. Obviously the Enterprise cannot afford the additional workload caused by “false positives”. So, racing to catch up are the whitelist vendors – to remove all the “noise” so the staff can focus on the signal. Millions of dollars are already being invested into whitelisting files – and there are solid technical reasons this doesn’t work.

Whitelists are based upon files on disk. A whitelist, in current industry terms, means a list of the MD5 sums for files ON DISK. Please understand that files on disk are not the same as files in memory. And all that matters is memory. When a file is LOADED into memory, it CHANGES. This means on-disk MD5 sums do not map to memory. There are several reasons memory is different:

1) Memory contains much more data than the on disk file
2) Memory contains thread stacks
3) Memory contains allocated heaps
4) Memory contains data downloaded from the Internet
5) Memory contains secondary or tertiary files that were opened and read
6) Memory contains data that is calculated at runtime
7) Memory contains data that is entered by a user

All of the above are not represented by the file on disk. So, none of the above are represented by the whitelist MD5 sum. Yet, when the file hash on disk passes for white-listed, the running in-memory file is considered whitelisted by proxy. This is where the whole model breaks down. In memory, there are millions of bytes of information that are calculated at runtime – they are different every time the program is executed, the DLL is loaded, or the EXE is launched. These bytes are part of the process, but unlike the file on disk they change every time the program is executed. Therefore, they cannot be whitelisted or checksummed. This data can change every minute, every second of the program’s lifetime. None of this dynamic data can be hashed with MD5. None of this dynamic data is represented by the bytes on disk. So, none of it can be whitelisted.

While an executable file on disk can be whitelisted, well over 75% of that program cannot be whitelisted once it’s actually running in memory. This missing 75% can easily contain malicious code or data. It can contain injected code. It can contain booby-traps in the form of malicious data. It can represent an injected thread. The assumption that an on-disk whitelist match means that this dynamic data is ‘trusted by proxy’ is absurd. Yet, this is what the whitelisters want us to believe.

For malware authors, the whitelist is a boon. It means that a malware author only needs to inject subversive code into another process that is whitelisted. Since the whitelist doesn’t and cannot account for dynamic runtime data, the malware author knows his injected code is invisible to the whitelist. And, since the process is whitelisted on disk, he can be assured his malware code will also be whitelisted by proxy. So, in effect, whitelisting is actually WORSE than blacklisting. In the extreme, the malware may actually inject into the desktop firewall or resident virus scanner directly as a means of obtaining this blanket of trust.

The mindset that “suspicious is good enough to warrant a look” is a step in the right direction. But, whitelisting is not the correct approach. The only way to combat modern malware is to analyze the physical running memory of a system. In memory will be found the indicators of suspicion, and this is much more like a blacklist than a whitelist – except that it’s generic and based on the traits and behaviors of software, not hard signatures. For example, there are only so many ways you can build a keylogger. Once you can detect these traits in runtime memory, you are going to detect the keylogger regardless of who wrote it, what it was compiled with, what attack toolkit was used, or what it was packed with. As a security industry we need to stop climbing uphill with traditional approaches proven not to work. We need to change the fundamental way we do things if we are going to win.

Tuesday, June 24, 2008

Flypaper 1.0 Released

I'm happy to announce the release of a free tool from HBGary. It's something I put together to save me time when doing malware analysis for customers.

Most malware is designed into two or three stage deployment. First, a dropper program will launch a second program, and then delete itself. The second program may take additional steps, such as injecting DLL's into other processes, loading a rootkit, etc. These steps are taken quickly, and it can be difficult for an analyst to capture all of the binaries used in the deployment. HBGary Flypaper solves this problem for the analyst.

HBGary Flypaper loads as a device driver and blocks all attempts to exit a process, end a thread, or delete memory. All components used by the malware will remain resident in the process list, and will remain present in physical memory. The entire execution chain is reported so you can follow each step. Then, once you dump physical memory for analysis, you have all the components 'frozen' in memory - nothing gets unloaded. All of the evidence is there for you.

HBGary Flypaper is designed to be used with a virtual machine. Once activated, Flypaper will also block network traffic to and from the machine. If you are using HBGary Responder with the virtual machine, only the traffic to and from Responder is allowed, effectively quarantining the malware for analysis. (Note, this blocking operation would not block NDIS level rootkit material, only malware that uses the existing TCP/IP stack.)

You can get it from the HBGary website. (www.hbgary.com)

Friday, June 20, 2008

Microsoft wipes out 700,000 - too late to the game

A very interesting post came out on the MMPC blog today – Microsoft added some sigs to capture Taterf and Frethog malware variants and captured waaaay more than they expected (http://blogs.technet.com/mmpc/). On the first day alone they detected 700,000 Taterf variants, millions in the first week. What is interesting is the sheer volume of malware designed to steal online gaming credentials. This is equivalent to the threat faced by financial institutions every day in the form of keyloggers that steal financial credentials. Except, in this case, the money is stored in game servers. But, like all money – money is just a digit in a computer somewhere. This is not different. The target smells the same if you step back. Just like stolen banking accounts, these accounts are stored in a bad-guy SQL server somewhere and sold for cash based on whatever inventory the character happens to have. The Asia-Pac region is already full of companies that farm gold (aka ‘real cash economy’) – they already have existing relationships with real purchasers in the real-cash economy with set quotas. So, it’s not a stretch to imagine they can clear out and launder 50 million wow gold in 90 days. At the scale of the malware infection described in Microsoft’s blog, this was a huge operation (with the sheer volume of flash and quicktime exploits over Q1 this doesn’t surprise me either). And, by the time these infections were cleaned by Microsoft, it was too late. The game was already over.

Monday, June 16, 2008

Welcome to Greg Hoglund's new Blog

Welcome to my new blog, Fast Horizon. I have retired my old blog on rootkit.com and opened up shop here at blogger. I am the CEO of HBGary, Inc. (http://www.hbgary.com/) – a new company in the computer security industry. We released our first product this year (Responder, www.hbgary.com/responder_pro.html). HBGary is actually about five years old, but until now we have been a services company working primarily for the U.S. Dept. of Defense and Intelligence Community. I am excited to be part of the shift toward product development. This is my third startup. I am the author of three books and have been educating people about security threats – especially rootkits – for almost 10 years. I have a great foresight for trends – thinking of ideas about 5 years too soon for the market - and an almost cynical edge to my observations. Most people know me as a hacker, but in truth I probably know more about business and product development than hacking at this point. All of my startups have been in software development. I have probably experienced every management nightmare that can be listed, and dealt with it. I like to take big bites - so HBGary is tackling the biggest threat in computer security today – malware. Unlike most companies however, we aren’t selling snake-oil. Instead, our philosophy is that it’s IMPOSSIBLE to keep the bad guys out. The billions of dollars spent on security since the millennium has been a complete waste. Instead, we assume the bad guys will succeed – and it’s our job to catch them once they get in.

I could describe our solution as a platform for analyzing physical memory. You see, if there truly is a cyberspace in the Enterprise, it’s represented by the ones and zeroes in physical RAM.

There are only three kinds of data in the enterprise:
- Data at rest, on hard drives
- Data in motion, over the network
- Data in execution, in RAM

For any data to be used, it has to exist in RAM. Everything that matters must exist in RAM. By being in RAM, you are the center of the universe. Yet for all its power, until now nobody has a platform to analyze RAM. There are host-based IDS products, and AV, but all of these depend on the OS to query things about the OS – age old rootkit problem. The system is subverted and it’s game over. Our solution steps aside the OS and analyzes the physical RAM snapshot –offline-, thus avoiding any malware trickery.

There is a high barrier to entry to this work. We open the RAM, look inside, and extract objects. We reverse engineered every version and service pack of Microsoft Windows to be able to do this. We can find every process, every driver, and every line of assembly code of every software component. And, we do it without using the operating system – we do it without executing the environment we analyze.

In my grand vision we will build a picture of the true enterprise cyberspace. We have radical new technologies, like Digital DNA, that can be used to identify fragments of documents, strains of malware, intellectual property, fingerprints of email attachments, etc. Although we are tackling malware, our platform is generic and could be used for many other markets (IP asset tracking, E-Discovery, etc). As a company, we couldn’t ask to be in a better place in a market. We are set to explode.