Cover Page

Table of Contents

Cover

Title Page

Copyright

Dedication

About the Authors

Acknowledgments

Introduction

Who Should Read This Book

How This Book Is Organized

Setting Up Your Environment

Conventions

On The Book’s DVD

Chapter 1: Anonymizing Your Activities

The Onion Router (Tor)

Malware Research with Tor

Tor Pitfalls

Proxy Servers and Protocols

Web-Based Anonymizers

Alternate Ways to Stay Anonymous

Cellular Internet Connections

Virtual Private Networks

Being Unique and Not Getting Busted

Chapter 2: Honeypots

Nepenthes Honeypots

Working with Dionaea Honeypots

Chapter 3: Malware Classification

Classification with ClamAV

Classification with YARA

Putting It All Together

Chapter 4: Sandboxes and Multi-AV Scanners

Public Antivirus Scanners

Multi-Antivirus Scanner Comparison

Public Sandbox Analysis

Chapter 5: Researching Domains and IP Addresses

Researching Suspicious Domains

Researching IP Addresses

Researching with Passive DNS and Other Tools

Fast Flux Domains

Geo-Mapping IP Addresses

Chapter 6: Documents, Shellcode, and URLs

Analyzing JavaScript

Analyzing PDF Documents

Analyzing Malicious Office Documents

Analyzing Network Traffic

Chapter 7: Malware Labs

Networking

Physical Targets

Chapter 8: Automation

The Analysis Cycle

Automation with Python

Adding Analysis Modules

Miscellaneous Systems

Chapter 9: Dynamic Analysis

API Monitoring/Hooking

Data Preservation

Chapter 10: Malware Forensics

The Sleuth Kit (TSK)

Forensic/Incident Response Grab Bag

Registry Analysis

Chapter 11: Debugging Malware

Working with Debuggers

Immunity Debugger’s Python API

WinAppDbg Python Debugger

Chapter 12: De-obfuscation

Decoding Common Algorithms

Decryption

Unpacking Malware

Unpacking Resources

Debugger Scripting

Chapter 13: Working with DLLs

Chapter 14: Kernel Debugging

Remote Kernel Debugging

Local Kernel Debugging

Software Requirements

Chapter 15: Memory Forensics with Volatility

Memory Acquisition

Preparing a Volatility Install

Chapter 16: Memory Forensics: Code Injection and Extraction

Investigating DLLs

Code Injection and the VAD

Reconstructing Binaries

Chapter 17: Memory Forensics: Rootkits

Chapter 18: Memory Forensics: Network and Registry

Registry Analysis

Index

Wiley Publishing, Inc. End-User License Agreement

Title Page

To my family for helping me shape my life and to my wife Suzanne for always giving me something to look forward to.

—Michael Hale Ligh

To my new wife and love of my life Irene and my family. Without your support over the many years, I would not be where I am or who I am today.

—Steven Adair

About the Authors

Michael Hale Ligh is a Malicious Code Analyst at Verisign iDefense, where he specializes in developing tools to detect, decrypt, and investigate malware. In the past few years, he has taught malware analysis courses and trained hundreds of students in Rio De Janeiro, Shanghai, Kuala Lumpur, London, Washington D.C., and New York City. Before iDefense, Michael worked as a vulnerability researcher, providing ethical hacking services to one of the nation’s largest healthcare providers. Due to this position, he gained a strong background in reverse-engineering and operating system internals. Before that, Michael defended networks and performed forensic investigations for financial institutions throughout New England. He is currently Chief of Special Projects at MNIN Security LLC.

Steven Adair is a security researcher with The Shadowserver Foundation and a Principal Architect at eTouch Federal Systems. At Shadowserver, Steven analyzes malware, tracks botnets, and investigates cyber-attacks of all kinds with an emphasis on those linked to cyber-espionage. Steven frequently presents on these topics at international conferences and co-authored the paper “Shadows in the Cloud: Investigating Cyber Espionage 2.0.” In his day job, he leads the Cyber Threat operations for a Federal Agency, proactively detecting, mitigating and preventing cyber-intrusions. He has successfully implemented enterprise-wide anti-malware solutions across global networks by marrying best practices with new and innovative techniques. Steven is knee deep in malware daily, whether it be supporting his company’s customer or spending his free time with Shadowserver.

Blake Hartstein is a Rapid Response Engineer at Verisign iDefense. He is responsible for analyzing and reporting on suspicious activity and malware. He is the author of the Jsunpack tool that aims to automatically analyze and detect web-based exploits, which he presented at Shmoocon 2009 and 2010. Blake has also authored and contributed Snort rules to the Emerging Threats project.

Matthew Richard is Malicious Code Operations Lead at Raytheon Corporation, where he is responsible for analyzing and reporting on malicious code. Matthew was previously Director of Rapid Response at iDefense. For 7 years before that, Matthew created and ran a managed security service used by 130 banks and credit unions. In addition, he has done independent forensic consulting for a number of national and global companies. Matthew currently holds the CISSP, GCIA, GCFA, and GREM certifications.

Acknowledgments

Michael would like to thank his current and past employers for providing an environment that encourages and stimulates creativity. He would like to thank his coworkers and everyone who has shared knowledge in the past. In particular, AAron Walters and Ryan Smith for never hesitating to engage and debate interesting new ideas and techniques. A special thanks goes out to the guys who took time out of the busy days to review our book: Lenny Zeltser, Tyler Hudak, and Ryan Olson.

Steven would like to extend his gratitude to those who spend countless hours behind the scenes investigating malware and fighting cyber-crime. He would also like to thank his fellow members of the Shadowserver Foundation for their hard work and dedication towards making the Internet a safer place for us all.

We would also like to thank the following:

—Michael, Steven, Blake, and Matthew

Introduction

Malware Analyst’s Cookbook is a collection of solutions and tutorials designed to enhance the skill set and analytical capabilities of anyone who works with, or against, malware. Whether you’re performing a forensic investigation, responding to an incident, or reverse-engineering malware for fun or as a profession, this book teaches you creative ways to accomplish your goals. The material for this book was designed with several objectives in mind. The first is that we wanted to convey our many years of experience in dealing with malicious code in a manner friendly enough for non-technical readers to understand, but complex enough so that technical readers won’t fall asleep. That being said, malware analysis requires a well-balanced combination of many different skills. We expect that our readers have at least a general familiarity with the following topics:

Our second objective is to teach you how various tools work, rather than just how to use the tools. If you understand what goes on when you click a button (or type a command) as opposed to just knowing which button to click, you’ll be better equipped to perform an analysis on the tool’s output instead of just collecting the output. We realize that not everyone can or wants to program, so we’ve included over 50 tools on the DVD that accompanies the book; and we discuss hundreds of others throughout the text. One thing we tried to avoid is providing links to every tool under the sun. We limit our discussions to tools that we’re familiar with, and—as much as possible—tools that are freely available.

Lastly, this book is not a comprehensive guide to all tasks you should perform during examination of a malware sample or during a forensic investigation. We tried to include solutions to problems that are common enough to be most beneficial to you, but rare enough to not be covered in other books or websites. Furthermore, although malware can target many platforms such as Windows, Linux, Mac OS X, mobile devices, and hardware/firmware components, our book focuses primarily on analyzing Windows malware.

Who Should Read This Book

If you want to learn about malware, you should read this book. We expect our readers to be forensic investigators, incident responders, system administrators, security engineers, penetration testers, malware analysts (of course), vulnerability researchers, and anyone looking to be more involved in security. If you find yourself in any of the following situations, then you are within our target audience:

How This Book Is Organized

This book is organized as a set of recipes that solve specific problems, present new tools, or discuss how to detect and analyze malware in interesting ways. Some of the recipes are standalone, meaning the problem, discussion, and solution are presented in the same recipe. Other recipes flow together and describe a sequence of actions that you can use to solve a larger problem. The book covers a large array of topics and becomes continually more advanced and specialized as it goes on. Here is a preview of what you can find in each chapter:

Setting Up Your Environment

We performed most of the development and testing of Windows tools on 32-bit Windows XP and Windows 7 machines using Microsoft’s Visual Studio and Windows Driver Kit. If you need to recompile our tools for any reason (for example to fix a bug), or if you’re interested in building your own tools based on source code that we’ve provided, then you can download the development environments here:

As for the Python tools, we developed and tested them on Linux (mainly Ubuntu 9.04, 9.10, or 10.04) and Mac OS X 10.4 and 10.5. You’ll find that a majority of the Python tools are multi-platform and run wherever Python runs. If you need to install Python, you can get it from the website at http://python.org/download/. We recommend using Python version 2.6 or greater (but not 3.x), because it will be most compatible with the tools on the book’s DVD.

Throughout the book, when we discuss how to install various tools on Linux, we assume you’re using Ubuntu. As long as you know your way around a Linux system, you’re comfortable compiling packages from source, and you know how to solve basic dependency issues, then you shouldn’t have a problem using any other Linux distribution. We chose Ubuntu because a majority of the tools (or libraries on which the tools depend) that we reference in the book are either preinstalled, available through the apt-get package manager, or the developers of the tools specifically say that their tools work on Ubuntu.

You have a few options for getting access to an Ubuntu machine:

We always try to provide a URL to the tools we mention in a recipe. However, we use some tools significantly more than others, thus they appear in five to ten recipes. Instead of linking to each tool each time, here is a list of the tools that you should have access to throughout all chapters:

You should note a few final things before you begin working with the material in the book. Many of the tools require administrative privileges to install and execute. Typically, mixing malicious code and administrative privileges isn’t a good idea, so you must be sure to properly secure your environment (see Chapter 7 for setting up a virtual machine if you do not already have one). You must also be aware of any laws that may prohibit you from collecting, analyzing, sharing, or reporting on malicious code. Just because we discuss a technique in the book does not mean it’s legal in the city or country in which you reside.

Conventions

To help you get the most from the text and keep track of what’s happening, we’ve used a number of conventions throughout the book.

Recipe X-X: Recipe Title

Boxes like this contain recipes, which solve specific problems, present new tools, or discuss how to detect and analyze malware in interesting ways. Recipes may contain helpful steps, supporting figures, and notes from the authors. They also may have supporting materials associated with them on the companion DVD. If they do have supporting DVD materials, you will see a DVD icon and descriptive text, as follows:

dvd1.eps

You can find supporting material for this recipe on the companion DVD.

For your further reading and research, recipes may also have endnotes1that site Internet or other supporting sources. You will find endnote references at the end of the recipe. Endnotes are numbered sequentially throughout a chapter.

1 This is an endnote. This is the format for a website source

Note Tips, hints, tricks, and asides to the current discussion look like this.

As for other conventions in the text:

     This is an example of monofont type with a long \

            line of code that needed to be broken.

     This truncated line shows how [REMOVED]

      $ date ; typing into a Unix shell

      Wed Sep  1 14:30:20 EDT 2010 

      C:\> date ; typing into a Windows shell

      Wed 09/01/2010

On The Book’s DVD

The book’s DVD contains evidence files, videos, source code, and programs that you can use to follow along with recipes or to conduct your own investigations and analysis. It also contains the full-size, original images and figures that you can view, since they appear in black and white in the book. The files are organized on the DVD in folders named according to the chapter and recipe number. Most of the tools on the DVD are written in C, Python, or Perl and carry a GPLv2 or GPLv3 license. You can use a majority of them as-is, but a few may require small modifications depending on your system’s configuration. Thus, even if you’re not a programmer, you should take a look at the top of the source file to see if there are any notes regarding dependencies, the platforms on which we tested the tools, and any variables that you may need to change according to your environment.

We do not guarantee that all programs are bug free (who does?), thus, we welcome feature requests and bug reports addressed to malwarecookbook@gmail.com. If we do provide updates for the code in the future, you can always find the most recent versions at http://www.malwarecookbook.com.

The following table shows a summary of the tools that you can find on the DVD, including the corresponding recipe number, programming language, and intended platform.

Table 1

Table 1 Continued

Table 1 Continued

Table 1 Continued

Table 1 Continued