VMWare Incident Response: A Process
Basics of Incident Response
Included in this report is an overview of the steps taken during an Incident Response I assisted in. This write up serves to be a personal reference as well as a request for feedback of the steps taken to respond to a compromised system. While each of these topics can be discussed and practiced in length; this process is an overview of what occurs during an incident response without technical overhead.
Overview of the Incident Reponse Process
1. Contact with consultee
2. Acquisition of Evidence
3. Disk Forensics
4. Memory Forensics
5. Reverse Engineering of Collected Evidence
1. Contact with the Consultee
The initial meeting of with whomever the incident response is being conducted for is important as it is critical to get as much information about the incident response as possible. It is also important to ensure access to all files & evidence is granted. This should happen before any technical work is done on the incident response.
In this phase it is important to:
Get Access to the Files need for the investigation
- Disk Forensics (.vmdk)
- Memory Forensics (.vmsn)
- Extra files (Delta VMDK, Previous Snapshots)
Further research should be done to determine the best course of action. There is debate on the best way to image a virtual machine for these files. While the disk image is acquired by simply copying the .vmdk, the memory is where things get complicated. The most forensically sound method of acquiring a memory image is to pause the virtual machine and then take a snapshot. This saves the output as a .vmsn which can be used for memory analysis on tool such as Volatility.
Overall, the more files that can be gathered, the better. Most recent snapshots and files are helpful, but if the attack happened prior, older snapshots and files will come in handy.
Get contact information to all personnel involved with the compromised service
- Administrators of the compromised machines
- Support groups (Security Team, VMware Team, Customer Service ect.)
If access is required or critical information needs to be communicated, it is best to have contacts of anyone that may need to be informed throughout the case. In personal experience, it becomes difficult when working on a case after 5pm as everyone leaves work. Having a phone number or personal e-mail of someone can save time and in an emergency to provide quick answers to questions the response team may have.
Background information & Questions to Ask
- When did you know the machine was compromised?
- What indicated the machine to be considered compromised?
- Who had access to the machine? What did the machine have access to / was networked with?
- What other interactions with the compromised machine took place?
2. Acquisition of Evidence
With the initial files located, it is important to get copies to the rest of the team as well as ensure there are backups. All the various files needed for the investigation should be copied off the compromised virtual machine, but then backups and copies of the evidence can be made. Remember the rule "2 is one and 1 is none".
At the same time, keeping a minimal interaction with the evidence is key. When working with memory images, every interaction with the machine can have a large impact. As mentioned above, suspending the machine and then acquiring a snapshot is the most forensically sound process.
Depending on the environment, external hard drives and USBs work well to store .vmdk and .vmsn files. When working with malware extracted from any of these files, clear annotation of drives or usbs with the malware should be indicated. Designated labs or sandboxes should be used when working with these files. General rules of malware analysis should apply.
3. Disk Forensics
With the files acquired, analysis on the current hard drive of the server can begin. A forensics tool kit like Autopsy or the FTK Suite can be used to browse the disk. Depending on the size of the disk image, a few common areas of infection can be checked to see what may have happened on the machine. FTK suite was used in the recent investigation, using the tool:
Basic Approach to Disk Forensics
- Sort the files by executable
- Search common infection ares such as Temporary Files, SYSTEM32 (Windows), Downloads
- Search areas pertaining to the server compromised - (Web directories)
- Search any indicators of compromise provided by the consultees
- Review logging
When searching for indicators of compromise, it is possible that any artifacts left over from the attack can be found on disk. Looking for the low hanging fruit is often the best approach. Once one indicator is found, often it leads to a rabbit hole of artifacts and the story begins to appear.
Often too when artifacts are found, reviewing the dates and times of their creation and modification can be used to start to build a timeline (See Section 6) of the attack. If an executable or text file left by the attack occurred at a specific time, reviewing system logs and sorting in the forensics suite by that date and time could lead to further discovery of other artifacts or actions left by the attacker.
As more artifacts are found, repeating the process as more information is found allows for the most amount of artifacts to be found. Throughout the entire forensics process, repeating previous steps, time and resources permitting, can allow for stuff missed in the first pass to be found.
4. Memory Forensics
Memory forensics offers a live view of everything happening on the machine at a given point in time. However, it is only a snapshot of what happened. The greatest caveat is that if the attack was far prior to the memory image being investigated, artifacts may be difficult to find. In all, memory forensics can either make an investigation incredibly short, or not have much to offer at all.
When working with virtual machines snapshots of the images (.vmsn) can be used by memory forensics tools. Volatility is a great tool for memory forensics and many plugins and add-ons can be used to make this process easier. Below are some common artifacts as well
Common artifacts found in memory can be, but are not limited to:
- Processes Runing (pslist, psxview, pstree)
- Connections (conscan, connections, sockets, netscan)
- Commandline Usage (cmdscan, consoles)
- DLLs used in memory (dlllist, dlldump)
- Injections & Malware (malfind, modscan)
This list of above commands is brief and the total list of commands that can be used with Volatility can be found here (Command Reference). The amount of information that can be discerned from memory is vast. When using it in conjunction with Disk Forensics as well as Network & Dynamic analysis a clear picture of the attack can be developed.
With artifacts such as processes running, current connections, and command-line usage; what the attacker may have been doing on the machine becomes apparent. If there is any malicious programs or activity on the machine at the time of the attack, with some training, it can be quickly identified using memory forensics.
Processes and executable found in the disk forensics can be searched for in memory. Then networking connections can be verified using bulk extractor to find any other files of interest by importing them into Wireshark and then filtering.
5. Reverse Engineering of Collected Evidence
With Disk Image forensics and Memory Analysis, many of the tools offer the export of files and executable found within the evidence. When malicious executables are found either in memory or in file systems, they can be exported and then used to find more information on what the attacker was attempting to accomplish and whether or not the malware is communicating outbound to the attacker.
Static and Dynamic Analysis can be performed. Static analysis can provide more information on what an executable does when its ran, whether its been obfuscated or packed (prevents reverse engineering), and where the program may have come from. Dynamic analysis is the process of running the program in a controlled environment to watch the changes made to the system in real time.
Overall Reverse Engineering and Analysis can provide
- What the malware does
- Where it came from
- What it injects itself into, what DLLs are called
- Changes made to the computer (Injection, DLLs, Registry)
- Communication to outside networks / attackers
- Functions of the malware
Best practices and other guides on these processes online and should be followed. Ideally, a sandbox lab environment should be built before executing malware on a machine. Tools can monitor exact changes made to the system and networking can be simulated and recorded. This allows for the best understanding of the malware but may be time consuming.
It is incredibly important not only to the consultee but to the incident response team to build an overall timeline of the events taking place during the compromise. Using artifacts found between Disk Imaging as well as Memory Forensics, a timeline of events can be constructed. System logging and timeline features of various forensics programs can be used to help identify when the attack happened and more importantly if pivoting occurred.
Incident response is all about information. With a timeline in place, it can be cross referenced with company holidays, working hours of known countries / apt groups, as well as if possible insider threat occurred. For example, if the malware has comments or characteristics of a known apt group from a country, in its code, referencing the time of infection, as well as holidays; allows for attribution to take place.
After the timelining and various aspects of analysis are complete, recommendations to the consultee should take place. Overall, a few main questions should be answered and then from there an analysis can be concluded.
Questions to consider:
- Does the attack vector used still exist on other servers or machines?
- Was the attacker able to pivot through the network? Does further analysis need to take place?
- What should be done with the compromised machine? Can it be cleaned or should it be removed entirely
- What updates and patches must be installed
Once this is known, the report can be built. There are resources on various ways to best define recommendations for a consultee to take. The BTFM (Blue Team Field Manual) offers a great reference for what to include on the recommendation. Communication with the consultee is very important as they may also ask for specific advice or information related to the case.
Generally, the recommendation section should include the following
Action items should include anything that must be done immediately to control the situation. This section of the report should include password changes; changes to firewall, IPS, IDS Rules; GPO and Active Directory Changes. Anything that is critical to the security of the network and to ensure the attacker no longer has access to the network.
This section includes more specific actions to be taken against the compromised host(s). Should the machine be taken offline permanently or can it be cleaned; changes to logging or updates to that machine.
On a larger scale, this includes actions to take by the consultees' organization as a whole to better prepare against attacks in the future. Updating and patching systems in use, reviewing vulnerabilities in applications in use, looking for similar vulnerabilities or attack vectors used in the compromise are all possible information to include.
Overall, in the closing reports, as much information should be included on the above subjects. The timleline, vectors of the attack, as well as recommendations should be emphasized. Most consultees are going to want to know how the attacker got in, what they took, and if they still have access. Each report will be unique to their requests but should generally outline the above.
References & Tools
 Hirwani, Manish; Pan, Yin; Stackpole, Bill; and Johnson, Daryl, "Forensic Acquisition and Analysis of VMware Virtual Hard Disks" (2012). Accessed from http://scholarworks.rit.edu/other/297
BTFM (Blue Team Field Manual)
Volatility Command References
Forensic Toolkit (FTK)