Sunday, April 26, 2015

Timeline Analysis Process

Having discussed timeline analysis in a couple of blog posts so far (such as here...), I thought I'd take an opportunity to dig a bit deeper into the process I've been using for some time to actually conduct analysis of timelines that I create during engagements.

Before we get into the guts of the post, I'd like to start by saying that this is what I've found that works for me, and it does not mean that this is all there is.  This is what I've done, and been doing, tweaking the process a bit here and there over the years.  I've looked at things like visualization to try to assist in the analysis of time lines, but I have yet to find something that works for me.  I've seen how others "do" timeline analysis, and for whatever reason, I keep coming back to this process.  However, that does not mean that I'm not open to other thoughts, or discussion of other processes.  It's quite the opposite, in fact...I'm always open to discussing ways to improve any analysis process.

Usually when I create a timeline, it's because I have something specific that I'm looking for that can be shown through a timeline; in short, I won't create a timeline (even a micro-timeline) without a reason to do so.  It may be a bit of malware, a specific Registry entry or Windows Event Log record, a time frame, etc.  As is very often the case, I'll have an indicator, such as web shell file on a system, and find that in some cases, the time frames don't line up between systems, even though the artifact is the same across those systems.  It's this information than I can then use, in part, to go beyond the "information" of a particular case, and develop intelligence regarding an adversary's hours of operations, action on objectives, etc.

When creating a timeline, I'll use different tools, depending upon what data I have access to.  As I mentioned before, there are times when all I'll have is a Registry hive file, or a couple of Windows Event Logs, usually provided by another analyst, but even with limited data sources, I can still often find data of interest, or of value, to a case.  When adding Registry data to a timeline, I'll start with regtime to add key Last Write times from a hive to the timeline.  This tool doesn't let me see Registry values, only key Last Write times, but it's value is that it lets me see keys that have been created or modified during a specific time frame, telling me where I need to take a closer look.  For example, when I'm looking at a timeline and I see where malware was installed as a Windows service, I'll usually see the creation of the Registry key for the service beneath the Services key (most often beneath both, or all, ControlSets).

When I want to add time stamped information from Registry value data to a timeline, I'll turn to RegRipper and use the plugins that end in *_tln.pl.  These plugins (generally speaking) will parse the data from Registry values for the time stamped information, and place it into the necessary format to include it in a timeline.  The key aspect to doing this is that the analyst must be aware of the context of the data that they're adding.  For example, many analysts seem to believe that the time stamp found in the AppCompatCache (or ShimCache) data is when the program was executed, and in several cases (one announced publicly by the analyst), this misconception has been passed along to the customer.

I also use several tools that let me add just a few events...or even just one event...to the time line.  For example, the RegRipper secrets_tln.pl plugin lets me add the Last Write time of the Policy/Secrets key from the Security hive to the timeline.  I can check the time stamp first with the secrets.pl plugin to see if it's relevant to the time frame I'm investigating, and if it is, add it to the timeline.  After all, why add something to the timeline that's not relevant to what I'm investigating?

If I want to add an event or two to the timeline, and I don't have a specific parser for the data source, I can use tln.exe (image to the left) to let me add that event.  I've found this GUI tool to be very useful, in that I can use it to quickly add just a few events to a timeline, particularly when looking at the data source and only finding one or two entries that are truly relevant to my analysis.  I can fire up tln.exe, add the time stamp information to the date and time fields, add a description with a suitable tag, and add it to the timeline.  For example, I've used this to add an event to a timeline, indicating when the available web logs on a system indicated the first time that a web shell that had been added to the system was accessed.  I added the source IP address of the access in the description field in order to provide context to my timeline, and at the same time, adding the event itself provided an additional (and significant) level of relative confidence in the data I was looking at, because the event I added corresponded exactly to file system artifacts that indicated that the web shell had been accessed for the first time.  I chose this route, because adding all of the web log data would've added significant volume to my timeline without adding any additional context or even utility.

When creating a timeline, I start with the events file (usually, events.txt), and use parse.exe to create a full timeline, or a partial timeline based on a specific date range.  So, after running parse.exe, I have the events.txt file that contains all of the events that I've extracted from different data sources using different tools (whichever applies at the time), and I have either full timeline (tln.txt) or a shortened version based on a specific date range...or both.  To begin analysis, I'll usually open the timeline file in Notepad++, which I prefer to use because it allows me to search for various things, going up or down in the file, or it can give me a total count of how many times a specific search term appears in the file.


Once I begin my analysis, I'll open another tab in Notepad++, and call it "notes.txt".  This is where all of my analysis notes go while I'm doing timeline analysis.  As I start finding indicators within the timeline, I'll copy-and-paste them out of the timeline file (tln.txt) and into the notes file, keeping everything in the proper sequence.

Timeline analysis has been described as being an iterative process, often performed in "layers".  The first time going through the timeline, I'll usually find a bunch of stuff that requires my attention.  Some stuff will clearly be what I'm looking for, other stuff will be, "...hey, what is this..."...but most of what's in the timeline will have little to do with the goals of my exam.  I'm usually not interested in things like software and application updates, etc., so having the notes file available lets me see what I need to see.  Also, I can easily revisit something in the timeline by copying the date from the notes file, and doing a search in the timeline...this will take me right to that date in the timeline.

Recently while doing some timeline analysis, I pulled a series of indicators out of the timeline, and pasted them into the notes file.  Once I'd followed that thread, I determined that what I was seeing as adware being installed.  The user actively used the browser, and the events were far enough back in time that I wasn't able to correlate the adware installation with the site(s) that the user had visited, but I was able to complete that line of analysis, note what I'd found, remove the entries from the notes file, and move on.

As timeline analysis continues, I very often keep the data source(s) open and available, along with the timeline, as I may want to see something specific, such as the contents of a file, or the values beneath a Registry key.  Let's say that the timeline shows that during a specific time frame that I'm interested in, the Last Write time of the HKLM/../Run key was modified; I can take a look at the contents, and add any notes I may have ("...there is only a single value named 'blah'...") to the notes file.

Many times, I will have to do research online regarding some of the entries in the timeline.  Most often, this will have to do with Windows Event Log entries; I need to develop an understanding of what the source/ID pair refers to, so that I can fill in the strings extracted from the record and develop context around the event itself.  Sometimes I will find Microsoft-Windows-Security-Auditing/5156 events that contain specific process names or IP addresses of interest.  Many times, Windows Event Log record source/ID pairs that are of interest will get added to my eventmap.txt file with an appropriate tag, so that I have additional indicators that automatically get identified on future cases.

Not everything extracted from the timeline is going to be applicable to the goals of my analysis.  I've pulled data from a timeline and my research has determined that the events in question were adware being installed.  At that point, I can remove the data from my notes file.

By the time I've completed my timeline analysis, I have the events file (all of the original events), the timeline file, and the notes file.  The notes file is where I'll have the real guts of my analysis, and from where I'll pull things such as indicator clusters (several events that, when they appear together, provide context and a greater level of confidence in the data...) that I've validated from previous engagements, and will use in future engagements, as well as intel I can use in conjunction with other analysis (other systems, etc.) to develop a detailed picture of activity within the enterprise.

Again, this is just the process I use and have found effective...this is not to say that this is what will work for everyone.  And I have tried other processes.  I've had analysts send me automatically-created colorized spreadsheets and to be honest, I've never been very clear as to what they've found or thought to be the issue.  That is not to say that this method of analysis isn't effective for some, as I'm sure it is...I simply don't find it effective.  The process I described has been effective for me at a number of levels...from having a single data source from a system (i.e., a single Registry hive...), to having an entire image, to analyzing a number of systems from the same infrastructure.  And again, I'm not completely stuck to this process...I'm very open to discussion of other processes, and if I do find something that proves to be effective, I have no problem adding it to what I do, or even changing my process all together.

8 comments:

Anonymous said...

Do you use multi monitor PC configuration for having many case sources open at the same time or you just switch among the windows on single monitor ?

What hardware configuration is your primary investigative PC system built upon ? Is it average "office PC" system, is it laptop or desktop, is it high-end system with SSD and plenty of RAM, or maybe not ?

Do you extensively use VM as a DFIR workstation in your investigations (VBox/Vmware) ?

Please give us a feeling what your workstation really look like (if willing/possible of course).

Thank you

H. Carvey said...

Anonymous,

My primary analysis system is a Dell Latitude E6510 laptop that I purchased off of the Dell refurbished system shelf maybe 4 yrs ago, nothing high end at all.

I'm running Windows 7, and I have VirtualBox installed, but the only time I've really used that (beyond testing Windows 10 TP) has been to convert VMWare compressed images into .vmdk files I can access with FTK Imager.

Almost all of the tools I use are freeware, with the notable exception being UltraEdit. Many of the tools I use, I wrote myself.

I've seen high-end analysis systems before, and it's looked like using a Stinger missile to swat a fly. If you know what you're looking for, and you're familiar with your data sources, you don't need "high-end" systems, in most cases.

If there is something that I need to do that will take a while, such as scan the image with AV or carve for some records, I'll schedule that for the evening when I'm not actively working the analysis.

How about you? What does your analysis system look like?

Anonymous said...

I must admit I have somewhat beefier environment but not for the sole purpose of DFIR tasks but also for other engagements I work with (i.e. pentesting).

As I preffer command line type environment (yes, I come from unixland) and at the same time I want to have constant insight into as much of case data, I usually have up to four displays with different case data sources spread across all of them except the main display which I use for core digging of the case and when in need to correlate the displayed data from other displays. Using multiple displays I perform things more quickly and have better mental connection to the case data. But beware, having four to five displays is (IMHO) the upper limit in order not to introduce a data overload.

My primary DFIR workstation is SIFT 3 VM guest on Linux host (reverting to a snapshot is a killer feature!) and a Windows 7 VM guest secondary workstation primarily for Windows-only tools.

Also, in order to quickly test some forensic scenario/artifacts of the case I have setup a virtual network with test Windows domain and a few Linux machines, all as VM on the same host system, isolated from real wired network.

As of hardware profile of my system, it's currently HP Zbook G2 with a beefier config (32GB RAM/High End CPU/Max. SSD available) and I will continue to use that system for (usually) another two to three years from now. As you can see I tend to concentrate my whole "forensic lab" into a single machine but it has the pros considering the environment requirements I work in. The machine is (of course) FDE encrypted and regularly backuped to a safe storage.

Regarding the DFIR software, SIFT 3 is already a great repository for forensic tools by itself (including yours - thanks!) which I supplement with TZworks tools (no ads here - I find their tools great considering the fact the majority of tools are command line), and when in need to use the Windows (which I almost always use to confirm the findings already done inside SIFT) I use TZworks (again), EnCase Forensic (primariliy as a case container) and IEF from MagnetForensics.

Regards

Mari DeGrazia said...

Just wanted to add - the nice thing about cutting and pasting the timeline "snippets" into a new document is that once I am ready for the report, I can save the file as a .csv, open it in Excel, add some borders, and throw it into the report fairly easily.

Oh - and I love having a multi monitor display :-)

Libby Baugher said...

I have a couple questions about Reg Ripper. I'm doing usb analysis and the output from regripper shows LastWrite:. Is this the last time it was written to? Or inserted?
Also, time says Thu Apr 16 15:18:02 2015. Is this local or UTC?

Thanks!

H. Carvey said...

Libby,

I responded to your email...I hope it was helpful.

Unknown said...

@harlan
Hi, I'm master student working in my thesis about "how to detect malicioues pdf file based on timeline analysis " my work is check what events does a file (PDF) create when it is opened, and what does the PDF Viewer (Adobe) do when it opens a file, etc. Then write a parser or something to filter those from my Supertimeline ,I have extracted from system image

1.supertimeline using log2timeline tool(evt,software,prefetch,pdf,mft,ntuser,mactime)
2.journal file
3.mft file

now i want to know how filter these huge event To achieve goal (detect malicioues pdf file ), I dont know where i should start
can you help me plz

H. Carvey said...

Noor,

You sure picked an odd way to communicate...I tried checking your Google profile and there's no email address that I can find.

Question...why are you using the MFT twice?

I would start by considering how a PDF document gets on the system to start with, and look for artifacts of that. I'd then look for indications of the user opening the document.

Hope that helps.