Tuesday, March 30, 2010

Links

Timelines
Grayson finished off a malware/adware exam recently by creating a timeline, apparently using the several different tools including fls/mactime, Mandiant's Web Historian, regtime.pl from SIFT v2.0, etc. Not a bad way to get started, really...and I do think that Grayson's post, in a lot of ways, starts to demonstrate the usefulness of creating timelines and illustrating activity that you wouldn't necessarily see any other way. I mean, how else would you see that a file was created and modified at about the same time, and then see other files "nearby" that were created, modified, or accessed? Add to that logins visible via the Event Log, Prefetch files being created or modified, etc...all of which adds to our level of confidence in the data, as well as adding to the context of what we're looking at.

I've mentioned this before but another timeline creation tool is AfterTime from NFILabs. In some ways this looks like timeline creation is gaining some attention as an analysis technique. More tools and techniques are coming out, but I still believe that considerable thought needs to go into visualization. I think that automatically parsing and adding every data source you have available to a timeline can easily overwhelm any analyst, particularly when malware and intrusion incidents remain least frequency of occurrence on a system.

Event Log Parsing and Analysis
I wanted to point out Andrea's EvtxParser v1.0.4 tools again. I've seen where some folks have gotten into positions where they're interested in parsing Windows Event Log files, and Andreas has done a great deal of work in providing a means for folks to do so with Vista systems and above, without having a like system available.

IR in the Cloud
Here's an interesting discussion about IR in the cloud that I found via TaoSecurity. While there are a number of views and thoughts in the thread, in most cases I would generally tend to stay away from discussions where folks start with, "...I'm not a lawyer nor expert in cloud computing or forensics..."...it's not that I feel that anyone needs to be an expert in any particular area, but that kind of statement seems to say, "I have no basis upon which to form an opinion...but I will anyway." The fact of the matter is that there're a lot of smart folks (even the one who admitted to not being a lawyer...something I'd do every day! ;-) ) in the thread...and sometimes that toughest question that can be asked is "why?"

Cloud computing is definitely a largely misunderstood concept at this point, and to be honest, it really depends on the implementation. By that, I mean that IR depends on the implementation...just as IR activities depend on whether the system I'm reacting to is right in front of me, or in another city.

Incident Preparedness
On the subject of IR, let's take a step back to incident preparedness. Ever seen the first Mission: Impossible movie? Remember when Ethan makes it back to the safe house, gets to the top of the stairs and removes a light bulb, crushes it in his jacket and lays out the shards in the darkened hallway as he backs toward his room? He's just installed rudimentary incident detection...anyone who steps into the now-dark hallway will step on shards of the glass, alerting him to their presence.

Okay, so who should be worried about incidents? Well, anyone who uses a computer. Seriously. Companies like Verizon, TrustWave and Mandiant have released reports based on investigations they've been called in for, and Brian Krebs makes it pretty clear in his blog that EVERYONE is susceptible...read this.

Interestingly, in Brian's experience, folks hit with this situation have also been infected with Zbot or Zeus. The MMPC reported in Sept 2009 that Zbot was added to MRT; while it won't help those dentists now, I wonder what level of protection they had at the time. I also wonder how they feel now about spending $10K or less in setting up some kind of protection.

I can see the economics in this kind of attack...large organizations (TJX?) may not see $200K as an issue, but a small business will. It will be a huge issue, and may be the difference between staying open or filing for bankruptcy. So why take a little at a time from a big target when you can drain small targets all over, and then move on to the next one? If you don't think that this is an issue, keep an eye on Brian's blog.

Malware Recovery
Speaking of Brian, he also has an excellent blog post on removing viruses from systems that won't boot. He points to a number of bootable Linux CDs, any of which are good for recovery and IR ops. I've always recommended the use of multiple AV scanners as a means of detecting malware, because even in the face of new variants that aren't detected, using multiple tools is still preferable over using just one.

F-Response
For those of you who aren't aware, F-Response has Linux boot CD capability now, so you can access systems that have been shut off.

Dougee posted an article on using an F-Response boot CD from a remote location...something definitely worth checking out, regardless of whether you have F-Response or not. Something like this could be what gets your boss to say "yes"!

Extend your arsenal!

Browser Forensics
For anyone who deals with cases involving user browser activity on a system, you may want to take a look at BrowserForensics.org. There's a PDF (and PPT) of the browser forensics course available that looks to be pretty good, and well worth the read. There's enough specialization required just in browser forensics, so much to know, that I could easily see a training course and reference materials just for that topic.

Bablodos
Speaking of malware, the folks over at Dasient have an interesting post on the "Anatomy of..." a bit of malware...this one called Bablodos. These are always good to read as they can give a view into trends, as well as specifics regarding a particular piece of malware.

Google has a safe browsing diagnostic page for Bablodos here.

Book Translations
I got word from the publisher recently that Windows Forensic Analysis is being translated into French, and will be available at some point in the future. Sorry, but that's all I have at the moment...hopefully, that will go well and other translations will be (have been, I hope) picked up.

Sunday, March 28, 2010

Thought of the Day

Don't be dependent upon tools; rather, focus on the goals of your exam, and let those guide you.

When starting an exam, what is the first question that comes to mind? If it's "...now where did I leave my dongle?", then maybe that's the wrong question. I'm a pretty big proponent for timeline creation and analysis, but I don't always start an exam by locating every data source and adding it to a timeline...because that just doesn't make sense.

For example, if I'm facing a question of the Trojan Defense, I may not even create a timeline...because for the most part, we already know that the system contains contraband images, and we may already know, or not be concerned with, how they actually got there. If the real question is whether or not the user was aware that the images were there, I'll pursue other avenues first.

Don't let your tools guide you. Don't try to fit your exam to whichever tool you have available or were trained in. You should be working on a copy of the data, so you're not going to destroy the original, and the data will be there. Focus on the goals of your exam and let those guide your analysis.

Saturday, March 27, 2010

Thought of the Day

Today's TotD is this...what are all of the legislative and regulatory requirements that have been published over the last...what is it...5 or more years?

By "legislative", I mean laws...state notification laws. By "regulatory", I mean stuff like HIPAA, PCI, NCUA, etc., requirements. When you really boil them down, what are they?

They're all someone's way of saying, if you're going to carry the egg, don't drop it. Think about it...for years (yes, years), auditors and all of us in the infosec consulting field have been talking about the things organizations can do to provide a modicum of information security within their organizations. Password policies...which includes having a password (hello...yes, I'm talking to you, sa account on that SQL Server...) - does that sound familiar? Think about it...some auditor said it was necessary, and now there's some compliance or regulatory measure that says the same thing.

As a consultant, I have to read a lot of things. Depending upon the customer, I may have to read up on HIPAA one week, and then re-familiarize myself with the current PCI DSS the next. Okay, well, not so much anymore...but my point is that when I've done this, there's been an overwhelming sense of deja vu...not only have infosec folks said these things, but in many ways, under the hood, a lot of these things say the same thing.

With respect to IR, the PCI DSS specifically states, almost like "thou shalt...", that an organization must have an incident response capability (it's in chapter 12). Ever read CA SB 1386? How would any organization comply with this state law (or any of the other...how many...state laws?) without having some sort of incident detection and response capability?

My point...and my thought...is that this is really no different from being a parent. For years, organizations have been told by auditors and consultants that they need to tighten up infosec, and that they can do so without impacting business ops. Now, regulatory organizations and even legislatures have gotten into the mix...and there are consequences. While a fine from not being in compliance may not amount to much, the act of having to tell someone what happened does have an impact.

Finally, please don't think that I'm trying to equate compliance to security. Compliance is a snapshot in time, security is a business process. But when you're working with organizations that have been around for 30 or so years and have not really had much in the way of a security infrastructure, compliance is a necessary first step.

Friday, March 26, 2010

Responding to Incidents

Lenny Zeltser has an excellent presentation on his web site that discusses how to respond to the unexpected. This presentation is well worth the time it takes to read it...and I mean that not only for C-suite executives, but also IT staff and responders, as well. Take a few minutes to read through the presentation...it speaks a lot of truth.

When discussing topics like this, in the past, I've thought it would be a good idea to split the presentation, discussing responding as (a) a consultant and (b) as a full-time employee (FTE) staff member. My reasoning was that as a consultant, you're walking into a completely new environment, and as a member of FTE staff, you've been working in that environment for weeks, months, or even years, and would tend to be very familiar with the environment. However, as a consultant/responder, my experience has been that most response staff isn't familiar with the incident response aspect of their...well...incident response. FTE response staff is largely ad-hoc, as well as under-trained, in response...so while they may be very familiar with the day-to-day management of systems, most times they are not familiar with how to respond to incidents in a manner that's consistent with senior management's goals. If there are any. After all, if they were, folks like me wouldn't need to be there, right?

So, the reason I mention this at all is that when an incident occurs, the very first decisions made and actions that occur have a profound effect on how the incident plays out. If (not when) an incident is detected, what happens? Many times, an admin pulls the system, wipes the drive, and then reloads the OS and data, and puts the system back into service. While this is the most direct route to recovery, it does absolutely nothing to determine the root cause, or prevent the incident from happening again in the future.

The general attitude seems to be that the needs of infosec in general, and IR activities in particular, run counter to the needs of the business. IR is something of a "new" concept to most folks, and very often, the primary business goal is to keep systems running and functionality available, whereas IR generally wants to take systems offline. In short, security breaks stuff.

Well, this can be true, IF you go about your security and IR blindly. However, if you look at incident response specifically, and infosec in general, as a business process, and incorporate it along with your other business processes (i.e., marketing, sales, collections, payroll, etc.), then you can not only maintain your usability and productivity, but you're going to save yourself a LOT of money and headaches. You're not only going to be in compliance (name your legislative or regulatory body/ies of choice with that one...) and avoid costly audits, re-audits and fines, but you're also likely going to save yourself when (notice the use of the word when, rather than if...) an incident happens.

I wanted to present a couple of scenarios based on culminations my own experience performing incident response in various environments for over 10 years. I think these scenarios are important, because like other famous events in history, they can show us what we've done right or wrong.

Scenario 1: An incident, a malware infection, is detected and the local IT staff reacts quickly and efficiently, determining that the malware was on 12 different systems in the infrastructure and eradicating each instance. During the course of response, someone found a string, and without any indication that it applied directly to the incident, Googled the string and added a relationship with a keystroke logger to the incident notes. A week later at a director's meeting, the IT director described the incident and applauded his staff's reactions. Legal counsel, also responsible for compliance, took issue with the incident description, due to the possibility of and the lack of information regarding data exfiltration. Due to the location and use of the "cleaned" systems within the infrastructure, regulatory and compliance issues are raised, due in part to the malware association with a keystroke logger, but questions cannot be answered, as the actual malware itself was never completely identified nor was a sample saved. Per legislative and regulatory requirements, the organization must now assume that any sensitive data that could have been exfiltrated was, in fact, compromised.

Scenario 2: An incident is detected involving several e-commerce servers. The local IT staff is not trained, nor has any practical knowledge of IR, and while their manager reports potential issues to his management, a couple of admins begin poking around on the servers, installing and running AV (nothing found), deleting some files, etc. Management decides to wait to see if the "problem" settles down. Two days later, one of the admins decides to connect a sniffer to the outbound portion of the network, and sees several files being moved off of the systems. Locating those files on the systems, the admin determines that the files contain PCI data; however, the servers themselves cannot be shut down. The admin reports this, but it takes 96 hrs to locate IR consultants, get them on-site, and have the IT staff familiarize the consultants with the environment. It takes longer due to the fact that the one IT admin who knows how the systems interact and where they're actually located in the data center is on vacation.

Scenario 3: A company that provided remote shell-based access for their employees was in the process of transitioning to two-factor authentication when a regular log review detected that particular user credentials were being used to log in from a different location. IT immediately shut down all remote access, and changed all admin-level passwords. Examination of logs indicated that the intruder had accessed the infrastructure with one set of credentials, used those to transition to another set, but maintained shell-based access. The second account was immediately disabled, but not deleted. While IR consultants were on their way on-site, the local IT staff identified systems the intruder had accessed. A definitive list of files known to contain 'sensitive data' (already compiled) was provided to the consultants, who determined through several means that there were no indications that those files had been accessed by the intruder. The company was able to report this with confidence to regulatory oversight bodies, and while a small fine was imposed, a much larger fine, as well as notification and disclosure costs, followed by other costs (i.e., cost to change credit/debit cards, pay for credit monitoring, civil suits, etc.) were avoided.

Remember, we're not talking about a small, single-owner storefront here...we're talking about companies that store and process data about you and me...PII, PHI, PCI/credit card data, etc. Massive amounts of data that someone wants because it means massive amounts of money to them.

So, in your next visit from an auditor, when they ask "Got IR?" what are you going to say?

Links

Evtx Parsing
Andreas has released an update to his Evtx Parser tools, bringing the version up to 1.0.4. A great big thanks to Andreas for providing these tools, and the capability for parsing this new format from MS.

F-Response Boot CD
As if F-Response wasn't an amazing enough tool as it is, Matt's now got a boot CD for F-Response! Pretty soon, Matt's going to hem everyone in and the only excuse you'll have for NOT having and using F-Response is that you live in a cave, don't have a computer, and don't gets on the InterWebs...

Malware & Bot Detection for the IT Admin
I recently attended a presentation, during and after which, the statement was made that the Zeus bot is/was difficult to detect. What I took away from this was that the detection methodology was specific to network traffic, or in some cases, to banking transactions. Tracking and blocking constantly changing domains and IP addresses, changes in how data is exfiltrated, etc., can be very difficult for even teams of network administrators.

As most of us remember, there's been discussion about toolkits that allow someone, for about $700US, to create their very own Zeus. By it's nature, this made the actual files themselves difficult to detect on a host system with AV. Again, detection is said to be difficult.

Remember when we talked about initial infection vectors of malware, and other characteristics? Another characteristic is the persistence mechanism...how malware or an intruder remains persistent on a system across reboots and user logins. These artifacts can often be very useful in identifying malware infections where other methods (i.e., network traffic analysis, AV, etc.) fail.

ZBot was also covered by the MMPC. A total of four variants are listed, but look at what they have in common...they all add data to a Registry value, specifically:

HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\UserInit

The same could be said for Conficker. According to the MMPC, there were two Registry artifacts that remained fairly consistent across various families of Conficker; creating a new, randomly named value beneath the Run key that pointed to rundll32.exe and the malware parameters, as well as Windows service set to run under svchost -k netsvcs.

That being the case, how can IT admins use this information? When I was in an FTE position with a financial services company, I wrote a script that would go out to each system in the infrastructure and grab all entries from a specific set of Registry keys. As I scanned the systems, I'd verify entries and remove them from my list. So, in short order, I would start the scan and head to lunch, and when I got back I'd have a nice little half page report on my desktop, giving me a list of systems with entries that weren't in my whitelist.

Admins can do something similar with something as simple as reg.exe, or something more complex written into a Perl script. So while someone else is scanning firewall logs or monitoring network traffic, someone else can target specific artifacts to help identify infected systems.

SIFT 2.0
Rob Lee has released SIFT 2.0, an Ubuntu-based VMWare appliance that comes with about 200 tools, including log2timeline, Wireshark, ssdeep/md5deep, Autopsy, PyFlag, etc.

To get your copy, go here, click on the "Forensics Community" tab at the top of the page, and choose Downloads.

If you're taken the SEC 508 course with Rob...or now with Ovie, or Chris...you have probably seen the SIFT workstation in action.

Tuesday, March 23, 2010

Even More Thoughts on Timelines

I realize that I've talked a bit about timeline creation and analysis, and I know that others (Chris, as well as Rob) have covered this subject, as well.

I also realize that I may have something of a different way of going about creating timelines and conducting analysis. I don't think that, out of the ways I've seen so far that there's a wrong way, I just think that we approach things a bit differently.

For example, I am not a fan of adding everything to a timeline, at least not as an initial step. I've found that IMHO, there's a lot of noise in just the file system metadata...many times, by the time I get notified of an incident, an admin has already logged into the system, installed and run some tools (including anti-virus), maybe even deleted files and accounts, etc. When interviewed about the incident and the specific question of "what actions have you performed on the system?" is asked, that admin most often says, "nothing"...only because they tend to do these things every day and they are therefore trivial. To add to that, there's all the other stuff...automatic updates to Windows or any of the applications, etc., that also adds a great deal of stuff that needs to be culled through in order to find what you're looking for. Adding a lot of raw data to the timeline right up front may mean that you're adding a lot of noise into that timeline, and not a lot of signal.

Goals
Generally, the first step I take is to look at the goals of my exam, and try to figure out which data sources may provide me the best source of data. Sometimes it's not a matter of just automatically parsing file system metadata out of an image; in a recent exam involving "recurring software failures", my initial approach was to parse just the Application Event Log to see if I could determine the date range and relative frequency of such events, in order to get an idea of when those recurring failures would have occurred. In another exam, I needed to determine the first occurrence within the Application Event Log of a particular event ID generated by the installed AV software; to do so, I created a "nano-timeline" by parsing just the Application Event Log, and running the output through the find command to get the event records I was interested in. This provided me with a frame of reference for most of the rest of my follow-on analysis.

Selecting Data Sources
Again, I often select data sources to include in a timeline by closely examining the goals of my analysis. However, I am also aware that in many instances, the initial indicator of an incident often is only the latest indicator, albeit the first to be recognized as such. That being the case, when I start to create a timeline for analysis, I generally start off by creating a file of events from the file system metadata, as well as available Event Log records, as well as running the .evt files through evtrpt.pl to get an idea of what I should expect to see from the Event Logs. I also run the auditpol.pl RegRipper plugin against the Security hive in order to see (a) what events are being logged, and (b) when the contents of that key were last modified. If this date is pertinent to the exam, I'll be sure to include an appropriate event in my events file.

Once my initial timeline has been established and analysis begins, I can go back to my image and select the appropriate data sources to add to the timeline. For example, during investigations involving SQL injection, the most useful data source is very likely going to be the web server logs...in many cases, these may provide an almost "doskey /history"-like timeline of commands. However, adding all of the web server logs to the timeline means that I'm going to end up inundating my timeline with a lot of normal and expected activity...if any of the requested pages contains images, then there will be a lot of additional, albeit uninteresting, information in the timeline. As such, I would narrow down the web log entries to those of interest, beginning with an interative analysis of the web logs, and add the resulting events to my timeline.

That's what I ended up doing, and like I said, the results were almost like running "doskey /history" on a command prompt, except I also had the time at which the commands were entered as well as the IP address from which they originated. Having them in the timeline let me line up the incoming commands with their resulting artifacts quite nicely.

Registry Data
The same holds true with Registry data. Yes, the Registry can be a veritable gold mine of information (and intelligence) that is relevant to your examination, but like a gold mine, that data must be dug out and refined. While there is a great deal of interesting and relevant data in the system hives (SAM, Security, Software, and System), the real wealth of data will often come from the user hives, particularly for examinations that involve some sort of user activity.

Chris and Rob advocate using regtime.pl to add Registry data to the timeline, and that's fine. However, it's not something I do. IMHO, adding Registry data to a timeline by listing each key by its LastWrite time is way too much noise and not nearly enough signal. Again, that's just my opinion, and doesn't mean that either one of us is doing anything wrong. Using tools like RegRipper, MiTec's Registry File Viewer, and regslack, I'm able to go into to a hive file and get the data I'm interested in. For examinations involving user activity, I may be most interested in the contents of the UserAssist\Count keys (log2timeline extracts this data, as well), but the really valuable information from these keys isn't the key LastWrite times; rather, it's the time stamps embedded in the binary value data within the subkey values.

If you're parsing any of the MRU keys, these too can vary with respect to where the really valuable data resides. In the RecentDocs key, for example, the values (as well as those within the RecentDocs subkeys) are maintained in an MRU list; therefore, the LastWrite time of the RecentDocs key is of limited value in and of itself. The LastWrite time of the RecentDocs key has context when you determine what action caused the key to be modified; was a new subkey created, or was another file opened, or was a previously-opened file opened again, modifying the MRUListEx value?

Files opened via Adobe Reader are maintained in an MRU list, as well, but with a vastly different format from what's used by Microsoft; the most recently opened document is in the AVGeneral\cRecentFiles\c1 subkey, but the name of the actual file is embedded in a binary value named tDIText. On top of that, when a new file is opened, it becomes the value in the c1 subkey, so all of the other subkeys that refer to opened documents also get modified (what was c1 becomes c2, c2 becomes c3, etc.), and the key LastWrite times are updated accordingly, all to the time that the most recent file was opened.

Browser Stuff
Speaking of user activity, let's not forget browser cache and history files, as well as Favorites and Bookmarks lists. It's very telling to see a timeline based on file system metadata with a considerable number of files being created in browser cache directories, and then to add data from the browser history files and get that contextual information regarding where those new files came from. If a system was compromised by a browser drive-by, you may be able to discover the web site that served as the initial infection vector.

Identifying Other Data Sources
There may be times when you would want to determine other data sources that may be of use to your examination. I do tend to check the contents of the Task Scheduler service log file (schedLgU.txt) for indications of scheduled tasks being run, particularly during intrusion exams; however, this log file can also be of use if you're trying to determine if the system were up and running during a specific timeframe. This may much more pertinent to laptops and desktop systems than to servers, which may not be rebooted often, but if you look in the file, you'll see messages such as:

"Task Scheduler Service"
Started at 3/21/2010 6:18:19 AM
"Task Scheduler Service"
Exited at 3/21/2010 9:10:32 PM


In this case, the system was booted around 6:16am EST on 21 March, and shut down around 9:11pm that same day. These times are listed relative to the local system time, and may need to be adjusted based on the timezone, but it does provide information regarding when the system was operating, particularly when combined with Registry data regarding the start type of the service, and can be valuable in the absence of, or when combined with, Event Log data.

There may be other data sources available, but not all of them may be of use, depending upon the goals of your examination. For example, some AV applications record scan and detection information in the Event Log as well as in their own log files. If the log file provides no indication that any malware had been identified, is it of value to include all scans (with "no threats found") in the timeline, or would it suffice to note that fact in your case notes and the report?

Summary
Adding all available data sources to a timeline can quickly make that timeline very unwieldy and difficult to manage and analyze. Through education, training, and practice, analysts can begin to understand what data and data sources would be of primary interest to an examination. Again, this all goes back to your examination goals...understand those, and the rest comes together quite nicely.

Finally, I'm not saying that it's wrong to incorporate all available data sources into a timeline...not at all. Personally, I like having the flexibility to create mini-, micro- or nano-timelines that show specific details, and then being able to go back to the overall timeline to be able to see what I learned viewed in the context of the overall timeline.

Monday, March 22, 2010

Links

File Initialization
Eoghan had an excellent post on issues/pitfalls of file initialization, particularly on Windows systems. After reading through this post, I can think of a number of exams I've done in the past where I wish I could link to this post in the report. In one particular instance, while performing data breach exams, I've found sensitive data "in" Registry hive files, and a closer look showed me that the search hits were actually in either the file slack or the uninitialized space within the file. During another exam, I was looking for indications of how an intruder ended up being able to log into a system via RDP. I ran my timeline tools across the Event Log (.evt) files extracted from the image, and found that I had RDP logins (event ID 528, type 10) in my timeline that fell outside of the date range (via evtrpt.pl) of the Security Event Log. Taking a closer look, I found that while the Event Logs had been cleared and one of the .evt file headers (and Event Viewer) reported that there were NO event records in the file, I found that the uninitialized portion of the file was comprised of data from previously used sectors...which included entire event records!

More than anything else, this concept of uninitialized space really reinforces to me how process and procedures are important, and how knowledgeable analysts really need to be. After all, say you have a list of keywords you're searching for, or a regex you're grep'ing for...when you get a hit, it is really in the file?

Unexpected Events
I ran across a link to Lenny Zeltser's site recently that pointed to a presentation entitled, How To Respond To An Unexpected Security Event. I'll have to put my thoughts on Lenny's presentation in another post, but for the most part, I think that the presentation has some very valid points that may often to unrecognized. More about that later, though.

Regtime
Following a recent SANS blog post (Digital Forensic SIFTing: SUPER Timeline Analysis and Creation), I received a couple of requests to release regtime.pl. This functionality was part of RegRipper before regtime.pl was included in SIFT. You can see this if you use rip.pl to list the plugins in .csv format, but instead of sending them to a file, pipe the output through the "find" command:

C:\Perl\forensics\rr>rip.pl -l -c | find "regtime" /i
regtime,20080324,All,Dumps entire hive - all keys sorted by LastWrite time


If you need the output to be in a different format, well, it's Perl and open source!

Timeline Creation
Chris has posted part 2 of his Timeline Analysis posts (part 1 is here), this time including Registry data by running regtime.pl from the SANS SIFT Toolkit v2.0 against the Registry hives. It's really good to see more folks bringing this topic up and mentioning it in blogs, etc. I hesitate to say "discussion" because I'm not really seeing much discussion of this topic in the public arena, although I have heard that there's been some discussion offline.

Looking at this, and based on some requests I've received recently, if you're looking for a copy of regtime.pl, you'll need to contact Rob about that. I did provide a copy of regtime.pl for him to use, but based on the output I saw in Chris's post, there appear to have been modifications to the script I provided. So if you're interested in a copy of that script, reach out to Rob.

Also, one other thought...I'm not a big proponent for adding information from the Registry in this manner. Now, I'm not saying its wrong...I'm just saying that this isn't how I would go about doing it. The reason for that is that there's just way too much noise being added to the timeline for my taste...that's too much extra stuff that I need to wade through. This is why I'll use RegRipper and the Timeline Entry GUI (described here) to enter specific entries into the timeline. By "specific entries", I mean those from the UserAssist and RecentDocs keys, as well as any MRU list keys. My rationale for this is that I don't want to see all of the Registry key LastWrite times because (a) too many of the entries in the timeline will simply be irrelevant and unnecessary, and (b) too many of the entries will be without any context whatsoever.

More often, I find that the really valuable information is what occurred to cause the change to the key's LastWrite time, and in many cases, that has to do with a Registry value, such as one being added or removed. In other cases, the truly valuable data can come from the binary contents of a Registry value, such as those within the UserAssist\Count subkeys.

Again, I'm not saying that there's anything wrong with adding Registry data to a timeline in this manner; rather, I'm just sharing my own experiences, those being that more valuable data can be found in a more tactical approach to creating a timeline.

AfterTime
I received an email recently from the folks at NFILabs, out of the Netherlands, regarding a tool called AfterTime, which is a Java-based timeline tool based on Snorkel. This looks like a very interesting tool, with evaluation versions running on both Windows and Linux. Based on some of the screenshots from the NFILabs site regarding AfterTime, it looks like each event has a number of values that relate to that event and can provide the examiner with some context to that event.

AfterTime also uses color-coded graphics to show the numbers and types of events based on the date. I've said in the past that one of the problems with this approach (i.e., illustrating abundance) is that most times, an intrusion or malware incident will be the least frequency of occurrence (shoutz to Pete Silberman of Mandiant) on a system. As such, I'm not entirely sure that a graphical approach to analysis is necessarily the way to go, but this is definitely something worth looking at.

Wednesday, March 17, 2010

Timeline Creation and Analysis

I haven't really talked about timelines in a while, in part because I've been creating and analyzing them as part of every engagement I've worked. I do this because in most...well, in all cases...the analysis I need to do involves something that happened a certain time. Sometimes it's a matter of determining what the event or events were, other times it's a matter of determining when the event(s) happened. The fact is that the analysis involves something happened at some point in time...and that's the perfect time to create a timeline.

With respect to creating timelines, I'm not the only using timelines. Chris posted recently on using the TSK tool fls to create a bodyfile from a live system, and Rob posted on creating timelines from Windows Volume Shadow Copies. Using Volume Shadow Copies to create timelines is a great way to get a view into the state of the system at some point in the past...something that can be extremely valuable in an investigation.

These are great places to start, but consider all that you could do if you took advantage of other data on the system. In order to get a more granular view into what happened when on a system, timelines should need to incorporate other data sources. Incorporating Event Log (.evt, .evtx) records may show you who was logged on to the system, and how (i.e., locally, via RDP, etc.). Now, auditing isn't always enabled, or enabled enough to provide indications of what you're looking for, but many times, there's some information there that may be helpful.

Including user web browser activity into a timeline has been extremely useful in tracking down things like browser drive-bys, etc. For example, by including web browsing activity, you may see the site that the user visited just prior to a DLL being created on the system and a BHO being added to the Registry. Also, don't forget to check the user's Bookmarks or Favorites...there at timestamps in those files, as well.

When I was working at IBM and conducting data breach investigations, many times we'd see SQL Injection being used in some manner. Parsing all of the web server logs for the necessary data required an iterative approach (i.e., search for SQL injection, collect IP addresses, re-run searches for the IP addresses, etc.), but adding those log entries to the timeline can provide a great deal of context to youranalysis. Say, for example, that the MS SQL Server database is on the same system as the IIS web server...any commands run via SQLi would leave artifacts on that system, just as creating/modifying files would. If the database is on another system entirely, and you're using the five field TLN format, you can easily correlate data from both systems in the same timeline (taking clock skew into account, of course). This works equally well for exams involving ASP or PHP web shells, as you can see where the command was sent (in the web server logs), as well as the artifacts (within the file system, other artifacts), all right there together.

Consider all of the other sources of data on a system, such as other application (i.e., AV, etc.) logs. And don't get me started on the Registry...well, okay, there you go. There're also Task Scheduler Logs, Prefetch files, as well as metadata from other files (PDF, Office documents, etc.) that can be added to a timeline as necessary. Depending on the system, and what you're looking for, there can be quite a lot of data.

But what does this work get me? Well, a couple of things, actually. For one, there's context. Say you start with the file system metadata in your timeline, and you kind of have a date that you're interested in, when you think the incident may have happened. So, you add the contents of the Event Logs, and you see that the user "Kanye" logged in...event ID 528, type 10. Hey, wait a sec...since when does "Kanye" log in via RDP? Why would he, if he's in the office, sitting at this desk? So then we add the user Registry hive information, and we see "cmd/1" in the RunMRU key (the most recent entry) and shortly thereafter we notice that "Kany3" logged in via RDP. We can get the user information from the SAM hive, as well as any additional information from the newly-created user profile. So as we add data, we begin to also add context with respect to activity we're seeing on the system.

We can also use the timeline to provide an increasing or higher level of overall confidence in the data itself. Let's say that we start with the file system metadata...well, we know that this may not be entirely accurate, as file system MAC times can be easily manipulated. These times, as presented by most tools, are usually derived from the $STANDARD_INFORMATION attribute within the MFT. However, what if I add the creation date of the file from the $FILE_NAME attribute, or simply compare that value to the creation date from the $STANDARD_INFORMATION attribute? Okay, maybe now I've raised my relative confidence level with respect to the data. So now, I add other sources of data, so rather than just seeing a file creation or modification event, I now see other activity (within close temporal proximity) that leads to that event?

Let's say that I start off with a timeline based just on file system metadata (Windows XP), and I see a file creation event for a Prefetch file. The Prefetch file is for an application accessed through the Windows shell, so I would want to perhaps see if the Event Log contained any login information so I could determine which user was logged in, as well as when they'd logged in; however, I find out that auditing of login events is not enabled. Okay, I check the ProfileList key in the Software hive against the user profile directories, and I find out that all users who've logged into the system are local users...so I can go into the SAM hive and get things like Last Login dates. I then parse the UserAssist key for each user, and I find that just prior to the Prefetch file I'm interested in being created, the user "Kanye" navigated through the Start button to launch that application. Now, the file system time may be easily changed, but I now have less mutable data (i.e., a timestamp embedded in a binary Registry value) that corroborates the file system time, which increases my relative level of confidence with respect to the data.

Now, jump ahead a couple of days in time...other things had gone on on the system prior to acquisition, and this time, I'm interested in the creation AND modification times of this Prefetch file. It's been a couple of days, and what I find at this point is that the UserAssist information tells me that the application referred to by the Prefetch file has actually been run several times, between the creation and modification date; now, my UserAssist information corresponds to modification time of the file. So, now I add metadata from the Prefetch file, and I have data that supports the modification time (the last time the application was run, the timestamp for which is embedded in the Prefetch file, would correspond to when the Prefetch file was last modified), as well as the number of times the user launched the application.

Now, if the application of interest is something like MSWord, I might also be interested in things such as any documents recently accessed, particularly via common dialogs. The point is that most analysts understand that file system metadata may be easily modified, or perhaps misinterpreted; by adding additional information to the timeline, I not only add context, but by adding data sources that are less likely to be modified (timestamps embedded in files as metadata, Registry key LastWrite times, etc.), I can raise my relative level of confidence in the data itself.

One final point...incident responders increasingly face larger and larger data sets, requiring some sort of triage to identify, reduce, or simply prioritize the scope of an engagement. As such, having access to extra eyes and hands...quickly...can be extremely valuable. So consider this...which is faster? Imaging 50 systems and sitting down and going through them, or collecting specific data (file system metadata and selected files) and providing to someone else to analyze, while on-site response activities continue? The off-site analyst gets the data, processes it and beings analysis, narrowing the scope...now we're down from 50 systems, to 10...and most importantly, we're already starting to get answers.

Let's say that I have a system with a 120GB system partition, of which 50 GB is used. Which is faster to collect...the overall image, or file system metadata? Which is smaller? Which can be more easily provided to someone off-site? Let's say that the file created when collecting file system metadata is 11MB. Okay...add Registry data, Event Logs, and maybe some specific files, and I'm up to...what...13MB. This is even smaller if I zip it...let's say 9MB. So now, while the next 12oGB system is being acquired, I'm providing the data to an off-site analyst, and she's able to follow a process for creating a timeline, and begin analyzing the data. Uploading a 9MB file is much faster than shipping a 120GB image via FedEx.

As a responder, I've had customers in California. Call me after happy hour on the East Coast, and the first flight out will be sometime in the next 12 hrs. It's usually 4 1/2 hrs to the San Jose area, but 6 hrs to LA or Seattle, WA. Then depending on where the customer is actually located, it may be another 2 hrs for me to get to the gate, get a rental car and arrive on-site. However, if there are trained first responders on staff, I can begin analyzing data (and requesting additional data) within, say, 2 hours of the initial call.

So another way cool thing is that this can also be used in data breach cases. How's that? Well, if you're shipping compressed file system metadata to someone (and you've encrypted it), you're not shipping file contents...so you're not exposing sensitive data. Providing the necessary information may not answer the question, but it can definitely narrow down the answer and help to identify and reduce the overall scope of an incident.

Monday, March 15, 2010

Tidbits

Windows 7 XPMode
I was doing some Windows 7 testing not long ago, during which I installed a couple of applications in XPMode. The first thing I found was that you actually have to open the XP VPC virtual machine and install the application there; once you're done, the application should appear on your Windows 7 desktop.

What I found in reality is that not all applications installed in XPMode appear to Windows 7. I installed Skype and Google Chrome, and Chrome did not and would not appear on the Windows 7 desktop.

Now, the next step is to examine the Registry...for both the Windows 7 and XPMode sides...to see where artifacts reside.

When it comes to artifacts, there are other issues to consider, as well...particularly when the system you're examining may not have the hardware necessary to run XPMode, so the user may opt for something a bit easier, such as VirtualBox's Seamless Mode (thanks to Claus and the MakeUsOf blog for that link!).

Skype IMBot
Speaking of Skype, the guys at CERT.at have an excellent technical analysis and write-up on the Skype IMBot. Some of what this 'bot does is pretty interesting, in it's simplicity...for example, disabling AV through 'net stop' commands.

I thought that the Registry persistence mechanism was pretty interesting, in that it used the ubiquitous Run key and the Image File Execution Options key. As much as I've read about the IFEO key, and used it in demos to show folks how the whole thing worked, I've only seen it used once in the wild.

The only thing I'd say that I really wasn't 100% on-board with about the report overall was on pg 7 where the authors refer to "very simple rootkit behavior"...hiding behavior, yes, but rootkit? Really?

ZBot
I found an update about ZBot over at the MMPC site. I'd actually seen variant #4 before.

Another interesting thing about this malware is something I'd noticed in the past with other malware, particularly Conficker. Even though there are multiple variants, and as each new variant comes out, everyone...victims, IR teams, and AV teams...is in react mode, there's usually something about the malware that remains consistent across the entire family.

In the case of ZBot, one artifact or characteristic that remains persistent across the family is the Registry persistence mechanism; that is, this one writes to the UserInit value. This can be extremely important in helping IT staff and IR teams in locating other infected systems on the network, something that is very often the bane of IR; how to correctly and accurately scope an issue. I mean, which would you rather do...image all 500 systems, or locate the infected ones?

From the MMPC write-up, there are appears to be another Registry value (i.e., the UID value mentioned in the write-up) that IT staff can use to identify potentially infected systems.

The reason I mention this is that responders can use this information to look for infected systems across their domain, using reg.exe in a simple batch file. Further, checks of these Registry keys can be added to tools like RegRipper, so that a forensic analyst can quickly...very quickly...check for the possibility of such an infection, either during analysis or even as part of the data in-processing. With respect to RegRipper, there are already plugins available that pull the UserInit value, and it took me about 5 minutes (quite literally) to write one for the UID value.

Millenials
I was talking with a colleague recently about how pervasive technology is today for the younger set. Kids today have access to so much that many of us didn't have when we were growing up...like computers and cell phones. Some of us may use one email application, whereas our kids go through dozens, and sometimes use more than one email, IM, and/or social networking application at a time.

I've also seen where pictures of people have been posted to social networking sites without their knowledge. It seems that with the pervasiveness of technology comes an immediate need for gratification and a complete dismissal for the privacy and rights of others. While some people won't like it when it happens to them, they have no trouble taking pictures of people and posting them to social network sites without their knowledge or permission. Some of the same people will freely browse social networking sites, making fun of what others have posted...but when it's done to them, suddenly it's creepin' and you have no right.

Combine these attitudes with the massive pervasiveness of technology, and you can see that there's a pretty significant security risk.

From a forensics perspective, I've seen people on two social networking sites, while simultaneously using three IM clients (none of which is AIM) on their computer (and another one on an iTouch), all while texting others from their cell phone. Needless to say, trying to answer a simple question like, "...was any data exfiltrated?" is going to be a bit of a challenge.

Accenture has released research involving these millenials, those younger folks for whom technology is so pervasive, and in many cases, for whom "privacy" means something totally different from what it means to you, me, and even our parents. In many ways, I think that this is something that lots of us need to read. Not too long ago, I noted that when examining a system and looking for browser activity, a number of folks responded that they started by looking at the TypedURLs key, and then asked, what if IE isn't the default browser? Let's take that a step further...what if the computer isn't the default communications device? Many times LE will try to tie an IP address from log files to a specific device used by a particular person...but now the question is, which device? Not only will someone have a laptop and a cell phone, now what happens when they tether the devices?

The next time you see some younger folks sitting around in a group, all of them texting other people, or you see someone using a laptop and a cell phone, think about the challenges inherent to answering the most basic questions.

USN Journal
Through a post and some links in the Win4n6 Yahoo Group, I came across some interesting links regarding the NTFS USN Journal file (thanks, Jimmy!). Jimmy pointed to Lance's EnScript for parsing the NTFS transaction log; Lance's page points to the MS USN_RECORD structure definition. Thanks to everyone who contributed to the thread.

An important point about the USN Journal...the change journal is not enabled by default on XP, but it is on Vista.

So, why's this important? Well, a lot of times you may find something of value by parsing files specifically associated with the file system, such as the MFT (using analyzeMFT.py, for example).

Another example is that I've found value bits in the $LogFile file, both during practice, and while working real exams. I simply export the file and run it through strings or BinText, and in a couple of cases I've found information from files that didn't appear in the 'active' file system.

For more detailed (re: regarding structures, etc.) information about the NTFS file system, check out NTFS.com and the NTFS documentation on Sourceforge.

Resources
MFT Analysis/Data Structures
Missing MFT Entry

LNK Files
Speaking of files...Harry Parsonage has an interesting post on Windows shortcut/LNK files, including some information on using the available timestamps (there's 9 of them) to make heads or tails of what was happening on the system. Remember that many times, LNK files are created through some action taken by a user, so this analysis may help you build a better picture of user activity.

Monday, March 01, 2010

Thoughts on posing questions, and sharing

I ran across a question on a list recently that I responded to when I saw it, but as time has passed, I've reconsidered my response somewhat. And whatnot.

The question I saw had to do with RegRipper, specifically my thoughts on meeting the needs of the community and creating new plugins. Basically, all I've ever asked for in that regard is a concise description of the need or issue, and a sample hive file. The person asking the question wanted to know if I seriously expected folks to provide hive files from live cases. My initial reaction was no, there are other ways to provide the necessary data. Such as setting up a test environment, replicating the issue, and sending me that hive file. However, I began to reconsider that response...if someone doesn't really know the difference between a Registry key and a value, and they have a question, how would they go about crafting the question? Once they do that, how would they go about discerning the responses they received, and figuring out which applied to what they were working on?

Seriously, there are a lot of things out there that require specific use of language, and specificity of language can be somewhat lacking in our community.

Taking that a step further, one of the problems I've seen for a number of years is that some questions that need to be asked simply don't get asked, because people in the community don't want to share information; apparently, "sharing information" has a number of different connotations. Some folks don't want it publicly known that they don't know something...even if asking the question means that they'll end up knowing the answer. I've seen this before...I didn't want to ask the question, because I didn't want to look dumb. To that, my response is usually along the lines of, so you don't ask the question, and we overcharge the customer for an inferior deliverable, our billing rate drops, AND you don't know the answer for the next time you need it. Really...which situation really makes you look dumb? Another one I see is that some folks don't ask questions publicly because they just don't want others to know that they had to ask...to which I usually suggest that if they had asked the question, they'd then know the answer, obviating the issue all together.

Others apparently don't ask questions because they're afraid that they'll have to give up sensitive information...information about a case that they're working on, etc. I understand that folks working CP cases don't want that stuff out...and to be honest, I don't either. I do want to help...and sometimes, due the "cop-nerd language barrier" sometimes the best and fastest way to help is to get the actual Registry hive or Event Log file. And guess what? Hive files don't (usually) contain graphics.

Like many folks, my desire to help comes from just that...a desire to help. If my helping makes it easier for an LE to be prepared to address the Trojan Defense, or better yet, to do so in a manner that gets a plea agreement, then that's good. I do NOT want to see the images, and I and others can help without seeing them.

Another issue is that some folks don't ask questions because they don't know enough about the situation to ask the question. This can be a particular issue in digital forensics, because there are certain things that really make a difference in how the respondent answers...such as, the file system, or even the version of the operating system. NTFS is different from FAT is different from ext2/3, and Windows XP has a number of differences from Windows 2000, as well as Vista.

Here's an example...some folks will ask questions such as, "how do I tell when a file was first created on a system?", without really realizing that the system in question, and perhaps even the document type, can greatly affect the answer. So sometimes the initial question is asked, but there may not be any response to (repeated) requests for clarification to the original question.

Does the version of Windows really matter, generally speaking? When you're dealing with any kind of IR or forensic analysis, the answer is most often going to be "yes".

So the big question is, if you have a question, do you want an answer to it? Are you willing to provide the necessary information such that someone can provide a succinct response? I know some folks who will not even attempt to answer a question that require an encyclopedic answer.

Before we go on, let me say that I complete understand and agree that we can't know everything. No one of us can know it all...that's where there's strength in a community of sharing. There's no way that you're going to know everything you need to know for every exam...there are going to be things that we don't remember (maybe from a training course a couple of years ago, or something you read once...), and there are going to be things that we just don't know.

So what can we, as a community, do? Well, one way to look at is that the question I have...well, someone else in the room or on the board may have the same question; they may not know it yet. So if that question gets asked, then others will be able to see the answers and then ask the next question, expanding that information. The point is that no one of us is as smart as all of us together.

Find someone you can trust, someone you're willing to share information with. If you need to, establish an NDA. Have community meetings in local areas. If you don't feel comfortable sharing with some folks because you don't know them...get to know them.

The other option is that you learn to do it yourself...and that's not always going to work. You may spend 8 months examining MacOSX systems, and suddenly have to examine a Windows 7 system. What're you going to do then? Sure, spending all weekend gettin' giggy wit' Google will likely net (no pun intended) you something, but at what point do you reach overload?

Over the years, I've met a number of folks with skills and abilities for which I have a great deal of respect, and some of those I've reached to for assistance when I've needed it. Conversely, I've done my best to respond to those folks who've reached to me with questions regarding areas I'm specifically interested in.

Anyway, I'll bring this rambling to a close...

Addendum: Sometimes a really good place to start with questions is to seek answers at the ForensicsWiki. This is also good place to post the answers once you get them.