Sunday, July 09, 2017

Tools

Today's post is a mish-mash of tools and techniques that I've seen or used recently...

Hindsight is a great free, open source tool for parsing a user's Chrome browser data.  I've used it a number of times to great effect; in one instance, I was able to show that a system became infected with ransomware when the user used Chrome to access their AOL email, where they downloaded and launched the malicious attachment.  The tool is very easy to use, and all you need to do is either point it at the user's "Default" folder (within the Chrome path), or extract the sqlite3 files and run it locally against the data.

Joe Gray over at AlienVault published an interesting article on data carving; this has always been an interesting DFIR topic, ranging from file carving to carving for individual records.  In the wake of the recent NotPetya attacks, Willi's EVTXtract might come in handy for some.  Another tool that I've run against decompressed hibernation files, pagefiles, and unallocated space is bulk_extractor, specifically when looking for indications of network communications.  My point is that if you're going to go carving, sometimes it's a good idea to first think about what it is you're carving for, and then seek an suitable approach to performing the carving.

Not "new" by any stretch, but Yogesh's research into Windows8/8.1 search history is still very relevant for a number of reasons.  For one, it illustrates the continued use of the LNK file format (which is actually pretty pervasive throughout the Windows platform...), telling us that not all of the stuff we learned from previous versions of Windows needs to be thrown out the door.  Second, Yogesh's finding that the retention mechanism for search terms changed between Windows 8 and 8.1 illustrates how quickly things can change on Windows systems.  I mean, look at what the Volatility folks have had to deal with!  ;-)

I ran across the Network Usage View tool from NIRSoft recently...that's a pretty interesting capability.  The write-up for the tool indicates that it gets it's data by reading the SRUDB.DAT database on Win8 and Win10 systems.  This is potentially a pretty valuable data source for DFIR work and analysis. In case you haven't seen it, Yogesh has a pretty fascinating presentation available on SRUM Forensics that is worth checking out.

I saw on Twitter recently that there's a Python-based tool available available now for diff'ing Registry hive files.  I completely agree with those who've commented that this is some great functionality to have available, and has a great deal of potential...but this functionality has been around from quite some time already, via other sources.  For example, James McFarlane's Parse::Win32Registry Perl module distribution includes a script that implements this functionality.  Another tool that allows you to diff two Registry hive files is RegShot.  I agree that this is great functionality to have available, particularly if you want to see what differences exist between a hive file extracted from a VSC or found in the RegBack folder, with one in the config folder.

Speaking of the Registry, I saw this paper from DFRWS 2008 that discusses recovering deleted data from Registry hive files.  My first real encounter with this sort of information was via Jolanta Thomassen's dissertation paper on the topic, and the regslack tool she provided to go along with it.  Since then, other tools (RegRipper plugins del.pl and del_tln.pl) have implemented similar functionality, largely due to the demonstrated value of this functionality.

Jason Hale posted a while back (2 yrs) on the DeviceContainers key on Windows systems, and I ran across his post again recently.  What he found is pretty interesting...I'll have to dig into it a bit more and see what else is available out there.  Jason's research seems to provide a pretty good idea of what can be derived from the key data, so this may be well worth developing a RegRipper plugin, even if just to research what's available in various hives.

I was working on some analysis recently, and was facing an issue where a good number of NTUSER.DAT files had been recovered from an image, all of which had been extracted from the image and placed in folder paths.  While there were a lot of these files, I was only interested in one Registry key (pertinent to the case), a key for which a RegRipper plugin did not exist.  So, I modified an existing plugin to give me information about the key in question, if it did exist, and then wrote a DOS batch file to iterate through all of the folders, running the new plugin (via rip.exe) against the hive file.  A few minutes of development and testing, and I had a repeatable, documented process in place and functioning, providing a capability that had not been in my hands just a few moments before.  My point in sharing this is to illustrate what can be achieved through simple problem definition, and the use of open sources to develop a solution.  I have the batch file I used, so it's pretty much self-documenting, and I pasted the command line from the batch file into my case notes.

pundup - Python script from herrcore to extract contents of McAfee *.bup files.  Even in 2017, there are a great deal of systems (and infrastructures) without any real endpoint monitoring capability employed, and sometimes you need to dig around a bit to get some really useful information about an incident.  One place you can look is AV detections (via the logs), and as such, any available quarantined files may provide even greater insight into the incident.  Further, if the system is running an older version of Windows, and you don't have an Amcache.hve file collecting process execution artifacts (like SHA-1 hashes), having the actual EXE itself to document, hash, and analyze would be very beneficial.

AppCompatProcessor - I ran across this little open source gem recently (note: according to the readme, this does not currently run on Windows); this tool runs through either AppCompatCache or AmCache data and allows you to...well, do a LOT with the data. It's well worth a look; just reading through the main page, I can easily see that a lot of what I include in my own workflow is used as pivot points, and then to expand the data.  For example, I tend to look for things like "$Recycle",

SysMon View - this is a really interesting approach to filtering and visualizing data collected by Sysmon on a Windows system.  Unfortunately, the only time I see Sysmon in use is on my own test systems; it does not seem to have been widely adopted by members of the corporate community who call for IR assistance.  I do think that this is a great approach to making better use of the data, though.

LimaCharlie - from refractionPOINT, described as an open source, cross-platform endpoint sensor.  There isn't a great deal of information available via the web page, but there are a few tweets available.

SideNote:
Speaking of endpoint agents, SANS recently conducted a product review of CrowdStrike's Falcon platform...you can get the PDF report here.
/SideNote

Invoke-Phant0m - The description states that the script "walks thread stacks of Event Log Service process (spesific[sic] svchost.exe) and identify Event Log Threads to kill Event Log Service Threads. So the system will not be able to collect logs and at the same time the Event Log Service will appear to be running."  Given the recent release of tools that claim to be able to remove individual Windows Event Log records, this is an interesting approach.  However, the biggest issue with the released tools is the inability to validate findings; while some on Twitter (I've been pointed to tweets) have claimed success, the actual EXE and process used haven't been shared to the point of allowing others to validate the findings.  To say, "...just start with this DLL..." does not provide a means of validation.

Of course, if you're not able to remove individual records, Inslainity provided another approach, albeit one that can be validated.  I've tested another approach to removing specific ranges of event records from the Security Event Log, using a method that can be scaled to all logs, but is much more insidious if you don't.

The folks at Javelin Networks have come up with an in-memory PowerShell script that can peek into consoles and provide detailed information about what's being done.  The description states that the script "extracted the content of the following command-line shells: PowerShell, CMD, Python, Wscript, MySQL Client, and some custom shells such as Mimikatz console. In some cases, the tools might be helpful to extract encrypted shells like the one used in PowerShell Empire Agent."

News
Adam (Hexacorn) has published yet another article demonstrating means of persistence and EDR bypass.  If nothing else, this is an excellent example of why endpoints of all versions (not just Windows) need to be instrumented to monitor and record process creation events, including full command lines.

Pete James over at Precision Discovery have a fascinating blog post in which he discusses records left behind in an often-overlooked Windows Event Log file.  I can't say that I've ever had a case where I've needed to know which Office files had been accessed by a user, but if you're tracking such artifact categories, then this is a good one to include.

Speaking of Office products, Will Knowles at MWRLabs published a blog post on using Office add-ins for persistence.

Here are a couple of links I've had sitting around for a while that I really haven't dug into...
Javelin Networks - CLI Powershell
NViso - Hunting Malware with Metadata
FireEye - Shim Databases used for Persistence

Thursday, June 15, 2017

Analyzing Documents

I've noticed over time that a lot of the write-ups that get posted online regarding malware or downloaders delivered via email attachments (i.e., spear phishing campaign) focus on what happens after the malicious payload is activated...the URL reached to, the malware downloaded, etc.  However, few seem to dig into the document itself, and there's a great deal can be gleaned from those documents, that can add to the threat intel picture.  If you're not looking at everything involved in the incident, if you're not (as Jesse Kornblum said) using all the parts of the buffalo, then you're very likely missing critical elements of the threat intel picture.

Here's an example from MS...much of the information in the post focuses on the embedded macro and the subsequent decrypted Cerber executable file, but there's nothing available regarding the document itself.

Keep in mind that different file formats (LNK, OLE, etc.) will contain different information.  And what I'm referring to here isn't about running through analysis steps that take a great deal of time; rather, what I'm going to show you are a few simple steps you can use to derive even more information from the attachment/wrapper documents themselves.

I took a look at a couple of documents (Doc 1, Doc 2) recently, and wanted to share my process and see if others might find it useful.  Both of these OLE-format documents have hashes available (or you can download and compute the hashes yourself), and they were also found on VirusTotal:

VirusTotal analysis for Doc 1
VirusTotal analysis for Doc 2

The VT analysis for both files includes a comment that the file was used to deliver PupyRAT.

Tools
Tools I'll be using for this analysis include my own oledmp.pl and wmd.pl.

Doc 1 Analysis
Running oledmp.pl against the file, we see:

Fig. 1: Doc 1 oledmp.pl output

















That's a lot of streams in this OLE file.  So, one of the first things we see is the dates for the Root Entry and the 'directories' (MS has referred to the OLE file format as "a file system within a file", and they're right), which is 1 Jan 2017.  According to VT, the first time this file was submitted was 1 Jan 2017, at approx. 20:29:43 UTC...so what that tells us is that it's likely that one of the first folks to receive the document submitted it less than 14 hrs after the file was modified.

Continuing with oledmp.pl, we can view the contents of the various streams in a hex dump format, but we see that stream number 20 contains a macro.  Using oledmp.pl with the argument "-d 20", we can view the contents of the stream in hex dump format.  In the output we see what appear to be 2 base64-encoded Powershell commands, one that downloads PupyRAT to the system, and another that appears to be shell code.  Copying and decoding both of the streams gives us the command that downloads PupyRAT, as well as a second command that appears to be some form of shell code.  Some of the variable names ($Qsc, $zw5) appear to be unique, so searching for those via Google leads us to this Hybrid-Analysis write-up, which provides some insight into what the shell code may do.

Interestingly enough, the same search reveals that, per this Reverse.IT link, both encoded Powershell commands were used in another document, as well.

Moving on, here's an excerpt of the output from wmd.pl, when run against this document:

--------------------
Summary Information
--------------------
Title        :
Subject      :
Authress     : Windows User
LastAuth     : Windows User
RevNum       : 2
AppName      : Microsoft Office Word
Created      : 01.01.2017, 06:51:00
Last Saved   : 01.01.2017, 06:51:00
Last Printed :

Notice the dates...they line up with the previously-identified dates (see fig.1)


Doc 2 Analysis
Following the same process we did with doc 1, we can see very similar output from oledmp.pl with doc 2:

Fig. 2: Doc 2 oledmp.pl output















One of the first things we can see is that this document was created within about 24 hrs of doc 1.

In the case of doc 2, stream 16 contains the data we're looking for...extracting and decoding the base64-encoded Powershell commands, we see that the commands themselves (PupyRAT download, shell code) are different.  Conducting a Google search for the variables used in the shell code command, we find this Hybrid-Analysis write-up, as well as this one from Reverse.IT.

Here's an excerpt of the output from wmd.pl, when run against this document:

--------------------
Summary Information
--------------------
Title        : HealthSecure User Registration Form
Subject      : HealthSecure User Registration Form
Authress     : ArcherR
LastAuth     : Windows User
RevNum       : 2
AppName      : Microsoft Office Word
Created      : 02.01.2017, 06:49:00
Last Saved   : 02.01.2017, 06:49:00
Last Printed : 20.06.2013, 06:27:00

--------------------
Document Summary Information
--------------------
Organization : ACC

Remember, this is a sample pulled down from VirusTotal, so there's no telling what happened with the document between the time it was created and submitted to VT.  I made the 'authress' information bold, in order to highlight it.

Summary
While this analysis may not appear to be of significant value, it does form the basis for developing a better intelligence picture, as it goes beyond the more obvious aspects of what constitutes most analysis (i.e., the command to download PupyRAT, as well as the analysis of the PupyRAT malware itself) in phishing cases.  Valuable information can be derived from the document format used to deliver the malware itself, regardless of whether it's an MSOffice document, or a Windows shortcut/LNK file.  When developing an intel picture, we need to be sure to use all the parts of the buffalo.

Wednesday, May 31, 2017

Use of LNK Files...Again

I've discussed LNK file before...here, and more recently, here.  Windows shortcut files are an artifact on Windows systems that have been available for so long that there are likely a great many analysts who understand the category this artifact falls into, but may not know a great deal about the nature of the artifact itself.  That is to say that it's likely that a good many training and certification courses tend to gloss over some of the more esoteric aspects of the Windows shortcut file as an artifact.

Most analysts likely understand the Windows shortcut, or "LNK" file to be an artifact of "user knowledge of files", as the most commonly understood activity that leads to the automatic creation of these files is a user double-clicking on a file via the Windows Explorer shell.

We also know that understanding the format of these files can be very important, as well.  The LNK file format is used as the basis for individual streams within JumpLists, and there are several Registry keys that contain value data that follows the LNK file format.  In fact, LNK files are compose, at least in part, of shell items...so now we're getting into the format of the individual building blocks of many artifacts on Windows systems.

Let's take a step back for a moment, and consider the most common means of "producing" Windows shortcut files.  From a standpoint of developing and interpreting evidence, let's say that a user plugs a USB device into a laptop, opens the contents of the device in Windows Explorer, and then double-clicks an MS Word document.  At that point, a shortcut file is created in the user's Recent folder that contains not only a series of shell items that comprise the path to the file, but also a string that refers to the file, as well.  However, the LNK file also contains information about that system on which it was created, and it's this aspect of the file format that is missed or forgotten, due the fact that in most cases, this embedded information isn't very interesting.  However, if an LNK file is not created as an artifact of, say, a malware infection, but is instead used by an attacker to infect other systems, then the LNK file will contain information about the system on which it was developed, which would NOT be the victim system.  This is when the contents of the LNK file become very interesting.

TrendMicro's TrendLabs recently published a blog article that gives a very good example of when these files can be extremely interesting.  The article describes the reported increase in use of LNK files as weaponized attachments by the group identifed as "APT 10".  I find this absolutely fascinating, not because something that has been part of Windows from the very beginning has been turned against users and employed as an attack vector, but because of the information and metadata that seems to be left on the floor because these files simply are not parsed or dug into...at least not to any extent that is discussed publicly.  It seems that most organizations may miss or discount the value of Windows shortcut file contents, and not incorporate them into their overall intelligence picture.

This JPCERT blog post describes "Evidence of Attackers' Development Environment Left in Shortcut Files", and this NViso blog post discusses, "Tracking Threat Actors Through .LNK Files".  These are great at describing what can be done, providing illustration and example.

Sunday, April 09, 2017

Getting Started

Not long ago, I gave some presentations at a local high school on cybersecurity, and one of the questions that was asked was, "how do I get started in cybersecurity?"  Given that my alma mater will establish a minor in cybersecurity this coming fall, I thought that it might be interesting to put some thoughts down, in hopes of generating a discussion on the topic.

So, some are likely going to say that in today's day and age, you can simply Google the answer to the question, because this topic has been discussed many times previously.  That's true, but it's as blessing as much as it is a curse; there are many instances in which multiple opinions are shared, and at the end of the thread, there's no real answer to the question.  As such, I'm going to share my thoughts and experience here, in hopes that it will start a discussion that others can refer to.  I'm hoping to provide some insight to anyone looking to "get in" to cybersecurity, whether you're an upcoming high school or college graduate, or someone looking to make a career transition.

During my career, I've had the opportunity to be a "gate keeper", if you will.  As an incident responder, I was asked to vet resumes that had been submitted in hopes of filling a position on our team.  To some degree, it was my job to receive and filter the resumes, passing what I saw as the most qualified candidates on to the next phase.  I've also worked with a pretty good number of analysts and consultants over the years.

The world of cybersecurity is pretty big and there are a lot of roads you can follow; there's pen testing, malware reverse engineering, DFIR, policy, etc.  There are both proactive and reactive types of work.  The key is to pick a place to start.  This doesn't mean that you can't do more than one...it simply means that you need to decide where you want to start...and then start. Pick some place, and go from there.  You may find that you're absolutely fascinated by what you're learning, or you may decide that where you started simply is not for you.  Okay, no problem.  Pick a new place and start over.

When it comes to reviewing resumes, I tend to not focus on certifications, nor the actual degree that someone has.  Don't get me wrong, there are a lot of great certifications out there.  The issue I have with certifications is that when most folks return from the course(s) to obtain the certification, there's nothing that holds them accountable for using what they learned.  I've seen analysts go off to a 5 or 6 day training course in DFIR of Windows systems, which cost $5K - $6K (just for the course), and not know how to determine time stomping via the MFT (they compared the file system last modification time to the compile time in the PE header).

I am, however, interested to see that someone does have a degree.  This is due to the fact that having a degree pretty much guarantees a minimum level of education, and it also gives insight into your ability to complete tasks.  A four (or even two) year degree is not going to be a party everyday, and you're likely going to end up having to do things you don't enjoy.

And why is this important?  Well, the (apparently) hidden secret of cybersecurity is that at some point, you're going to have to write.  That's right. No matter what level of proficiency you develop at something, it's pretty useless if you can't communicate and share it with others.  I'm not just talking about sharing your findings with your team mates and co-workers (hint, "LOL" doesn't count as "communication"), I'm also talking about sharing your work with clients.

Now, I have a good bit of experience with writing throughout my career.  I wrote in the military (performance reviews, reports, course materials, etc.), as part of my graduate education (to include my thesis), and I've been writing almost continually since I started in infosec.  So...you have to be able to write.  A great way to get experience writing is to...well...write.  Start a blog.  Write something up, and share it with someone you trust to actually read it with a critical eye, not just hand it back to you with a "looks good".  Accept that what you write is not going to be perfect, every time, and use that as a learning experience.

Writing helps me organize my thoughts...if I were to just start talking after I completed my analysis, what came out of my mouth would not be nearly as structured, nor as useful, as what I could produce in writing.  And writing does not have to be sole source of communications; I very often find it extremely valuable to write something down first, and then use that as a reference for a conversation, or better yet, a conference presentation.

So, my recommendations for getting started in the cybersecurity field are pretty simple:
1. Pick some place to start.  If you have to, reach to someone for advice/help.
2. Start. If you have to, reach to someone for advice/help.
3. Write about what you're doing. If you have to, reach to someone for advice/help.

There are plenty of free resources available that provide access to what you need to get started; online blog posts, pod casts/videos, presentations, books (yes, books online and in the library), etc.  There are free images available for download, as part of DFIR challenges (if that's what you're interested in doing).  There are places you can go to find out about malware, download samples, or even run samples in virtual environments and emulators.  In fact, if you're viewing this blog post online, then you very likely have everything you need to get started.  If you're interested in DFIR analysis or malware RE, you do not need to have access to big expensive commercial tools to conduct analysis...that's just an excuse for paralysis.

There is a significant reticence to sharing in this "community", and it's not simply isolated to folks who are new to the field.  There are a lot of folks who have worked in this industry for quite a while who will not share experiences or findings.  And there is no requirement to share something entirely new, that no one's seen before.  In fact, there's a good bit of value in sharing something that may have been discussed previously; it shows that you understand it (or are trying to), and it can offer visibility and insight to others ("oh, that thing that was happening five years ago is coming back...like bell bottoms...").

The take-away from all of this is that when you're ready to put your resume out there and apply for a position in cybersecurity, you're going to have some experience in the work, have visible experience writing that your potential employer can validate, and you're going to know people in the field.

Understanding What The Data Is Telling You

Not long ago, I was doing some analysis of a Windows 2012 system and ran across an interesting entry in the AppCompatCache data:

SYSVOL\Users\Admin\AppData\Roaming\badfile.exe  Sat Jun  1 11:34:21 2013 Z

Now, we all know that the time stamp associated with entries in the AppCompatCache is the file system last modification time, derived from the $STANDARD_INFORMATION attribute.  So, at this point, all I know about this file is that it existed on the system at some point, and given that it's now 2017 it is more than just a bit odd, albeit not impossible, that that is the correct file system modification date.

Next stop, the MFT...I parsed it and found the following:

71516      FILE Seq: 55847 Links: 1
[FILE],[BASE RECORD]
.\Users\Admin\AppData\Roaming\badfile.exe
    M: Sat Jun  1 11:34:21 2013 Z
    A: Mon Jan 13 20:12:31 2014 Z
    C: Thu Mar 30 11:40:09 2017 Z
    B: Mon Jan 13 20:12:31 2014 Z
  FN: msiexec.exe  Parent Ref: 860/48177
  Namespace: 3
    M: Thu Mar 30 11:40:09 2017 Z
    A: Thu Mar 30 11:40:09 2017 Z
    C: Thu Mar 30 11:40:09 2017 Z
    B: Thu Mar 30 11:40:09 2017 Z
[$DATA Attribute]
File Size = 1337856 bytes

So, this is what "time stomping" of a file looks like, and this also helps validate that the AppCompatCache time stamp is the file system last modification time, extracted from one of the MFT record attributes.  At this point, there's nothing to specifically indicate when the file was executed but now, we have a much better idea of when the file appeared on the system. The bad guy most likely used the GetFileTime() and SetFileTime() API calls to perform the time stomping, which we can see by going to the timeline:

Mon Jan 13 20:12:31 2014 Z
  FILE                       - .A.B [152] C:\Users\Admin\AppData\Roaming\
  FILE                       - .A.B [56] C:\Windows\explorer.exe\$TXF_DATA
  FILE                       - .A.B [1337856] C:\Users\Admin\AppData\Roaming\badfile.exe
  FILE                       - .A.B [2391280] C:\Windows\explorer.exe\

Fortunately, the system I was examining was Windows 2012, and as such, had a well-populated AmCache.hve file, from which I extracted the following:

File Reference: da2700001175c
LastWrite     : Thu Mar 30 11:40:09 2017 Z
Path          : C:\Users\Admin\AppData\Roaming\badfile.exe
Company Name  : Microsoft Corporation
Product Name  : Windows Installer - Unicode
File Descr    : Windows® installer
Lang Code     : 1033
SHA-1         : 0000b4c5e18f57b87f93ba601e3309ec01e60ccebee5f
Last Mod Time : Sat Jun  1 11:34:21 2013 Z
Last Mod Time2: Sat Jun  1 11:34:21 2013 Z
Create Time   : Mon Jan 13 20:12:31 2014 Z
Compile Time  : Thu Mar 30 09:28:13 2017 Z

From my timeline, as well as from previous experience, the LastWrite time for the key in the AmCache.hve corresponds to the first time that badfile.exe was executed on the system.

What's interesting is that the Compile Time value from the AmCache data is, in fact, the compile time extracted from the header of the PE file.  Yes, this value is easily modified, as it is simply a bunch of bytes in the file that do not affect the execution of the file itself, but it is telling in this case.

So, on the surface, while it may first appear as if the badfile.exe had been on the system for four years, it turns out that by digging a bit deeper into the data, we can see that wasn't the case at all.

The take-aways from this are:
1.  Do not rely on a single data point (AppCompatCache) to support your findings.

2.  Do not rely on the misinterpretation of a single data point as the foundation of your findings.  Doing so is more akin to forcing the data to fit your theory of what happened.

3. The key to analysis is to know the platform you're analyzing, know your data...no only what is available, but it's context.

4.  During analysis, always look to artifact clusters.  There will be times when you do not have access to all of the artifacts in the cluster, so you'll want to validate the reliability and fidelity of the artifacts that you do have.

Saturday, April 08, 2017

Understanding File and Data Formats

When I started down my path of studying techniques and methods for computer forensic analysis, I'll admit that I didn't start out using a hex editor...that was a bit daunting and more than a little overwhelming at the time.  Sure, I'd heard and read about those folks who did, and could, conduct a modicum of analysis using a hex editor, but at that point, I wasn't seeing "blondes, brunettes, and redheads...".  Over time and with a LOT of practice, however, I found that I could pick out certain data types within hex data.  For example, within a hex dump of data, over the years my eyes have started picking out repeating patterns of data, as well as specific data types, such as FILETIME objects.

Something that's come out of that is the understanding that knowing the structure or format of specific data types can provide valuable clues and even significant artifacts.  For example, understanding the structure of Event Log records (binary format used for Windows NT, 2000, XP, and 2003 Event Logs) has led to the ability to parse for records on a binary level and completely bypass limitations imposed by using the API.  The first time I did this, I found several valid records in a *.evt file that the API "said" shouldn't have been there.  From there, I have been able to carve unstructured blobs of data for such records.

Back when I was part of the IBM ISS ERS Team, an understanding of the structure of Windows Registry hive files led us to being able to determine the difference between credit card numbers being stored "in" Registry keys and values, and being found in hive file slack space.  The distinction was (and still is) extremely important.

Developing an understanding of data structures and file formats has led to findings such as Willi Ballenthin's EVTXtract, as well as the ability to parse Registry hive files for deleted keys and values, both of which have proven to be extremely valuable during a wide variety of investigations.

Other excellent examples of this include parsing OLE file formats from Decalage, James Habben's parsing Prefetch files, and Mari's parsing of data deleted from SQLite databases.

Other examples of what understanding data structures has led to includes parsing Windows shortcuts/LNK files that were sent to victims of phishing campaigns.  This NViso blog post discusses tracking threat actors through the .lnk file they sent their victims, and this JPCert blog post from 2016 discusses finding indications of an adversary's development environment through the same resource.

Now, I'm not suggesting that every analyst needs to be intimately familiar with file formats, and be able to parse them by hand using just a hex editor.  However, I am suggesting that analysts should at least become aware of what is available in various formats (or ask someone), and understand that many of the formats can provide a great deal of data that will assist you in your investigation.

Sunday, March 26, 2017

Links, Updates

LNK Attachments
Through my day job, we've seen a surge in spam campaigns lately where Windows shortcuts/LNK files were sent to the targets as email attachments.  A good bit of the focus has been the embedded commands within the LNK files, and how those commands have been obfuscated in order to avoid detection or analysis.  There is some great work being done in this area (discovery, analysis, etc.) but at the same time, a good bit of data and some potential intelligence is being left "on the floor", in that the LNK files themselves are not being parsed for embedded data.

DFIR analysts are probably most often familiar with LNK files being used to either maintain malware persistence when infected, or to indicate that a user opened a file on their system.  In instances such as these, the MS API for creating LNK files embeds information from the local system within the binary contents of the LNK file.  However, if a user is sent an LNK file, that file must have been created on another system all together...which means that unless a specific script was used to create the LNK file on a non-Windows system, or to modify the embedded information, we can assume that the embedded information (MAC address, system NetBIOS name, volume serial number, SID) was from the system on which the LNK file was created.

I'd blogged on this before (yes, eight years ago), and while researching that blog, had found this reference to %LnkGet% at Microsoft.

I recently ran across this fascinating write-up from NVISO Labs regarding the analysis of an LNK file shipped embedded within an old-style (re: OLE) MSWord document.  While the write-up contains hex excerpts of the LNK file, and focuses on the command used to launch bitsadmin.exe to download malware, what it does not do is extract embedded artifacts (SID, system NetBIOS name, MAC address, volume serial number) from the binary contents of the LNK file.

I know...so what?  Who cares?  How can you even use this information?  Well, if you're maintaining case notes in a collaboration portal, you can search for various findings across engagements, long after those engagements have closed out or analysts have left (retired, moved on, etc.), developing a "bigger picture" view of activity, as well as maintaining intelligence from those engagements.  For example, keeping case notes across engagements will allow a perhaps less experienced analyst see what's been done on previous engagements, and illustrate what further work can be done (just-in-time learning).  Of course then there's correlating multiple engagements with marked similarities (intel gathering).  Or, something to consider is that there are Windows Event Log records that include the NetBIOS name of the remote system when a login occurs, and you might be able to correlate that information with what's seen embedded in LNK files (intel collection/development).

MS-SHLLINK: Shell Link Binary File Format

AutoRun - ServiceStartup
Adam's found yet another autostart location within the Registry, this one the ServiceStartup key in the Windows Registry.  I haven't seen a system with this key populated, but it's definitely something to look out for, as this would make a great RegRipper plugin, or a great addition to the malware.pl plugin.

However, while I was looking around at a Software hive to see if I could find a ServiceStartup key with any values, I ran across the following key that looked interesting:

HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\WSMAN\AutoRestartList

Digging into the key a bit, I found a couple of interesting articles (here, and here).  This is something I plan to keep a close eye on, as it looks as if it could be very interesting.

Analysis
The guys from Carbon Black recently shared some really fascinating analysis regarding "a malicious Excel document was used to create a PowerShell script, which then used the Domain Name System (DNS) to communicate with an Internet Command and Control (C2) server."

This is really fascinating stuff, not only to see stuff you've never seen before, but to also see how such things can be discovered (how would I instrument or improve my environment to detect this?), as well as analyzed.

Imposter Syndrome
I was catching up on blog post reading recently, and came across a blog post that centered on how the imposter syndrome caused one DFIR guy to sabotage himself.  Not long after reading that, I read James' post about his experiences at BSidesSLC, and his offer to help other analysts...and it sounded to me like James was offering a means for DFIR folks who are interested to overcome their own imposter syndrome.

On a similar note, not too long ago, I was asked to give some presentations on cyber-security at a local high school, and during one of the sessions, one of the students had an excellent question...how does someone with no experience apply for an entry-level position that requires experience?  To James' point, I recommended that the students with an interest in cyber-security start working on projects and blogging about them, sharing what they did, and what they found.  There are plenty of resources available, including system images that can be downloaded and analyzed, malware that can be downloaded and analyzed, etc.  Okay, I know folks are going to say, "yeah, but others have already done that..."...really?  I've looked at a bunch of the images that are available for download, and one of the things I don't see is responses...yes, there are some, but not many.  So, download the image, conduct and document your analysis, and write it up.  So what if others have already done this?  The point is you're sharing your experiences, findings, how you conducted your analysis, and most importantly, you're showing others your ability to communicate through the written word.  This is very important, because pretty much every job in the adult world includes some form of written communications, either through email, reports, etc.

Here's what I would look for if I were looking to fill a DFIR analysis position...even if the image had already been analyzed by others, I'd look for completeness of analysis based on currently available information, and how well it was communicated.  I'd also look for things like, was any part of the analysis taken a step or two further?  Did the "analyst" attempt to do all of the work themselves, or was there collaboration?  I'd want to see (or hear, during the interview) justification for the analysis steps along the way, not to second guess, but in order to understand thought processes.  I'd also want to see if there was a reliance on commercial tools, or if the analyst was able to incorporate (or better yet, modify) open source tools.

Not all of these aspects are required, but these are things I'd look for.  These are also things I'd look at when bringing new analysts onto a team, and mentoring them.

PowerShell
posted recently about some interesting Powershell findings that pertain to Powershell version 5.  I found this while examining a Windows 10 system, but if someone has updated the Powershell installation on their Windows 7 and 8 systems, they should also be able to take advantage of the artifacts.

However, something popped up recently that simply reiterated the need to clearly identify and understand the version of Windows that you're examining...Powershell downgrade attacks.

This SecurityAffairs article provides an example of how Powershell has been used.

Wednesday, March 15, 2017

Incorporating AmCache data into Timeline Analysis

I've had an opportunity to examine some Windows 10 systems lately, and recently got a chance to examine a Windows 2012 server system.  While I was preparing to examine the Windows 2012 system, I extracted a number of files from the image, in order to incorporate the data in those files into a timeline for analysis.  I also grabbed the AmCache.hve file (I've blogged about this file previously...), and parsed it using the amcache.pl RegRipper plugin.  What I'm going to do in this post is walk through an example of something I found after the initial analysis,

From the ShimCache data from the system, I found the following reference:

SYSVOL\Users\Public\Downloads\badfile.exe  Fri Jan 13 11:16:40 2017 Z

Now, we all know that the time stamp associated with the entry in the ShimCache data is the file system last modification time of the file (NOT the execution time), and that if you create a timeline, this data would be best represented by an entry that includes "M..." to indicate the context of the information.

I then looked at the output of the amcache.pl plugin to see if there was an entry for this file, and I found the following:

File Reference    : 720000150e1
LastWrite           : Sun Jan 15 07:53:53 2017 Z
Path                    : C:\Users\Public\Downloads\badfile.exe
Company Name : FileManger
Product Name    : Fileppp
File Descr          : FileManger
Lang Code         : 0
SHA-1               : 00002861a7c280cfbb10af2d6a1167a5961cf41accea
Last Mod Time  : Fri Jan 13 11:16:40 2017 Z
Last Mod Time2: Fri Jan 13 11:16:40 2017 Z
Create Time       : Sun Jan 15 07:53:26 2017 Z
Compile Time    : Fri Jan 13 03:16:40 2017 Z

We know from Yogesh's research that the "File Reference" is the file reference number from the MFT; that is, the sequence number and the MFT record number.  In the above output, the "LastWrite" entry is the LastWrite time for the key with the name referenced in the "File Reference" entry.  You'll also notice some additional information that could be pretty useful...some of it (Lang Code, Product Name, File Descr) were values that I added to the plugin today (I also updated the plugin repository on GitHub, as well).

You'll also notice that there are a few time stamps, in addition to the key LastWrite time.  I thought that it would be interesting to see what effect those time stamps would have on a timeline; so, I wrote a new plugin (amcache_tln.pl, also uploaded to the repository today) that would allow me to add data to my timeline.  After adding the AmCache.hve time stamp data to my timeline, I went looking for

Sun Jan 15 07:53:53 2017 Z
  AmCache       - Key LastWrite   - 720000150e1:C:\Users\Public\Downloads\badfile.exe
  REG                 User - [Program Execution] UserAssist - C:\Users\Public\Downloads\badfile.exe (1)


Sun Jan 15 07:53:26 2017 Z
  AmCache       - ...B  720000150e1:C:\Users\Public\Downloads\badfile.exe
  FILE              - .A.B [286208] C:\Users\Public\Downloads\badfile.exe


Fri Jan 13 11:16:40 2017 Z
  FILE               - M... [286208] C:\Users\Public\Downloads\badfile.exe
  AmCache       - M...  720000150e1:C:\Users\Public\Downloads\badfile.exe

Fri Jan 13 03:16:40 2017 Z
  AmCache       - PE Compile time - 720000150e1:C:\Users\Public\Downloads\badfile.exe

Clearly, a great deal more analysis and testing needs to be performed, but this timeline excerpt illustrates some very interesting findings.  For example, the AmCache entries for the M and B dates line up with those from the MFT.

Something else that's very interesting is that the AmCache key LastWrite time appears to correlate to when the file was executed by the user.

For the sake of being complete, let's take the parsed MFT entry for the file:

86241      FILE Seq: 114  Links: 1
[FILE],[BASE RECORD]
.\Users\Public\Downloads\badfile.exe
    M: Fri Jan 13 11:16:40 2017 Z
    A: Sun Jan 15 07:53:26 2017 Z
    C: Fri Feb 10 11:37:25 2017 Z
    B: Sun Jan 15 07:53:26 2017 Z
  FN: badfile.exe  Parent Ref: 292/1
  Namespace: 3
    M: Sun Jan 15 07:53:26 2017 Z
    A: Sun Jan 15 07:53:26 2017 Z
    C: Sun Jan 15 07:53:26 2017 Z
    B: Sun Jan 15 07:53:26 2017 Z
[$DATA Attribute]
File Size = 286208 bytes

We know we have the right file...if we convert the MFT record number (86241) to hex, and prepend it with the sequence number (also converted to hex), we get the file reference number from the AmCache.hve file.  We also see that the creation date for the file is the same in both the $STANDARD_INFORMATION and $FILE_NAME attributes from the MFT record, and they're also the same as the value extracted from the AmCache.hve file.

There definitely needs to be more research and work done, but it appears that the AmCache data may be extremely valuable with respect to files that no longer exist on the system, particularly if (and I say "IF") the key LastWrite time corresponds to the first time that the file was executed.  Review of data extracted from a Windows 10 system illustrated similar findings, in that the key LastWrite time for a specific file reference number correlated to the same time that an "Application Popup/1000" event was recorded in the Application Event Log, indicating that the application had an issue; four seconds later, events (EVTX, file system) indicated an application crash.  I'd like to either work an engagement where process creation information is also available, or conduct testing and analysis of a Win2012 or Win10 system that has Sysmon installed, as it appears that this data may indicate/correlate to a program execution finding.

Now, clearly, the AmCache.hve file can contain a LOT of data, and you might not want it all.  You can minimize what's added to the timeline by using the reference to the "Public\Downloads" folder, for example, as a pivot point.  You can run the plugin and pipe the output through the find command to get just those entries that include files in the "Public\Downloads" folder in the following manner:

rip -r amcache.hve -p amcache_tln | find "public\downloads" /i

An alternative is to run the plugin, output all of the entries to a file, and then use the type command to search for specific entries:

rip -r amcache.hve -p amcache_tln > amcache.txt
type amcache.txt | find "public\downloads" /i

Either one of these two methods will allow you to minimize the data that's incorporated into a timeline and create overlays, or simply create micro-timelines solely from data within the AmCache.hve file.

Oh, and hey...more information on language ID codes can be found here and here.

Addendum: Additional Sources
So, I'm not the first one to mention the use of AmCache.hve entries to illustrate program execution...others have previously mentioned this artifact category:
Digital Forensics Survival Podcast episode #020
Willi's amcache.py parser