Tuesday, July 29, 2014

More Fun With Python

Since I’m in the middle of trying to get a title completed on Python scripting for security professionals for Infinite Skills, I’ve been doing a lot of writing little scripts that do interesting things. So, in order to dump my head and also get something somewhat recent and potentially interesting up here, I thought I’d write up one of those scripts here. This could be a useful foundation for someone who wanted to do a little security testing using Python scripts. It is also useful for forensics professionals since you may want to write custom tools that are capable of parsing data in a way that makes sense to you rather than relying on tools that represent data in a way that made sense to someone else. Being able to parse simple data structures, for example, is a very useful skill for both network programming and also forensics programming. One case where there was a lot of parsing to do that came up both in terms of the video training I am doing but also in the next book I am writing is dealing with the information in the master boot record, including the partition table. 

So, let’s take a look at a program that I threw together to do some quick parsing of the partition table in a master boot record. 

#!/usr/bin/python3

#  (c) 2014, WasHere Consulting, Inc

import struct

 

f = open("mbr.dd", "rb")

 

mbr = bytearray()

try:

    mbr = f.read(512)

finally:

    f.close()

 

x = struct.unpack("<i", mbr[0x1B8:0x1BC])

print("Disk signature: ", x[0])

x = mbr[0x1BE]

if x == 0x80:

    print("Active flag: Active")

else:

    print("Active flag: Not active")

 

lbastart = struct.unpack("<i", mbr[0x1C6:0x1CA])

print("Partition Start (LBA): ", lbastart[0])

lbaend = struct.unpack("<i", mbr[0x1C9:0x1CD])

print("Partition End (LBA): ", lbaend[0])

For the purposes of this program, I have grabbed an image of the master boot record so I can get to the partition table. I did this by using the UNIX/Linux utility dd. You simply grab the first 512 byte block with dd if=/dev/sdb of=mbr.dd bs=512 count=1 and you end up with an image of the master boot record you can use various tools with. So, the program opens up the disk image called mbr.dd as a binary file then creates a byte array to store all of the bytes from that disk image into. Once I have a byte array, called mbr, I can start to pull the bytes out that I want as long as I know where the offsets are. 

Something to keep in mind, though, is that the master boot record is stored as little endian so if I have a file with an image or copy of that master boot record, all of the multi-byte values are going to be backwards. We need to use struct.unpack to get the bytes out and in the correct order. So, we tell struct.unpack that we have a little endian integer with the parameter <i and then we have to provide a range of bytes out of the byte array that struct.unpack should use to create that integer out of. The thing to keep in mind when you are providing a range is that the top end is not inclusive. For the disk signature, I am grabbing bytes 1B8, 1B9, 1BA, 1BB. Even though the last byte in the range indicated in the program is 1BC, we don’t get that last byte because it’s not included based on the Python syntax. 

Once we have the basics of pulling data out of the byte array, the rest is trivial. I can grab the single byte indicating whether a partition is active (bootable) or not and then compare that value with what I know about that flag. If it’s 0x80, I know the partition is active. If it’s not, then it’s not active so I can print out the results based on that byte. I can also get the starting logical block address and the ending logical block address by grabbing the bytes from my byte array and converting them into integers, again using the struct.unpack method. 

This is a simple technique that can then be applied to other binary data structures. Whether that data structure is the rest of the master boot record or if it’s the BIOS parameter block or the structures associated with a GUID Partition Table disk. 

 

Thursday, July 10, 2014

Finding Data In A Disk Image

Back to the basics now, especially since it seems as though there is a lot of emphasis on letting the tools do all the work for us that we forget how to do it ourselves when we are lacking the tools. This time we’re going to walk through the process of locating a deleted file on a hard drive. In this case, I’m going to be using a deleted file but, of course, this same process can be used for locating information for non-deleted files off a disk image as well. To save a little time, I’m going to be using some utilities from The Sleuth Kit, though we could also do the same thing by hand. I should also mention at this point that one of the reasons for writing this up is that I am working through a file systems chapter in my next book on Operating Systems Forensics, due to be published early next year by Syngress/Elsevier. It helps to understand the structure of the file system format so you can find data on the disk and know where to look for it.

The first thing I need to do is create a file I want to use on an NTFS partition. I’m going to be doing this from a Linux system, though the Sleuth Kit utilities work on other operating systems as well like Windows. Since I want to acquire a disk image, I am going to use Linux so I get dd. I am going to create the file with a word that isn’t likely to show up anywhere else on the file system in order to minimize the number of hits I get when I go looking. You can see the contents of the file below.

Screen Shot 2014 07 10 at 11 46 55 AM

I need to grab an image of the file and I may as well also capture a cryptographic hash at the same time since it’s just good practice. I’m going to use dd to capture the image. So, in my case, I’m using dd if=/dev/sdb1 of=ntfs.dd. I want to capture the whole disk so I am not setting a block size or setting any count. I want it to run until it runs out of disk to copy. Then I’m going to grab a cryptographic hash of the resulting image. When I’m all done, I get 2d59270e187217c3c222fc78851a1ebe91e3f8ec for my SHA1 hash on a disk image that is 150M in size. For the purposes of this exercise, I left the image small.

After deleting the file from the disk, I want to figure out where the data is. There are a couple of ways of accomplishing this using the Sleuth Kit tools. The first is to look it up by the file name. In reality, when you delete a file, it’s not gone from your drive or partition. As a result, there are still references to the file name sitting out there on my partition. One TSK tool I can use to find the file by the filename is ifind. ifind will search your file system for a reference to the filename or you can look up the metadata for a file based on the data unit you provide. In this example, I’m going to use ifind to look for the filename and then I’m going to use another TSK utility to pull the data out of the address on the disk that ifind has provided for me. You can see what that looks like below.


Screen Shot 2014 07 10 at 1 39 10 PM

That provides me with the contents of the file and you can compare the results with what we saw when I used cat to display the file. ifind looked in the image file ntfs.dd for a filename called wubble.txt and it returned a data block of 73, which I then used to pass into icat to get the contents of block 73. This assumes we know the name of the file or that the name of the file is still available to be searched on. What I may have is just a chunk of text that comes out of the file. I can use another utility to go looking for the file based on just a word search. Since I’m looking through a binary file with the image capture, I need to do something special with grep in order to figure out the offset where I can find the text I’m looking for. I am going to tell grep to give me a binary offset and look through the whole file, so I’ll be using grep -oba to search my image file for the word wubble. You can see the results below.


Screen Shot 2014 07 10 at 1 47 07 PM

It looks like we have several hits in the image. The number on the left hand side is the byte offset. Since I have that in bytes, I need to figure out what cluster that is going to be in so I can use the TSK tools, which are block or cluster-based. As a result, I need to do a little math but I need to know what my cluster size is first. Fortunately, I can use fsstat on my image and get my cluster size. Of course, I could also use a hex editor and do it the really old-fashioned away but this should be adequate for our purposes. If I divide 91551 by my cluster size of 4096, I end up with 22 and some change. That tells me that the data is going to be in cluster 22, so I can use the TSK tool blkcat to get the contents of block/cluster 22. You can see the results of that below.

Screen Shot 2014 07 10 at 1 51 11 PM

You can see a lot of funny looking characters. That’s because what we have here is an entry in the Master File Table so there is a lot of binary data and when you try to convert a byte that wasn’t meant to be a character into a character, you end up with values that don’t translate into something that looks right. You can, though, see the text of the file and that’s pretty common for an NTFS entry. Since it’s a very small file, the contents of the file was simply stored as an attribute of the file rather than taking up a data block somewhere else on the file system. You can also see the filename in the middle of all of that content. You can also see the value FILE0. This indicates that the drive was formatted with Windows XP or something newer. Since I used a Linux system to format it, the formatting utility just used conventions from a more recent version of Windows and NTFS.

We used a lot of TSK utilities to do this but we could just as easily stick with standard UNIX utilities to perform the work. Using the output from grep, we can see that we need to be 22 blocks into the file system in order to get the data we are looking for out. We can easily use dd to extract that information and then use xxd to view it in a hexadecimal dump. Using dd, we would set the block size to the block/cluster size of 4096 and then skip 22 blocks or clusters, grabbing only one. So, we could use dd if=ntfs.dd of=caught.dd bs=4096 count=1 skip=22 to grab a single block from the image file we have. Once we have the 4096 byte cluster, we can just use xxd caught.dd to view the results in a hexadecimal dump and see that we have the MFT entry for the file wubble.txt.