Tuesday, October 3, 2017

Chasing Data Using Sleuth Kit

Working on a new set of videos for O'Reilly Media -- basically updating one of the first video titles from back when it was Infinite Skills. In the process, I had to refresh my memory on a number of things. One of them was using The Sleuth Kit tools to run through a disk image to locate the contents of a file. Sure, you could just pop it open in a browser using some commercial tool but where's the fun in that? Autopsy you say? Yeah, but ultimately, Autopsy uses The Sleuth Kit tools to begin with even if you don't see it. Why not just learn what it is that Autopsy does so you can be ahead of the game? Having said that, let's step through how you might go about this.

We're going to be working with a disk image taken from a Linux system and the partition on the disk was formatted with ext4. However, the same steps will work for a Windows disk, particularly if the partition was formatted with NTFS. Since we have a disk image and not a partition image, the first thing we need to do is determine where the partition actually starts. In order to do that, we are going to use the program mmls, which lists all of the partitions in a disk or disk image. We could also use fdisk -l to do essentially the same thing.


What we discover here is that the partition we are looking for starts at byte 2048. The other Sleuth Kit tools we will be using will need to be told what offset to start at because they are really looking for the start of the partition in order to parse the data structures that begin there. Once we know where the partition starts, we can get a list of the files that are in the partition. For this, I'm just going to get a list of active files and not worry about doing a recursive listing down through all the directories (adding a -r). We also aren't going to deleted files (adding a -d). For our purposes, it doesn't much matter whether we have those or not. We are going to use fls and we need to add a -o 2048 to indicate that the offset to where the partition starts is 2048 bytes.


We now have a listing of the small number of files that are in the root directory of this partition. What we get from this listing is whether the entry is a directory (d/d) or a regular file (r/r). The second column is the inode where the metadata for the file is located. The metadata for the file not only includes date information but also, more importantly, the data blocks that belong to the file. Those data blocks are where can get access to the contents of the file. In order to get the data blocks, we are going to use the program istat. This will give us all of the information that the inode has related to the file. Keep in mind that while you think about the file in the context of the filename, on a UFS-based system (ext inherits a lot from UFS, the UNIX File System that goes back to the 70s and 80s with BSD, the Berkeley Systems Distribution), a file is just a collection of related data blocks. We could have multiple filenames that all point to the same "file" on the disk.

Running istat, we provide the offset to the start of the partition, just as we did with fls. Additionally, we provide the image that we are searching and also the inode that we want to interrogate. You can see the results of this below.


Among other things, we can see that the inode has been allocated. It's not free space because it refers to a file. You can see the date and time information. You can also see the permissions that are associated with the file. Additionally, as I mentioned above, different filenames can point to the same set of data blocks (the same inode). The "num of links" entry indicates the number of filenames that point to this inode, and by extension, the data that the inode points to. This is where the "Direct Blocks" entry is important. The direct blocks tells us where to get the contents of the file. For this, we use the blkcat command.


Again, we have to provide the offset because blkcat expects to start with the beginning of the partition, as fls and istat do. We provide the image name then the block number where the data is located. This is followed in this case by the number of blocks we want to extract. By default, we only pull one but since all of the blocks for the file are consecutive, we can pull all of them at once. Beneath that, you can see the contents of the file.

While it's several steps, using mmls to get the partition start, fls to get a listing of files, istat to get the data block address and finally blkcat to extract the file contents, it does help to highlight how the filesystem is put together. Being able to follow this chain, no matter the file or the filesystem, will help with the understanding of the workings of a filesystem such that no matter what tool you are using, you know the process.

No comments:

Post a Comment