Tuesday, December 10, 2013

Data Carving Done Manually

Data carving, for those uninitiated in the arcane ways of forensic investigators or just technology geeks who like screwing around with things, is the process of extracting files out of a large pile of bits. You may want to do this to pull these files out of hidden areas on the disk or you may want to recover deleted files. You may also just want to see if you can do it just for the fun of it. There are a lot of different ways of carving data out of a disk and I’m going to walk through one way using only tools that you can find on your average Linux distribution. So, data carving the old fashioned way. 

The Setup

First, I’m using virtual machines which makes life a little easier when it comes to shuffling disks around and making them small for the purposes of imaging them. I’m going to be using a disk image, though you could also use a raw disk just as easily and the process would be the same. While I created the disk inside a Windows virtual machine, I imaged it from a Linux VM using dd. The first thing we want to do is find some files we want to carve out. Since I created the disk, I know there are JPEG images on it. Before you go digging for gold or data, you have to know what it is you are looking for. While I’m looking for a JPEG image, I have to know what that JPEG image looks like before I can go searching bits and bytes. It’s not like I can tell the system to go looking for a picture of my hot girlfriend in a bikini. You’d have to know some sort of digital pattern. 

Fortunately, when it comes to JPEGs, I happen to know that there are some key markers I should be looking for. While there are specific byte patterns that start and end the file, it’s a bit easier to start off looking for a string and I know that JPEGs have the string JFIF in their headers, so I have a starting point. I can search the disk for the ASCII pattern JFIF. Once I find that pattern, I can isolate the file and extract it. Again, nothing up my sleeve other than the usual Linux command line suspects that you’d find in any distribution you can find. 

The Carving

The first thing is to go looking for the string JFIF since I know it will be in the header. I’m going to use the Linux/UNIX strings command to search for it but since I know it’s going to be there, searching for it isn’t enough. I also need to know where it’s going to be. As a result, I am going to have to tell strings I need to know the byte location within the file. To do that, I use the option -t with a parameter of d. -t says print the offset and d says print it in decimal. 

Screen Shot 2013 12 10 at 8 04 10 AM

 

 

 

 

Now I have some byte locations but I need to do a little math to help me figure out where I need to look. I could start at that byte and start grabbing but I actually need some bytes before it as well since that’s not actually the beginning of the file. As a result, I’m going to figure out what sector that byte is in. In order to do that, I have to divide by 512 since a sector is 512 bytes. When I divide 96236068 by 512 I get 187961. That’s the sector I’m in. The file system is actually logically organized into clusters that are larger than a single sector but I don’t need to worry about what cluster I’m in at this point. All I need to know is the sector. I can now use dd again to extract a chunk of the disk image that I think will correspond with the location of this file. I don’t know how big it is so I’m just going to grab a decent sized chunk of the image and then I can whittle from there once I find the end. 

Screen Shot 2013 12 10 at 8 12 29 AM

 

 

 

You’ll notice that I skipped 187960 blocks (sectors) before I started capturing my output. The reason for that is this is zero based and I need to get the beginning of the sector that the offset I found is in. As a result, I reduce my number by 1 and use that instead. The very first bytes I should see at the beginning of the JPEG are FF D8. That indicates the beginning of the JPEG header. If I use xxd to look at the resulting file I got from the dd capture, I can see that my first two bytes are in fact FF D8.

0000000: ffd8 ffe2 021c 4943 435f 5052 4f46 494c ......ICC_PROFIL 
0000010: 4500 0101 0000 020c 6c63 6d73 0210 0000 E.......lcms....

What I need to do now is locate the end of the file so I can figure out where I need to truncate it. I know at this point that I’m looking for the byte pattern FF D9 because that is the byte pair that indicates the end of a JPEG file. I’m going to use a hex editor to go looking for that byte pair so I can find the offset in the file where I need to truncate. In the image below, you can see the cursor indicating the beginning of the byte pattern. By counting over, I see the image ends at offset 1AD09. Now I know where to truncate the image. 

Screen Shot 2013 12 10 at 8 21 04 AM

 

 

While I could truncate it in the editor, I can also use dd again to just extract those bytes that I want and write it out to a new file. First, though I need to convert 1AD09 from hexadecimal to decimal. I can use a simple programmer’s calculator that’s included with my operating system and let it do the conversion for me. I end up with 109833. I want to make sure I get that position as well, so I’m going to grab 109834 bytes from the beginning of the JPEG and write it out to a new file. 

Screen Shot 2013 12 10 at 8 26 51 AM

 

 

 

When I look at the hex output from the file, again using xxd, I can see that the last two bytes are in fact FF D9.

001acf0: 3305 30cc 8cb1 1b94 cb7f 8e25 ccba f794 3.0........%.... 
001ad00: f32e 8d18 25c0 6b13 ffd9 ....%.k...

I can now open the file up in an image editor or viewer and see the result. Of course, what we’ve done will work for any JPEG. In order to carve out other file types, you would need to know the specific characteristics of that file to be able to look for patterns in the disk. 

Conclusion

You’ll have noticed that I searched for a string rather than the byte pattern. The reason is that I can’t search directly for the byte pattern without doing something in the middle like converting the disk to a hexadecimal representation and then looking for the byte pattern. I could also load up the disk in a hex editor to find the pattern I was looking for. If you have a large disk, this can be time consuming and also memory consuming. Large files may take much longer to work with in that way. Strings is convenient because I can look for a string and also have strings print the offset in the file for me where the string was located. The offset is really the most important part since it indicates where in the disk I need to be looking. Obviously, it would have been easier to look for FF D8 but those are non-printable characters that I couldn’t represent by typing so that I could search for them. 

I could also, if I were in a programming frame of mind, write a program that would go looking for a hex pattern for me and I may have to do that if I can’t find a string pattern to look for first. Fortunately, there are tools that will go carving files out of a disk for you. In fact, there are a lot of them. Doing it manually, though, can give you an appreciation for what’s involved when those tools have to go grubbing around through a lot of bits and bytes looking for short byte patterns. 

Sunday, December 8, 2013

Wearable Computing

As part of doing research for the next book I am writing about the next technologies that will be around to help with collaboration, particularly when it comes to business collaboration, I recently bought a Galaxy Gear smart watch. The idea was to see how effective wearable computing will be and, in part, because the Google Glass is so ridiculously expensive. Had I been able to get a Glass, at a reasonable price, I would have more than likely. New technology interests me.

When it comes to the Gear, there have been a couple of concerns. First, is it anything more than a very expensive watch? Honestly, it’s not the only really expensive watch I’ve ever bought. The worst was a bad experience with a Suunto. A GPS watch seemed like a really good idea. I’d be able to really see how far I was walking or biking and I could map my walks and biking outings. The GPS had a really hard time ever finding the satellites and it ran the battery down. It was a terrible experience. The map app never seemed to work all that well either. Ridiculously expensive for what essentially turned out to be just a watch. And not even a very good watch. So, having said that, the Gear could only be up from there, right?

As it turns out it is. At the moment, there aren’t a lot of apps for it but the one thing I thought it would be really good for, it actually is perfect for. Sometimes I may be doing something, like lecturing or driving, and I don’t want to pull my phone out. It just may not be appropriate or safe. However, the Gear will display texts or calls or even social media messages. I can quickly glance at my watch just to see what or who it is, in case I’m expecting something important. I can also get notifications about e-mail messages, calendar reminders and alarms. You may think that this is obsessing too much about keeping attached to your digital communications but the reality is that there are times when you need to catch an important message but it’s just not convenient to whip out your phone. 

Or, even worse, how often do you see people in meetings that are supposed to be important and they are sitting there with their phone, playing with it and checking messages and so forth. Isn’t it rude to sit there clearly not paying attention to the meeting and looking at your phone? If you don’t want to pay attention, don’t go to the meeting. You get no points for simply warming the seat at the table. At least this way, you can get messages sent to your watch. It’s hard to say if it would be considered any less rude to sit there glancing at your watch periodically or staring at your phone. You glance at your watch, you send the message that maybe you’re waiting for the meeting to be over but at least you may be engaged. You stare at your phone or play with it and you not only want the meeting to be over but you also don’t care about paying attention to what’s going on. 

There aren’t a lot of apps. I know I said that. I hope there will be more. At the moment, Samsung hasn’t released a software developer kit for the Gear. Without the SDK, the Gear won’t be nearly as useful as it could be. However, there are some apps, including some third party apps. As with many third party apps, it’s hard to know whether to fully trust them or not or whether they are going to provide the functionality you are looking for. Here’s an example. Natively, you will get a notice that you have a Facebook message. Just a notice. Not the message. There is an app that says it will show you the message. However, it’s not from Facebook. Do I trust this third party app developer? Will the app interface with Facebook correctly? Who knows. At the moment, I’m not interested enough to figure it out. 

It does have a pedometer built into it. Boy, has the pedometer business taken off. Everything has a pedometer built into it now. I’ve been using a pedometer from FitBit for the last few years. They work well. They claim to track your sleep, though not nearly as well as the Zeo sleep monitor I have. Sadly, Zeo went out of business. Anyway, now I have a phone (the S4) that will act as a pedometer. I have a watch that will act as a pedometer and integrate with the phone. And I have an actual pedometer. Having a device that I use regularly that will track my steps and activity is convenient. I don’t always have my phone with me, though. And I don’t always have my watch on. If I’m just puttering around the house/apartment, I probably don’t have my phone in my pocket and my watch can get in the way of doing things like typing, as I am now. The clasp is metal and it rubs on my laptop while I’m typing and metal against metal is just annoying. Having a little device I can toss in my pocket is convenient. Other than now I have a Fitbit, a phone, some keys, maybe some money and who knows what else. 

There are too cool features, even if only in a very geeky way. One of them is the whole talking into your wrist thing. The speaker and microphone (yes, you can take calls on your watch, just like Dick Tracy, sort of) are in the clasp. If you want the best ability to hear and be heard, you put the clasp up toward your head. I can also dictate things like text messages using S Voice, again using the microphone in the clasp. The other cool geeky feature is the fact that it has a camera. Yes, all of a sudden you have a spy camera on your wrist. And a Samsung camera. The pictures even look really good.You can see one taken with the camera in my watch below. I did shrink it down so it’s not full resolution but the quality is really quite good. 

 20131129 153301

Finally, let me get back to one of the issues that seemed like it might be a concern. Before I bought it, I heard people suggesting that the battery wasn’t adequate. I remember reading someone indicate that the battery wouldn’t last a day. As it turns out, my battery generally lasts about 4 days. This is far longer than my phone battery lasts. By about 3 days, generally. Is it ideal? No, but it’s not too bad. If you’re really concerned about it, charge it overnight while you’re in bed. It’s not like it will track your sleep. Who wears their watch to bed?

Overall, would I recommend the Galaxy Gear. I’d say it depends. I think you will see a lot of business people getting wearable computers like the Galaxy Gear because they feel like they want to be in constant contact. Well, I take that back. Maybe business people is too general. Many executives and sales people will want a device like this, I think. If you are just looking for a watch, it’s expensive. If you often carry your phone in a place that’s difficult to get to, the Gear is a nice way to get calls and notifications and I think it’s worth the cost. You want to wait for Apple to have half the functionality at twice the cost? 

Tuesday, November 19, 2013

Start of a New Day

Sure, take on something new. Why not? This seems like a good outlet for stuff that doesn’t fall into the categories of other things I’m doing between teaching, writing and recording video training titles. I expect there will be a smattering of different concepts that show up here, including forensics, networking, security and so forth. Maybe more generically, technology discussions.
This first item arrived as a result of a homework assignment I was doing for a graduate Operating Systems Forensics course. We were supposed to take a look at a number of ways to gather artifacts under Windows and then translate the same sort of things to Linux, Mac OS or another operating system. I decided to take a look at the Volatility Framework and the things it could do under Windows and provide the same sort of guidance under Linux without using the framework. One of the issues was taking a look at raw system memory. Fortunately, there are ways to do that in Linux.
Linux makes use of pseudo filesystems to present information that is stored inside the kernel space. One of those filesystems is /proc, which provides a way of accessing information about the current processes. Even though it’s accessed from the filesystem, you aren’t accessing files in the sense of bits that are stored on a disk in your system. Instead, you are accessing an interface into kernel space. You can see a list of all of the files in the directory of the init process, which will always be the first process on a Linux system.

attr cpuset limits net sched syscall
autogroup cwd loginuid ns schedstat task
auxv environ map_files oom_adj sessionid timers
cgroup exe maps oom_score smaps wchan
clear_refs fd mem oom_score_adj stack
cmdline fdinfo mountinfo pagemap stat
comm io mounts personality statm
coredump_filter latency mountstats root status

All of those files and directories are statistics and other pieces of data belonging to the init process. You can see the stack and pagemap, for example. While this can give you a lot of data about individual processes, at some point you may simply need to just get a dump of all of the memory on a system. This is especially true when it comes to a forensic investigation and you need to grab all of the information in memory for later analysis. We can do this pretty easily using another pseudo filesystem, /dev. In this case, we are going to use the memory device from the /dev filesystem in order to grab memory. If I just dump the /dev/mem device, though, it’s not going to look like anything I can use since there will be a lot of values that don’t have ASCII representations that can be printed. As a result, we need to dump the output into a utility that can convert the input into a hex dump output. Since I probably have a lot of memory and I want to look at it visually, I want to make use of a pager that can display a portion of memory at a time. Below, you can see the command string I used and a portion of the memory from my Linux system.

$ dd if=/dev/mem | xxd | less
0000000: a072 00f0 a072 00f0 a172 00f0 a072 00f0 .r...r...r...r..
0000010: a072 00f0 dc72 00f0 a072 00f0 a072 00f0 .r...r...r...r..
0000020: 91ad 00f0 c5a3 00f0 cc72 00f0 cc72 00f0 .........r...r..
0000030: cc72 00f0 cc72 00f0 1785 00f0 cc72 00f0 .r...r.......r..
0000040: 1001 00c0 ea72 00f0 f472 00f0 9788 00f0 .....r...r......
0000050: 62a0 00f0 fe72 00f0 24a5 00f0 59ab 00f0 b....r..$...Y...
0000060: 1779 00f0 ff7d 00f0 05ac 00f0 a072 00f0 .y...}.......r..
0000070: a072 00f0 3370 00f0 7283 00f0 3313 00c0 .r..3p..r...3...
0000080: a072 00f0 a072 00f0 a072 00f0 a072 00f0 .r...r...r...r..
0000090: a072 00f0 a072 00f0 a072 00f0 a072 00f0 .r...r...r...r..
00000a0: a072 00f0 a072 00f0 a072 00f0 a072 00f0 .r...r...r...r..

It’s important here to recognize that the addresses shown are physical addresses, which have no resemblance to the addresses that would be in use by an application since Linux, like all other modern operating systems, uses virtual memory in order to support more allocated memory than is physically in the system. However, having a full dump of memory can be critical in some cases. Having access to a character-based device that is an interface to raw system memory is powerful so it’s helpful to remember that with the /dev/mem device, you can not only read, but you can write as well so it’s best to be careful as as to avoid forking up your system.