Tuesday, November 3, 2015

Policy and Compliance Are Not Enough

The information security business seems to have strayed a bit from its roots. The roots of digital information security really began decades ago by the people who built and maintained systems. They may have wanted to either protect information or keep people out. While our nature as humans is more sharing and collaborative than it is secretive and isolationist, the reality is that we all have times when we need our secrets and spaces where we can store information that no one else can get to unless we specifically allow it. The problem comes when you start allowing a lot of users into the system or if you start connecting a lot of systems together into networks. Then we need additional protections in place to make sure everyone’s little corral of horses stays their own little corral of horses, unless they choose to set up a petting zoo.

Ultimately, there are competing priorities when it comes to information security. There is the pure security-focused no one gets in unless they are specifically allowed in priority. On the other end, there is the perspective that the business owns everything and so it gets to set the rules. The problem that arises here is that rather than these two ends working together to find a middle, the money for the security end is tied up in the focus on the business priorities.

This is where we find a conundrum. Sure, you can say that without the business, there are no systems to protect. As a result, the business should always set the priorities. This has the potential to work well if the business truly owns its resources and has a stake in protecting them. The problem arises when the business has no stake in the resources that are under the control of the information technology and information security people. I’m losing you, right? Okay, let’s talk about case studies.

Using a very simple scenario that can be extrapolated to much higher levels. Let’s say you are a company who wants to start up a loyalty card for your customers. This will allow you to learn a lot about the people who spend money with you and you can feed a little back in discounts or other goodies. Without the goodies, what is to entice people to sign up for your loyalty card so you can gather all of that data? Suddenly, though, your business has a resource that it has no stake in. You are storing names, addresses and phone numbers of a large number of your customers. What happens to your business if that information is stolen? It could be that absolutely nothing happens. If you aren’t storing credit cards or other financial data with those records, it could be you don’t even need to let anyone know, depending on the breach notification laws where you do business.

Even if you do have to notify someone, what has actually been lost? Some names and addresses. No big deal, right? If there is no downside to the business if that information is lost, what is the incentive to do everything possible to protect that information? This is where the problem of business-driven security comes in. The information that has been stolen doesn’t actually belong to the business. It belongs to the customers of the business. Since it doesn’t belong to the business and there is no actual impact to the business from its loss — maybe you have to shut down your loyalty card program, which doesn’t lose sales. It just means less marketing information that you can make use of to be more effective.

Business-driven security starts with the security policy. This is a very high-level statement of expected outcomes. There is nothing at all about implementation in the policy. That comes in a set of standards that fall out of the policy. An acceptable use policy, which is common, may simply say that anyone making use of a company resource like a computer and the enterprise network will do so in a business-appropriate manner. That’s it. That’s the policy. There are countless ways to implement that policy. The standards that are defined underneath that policy will get into more detail but still won’t get into specific technologies and implementations. Instead, you get a more fine-grained set of requirements for what meeting that policy should look like.

The notion of making sure that you are achieving your policy goals is called compliance. This is also a word used in relationship to meeting regulatory requirements. Some businesses may need to meet requirements set down by the Payment Card Industry (PCI) if they deal with credit or debit cards. Others may have requirements set down by the Federal Depositors Insurance Corporation (FDIC). This would be common with banks and other similar financial companies. Making sure that you are meeting these requirements is also called compliance. As a result, compliance is big business. Meaning, there is a lot of money in auditors coming in to make sure you are following the appropriate rules.

Meeting a set of rules, however, that are very high level statements of expected outcomes may not necessarily be the right things to be paying attention to. Here’s an example. A business has a security awareness training program for all its users. Every user has to take this training. An auditor may come in and determine whether the business is really getting all of its users through security training. If the business has a goal of getting, say 97% of users through training in a month and they hit 98%, they have achieved their objective.

Is this the right objective, though? Are the right topics being covered in the training? How is retention of the training being measured? Does this training actually help improve the overall security posture of an organization?

These are all questions that are not answered in this scenario but they are potentially far more important than the question of the percentage of users who have successfully made it through training. What it comes down to is clearly defining the problem. If you haven’t identified the problem well enough, your measurements are likely to be meaningless.

Large businesses are often driven by this compliance mentality and auditors and security professionals are often driven by meeting objectives that bear no relation to improving the security posture of an organization. A business can meet all of its compliance objectives and still be breached. This happens all of the time. The large companies you have read about all have robust security policies and compliance programs in place. The problem is that the security policies are all around protecting the business.

So, back to the scenario from above. If the business is about protecting the business but the business is storing information about a third party (its customers), where does the third party get a say in protecting its information? Once the information is stolen, it’s too late to walk away from the business and research shows that in most cases, businesses are not impacted financially by these breaches. Certainly, their stock prices are not impacted over the long term. Where is the place at the table for the stakeholders who have the most to lose from a security breach?


Tuesday, June 30, 2015

Hiding Between Partitions

One of the challenges of digital forensics is the number of places that someone who knows what they are doing could hide data they didn’t want to be found. Fortunately, the vast majority of cases don’t fall into that category. Most of the time, this will be a case of files sitting in the Documents folder where they belong or maybe in the Recycle Bin or Trash, depending on the system you are looking at. This requires some experience with the different operating systems to know where to look for documents.
Of course, if someone wanted to be a little sneaky, they may create a folder somewhere else on the drive and hide their illicit documents, like their collection of Justin Beiber MP3s, in that folder. This would, of course, be outside of the standard document repositories for users. You might, for example, stuff a folder into the Windows directory structure. This would require administrative rights to the drive but on most personal desktop systems, the user would have those rights. However, you are not restricted to the filesystem itself. If you know what you are doing a little bit, you can make use of the entire drive to store data.
Let’s say that you were to create a little slack space on your drive where you didn’t have a partition defined. Remember that a partition is a collection of consecutive blocks or sectors on a drive. A partition can then be formatted with a filesystem so it can then be used by the operating system to store files on. If you have a blank space before a partition or after a partition, that’s space that the operating system can’t use within a filesystem. That makes it fair game to stuff data into. You can see in the figure below, a drive that has two partitions defined on it with a large gap between the two partitions.



The end sector for the first partition is 1000000 but the beginning of the next sector is 1500000. This leaves 500000 sectors unused. Each sector is 512 bytes. That gives us something around 250M to store information into. This is a small drive and that value is close to a quarter of the drive. On larger drives, it may be a lot easier to carve a decent chunk out of the middle or the end of the drive and not have it be noticed.
The problem with this space is that it’s unorganized. I can’t just copy files to it and expect those files to be placed nicely so they can be retrieved easily. If I have a few files that I
want to put up there, I could manually place them one at a time and keep track of where I positioned them within that space. That requires figuring out how many blocks the files take up and remembering where they are. Instead, another way to do it is to simply concatenate the files together. There are a number of utilities you can use to do that. Under Linux, which is where I did this work, you can just use the cat utility. Perhaps one thing you want to do is to put a separator file between the files you want to store. That will allow you to extract the original files later on.
Once we have the concatenated file with all of the data we want to store in it, we can just dump it up to the slack space on the drive. In the figure below, you can see that I took my file as my input to the dd (disk dump) command and then wrote out to a sector in the slack space between the two partitions.

 

You will notice that when I was writing out to the slack space, I used the seek parameter. That’s because I needed to seek into the output file. If I want to go to a particular location in the input file, I would use skip, as you can see in the next invocation of dd where I extract the data. You will also notice in the output, I get 56+1 when I write the data out. That’s because I am writing 56 complete blocks plus 1 partial block. I didn’t file the last 512 byte sector when I wrote the file out. When I read in, then, I need to read in all 57 blocks to get the complete file.
Since I have the original file as well as the recovered file, I can compare one against the other. diff shows an insignificant difference regarding a newline at the end of the file. When I check the line count, I get the same line count between the two. These two comparisons tell me that the file I retrieved is substantially the same as the file that I put up into the disk.
Of course, I could also check to see if there is a specific piece of content in the entire file system using grep but on large disks, that can take a while. There are other tools that can be used to ferret out such things, especially if you can afford the commercial forensics tools.
While this is all very manual, this could be performed programmatically as well. Perhaps another time.

Saturday, April 25, 2015

To Frag or Not To Frag, That's Hardly A Question

In the beginning, for there certainly was once a beginning, the gods created the Arpanet and saw that it was good. Skip ahead several years and they further begat TCP, then IP, then UDP and so on and so forth. Considering that they always had in mind the notion that data would be sent in small packages of a size that could be easily determined, they decided to call these packets. Because, sure, why not? However, not everything was called a packet. It all depended on where the chunking up took place as to what you called the end results. In the case of chunking at the Internet Protocol (IP) layer, the result was indeed called a packet. IP then needed to be able to handle this idea of packets and be able to put them back together again. This meant knowing the size and where they needed to be placed, much like putting a puzzle back together. If you try to simply jam all the little dangly bits into holes on other pieces, you ran the risk of having gaps in the resulting puzzle where the dangly bits didn’t cleanly go into holes. Not to mention, the resulting image probably just won’t look right.

Let’s say, for example, that you have this chunk of data that’s 100 bytes, as you can see in the figure below. Maybe it gets broken up into chunks of 20, 30 and 50 bytes each. For ease of reference, let’s call each A, B and C. If what you are sending starts with abcdefghijklmnopqrstuvwxyz, which is 26 bytes. Only 20 of them would fit into the first chunk of data, or packet. If I were to send that to my friend Allan, but if I were to send them in separate packets (maybe think of them as envelopes), he would need to know which to open first and how to arrange them. Because of that, it’s helpful to have some sort of identifier associated with them. This puts us back to A, B and C.

If I were to receive C first, followed by A then B, I would know, because I know my alphabet, that A comes first, then B, then C. One of the problems with fragmentation is that I can’t tell what’s really going on by looking at a single fragment. Maybe I catch the one that says uvwxyz. I don’t know what that means. I suppose it could be the tail end of the alphabet, but that’s just guessing. What really comes before or after? It’s this uncertainty that opens the door to some bad behaviors. The utility fragroute, written by Dug Song, allows us to establish some rules that can actually cause fragmentation of messages. In the IP world, we have an expression for the names A, B and C that we used above. There is a field in the IP header called the IP identification field that associates a lot of related messages together. From there, we have a second field called an offset that tells the receiving end where that particular part of the message slots in. You can see a sample IP header below showing the IP identification field (IPID) and the Fragment offset field. The receiving end needs both of these in order to pull the entire puzzle together with all the pieces in the right order.


Let’s say, though, that you were sitting in the middle looking at all of the puzzles that were going through and you needed to determine whether they were bad puzzles or not. Maybe, in the case of me mailing letters to my friend Allan, I have put a single sheet of paper into an envelope and sent them to him. Maybe out of order, maybe one a day at a time. The complete thing is the Anarchist’s Cookbook with bomb making recipes and so forth in it. Once he has it, he can do bad things. You might want to stop him, and if you knew Allan, that might be a really wise idea. You’d have to know that what he is getting is really bad. You’d need the entire collection of papers, potentially, before you could make a determination. If we are talking about a network device, though, you need all of the fragments before you can determine whether to send it on. If fragments come in out of order or delayed, that means the receiving application is going to be delayed and that’s often unacceptable. So, maybe if it’s fragmented, you just send the fragments along because you don’t have enough information to make a decision from each individual fragment. Rather than holding up the train, which may cause users to be upset, you just push everything through.

We can take advantage of this with fragroute. Using fragroute, we can grab messages from applications and fragment and otherwise mangle them before they get sent on their merry way. Let’s try this as a ruleset.

delay random 10

dup random 20

ip_frag 48

ip_ttl 3

print

You may be able to figure most of this out on your own but let’s step through just to be sure. The first line says to delay random messages by 10 milliseconds. The next line says to duplicate random messages with a 20 percent chance to perform the duplication. The next line is the one that we’ve been talking about. Fragment each message at 48 bytes. Finally, set the IP time to live field to be 3 and then print out what has happened.

Below is one of the frames (another way of talking about messages that have been broken up but this is each chunk of data that is seen on the wire) that results from running fragroute. You can see that the total length of the frame is 68 bytes. Out of that 68 bytes, 20 of them are just the IP header. The other 48 bytes are the actual data. Based on looking at the fragment offset, it looks as though each fragment was 48 bytes, just as we had set in the fragroute rules. The offset is a multiple of 48, indicating that this is the third fragment (0-47, 48-95).


You may notice that there is no indication here what the data indicates. Normally, you would have some indication of what protocol was being used. The problem is that we don’t have the TCP header in this fragment. You can see from the IP header that the next layer protocol is TCP but we don’t know what the protocol above that is. If you look at the very bottom of the Wireshark window above, you can see the actual data. This suggests a user agent from an HTTP request. But without the actual TCP header, we can’t determine for sure that it’s an HTTP request. Other requests can use a user agent and there is really nothing else to suggest that this is HTTP. We certainly don’t have a port number, because of the lack of TCP header.

You can see from this why fragmented packets is such a problem. We can pull it all together, of course. You can see the message in Wireshark that the entire message is reassembled in frame 3765. I can also follow the conversation by looking at frame 3765 and I find the following.


This is clearly an HTTP request. The section highlighted in red is the request that we were looking at a fragment from. The section highlighted in blue is the response. This doesn’t look at all like anything worth getting worried about. It’s just a standard HTTP request and response. But just from the fragment we saw, it was hard to say. In this case, the entire conversation was fractions of a second. We could easily have delayed the frames by quite a bit, which would have required a lot of hold up. This is why fragmentation attacks can be challenging when it comes to detection and certainly when it comes to prevention.



Tuesday, March 24, 2015

Math Lessons

While you might not think this really has anything to do with forensics or other security-related issues, the reality is that math is your friend. And my friend. And when you have to calculate the byte offset on a hard drive to locate the cluster where a particular file is located, you will really want to know a little about the basics of math. 

You may have guessed that the origination of this topic is all of the nonsense spreading around on social networking sites like Facebook. Based on the number of times variations on these math problems show up and the number of times I see wrong answers, it seems as though a large number of folks really could stand a brief math lesson and while I am neither a math instructor in real life, nor do I play one on TV, I am going to take this one on because it will make me feel better. 

The acronym to remember here, and it’s really quite simple, is PEMDAS. Make up whatever mnemonic you want to remember, what it really means is parentheses, exponents, multiplication, division, addition and subtraction. This is the officially approved order of operations. When you see a very long chain of mathematical operations, you might think that you should just work left to right and as a general rule, that’s not a bad instinct. However, in order to come up with a consistent and mathematically accurate answer, you should apply the order of operations first. Then you can move on to left to right. You will also find that it’s generally easier to do a simple replacement. Let’s illustrate with an equation I’ve been seeing recently on Facebook. 

7 + 7 / 7 + 7 * 7 - 7

For those of you unfamiliar with two of those symbols, the / is a division sign for cases where we don’t have the horizontal line with a dot above and below, as in a computer keyboard. The * is a multiplication symbol, which is commonly used in place of a X or an x because those might be confusing in algebraic equations. So, let’s apply the order of operations and then re-write the equation after substituting. 

7 + 1 + 49 - 7

7 divided by 7 is 1, so I swapped in a 1 for the division operation I did. 7 multiplied by 7 is 49 so I swapped that value in. That leaves us with the equation above. There are a couple of ways to do this at this point. I could certainly go left to right and add the first three numbers then subtract the last but you may have noticed that two of them cancel each other out. If I were to re-write the equation above as follows, it quickly becomes a lot easier. 

7 - 7 + 1 + 49

This leaves me with adding 1 to 49 resulting in 50. See how easy that was? Keep in mind that the order of operations is really important. I suppose I could get into the history of why someone determined that multiplication and division were more important than addition and subtraction but it would likely bore you to tears. It would take far more of my time to come up with something coherent than I feel like putting in at the moment, so let’s skip it and move on to series. Let’s say you see the following:

10  =  50

9  =  38

8  =  27

7  =  17

5  =  ?

There are two things you should notice right away. The difference between 50 and 38 is 12. 38 to 27 is 11. 27 to 17 is 10. So, the next in the series should be 8 because we were decreasing the right hand side by one less each time. Since the last difference was 10, the next difference will be 8. 17 - 8 is 9. This leads us to the next thing you really should notice. The value on the left skipped one. The value of 6 should be 8. I’m asking for the value of 5. Keep the series going. I decrease the difference on the right hand side by 1, meaning that as I decrease by one on the left, I will be decreasing by 8 on the right hand side. This means that 5 = 0. When you see a series like this, there is generally a trick. They have skipped a value out of the series. This doesn’t mean that you just assign the correct right hand value (the next one in the series) to the wrong left hand value. It means you apply the right hand series twice and assign that value to the left hand side. 

A little bit of math, folks, will take you a very long way. I hope this has been a little bit of help. I know it’s made me feel better to share it with you.