Wednesday, August 18, 2010

Better Security Through Sacrificing Maidens


Contributed By:Pete Herzog


I began this as an an answer to some questions but then I realized I will never successfully explain the OSSTMM 3, security metrics as the ravs, and trust metrics if I only answer the questions asked. I need to address this properly by explaining the background as well because the OSSTMM 3 is apparently very different from what most people expect out of a professional security model or what they even think security is.

I think the problem people have with the OSSTMM 3 is that they expect that some things are required or necessary in security and they just don't find it there. They think estimating attack frequency, attack types, and vulnerability impact are all needed to properly and successfully defend themselves. But those things aren't used in the OSSTMM (except in very special cases of physical and wireless security verification testing) to build "good enough" security. This leads people to think it's missing or wrong.

Now we all see people who say that security is about the process and we see them fighting a losing battle. We see them just do more of what they're being told to do by the compliance requirements, books, and blogs and it's not working or it's not scaling. The problem is we are being taught to build defenses like consumers and it isn't working.

That's why we took a different direction with the OSSTMM 3. If we keep doing what we know doesn't work even "good enough", why keep doing it? It wasn't until we accepted that there are things we can never reliably know that we knew we had better find the limits to that which we did know. So then at least we'd have that going for us. For example we know that we can't reliably determine the impact of a particular vulnerability for everyone in some big database of vulnerabilties because it will always depend on the means of interactions and the functioning controls of the target being attacked. But we do know how a particular vulnerability works and where. Which means we needed a way to categorize and rate vulnerabilities not on some arbitrary weight of potential impact but rather on what they do. Then anyone who has defenses where operations match where the vulnerability is and are missing the controls which would contain or stop the vulnerability then we would really truly know if there would be an impact greater than zero for them. Therefore by focusing on operations, we can devise tactics to respond to them.

Next we realized we had to look for the security particle. What can we use to make security? Where is the security equivalent of materials science? How can we reliably build a strong defense if we don't know what it even means (or more interestingly, how the hell are we selling it if we don't know what it means). So we needed to do some serious fact finding. We needed ground rules that we know we can use as a solid foundation. For example we know that there's only 10 types of operational controls which can be applied, 5 which protect through interaction with the threat and 5 which don't. We know that authentication will ALWAYS fail if either authorization or identification are stolen or misappropriated. We know that there's only 2 ways to take a physical asset- you either take it or you have it given to you. We know that operations require interactions with something and that something can be malicious. So we designed a way to reliably verify what we know and organize the information into intelligence.

Now that we were fact-finding, we found that much of what was assumed fact and turned out to be false came from opinions from authority. As a matter of fact, did you know that there's a huge, common body of security knowledge out there built mostly on anecdotal evidence and authoritative opinions passed around via transitive trust (X trusts Y and I trust X so I can trust Y) that is used as if it's all true? I know, I am shocked as well! So all this led to a general hack and slash through OSSTMM 2 leaving it as hollow as a pun at a funeral. We needed to start over using only the facts.

As we built the new OSSTMM as version 3, we began presenting and teaching these facts. I won't lie to you and tell you it was as pretty as a royal wedding in June. There was, ummm, "resistance". The consensus was that you can't deny the fact that some attacks are more persistent, more threatening, and more damaging than others. We didn't. Instead, the security industry wants you guessing how criminals are going to attack, which is often a psychological exercise of "thinking like a criminal" accomplished by people with nice homes, nice jobs, and a good night's sleep last night. Did you know you can even be certified that you can think like a hacker because you use the same tools as them? I know, I am shocked as well! They like to tell you that criminals follow a pattern but they really don't (see the Hacker Profiling Project for evidence of that). What we were seeing is the inherently unqualified opinions present in Risk marketed as fact within the security industry. Risk is a real thing. It exists. However the results from determining Risk is often made up.

Insurance companies use mountains of historical data to reduce risk. Wall Street uses mountains of trends current to the most recent second to reduce risk. Casinos use predetermined probabilities to reduce risk. As it turns out, the security industry uses quick response to reduce risk. Whether it comes from attacking our own software in vulnerability research, the use of AntiVirus to show us what's infecting us, or any of the hundreds of types of ways we have to show us we've been hit, security is an industry that uses current losses to protect future investments. Not only is that pretty dangerous but it's a horrible case of tunnel vision because it leads to defenses against specific attacks which had already happened. So the typical enterprise security today is one that is properly prepared to sacrifice something to an attacker now so they will be 100% prepared against it later.

For this backwards method we have to thank all those who think they should use Risk in the security industry. However they don't realize it can't work like it does in the other industries. For example, in security, types and areas of attack change with technology so the use of historical data like Insurance companies have is just not relevant. Unlike Wall St., we can't watch all the current trends with enough insight to know what the next attack will be for certain or with enough speed to react before they hit. Although that doesn't stop security from making it look like it does by secretly telling software companies of holes who then release patches for them which security experts predict will get reverse-engineered and turned into exploits and then are lauded for their predictive prowess by their followers when it inevitably happens (anywhere on any scale). And you maybe wondered why some security researchers get ticked off when you do full disclosure- because you're SOC-blocking their moves!

Another valid point is that Wall St. races other people to jump on and off trends whereas we need to race packets which travel at the speed of light. This also makes me wonder if the people who bought into "real time network monitoring" heard the fable of the tortoise and hare so often as children that they took it literally (or never turned on a light switch?). Finally, we also don't have the luxury of allowing some big losses like casinos where we can fix the odds and just hope to survive through the heavy hits because we'll win in the long run (although it looks like some government departments are actually trying this).

Now some of the Risk analysts within the security industry tell us that the problem isn't that we can't predict it but that there's too many data points right now to reliably guess the future. Basically, we need to get better at guessing. They say we need better models because then we can better forecast the problems. I see this approach in the other industries and I don't need to tell you how poorly it prevents financial meltdowns on Wall St., how exclusionary the guidelines are for getting pay-outs from Insurance companies, or how many lives are destroyed through gambling addictions at casinos. The truth is that in all other industries using Risk there has to be a loser. And the loser, unfortunately isn't the attacker. It's one of us. It's one of the ones we should be defending. It's like the story where the king feeds a maiden to the dragon every full moon to protect the rest. The dragon isn't losing. This is not the way to keep a town safe by sacrificing some of its denizens so the others can survive. What happens when another dragon shows up? And werewolves? And then the people turn into zombies? Threats change and come from unexpected places. The worst way to handle threats is to try to estimate them out of existence with Risk. Because it allows you to ignore some of the impact as inconsequential to the greater, or more selfish, number of beneficiaries. If you remember the story, the king didn't like it too well when it was his daughter that was fed to the dragon.

When we look at why we need Compliance it's because of selfishness. Businesses put their profits above defending their customers and business partners. Interestingly, the Compliance rules themselves are written to the greater good, which means that some companies won't be able to afford the required products therefore can't do business their way online. So the rules need to be lax enough so that only an acceptable amount of companies can't afford them. Still some of those who can't afford them will try to circumvent the rules to stay in business. But the Risk estimates will have considered this and make sure that only an acceptable amount of people will get hurt by those companies. What you have here is the use of Risk to further manage Risk and it's not working. We're just feeding the dragon.

At ISECOM we saw that what we needed was a way to create security so that the only loser would be the attacker. Which meant we had to do it without regard to the type of attacker, what their motives are, and what the probabilities are that they will only want to eat a maiden during the full moon. That's how we learned that you don't even need to know what the threats are or might be to defend against them reliably. See that's the funny thing because you are protecting against the unknown anyway. So if you don't need to know that then you don't need to know the impact of a particular threat or the result of a particular vulnerability either. You just need to know what limits your controls have on them and which operations are interactive with which parties. Now this isn't us saying that Risk goes away, no, not at all, but what we are not doing is looking for acceptable or "good enough" security at the expense of our own. So we do not use Risk to build our security. Instead we suggest you use the facts we know about security and the facts we know that give us reason to trust.

To build and verify security without using Risk, you need to learn the three main tools in the OSSTMM 3 which help you do this. Without them, you won't be able to do it successfully. Most importantly, you won't have to rebuild from scratch to do it. You just need to verify and categorize what you have and how it works.

The three tools are operational security metrics, trust metrics, and an OSSTMM 3 test. All three come from the same research but all three provide different intelligence. This leads people to get confused and find the whole thing to be overly complicated, apparently worse than guessing which is easy to do although nearly impossible to do consistently right (which is why in security these days being right isn't as important as showing your work because failing through status quo is also success in this screwed-up security culture with such acceptable CYA phrases like "If an attacker wants in they'll get in no matter what." and "There's no such thing as perfect security.")

The OSSTMM 3 test provides the following intelligence:

1. What the scope is and which were the targets tested,
2. What the test type and vector are,
3. Classification and enumeration of interactive points (operations),
4. Classification and enumeration of operational controls,
5. What types of tests were NOT performed on the scope,
6. What are the limitations of any of the controls,
7. Which operations do not work as expected (usually provide additional, unwanted or unknown interactive points).

That is what you need to know in order to calculate the Attack Surface for which we use the ravs, a measurement like mass, which shows the balance between controls, limitations, and operations. The really good thing about ravs is that they are not weighted. Therefore the values of certain vulnerabilities do not come from someone's assumptions of impact but rather from which interactions you allow and which controls you have in place to assuage damage. This flexibility means that ravs can be compared regardless of target types or scope.

The OSSTMM test results are designed to provide a lot of different information clearly for the analyst. What it won't do is tell you which kinds of attacks are coming, how often, from where, and what the financial loss will be of that attack. But you do now have much more exact information to calculate those things if you want to because you know exactly how vulnerable each system is alone and collectively, the points of failure, the only places where an attack can be made, the lack of controls, and the redundant, useless controls. You know what wasn't verified and is therefore unknown.

You might not know if an exploit will happen but if it can, you'll know the paths it can take, what servers or services will succumb to which type of attack and therefore the only types of attacks you can expect to get through, and that which will not because they have the right controls.

Now to better organize that information we have the STAR and we have the rav calc sheet in open doc format and in XLS format.

The STAR allows you to give a new type of overview to your client which shows exactly what is deficient, where and why. It shows which tests were not done and why. It allows for future comparisons with other tests from other consultants. It allows for continuous internal verification and measurement of change or improvements. It allows a business to manage security based on need instead of speculation. Therefore, a business could address Compliance by having a particular percentage rav instead of particular products. It would turn an enterprise's security from being a reactive, consumer culture to a preventative, resourceful one.

The rav calculation sheet is how the Analyst organizes the information from the OSSTMM 3 test. A security test may require multiple rav calc sheets as a new one is suggested for each change in vector, channel (physical, wireless, data networks, etc.), or type of test (black box, gray box, reversal, etc.). All these can later be combined in aggregate for a "big picture" but for analysis purposes, it is easier to keep them separated. This sheet will let you see easily what needs controls, which are redundant, and which services should be closed. One of the more interesting things you'll see when you use it is how narrow the controls are in the modern "secure" network. Sure, it's defense in depth but that doesn't help you when you're protected by the same type of control to the core. Bypass one type and you bypass them all. Most all the modern security is focused on Authentication, which is interesting because the identification process everywhere is pretty bad and on the Internet it's downright awful. Next, you'd see some Confidentiality because of all the encryption being built into protocols by default. However it's Alarm that is the most prevalent control because modern network security is reactive. It's all about waiting for the dragon to show up and feed on one of your maidens before alerting the rest of the town that the dragon came back.

The rav calc sheet can be as granular as you want such as in the SCARE project which shows how to use it with source code or the companies who use it to measure web app attack surfaces need to do it by interactive points in the web app itself. So you get the info you need at your finger tips to make bigger, better decisions. One of the handy things about this is placing monetary values on the server, service, app, or whatever based on business process requirements. These provide you with historical business data from which to make business forecasts to compare to how much it cost that server, service, or app to make and what it costs (perhaps annually) to keep it running and controlled. This sheet can be your sandbox. Right on the sheet you can play your war games by closing services, add the results of products you haven't bought yet, see what happens when a particular service is compromised or denied, etc. to see how much it changes the attack surface before you physically change a single thing on your servers. That rav delta can then be assigned a value base on operating costs and income from the business processes it is a part of to see if the new product gives enough bang for the buck or not.

Now trust metrics are almost a different beast. How they relate to the OSSTMM is that the factual information you get from verification can be used in the trust rules you generate to make a decision. The trust metrics help you fill the gap in the OSSTMM by helping you understand what cannot be verified or known by having you examine what your reasons to trust something are. Trust metrics you apply when you need to know how to approach the unknown. In that way it is similar to Risk but the similarity stops there. It lets you compare what you have and know to degrees of what you don't know in an even fashion. By only looking at what reasons you have to trust something new you avoid falsely speculating, something human beings are notoriously bad at.

For example, you would use trust metrics to determine if a new partner network should be connected to your own. Or how much access you would give to the visiting consultants. Or if you can depend on that new cloud provider. You could get rav scores from each network but that won't help you if they are secure against the world but malicious to you. So you use the trust metrics to determine how much reason you have to trust them and why. The properties you measure them against can be found here. It has you evaluate reasons to trust against 10 non-fallacious rules and shows you which reasons to trust are strongest and which are weakest. Therefore, hopeless romantics beware, it may cause uncomfortable flashes of reality.

The end effect of trust metrics is that if you did this for each partner, you could create a framework contract that specifically highlights the weak trust areas to create greater assurance. Or you could say no and show them what you need before you say yes. Or you could make the financial rewards more substantial for yourself or the penalties higher for them. Or you can just give them less access with greater controls if you want to be politically correct about the whole thing. With trust metrics you act and protect to an acceptable level of interaction rather than an acceptable level of loss. What you definitely don't need to do is take a chance based on an estimate of the acceptable number of systems which could fall to the malicious attacker. Because that again would be just feeding the dragon.

Hopefully I've explained here clearly why we did what we did with OSSTMM 3. Combining the OSSTMM 3 verification results with ravs and trust metrics lets you build stronger infrastructures by looking at where you are strong against everything you have no reason to trust.

Now, whether or not you agree with what is said here, and some may have fundamental problems with our reasons for taking the OSSTMM 3 in the direction which we have, you cannot dispute the value of the information provided by an OSSTMM 3 test. Some of you may be wondering what the Risk would be to give up on Risk and try such a strange, new method. You can only answer that for yourself. Only you know if your Risk method of security will scale indefinitely with you, if the costs of speculation and response products and processes is greater than the actual losses for you, and if you have enough maidens in your organization to feed all the dragons who show up during the full moons.

No comments:

Post a Comment