Enterprise Detection & Response: 2014

Wednesday, November 5, 2014

Triage Any Alert With These Five Weird Questions!

(OK, so I went all "BuzzFeed" on the title. My alternate was going to be "What kind of alert are you? Take this quiz and find out!" so be thankful.)

Introduction

There are few things more frustrating to users than using a tool which doesn't support (or may even be at odds with) their processes. Tools should be designed to support our workflows, and the more often we perform a workflow, the more important it is that our tools support it. As analysts, our most commonly-exercised workflow is probably alert triage. Alert triage is the process of going through all of your alerts, investigating them, and either closing them or escalating them to an incident. Escalated incidents lead directly to incident responses (IRs), of course, and there's not always a distinct handoff where triage ends and response begins. Therefore, some of the basic IR tasks are part of this workflow as well.

About five years ago, I was tasked with training a group of entry level security analysts to do alert triage. Previously, I'd really just "done it" and never really thought much about how it worked. After mulling it over for a while, though, I realized that the entire process really boiled down to a set of questions that the analyst needs to have answers for.

Was this an actual attack?
Was the attack successful?
What other assets were also compromised?
What activities did the attacker carry out?
How should my organization respond to this attack?

If you start from the assumption that the analyst sitting in front of an alert console is going to be answering these five questions over and over again, it seems pretty clear that we need to make sure that console makes it as easy and as quick as possible for the analyst to get these answers. We may not need to answer every one of these questions every time, nor do we always tackle them in the same order. In general, though, this is the process that we start from, and then adapt it on-the-fly to meet our needs.

I thought it might be interesting to examine these questions in a little more detail.

Was this an actual attack?

Of course, this is the first answer we need. You could restate this question as "Is this alert a false positive?" and it would mean the same thing. As an industry, no one has yet figured out how to eliminate all the false positive (FP) alerts while still keeping all the true positive (TP) alerts. Given that we know that there will be FPs (probably a substantial percentage of them), we need to make it as easy as possible for the analyst to distinguish between FP and TP, and to do it quickly.

The keys here are:

Providing the user with the context around the alert (what scenario is it intended to detect, what do actual examples of the TPs look like, etc)
Identifying what other information (stuff that's not already in the alert) the analyst needs to see, and providing quick easy access to this (e.g., pivot to examining the PCAP for an alert)

Was the attack successful?

If you've ever monitored the IDS/NSM console of an Internet-facing asset, you know that the vast majority of exploit attempts probably fail. These are mostly automated scan, probes or worms. They attack targets indiscriminately, without regards to OS, software stack or version numbers.

Unfortunately, if you alert on exploit attempts, dealing with these alerts is a substantial burden to the analysts. This is due entirely to the sheer number of alerts they have to deal with. This is one reason I tend to focus my attention on detections further down the Kill Chain: it cuts down on the number of unsuccessful attack attempts the analysts have to deal with.

Even though we try to avoid them as much as possible, there are still plenty of situations where we are alerting on something that may or may not have been a success. For example, if we alert on a drive-by download or a watering hole attack, the analyst then has to see if the browser reached back out to download the malicious payload. The key information they need will be much the same as before:

Context
Quick & easy access to related information

If the answer to this question is "Yes, the attack was successful" the next step is usually to escalate the alert to a full-blown incident.

What other assets were also compromised?

This is where things start to get really interesting. Assuming the alert indicates a successful attack, then you have to start doing what we call scoping the incident. That is, before you can take any actions, you need to gather some basic information with which to plan and make response decisions.

The first step when determine the scope of the incident is to assemble a list of assets involved. "Assets" in this case can be any object in your organization's IT space: computers, network devices, user accounts, email addresses, files or directories, etc. Anything that an attacker would want to attack, compromise or steal would be considered an asset.

For example, you may see an intruder try a short list of compromised usernames and passwords across a long list of hosts, trying to find out which credentials work for which hosts. In this case, your asset list would include every one of the compromised user accounts, but also all of the hosts they tried to use them on. If they performed this action from within your corporate network, the source of those login attempts would also be on the list.

It's also worth noting that the asset list will almost certainly change over the course of the incident. For smaller incidents, you may be able to both assemble the asset list and filter out the ones which weren't actually compromised, all in one step. For larger incidents, it's pretty common to create the list and then parcel it out to a group of incident responders to verify. As the analysts determine assets are not compromised, they may drop off the list. Similarly, as new information comes to light during the course of the investigation, new assets will probably be added to the list. The asset list is one of the most volatile piece of information associated with any active incident response, and keeping your list current is absolutely critical.

The keys to helping analysts deal with asset tracking are:

Creating a template "asset list" that can be cloned for each alert/incident to track compromise status. At a minimum, you probably need to track the following info for each asset

Asset name (hostname, username, email address, etc)
Date/time the attacker was first known to interact with that asset
Date/time the attack was last known to interact with that asset
Date/time an analyst detected the activity
Date/time the asset was added to the list
Date/time the analyst investigated the asset and determined it to be compromised or not
The results of that investigation (often COMPROMISED, NOT COMPROMISED, PENDING ANALYSIS)
Brief notes about the investigative findings

Sharing this list with other responders, in such a way as to make it easy to interact with (e.g., via a wiki)

What activities did the attacker carry out?

The next big piece of scoping the incident is just trying to come up with what I call the "narrative" of the attack: What did the attacker try to do, when, and to what? The asset list answers the "to what" question, but even that needs to get put into context with the other stuff.

To help answer these questions, incident responders typically create timelines of the significant events that occur during the attack and the IR. For example, you might start with a simple chart that calls out the highlights of the incident, with dates and times for:

The attacker's first exploit attempt
All the alerts generated by the attack
When the alerts were triaged and escalated to incidents
When the affected asset(s) were contained
When the affected assets were remediated
When the incident was closed

As your investigation gathers more details about what happened, the timeline grows. As more confirmed malicious events are discovered, and incident response milestones achieved, these are added to the timeline. Just like the asset list I mentioned before, the timeline typically changes a lot during the course of the investigation.

Once you have gathered the timeline data, you may want to display it in different ways, depending on what you're trying to accomplish or who you are sharing it with. For example, a table view is common when you're editing or managing the events, but some sort of graphical view or interactive visualization is a much nicer way to show the narrative to others or to include in your incident reports. There are some timelining tools available out there, but to start with, I recommend trying just a plain spreadsheet (or similar format) and seeing how that works before getting too complicated.

The keys to helping the response team track activities are:

Creating a timeline template that can be cloned for each incident. Key fields to track here might include

A description of the entry (attacker action, milestone reached, etc)
Date/time that entry occurred

Sharing this list with other responders, in such a way as to make it easy to interact with (e.g., via a wiki)

How should my organization respond to this attack?

This is the big question! Once you have assembled the list of compromised assets and the timeline of events, you exit the scoping phase and are ready to begin planning your incident response strategy. In most a cases, a typical strategy involves containment (removing the attacker's access to the compromised assets) followed by remediation (bringing the compromised assets back into production state). The topic of incident response strategies is far too detailed to get into here, but it's worth noting that this is not a one-size-fits-all situation. Different types of incidents require different plans (sometimes called playbooks). A well-managed CIRT will have a number of standard IR plans ready to go, but even then they often need to be tailored to the individual incident. And it's still not uncommon to find incidents that don't fit any of the standard plans exactly, in which case the response teams need to create an entirely new plan on the fly (using pieces of existing plans, if they're lucky).

The creation and application of these plans still requires a high degree of skill and experience, something that a lot of organizations don't have enough of in-house. If this is an issue, you may consider engaging a consultant with experience building CIRT teams to help guide you through this process.

The keys here are:

Identifying who in your organization has the experience necessary to come up with good playbooks, or getting outside help.
Creating playbooks that are specific to your organization's IT environment, policies and security goals.
Sharing these playbooks among all the incident responders
Training with and testing the playbooks on a regular basis

What does this all mean?

To a large extent, an organization's ability to detect and respond to security events depends on the quality of their tools. It's not enough to just go out and get a bunch of "solutions" if they don't support the way you work. We need to be able to leverage those tools to improve our ability to get our work done faster and better. By identifying the common workflows we go through, we can design our toolset to help us be more productive. This translates into more effective responses that happen faster, thus helping us protect ourselves much better.

I know that other very experienced incident responders will read this. I'd love to hear your feedback, so please leave a comment!

Thursday, October 23, 2014

The Defense Chain

Intro

If you're reading my blog, you're probably already familiar with the Kill Chain (KC). Briefly, it's a generalized model of the stages an adversary has to go through to carry out a targeted attack. It's been around for several years now, and just seems to become more popular over time. It's a great model that captures a complex subject and presents it in a simple way. I'm a big fan.

I've been thinking recently about the process that we as defenders have to go through to protect our networks against attacks (targeted or otherwise). Many papers and articles have been written on the subject, and most of us probably know the highlights: we make policies, we enforce the policies using both technical and non-technical means, we monitor our networks and we respond to incidents. The process is actually quite complex, though, and I began to think of how I would create a model to show it visually.

After a bit of pondering, it struck me that I was essentially trying to create the defender's counterpart to the Kill Chain. That is, we already know the stages an attacker has to go through to carry out an attack; now we need a model that shows the stages a defender must go through to protect against these attacks. One of the reasons the KC is so popular is that's it's very straightforward. I mean, literally, it's a line. So I thought, what would it look like if I applied it to defense?

The Defense Chain

The result is what I (rather clumsily) termed, "the Defense Chain" (DC).

The Defense Chain

Like the Kill Chain, the Defense Chain has seven phases (pure coincidence, but I admire the symmetry). I should probably mention that the DC is not just for detection and response. It includes all of your management and protective controls as well. Of course, detection and response each get their own separate phases, which I think underscores their importance. You can't get to the end of the chain if you're missing a link!

Let's examine the phases in more detail.

Plan

Before you can begin to protect your network, you first must figure out some key things, like what exactly you wish to protect, and what you're trying to protect it from. In the Plan phase, you do things like identifying your assets and creating your security and incident response policies to help protect them. This is also where you begin to decide what types of protective controls you will need (firewalls, endpoint protection, network proxies, etc), how you will deploy them, how you will monitor the entire system (because prevention always fails), and who's going to do all this work.

If you've ever been involved in the creation of a security program before, you'll know there are a lot of things to plan here. So many things, in fact, that I'm not even going to try to list them all. Just know that the planning phase is probably the most important piece of the Defense Chain, because everything else depends on it.

Build

Compared to planning, building is often fairly straightforward. During this phase, you assemble teams, learn skills and create or acquire the technical tools necessary to carry out your plans.

Did you catch what I just did there? It's vitally important that you build teams and skills *before* you try to build the technical parts of the solutions. Not everyone needs to be an expert, though you certainly need a few of those to guide you, but everyone involved needs to have enough of a background to know what they're doing and why they're doing it.

It's also worth pointing out that the "Build" phase isn't something you just do one time and then forget about it. Rather, you should be constantly growing your teams' skills and experience. You also need to have someone looking over your controls to be sure they are operating efficiently, and to update and improve them as needed.

Monitor

The monitor phase is where you actually operate the technical solutions and perform periodic reviews and drills to exercise your policies and plans. This could include just making sure the endpoint security solution you chose is working well, ensuring that packet loss on the NSM/ESM systems is within acceptable levels, or running table top incident response exercises to make sure everyone knows what to do.

This phase is probably where you spend the majority of your time.

Detect

I probably don't need to explain this phase much to my readers. The detect phase is where you check the output of your NSM/ESM systems, validate the alerts or do some proactive hunting through the data to find evil. This could also include fielding user queries about "weird" things on their computers or suspicious emails they received.

Respond

This is another phase I probably don't have to explain much here. Once you have found evil, you need to exercise those incident response plans you developed in the Plan phase. Investigate, contain and remediate! Kick out the bad guys and bring the affected assets back into normal operation.

Report

The Report phase is all about gathering information about your successes and failures, analyzing it to make recommendations for improvement, and communicating this to the right people. Typically, reporting is a followup to an incident response, but you would also do this for other reasons (e.g., to review a red team engagement or an auditors' findings).

Not everything is a formal report, either. Sometimes your "report" might be a presentation, a post to an internal blog, or even an email. The key is that you communicate findings and recommendation to the right people in whatever way makes it easiest for them to digest the information.

Improve

Security programs are not static! You need to constant improve you skills, your tools and your procedures to keep ahead of the bad guys. That's what this phase is all about. After the successes, failures and recommendations have been documented and reported, you need to make sure you act on them. So many organizations skip this step, and although it might make less work in the short term, it makes more work in the long term as they play keep-up with threats that have advanced beyond the organization's ability to protect themselves.

Conclusion

There you have it. In 10 minutes or less, my thoughts on a model of how an organization successfully defends itself against attacks. I don't claim that the Defense Chain model really contains anything new. Rather, I hope to just provide a simple visual guide to all the things you need to do, in rough order, and layed out in a way that makes it easy to visualize how all the phases flow and work together.

I know my readers deal with these sorts of things every day. Please, leave a comment below to let me know what you think. I'm eager to hear comments, questions and criticisms!

Saturday, March 1, 2014

Use of the term "Intelligence" in the RSA 2014 Expo

I attended RSA 2014 this week, and one of the things that struck me was the recurrence of the term "intelligence" on many vendor booths. I decided it would be a fun exercise to go through the expo halls to ask the vendors to clarify what their uses of the term "intelligence" meant.

Methodology

First, the parameters of the exercise. I chose an arbitrary starting expo hall (honestly, never once in the whole week was I able to accurately remember which was North and which was South) and walked the aisles from one side to the other, examining each booth to see if they referenced either "intel" (not the chip maker) or "intelligence". I did not consider booths that only had related terms like "information" or "sharing", nor did I consider booths for vendors I know play in the intel space but neglected to include the term in their display (I did stop at a couple of non-vendor booths that fell into this category, though. More on this below.).

At each booth, I explained who I was and what I was doing, and asked if there was anyone there who could answer a few quick questions about their use of the term "intelligence". Some of the vendor representatives were more well-equipped to answer my questions than others, but in all cases I let them decide who I should talk to, in order to try to avoid polluting the results with my own personal biases about who would make a "good" representative.

After establishing contact, I then pointed out their use of "intelligence" on their display and asked, "Can you explain what you mean by that?" If their answer seemed to roughly line up with the idea of "using information to detect malicious behavior" I then asked followup questions listed below, otherwise I thanked them for their time and ended the interview.

What types of information do you consider to be "intelligence"?
Are some types more valuable than others, either inherently or in certain circumstances?
How can your customers know they're getting the maximum value out of their intelligence?

Vendors

In total, I visited 10 vendors and 2 non-vendors. In fact, I had planned to visit more, but the expo halls are so large that I didn't even complete the tour of one hall during the short amount of time they were open on Thursday (the day I conducted this exercise).

The vendors I visited were:

Webroot
LogRhythm
R-sam
IBM
NetIQ
Arbor
Solutionary
AlienVault
Securonix
BAE Systems

The non-vendors I visited were special cases, because although neither mentioned "intelligence" on their displays, I felt the organizations had enough expertise in the intelligence field that their perspectives might have been useful:

Homeland Security
National Security Agency

Unfortunately, neither of these entities were able to discuss their thoughts about intelligence, as that was not the purpose of their booths and neither brought any experts in that area, so I include them here only as an interesting sidenote.

Full disclosure: I work for the Mandiant division of FireEye, both of which are well known for their threat intelligence. I purposely left them out of this survey. I did this not to throw criticism on other vendors, but because I have a much deeper knowledge of what FireEeye considers to be "intelligence" and I couldn't effectively include them in the survey without biasing the results towards my employers.

Uses of the term "Intelligence"

Out of the 10 vendors, I found variations of 8 different uses of the term "intelligence".

Threat intelligence (4 vendors)
Security intelligence (2 vendors)
Identity intelligence (2 vendors)
File intelligence (2 vendors)
Application intelligence (2 vendors)
Risk intelligence (1 vendor)
Applied intelligence (1 vendor)
Insider threat intelligence (1 vendor)

Most vendors stuck with one of the above, or a close rewording that meant the same thing. One vendor (Securonix) actually used several different variations in their display. Their representative explained this by saying "we add 'intelligence' to the end of everything." In fact, if I had taken additional uses of "intelligence" from our conversation, I would have added several more to the list above. This would have been breaking my own rules, though, so I omitted them.

Different meanings of "Intelligence"

The definitions of "intelligence" broke down into three categories:

"Intelligence" in the sense "doing something smart with input data to achieve a result" (i.e., what is also often referred to as "analytics", "anomaly detection" or just "correlation"). Terms used this way included "security intelligence" and "risk intelligence".
Enriching input data to allow security decisions to be based on more organizational context than was originally present in the data set (e.g., adding user identity information to incoming log events). Terms used this way inlcuded "identity intelligence", "file intelligence"and "application intelligence".
Consuming information about adversaries, tools or techniques and applying this to incoming data to identify malicious activity. The term "threat intelligence" was the most commonly used phrase in this category, although "insider threat intelligence" also applied.

Readers of my blog will note that definition #3 most closely matches up with what I consider to be "intelligence". It's worth noting that Chris Sanders defines #2 as "Friendly Intelligence" in chapter 14 of his book, Applied Network Security Monitoring (Full disclosure, I was a contributing author, though not of that chapter).

Types of "Intelligence"

For those vendors who's use of the term "Intelligence" fell in line with definition #3 above, I then asked about the types of intelligence they deal with. By far, the most common were IP addresses and domains, though URLs were also sometimes mentioned. The recurring ideas of "file" and "application" intelligence strongly implies the existing of file hash values as well. I will not attempt to summarize how many vendors mentioned each type, primarily because many of the vendor representatives I spoke to either weren't willing or weren't able to go into detail about the types of intel data they dealt with.

Applying the Pyramid of Pain model to responses shows that the respondents' use of threat intelligence still falls mainly into the bottom half of the pyramid. The few vendors that mentioned URLs may be working at least partially in the artifacts level as well. No vendor in my survey mentioned any type of indicators that would fall into the upper levels of the pyramid ("Tools" or "TTPs").

It's interesting to note that most vendors who use definition #3 cited primarily network-based data types (IPs, domains, URLs). A few mentioned "file" or "application" intelligence, which implies more of a host-based orientation, but no one mentioned traditional host-based indicators such as file names, registry keys or processes names. (One vendor did mention file names, but in conjunction with definition #1). This may indicate a gap in our industry's thinking about what types of information can be useful in detecting malicious activity, it may be a function of the types of products that the vendors in this survey are selling, or a combination of the two factors.

Other interesting things of note

I mentioned earlier that both Homeland Security and the National Security Agency had booths, but weren't able to comment on their ideas of intelligence. In fact, I did get one quote from the NSA representative, which I thought was interesting: "Information doesn't become intelligence until it is useful to someone." I interpret this to mean that the information also has to be consumable (information buried in a PDF report isn't that useful; it needs to be put into detection mechanisms). Since there is often a lot of confusion about the difference between information and intelligence, I think this is a nice way to phrase the difference so that people can understand.

Some of the individual vendor representatives also touched on a similar theme, drawing the distinction between "information" or "facts" and "intelligence". For example, Webroot mentioned that they have databases of "facts" like IP or file reputation, but that they have a process that combs through those databases to try to find connections and correlations and place them in context with other related facts. The ouput of this process is what they consider "intelligence": facts in context with each other.

The representative from Solutionary also had an interesting point of view. He described a hierarchy of "technical indicators" which are facts about the state of something, independent of possible security concerns (e.g., "the system is out of memory"), "threat indicators" which do have security implications, and "threat intelligence" which is a combination of the two, with additional higher-level context. The hierarchy goes something like:

technical indicators < threat indicators < threat intelligence

I've seen the "threat indicators < threat intelligence" before, and I think there is broad agreement on this among actual intelligence analysts, but I was unfamiliar with the concept of lower level technical indicators, although they seem pretty obvious in hindsight.

Conclusions

This turned out to be a pretty interesting exercise. The sample set is by no means large enough to constitute a reliable study, but I do think it has some valid things to say about our industry's approach to "intelligence" in general. I draw the following conclusions:

"Intelligence" is a buzzword that can mean anything you want it to mean. In my sample of 10 companies, there were 3 separate definitions (broadly speaking, probably more if you scope the definitions more narrowly). That's a lot of variance given the small number of respondents. It'd be interesting to expand this to a much larger set of vendors to see how many other definitions we can collect.
There is a valid case for the concept of "friendly intelligence". Of the three definitions, two of them actually did refer to the use of some sort of information to make it easier to detect malicious activity. Definition #2 is what Sanders calls "Friendly Intelligence" though none of the vendors I spoke to used this term. It does a good job at disambiguating the term "intelligence" and clearly indicates the idea of intelligence based on information you generate about yourself versus information about your adversaries. This is an important concept, and by naming it, we make it easier to identify and understand.
We are focused on the wrong types of threat intelligence. Most of the vendors' concepts of threat intel were solidly on the bottom half of the Pyramid of Pain, which suggests that the indicators we're focused on are the ones that are the least valuable to the adversaries. This, in turn, means that our incident detection and response operations are purely following the adversaries' lead and playing to their strengths. Instead, we should be developing tools and techniques to allow us to develop and apply intel near the top of the pyramid, where we can increase the attackers' costs of doing business against us and make them work harder (and expend more resources) to accomplish their missions.
No one has any idea if we are using intelligence effectively. Or even what "effectively" means in this context. Although I had three followup questions prepared (listed above), I rarely got to ask the final one. The questions were designed to follow each other in logical succession, so if the vendor couldn't or wouldn't answer one of them, I skipped the succeeding questions. No vendor was able to successfully provide an answer to the second followup question, "Are some types more valuable than others?" I was definitely not expecting anyone to parrot back the Pyramid of Pain or anything like that, but was hoping for some indication that certain types of indicators had different characteristics in terms of false or true positives, applied to more specific or more broad classes of attacks, or at least in general that not all types of data were of exactly equal use in detection or response. I didn't get that from any vendor, which leads me to believe that the idea of "throw it all at the wall and see what sticks" may still be the dominant paradigm in many of today's security solutions.

I wish I could say I was more surprised by these results. I suspected that result #1 was probably true, which was the original reason for this exercise. If you have read my previous blog articles, you will know that I have been saying #3 and #4 for some time, and these findings to tend to confirm my views. Result #2, though, I offer as a constructive finding. The term "intelligence" itself is neutral; it is neither malicious nor benign. We are used to the phrase "threat intelligence" as the kind that deals with malicious activity, and there is broad acceptance of this term in the industry. However, there is no equivalent term for information about oneself that can be used to help identify malicious activity, even though many vendors are clearly expressing this concept in their own different ways.

I had a lot of fun doing this: I got to meet a lot of new people, have some interesting conversations and even walk off some of this fantastic San Francisco food! My thanks to all the folks I talked to while doing this research. Perhaps I will try this again next year and see how the results compare.

Saturday, January 18, 2014

BSidesAugusta

Update 2014-03-14
Here's a link to the video: https://www.youtube.com/watch?v=SVKcFhyGqcY.

Update 2014-01-17
Holy wow, I just found this in my draft's folder! I obviously meant to publish it a few months ago, but somehow didn't. Normally I'd just skip it and not bother, but it does have a link to my ESM presentation video, which I think some people might find useful. It's better if I just fess up. Yup, I was an idiot.

Yesterday, I was lucky enough to attend the inaugural BSidesAugusta in Augusta, GA. This was a fantastic high energy event with a lot of great talks. I spoke on the Blue Team track about many of the themes I've blogged about already, plus a lot more that have been baking.

My talk was entitled "Enterprise Security Monitoring", and covers not only the themes I've blogged here already, but a lot of other stuff that I've been working on but haven't yet had a chance to write up.

I had several people ask me if I could make my slides available, so here they are.

The talk was also recorded, so I'll post a link to the YouTube video when it's available. You can also expect a blog post sometime in the next few days to explain this concept a bit more. Finally, I'll also be giving an updated version of this talk next month at BSidesDC, so if you didn't catch me in Georgia, come to Our Nation's Capitol and see me there!