It's been conference season for me lately. I've attended several in the last few months, and most of them have had at least a few sessions on "hunting". Since that's a research interest of mine, of course I attended all of them that I could possibly fit into my schedule. I've gotten some great ideas from these presentations, and have begun to notice some patterns in what's being presented. I've also spoken with a number of people at the conferences who have made some variation of the remark, "Sure, we'd like to hunt, but how do we figure out where to start?"
Recently, I had the opportunity to discuss with my colleagues the common types of activities that a security team does, and how they relate to hunting. I drew from the presentations I've seen and the conversations I've had to make some generalizations about the different levels of capability within hunting organizations I've encountered. These were literally just some bullet points on a whiteboard, but one of my colleagues said something along the lines of "That's kind of like a maturity model" and that got me to thinking. I realized that I had never actually seen anything like that, but if one existed, then it'd be a useful way to measure the state of a hunting program and to help drive improvement.
So I went away and thought about it for a few days. What follows is an expansion of the original idea, which I am making available in the hopes that the community will find it useful.
What is Hunting?
Before we can talk about hunting maturity, though, we need to discuss what, exactly, we mean when we say "hunting". I usually define hunting as the collective name for any manual or machine-assisted techniques used to detect security incidents. I refer to it as a "collective name" since there are many different techniques hunters might use to find the bad guys, and no single one of them is always "right". In fact, it's good to be familiar with many different methods so you can choose the best one depending on the type of activity you are trying to find.
I also refer to hunting as being "manual or machine-assisted" as opposed to being automated. I'm a big believer in automated alerting, but it can't be the only thing your detection program relies on. In fact, one of the chief goals of hunting should be to improve your automated detection capabilities by prototyping new ways to detect malicious activity and turning those prototypes into production detection capabilities.
The Hunting Maturity Model
With that definition of hunting in mind, let's consider what makes a good hunting program. There are three factors to consider when judging an organization's hunting ability: the quality of the data they collect for hunting, the tools they provide to access and analyze the data, and the skills of the analysts who actually use the data and the tools to find security incidents.
Of these factors, the analysts' skills are probably the most important, since they are what allows them to turn data into detections. That's why each level of the Hunting Maturity Model (HMM) starts with a statement of the level of analytic capability that is typically found at that level.
The quality of the data that an organization routinely collects from its IT environment is also a strong factor in determining the HMM level. The more data (and the more different types of data) you provide to an expert hunter, the more results they will find. The toolset for collecting and analyzing the data is a factor as well, but a less important one. Given a high amount of analyst skill and a large amount of good quality data, it's possible to compensate for toolset deficiencies, at least to a degree. For this reason, each HMM level discusses the quality and amount of data that's routinely collected across the enterprise, but does not directly address the analysis toolset.
The Hunting Maturity Model describes five levels of organizational hunting capability, ranging from HMM0 (the least capability) to HMM4 (the most). Let's examine each level in detail.
The Hunting Maturity Model (HMM) |
HMM0 - Initial
At HMM0, an organization relies primarily on automated alerting tools such as IDS, SIEM or antivirus to detect malicious activity across the enterprise. They may incorporate feeds of signature updates or threat intelligence indicators, and they may even create their own signatures or indicators, but these are fed directly into the monitoring systems. The human effort at HMM0 is directed primarily toward alert resolution.
HMM0 organizations also do not collect much information from their IT systems beyond the bare minimum needed to drive their automated alerting. Thus, even if they were to somehow acquire hunting expertise (perhaps by hiring a consultant or by making a strategic hire), their ability to actually hunt would be severely limited.
Organizations at HMM0 are not considered to be capable of hunting.
Organizations at HMM0 are not considered to be capable of hunting.
HMM1 - Minimal
An organization at HMM1 still relies primarily on automated alerting to drive their incident response process, but they are actually doing at least some routine collection of IT data. These organizations often aspire to intel-driven detection (that is, they base their detection decisions in large part upon their available threat intelligence). They often track the latest threat reports from a combination of open and closed sources.
HMM1 organizations routinely collect at least a few types of data from around their enterprise, and some may actually collect a lot. Thus, when new threats come to their attention, analysts are able to extract the key indicators from these reports and search historical data to find out if they have been seen in at least the recent past.
Because of this search capability, HMM1 is the first level in which any type of hunting occurs, even though it is minimal.
HMM2 - Procedural
If you search the Internet for hunting procedures, you will find several great ones. These procedures most often combine an expected type of input data with a specific analysis technique to discover a single type of malicious activity (e.g., detecting malware by gathering data about which programs are set to automatically start on hosts across the enterprise and using least-frequency analysis to find suspicious binaries). Organizations at HMM2 are able to learn and apply procedures developed by others, and may make minor changes, but are not yet capable of creating wholly new procedures themselves.
HMM2 organizations routinely apply these procedures, if not on a strict schedule, then at least on a somewhat regular basis.
Most of the commonly available procedures rely in some way on least-frequency analysis (as of this writing, anyway). This technique is only effective if there is data from many different hosts. Therefore, HMM2 organizations usually collect a large (sometimes very large) amount of data from across the enterprise.
HMM2 is the most common level of capability among organizations that have active hunting programs.
HMM3 - Innovative
HMM3 organizations have at least a few hunters who understand a variety of different types of data analysis techniques and are able to apply them to identify malicious activity. Instead of relying on procedures developed by others (as is the case with HMM2), these organizations are usually the ones who are creating and publishing the procedures. Analytic skills may be as simple as basic statistics or involve more advanced topics such as linked data analysis, data visualization or machine learning. The important point is that the analysts are able to apply the techniques to create repeatable procedures, which are documented and performed on a frequent basis.
Data collection at HMM3 tends to be similar to that at HMM2, though perhaps more advanced, since the continual focus on developing new techniques tends to drive the analysts into new data sources over time.
HMM3 organizations can be quite effective at finding and combating threat actor activity. However, as the number of hunting processes they develop increases over time, they may face scalability problems trying to perform them all on a reasonable schedule unless they increase the number of available analysts to match.
HMM4 - Leading
An HMM4 organization is essentially the same as one at HMM3, with one important difference: automation. At HMM4, any successful hunting process will be operationalized and turned into automated detection. This frees the analysts from the burden of running the same processes over and over, and allows them instead to concentrate on improving existing processes or creating new ones.
HMM4 organizations are extremely effective at resisting adversary actions. The high level of automation allows them to focus their efforts on creating a stream of new hunting processes, which results in constant improvement to the detection program as a whole.
Automation and the HMM
It may seem confusing at first that the descriptions for both HMM0 and HMM4 have a lot to say about automation. Indeed, an HMM4 organization always has automation in the front of their minds as they create new hunting techniques. The difference, though, is that HMM0 organizations rely entirely on their automated detection, whether it’s provided by a vendor or created in house. They may spend time improving their detection by creating new signatures or looking for new threat intel feeds to consume, but they are not fundamentally changing the way they find adversaries in their network. Even if they employ the most sophisticated security analytics tools available, if they are sitting back and waiting for alerts, they are not hunting.
HMM4 organizations, on the other hand, are actively trying new methods to find the threat actors in their systems. They try new ideas all the time, knowing that some won’t pan out but others will. They are inventive, curious and agile, qualities you can’t get from a purely automated detection product. Although a good hunting platform can certainly give your team a boost, you can’t buy your way to HMM4.
Using the HMM
A CISO that's heard that her organization needs to "get a hunt team" may legitimately be convinced that an active detection strategy is the right move, and yet still be confused about how to describe what the team's capability should actually be. It is my hope that by providing a simple maturity model, anyone thinking of getting into hunting will be able to get a good idea of what an appropriate initial capability would be (I recommend HMM2 to start with, BTW).
Maybe more importantly for those organizations who already hunt, the HMM can be used both to measure their current maturity and provide a roadmap for improvement. Hunt teams can match their current capabilities to those described in the model, then look ahead one step to see ideas for how they can develop their skills and/or data collection abilities in order to achieve the next level of maturity.
Although there's starting to be a lot of good hunting information available in the open literature, I believe that much of the confusion over what hunting is and how you should do it comes down not to processes but to the lack of a publicized hunting framework. In order to get anywhere, you must first know where you are and where you want to be. I hope the HMM will be a useful tool to help us all figure these things out.
Thank you for this. We're currently building hunt capability (we'd be in HMM1, prepping to go to HMM2, by your model), and were just sort of shooting in dark regarding what we thought we needed/wanted. This is a great framework!
ReplyDeleteExcellent! I'm glad this was helpful to you. If you are able to discuss it, I'd love to hear more about your experiences building your capability!
DeleteFantastic write-up. Thanks for sharing.
ReplyDeleteThis is good but more examples would help understand. But example what other analytics approach beside frequency analysis would be something that can be automated. Maybe you are talking about building a Spark ML pipeline to run analytics continually over logs. So then we need a description of regression and classification models that would fit the cybersecurity data. I am not sure cybersecurity analytics have an understanding of these statistical techniques. So there is the need to pair up data scientists with cyber analysts.
ReplyDeleteHi, Walker. Frequency analysis (AKA "stack counting") in hunting is usually used as a form of easy outlier detection, so from that we can point to other similar techniques that can also be automated. But whatever the implementation, it doesn't always have to be anything complicated. In fact, often the result of a successful data analysis hunt can be reduced to something very much like a signature ("If is within and is within , then create an alert").
ReplyDeleteI think the best question here is not *what* can be automated, but more like what's the *best way* to automate, given what you've found in your hunt.