SlideShare a Scribd company logo
Hi everyone.
1
What does a strategy for data analytics looks like, which can win in the often chaotic
reality of the business environment?
2
While I can’t promise this talk will provide a definitive answer, hopefully it will offer
some answers, and if not, at least some inspiration about what to think about, and a
snag list of things to avoid.
3
Security leaders and executives globally face 3 challenging questions:
1. What’s our business risk exposure from cyber?
2. What’s our security capability to manage that risk?
3. And based on the above, what do we prioritize?
4
By applying data analytics to security, we can provide the meaningful, timely, accurate
insights that security leadership need to do 2 things:
1) Support their colleagues in other teams, so they have the information to make
robust, defensible risk decisions; and
2) Gain the evidence we need to justify improvement where it matters most –
hopefully at best cost
5
That’s easy to say, but harder to do.
Because data analytics is a multi-dimensional problem space with a lot of moving
parts.
At the data layer, we have technologies that provide the output we need for analytics.
But even for one type of technology like anti-virus, we may have multiple vendors in
place, outputting data in different structures, with various coverage. They can also be
swapped out from one year to the next.
At the platform layer, how do we marshal our data so that we can deliver the
analytics that meet user need?
At the analysis layer, what techniques are available to us, and how repeatable are
they (or can they be) for the scale and frequency of analysis we want to run?
At the insight layer, how can we manage the fact that one question answered often
leads to many harder questions emerging, with an expectation for faster turn-
around?
6
At the communication layer, how do we make insight relevant and accessible to our
multiple stakeholders, based on their decision making context and concerns? And
how do we make any caveats clear at the different levels of strategic, operational and
tactical?
And lastly, provability. How do we win trust when so much analysis that leaders have
seen is wrong?
6
Taking on multidimensional problems always has a high risk of resulting in a resume
generating event.
So if we know all this, we may be tempted not to try.
7
Because if we do try – and precisely because the problem is complex, we will likely
fail forward on the way to success.
8
To our investors, the teams funding our projects, this is what failure looks like.
- An increasing amount of spend
- Very little visible value
- Someone making a heroic effort to save the situation
- Then getting frustrated with the politics they run into
- And leaving
9
In many cases, this happens because security data analytics efforts are “Death Star
projects”.
They are built on a big visions (and equally big promises), which require large teams
to do stuff successfully they haven’t done before, with coordination of lots of moving
parts over a long period of time.
And sometimes these visions aim to tackle problems that we don’t know that we can
solve, or even that solves the most important problem we have.
10
This cartoon sums up a lot of ‘blue sky thinking’ threat detection projects, which
often turn into sunk costs that the business can’t bear to put an end to because of
the time and money they’ve ploughed into them.
11
With any Death Star project, careful thought is needed about the legacy that other
teams will have to pick up afterwards.
12
But the same is also true at the other end of the spectrum, where it’s easy to end up
spending a lot of money on ‘data science noodling’, which doesn’t provide
foundational building blocks that can be built upon.
We hire some data scientists, set them to work on some interesting things, but
eventually they end up helping us fight the latest fire (either in security operations or
answering this weeks knee jerk question from the Executive Committee), rather than
doing what we hired them for.
And while artisanal data analytics programs definitely have short term value they also
have 2 core problems: 1) they aren’t scalable; and 2) their legacy doesn’t create the
operational engine for future success.
13
Although the actual amounts can differ, both the Death Star and artisanal model can
lead to this situation.
And in security, this isn't an unusual curve.
For many executives it represents their experience across most of the security
projects they’ve seen.
14
So how do we bend the cost-to-value curve to our will?
Not only in terms of security data analytics programs themselves, but also in terms of
how security data analytics can bend the overall cost curve for security by delivering
the meaningful, timely, accurate information that leadership need.
15
Ultimately what we need to win in business, is to win in a game of ratios.
This means :
- minimizing the time where value is less than spend
- maximizing the amount of value delivered for spend; and
- making value very visible to the people who are funding us
16
The conclusion of today’s talk is that we can only achieve this in the following way:
1) Start with a focus on sets of problems, not single use cases – and select initial
problem sets that give us building blocks of data that combine further down the line
to solve higher order problems with greater ease.
17
2) Take an approach to these problem sets that stacks Minimum Viable Products of
‘data plus analytics’ on top of each other.
18
And 3), approach problem sets in a sequence that sets us up to deliver greatest value
across multiple stakeholders with the fewest data sets possible.
19
In summary, this means that the battle ground we select for our data analytics
program needs to be here.
20
So, onto Act 1 of this talk.
21
22
Let’s imagine we work at ACME Inc., and our data analytics journey so far looks like
this.
Years ago, we invested heavily in SIEM - and discovered that while it was great if you
had a small rule set and a narrow scope, it quickly became clear upon deployment
that this dream would be unattainable.
As we moved from ‘rules’ to ‘search’, we invested in Splunk, only to run into cost
problem that inhibited our ability to ingest high volume data sets.
To manage a few specific headaches, we purchased some point analytics solutions.
And then spun up our own Hadoop cluster, with the vision of feeding these other
technologies with only the data they needed from a data store that we could also use
to run more complex analytics.
23
In meta terms, we could describe our journey as an oscillation between specific and
general platforms …
24
… as we adapted to the changing scale, complexity and IT operating model of our
organisation.
25
Let’s zoom in our our latest project, and walk through some scenarios that we may
have experienced …
26
… as we moved from build …
27
… to ingest …
28
… to analysis …
29
… and finally, insight.
30
And let’s imagine, not unreasonably, that our ‘value delivered’ curve across our
scenarios looks like this.
31
Scenario 1.
32
We’ve built our data lake, ingested data, done analysis, and delivered some insight –
so we’re feeling good.
33
But now we’ve run into problems.
34
And we’re paid a visit by the CFO’s pet T-Rex.
What’s gone wrong?
35
Well, as the questions people asked us got harder over time, at some point our ability
to answer them ran into limits.
36
The first problem we had was the architecture of our data lake.
It’s centrally managed by IT, and it doesn't support the type of analysis we want to do.
Business units using the cluster care about one category of analytics problem, and
the components we need to answer our categories of analytics problems isn’t in IT’s
roadmap.
We put in a change request, but we’re always behind the business in the priority
queue – and change control is taking a long time to go through, as IT have to work
out if and how an upgrade to one component in the stack will affect all the others.
Meanwhile, the business is tapping it's fingers impatiently, which means as a stop gap
we're putting stuff in excel and analysis notebooks … which is exactly what we
wanted to avoid.
37
Fortunately we got that problem solved, but then we encountered another. Now that
we’re generating insights, we’re getting high demand for analysis from lots of
different people.
In essence, we’ve become a service. Everyone who has a question realizes we have a
toolset and team that can provide answers, and we’re get a huge influx of requests
that are pulling us away from our core mission.
Half our team are now working on an effort to understand application dependencies
for a major server migration IT is doing. We're effectively DDoSed by our success, and
have to service the person who can shout the loudest.
We need to wrap a lot of process round servicing incoming requests, but while we’re
trying to do that, prioritization has run amok.
38
To try and get a handle on this, we called in a consultancy, who’ve convinced us that
what we need to do is set up a self service library of recipes so people can answer
their own questions.
We’ve built an intuitive front end interface to Hadoop, but we've quickly discovered
that with the same recipe, two people with different levels of skill can make dishes
that taste and look very different.
Now we're in a battle to scale the right knowledge for different people on how to do
analysis to avoid insights being presented that are accurate.
39
We're also finding that, as we deal with more complex questions, what we thought
was insight is not proving that valuable to the people consuming it.
Our stakeholders don’t want statuses or facts; they wanted us to answer the question
‘What is my best action?’
While we’re used to producing stuff at tactical or operational level for well-defined
problems, they are looking for strategic direction.
40
Scenario 2
41
Here, we’re pre-insight, and doing good stuff building analytics.
42
But it’s taking a lot longer to get to insights than we thought it would.
43
We didn’t understand the amount of work involved to:
- understand data sets, clean them, and prepare them; then
- work out the best analysis for the problem at hand, do that analysis and
communicate the result in a way that's meaningful to the stakeholder receiving them
…
- all with appropriate caveats to communicate the precision and accuracy of the
information they’re looking at
44
We’re now having conversations like this. because someone has read a presentation
or bit of marketing that suggest sciencing data happens auto-magically by throwing it
at a pre-built algorithm.
45
Specifically in the context of machine learning, a lot of the marketing we're seeing on
this today is dangerous.
First, there's a blurring of vocabulary, which doesn’t differentiate the discipline of
data analytics and data science vs the methods that data science and data analytics
use.
So when marketing pushes stories of automagic results from data analytics (which is
used wrongly as a synonym for ML) – and that later turns out to be an illusion - the
good work being done suffers by association.
46
Second, it speaks to us on an emotional level, when we don’t have a good framework
to assess if these ‘solutions’ will do what they claim in our environments.
As the CISO of a global bank said to me a few weeks ago, it is tempting and
comforting to think, when we face all the problems we do with headcount, expertise
and budget that, yes, perhaps some unsupervised machine learning algo can solve
this thorny problem I have.
So we give it a try, and it makes our problems worse not better.
47
Now, this isn’t a new problem in security.
It’s summed up eloquently in a paper called ‘A Market For Silver Bullets’, which
describes the fundamental asymmetry of information we face, where both sellers and
buyers lack knowledge of what an effective solution looks like. (Of course, the threat
actors know, but unfortunately, they’re not telling us).
In the world of ML, algos lack the business context they need – and it’s the
enrichment of algos that make the difference between lots of anomalies that are
interesting, but not high value, vs output we can act on with confidence.
48
But often neither the vendor nor the buyer know exactly how to do that.
So what you end up with is ‘solutions’ that have user experiences like this.
Now, I don’t know if you know the application owners I know, but this is simply not
going to happen.
49
And it's definitely not going to happen if what vendors deliver is the equivalent of
‘false positive as a service’.
50
Because if the first 10 things you give to someone who is very busy with with
business concerns are false positives, that’s going to be pretty much game over in
getting their time and buy-in for the future.
In the same way, Security Operations teams are already being fire hosed with alerts.
This means the tap may as well not be on if this is yet another pipe where there isn’t
time to do the necessary tuning.
51
In short, with ML and it’s promises, we face a classic fruit salad problem.
Knowledge is knowing a tomato is a fruit. Wisdom is not putting it in a fruit salad.
And while lots of vendors provide ML learning algos that have knowledge, it’s refining
those so that they have wisdom in the context of our business that makes them
valuable. Until that is possible (and easy) we’ll continue to be disappointed by results.
52
Scenario 3
53
Here, we’ve built lake and ingested data, but analysis has hit a wall.
54
We didn’t have mature process around data analytics in security when we started this
effort, and what we've done is simply scaled up the approach we were taking before.
This has created a data swamp, with loads of data pools that are poorly maintained.
55
We’re used to running a workflow in which an analyst runs to our tech frankenstack,
pulls any data they can on an ad hoc basis into a spreadsheet, runs some best effort
analysis, creates a pie chart, and sends off a PDF we hope no one will read.
56
By automating part of that mess, we now have … a faster mess.
57
Scenario 4
58
We run into trouble at the ingest stage
59
We’ve decided to ingest everything before starting with analysis.
And because this costs money and takes a lot of time, the business is sat for a long
time tapping their fingers waiting for insight.
Eventually, they get sick of waiting and cut the budget before we have enough in the
lake to do meaningful correlations and get some analysis going. We may try to
present some conclusions, but they’re flimsy and unmoving.
60
And finally, scenario 5.
61
In which we run into problems at the very first stage of building the lake.
We’ve been running a big data initiative for 6 months, and the business has come to
ask us how we were doing.
62
We said it would be done soon while wrestling with getting technology set up that
was stable and usable.
63
They checked back on us when the next budget cycle rolled round.
64
We said it would be done soon (while continuing to battle with the tech).
65
And then they decided they were done with a transformation program that was on a
trajectory to be anything but.
66
So, if these are the foreseeable problems to avoid …
67
… what does that mean as we consider our approach at strategic and operational
levels?
68
Let’s imagine at ACME, we understand all the problems we’ve just looked at, because
our team has lived through them in other firms.
And we want to take an MVP approach to solving a big problem, so that it has a good
chance of success.
69
The problem at the top of our agenda is how to deal with newly introduced DevOps
pipelines in several business units.
Our devs are creating code that’s ready to push into production in 2 weeks. Which is
great.
70
What’s not so great, is that security has a 3 month waterfall assurance process.
And at the end of this, multiple high risk findings are raised consistently.
71
So app dev asks the CIO for exceptions, which are now granted so frequently that
eventually security is pretty much ignored all together.
72
Because of the pain involved in going through this risk management process, the
status quo is fast becoming: let’s not go find and manage risk.
73
We need to change this, so we can shift the timeline for getting risk under
management from months to weeks.
We know data analytics is critical to this, both to a) get the information we need to
make good data informed decisions, then b) automate off the back of that to manage
risk at the speed of the business and be as data-driven as possible.
74
This means moving from a policy based approach, where only a tiny bit of code meets
all requirements ...
75
… to a risk based approach, where we can understand risk holistically, and manage it
pragmatically.
76
This means bringing together lots of puzzle pieces across security, ops and dev
processes.
77
And turning those puzzle pieces into a picture, to show risk as a factor of
connectedness, dependencies and activity across the operational entities that
support and delivery business outcomes.
78
Our plan to do this is to understand where we should set thresholds in various
relevant metrics, so that when data analytics identifies toxic combinations (or that
we’re getting close to them) we can jump on the problem.
79
In the long term, ideally we want to be able to do ‘what if’ analysis to address
problems before they arise, and shift thresholds dynamically as internal and external
factors relating to the threat, technology and business landscape.
80
This means we can start measuring risk to business value and revenue across
business units, based on business asset exposure to compromise and impact.
81
To top it all off, we then want to automate action on our environment using ‘security
robotics’ - i.e. orchestration technologies.
82
If what we’re building towards is to stand a chance of doing that, we’re going to want
lots of optionality across the platform (or platforms!) that could eventually support
these outcomes.
83
We’ll need to tie in requirements from lots of stakeholders outside security.
84
And consider how this effort (and other security controls) are introducing friction or
issues into people’s ‘jobs to be done’.
85
Especially where we’ve deployed ‘award winning solutions’ that people talk about like
this in private.
86
If we start with the question ‘What’s the user need?’, we can – no doubt – come up
with a set of foundational insights, which will deliver value to the CIO, project
managers, devs, sec ops the CISO and security risk functions.
87
And we can think about how to make information accessible to interrogate, so lots of
different people can self-serve.
88
The vision driving our MVP approach might look like this.
Which sounds convincing.
89
Except, what if we have 2000 developers in one Business Unit?
Or at least we think we do. We know we’ve got at least 2000, but it could be more.
And our code base is totally fragmented, so we don’t know where all our code is, and
how we‘d get good coverage on scanning it.
And we‘re about to move a load of infrastructure and operations to a Business
Process Outsourcer. Which will make it challenging to get some of the data we want.
And the available data that we can correlate in the short term, well ... to be honest, it
ain‘t great.
90
Perhaps we’ve chosen data analytics as a proxy for the problem that actually needs
solving, as analytics is very unlikely to be able to solve the problem we have.
91
All of which is to say, you can have the best strategy in the world to tackle a problem
you have, but if it isn’t focused on a problem you have, that you also know you can
solve, then we’re back to square one and the CFO’s pet T-Rex.
92
So onto our final act: Act 3.
93
How do we choose our battleground to solve problems we know we have, which we
know we can solve.
94
Simon Wardley is open sourcing really great thinking on strategy, and he talks a lot
about the primacy of ‘where’; i.e. we have to understand our landscape to choose
where to play, in order to win.
In our strawman devops example, the problem we had to solve was dictated to us.
- We had some great ideas and frameworks, but no understanding of our landscape
- We had no time to build that up
- We couldn’t choose a battleground where we had a good chance of winning
- And we had to jump on the problem that was right in front of us, because we were
firefighting
95
This massively limited our chance of success, because we lacked context about the
game, the landscape and the climate.
96
Let’s return to the concept we started out with.
We need to help our leaders demonstrate strong control over risk.
97
And to do that we need to pull lots of puzzle pieces together into a picture.
98
Measuring the probability of badness happening is a topic of great debate in security.
But very often, at a practical level, we can end up in endless meetings arguing about
how naked we need to be to catch a cold, when it would be more productive to just
put our clothes on.
Because if our cyber hygiene levels are low (or inconsistent), not only is the job of
detect and respond harder, but it’s harder to know if we’re in a defensible position
should the worst happen.
99
Starting with foundational building blocks that are possible, and highly palatable to
solve makes good sense.
100
As long as we can present outputs and results that people want to hang behind their
metaphorical desk on the office wall.
101
If this is our battle ground …
102
We can now assess where we have problems that sit in that box.
This may be as simple as assuring that we have the AV coverage and operational
consistency we expect across our different host types (servers and workstations) and
OS types.
103
The output should be relevant to various stakeholders, from the CIO to IT Ops to
security control managers.
104
And we should be able to track that we are moving from here …
105
… to here.
106
This is a model I call ‘the security cross fader’.
107
It expresses that investment in detect / respond becomes unsustainable at scale,
where that function is also picking up the side effects of poor cyber hygiene.
This sets up an investment trade off, of implementing preventative controls or change
processes to be secure by design where that makes financial sense, and having detect
/ respond pick up the slack where it’s not.
108
The goal (and challenge) is to find the right balance, so there’s less noise for detect /
respond to sift through, and an ability to for Security Operations to control the scope
of what they need to worry about.
109
How does this shake out into our problem space for security data analytics?
110
At the data layer …
111
… we want to ensure we tackle problem sets in a way that delivers maximum value
for multiple stakeholders with minimal data sets.
112
With that constraint, we can ask, “If you could only choose 5 data sets to meet your
user needs, what would they be?”
Here is an example of an answer we might get back.
113
The correlation we get in Gold is a 1st
order confirmation, and in blue, 2nd
order
confirmation of ‘facts’ about ‘stuff’.
114
We then have inferences we can draw based on our knowledge.
115
And finally 3rd
order signals …
116
That don’t give us strong confirmations, but which we can use to join dots.
117
Now we know what we’re aiming for, we might not start with Netflow, but we can
target data sets for collection and analysis that get us on the ladder we eventually
need to climb.
118
Next level: platform.
119
If this is nirvana …
120
… the journey can start with a user need that is far narrower.
121
As Mark Madsen said in 2011, if you procrastinate long enough, most problems solve
themselves.
122
And when it comes to building data lakes that can handle data volume, velocity and
diversity at scale, this is certainly where the market is heading.
So before investing lots of money to try and get there ourselves (with all the inter-
dependencies and challenges that entails) the best advice may be to wait a while.
123
Finally, onto analysis and insight.
124
If this is the approach we take to iterate quickly …
125
Then what we are setting up is a phased approach to quickly understand our data, the
value we can get from it currently, and the extent of the value we’ll be able to get in
future.
126
Like a musical cannon, we want to solve early problems that make harmonies more
pleasing over time as we add data sources and build analytics.
127
For example, we can use these data sets (at the bottom in grey) to address the
hygiene factors in green above.
128
Over time, we can use this to build upon.
129
Adding larger and more complex data soures as we go.
130
Tacking increasingly complex problems, accruing wisdom as we do.
131
We can then start looking at ‘risk factors’ in populations of operational entities.
132
Be they machine, or people.
133
Or apps.
134
Giving us the facts and evidence we need to put detections in context, and
understand risk across our landscape.
135
I realize that this is quite high level, and the equivalent in some ways of this guide to
‘How to draw a horse’.
136
None the less, I hope it’s been useful, and will answer any questions you have as best
I can!
Thank you very much.
137

More Related Content

What's hot

Whats the problem_ebook
Whats the problem_ebookWhats the problem_ebook
Whats the problem_ebook
VC-ERP
 
Analytics and Creativity
Analytics and CreativityAnalytics and Creativity
Analytics and Creativity
Ogilvy Consulting
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 
Big Data, Big Innovations
Big Data, Big Innovations  Big Data, Big Innovations
Big Data, Big Innovations
EMC
 
Coveo_Intelligent Workspace_eBook_FINAL
Coveo_Intelligent Workspace_eBook_FINALCoveo_Intelligent Workspace_eBook_FINAL
Coveo_Intelligent Workspace_eBook_FINALStephen Weidman
 
Transforming Customer Engagement with IBM Watson
Transforming Customer Engagement with IBM WatsonTransforming Customer Engagement with IBM Watson
Transforming Customer Engagement with IBM WatsonRahul A. Garg
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
mark madsen
 
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
Dana Gardner
 
Report: CIOs & Big Data
Report: CIOs & Big DataReport: CIOs & Big Data
Report: CIOs & Big Data
Infochimps, a CSC Big Data Business
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Inside Analysis
 
Spocto :: NPA and Data Recovery Solution
Spocto :: NPA and Data Recovery SolutionSpocto :: NPA and Data Recovery Solution
Spocto :: NPA and Data Recovery Solution
spocto
 
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
Dana Gardner
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
mark madsen
 
Watson join the cognitive era
Watson   join the cognitive eraWatson   join the cognitive era
Watson join the cognitive era
Anders Quitzau
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
Dana Gardner
 
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Dana Gardner
 
Demystifying Big Data for Associations
Demystifying Big Data for AssociationsDemystifying Big Data for Associations
Demystifying Big Data for Associations
Patrick Dorsey
 
Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1
CCG
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
Ganes Kesari
 

What's hot (19)

Whats the problem_ebook
Whats the problem_ebookWhats the problem_ebook
Whats the problem_ebook
 
Analytics and Creativity
Analytics and CreativityAnalytics and Creativity
Analytics and Creativity
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Big Data, Big Innovations
Big Data, Big Innovations  Big Data, Big Innovations
Big Data, Big Innovations
 
Coveo_Intelligent Workspace_eBook_FINAL
Coveo_Intelligent Workspace_eBook_FINALCoveo_Intelligent Workspace_eBook_FINAL
Coveo_Intelligent Workspace_eBook_FINAL
 
Transforming Customer Engagement with IBM Watson
Transforming Customer Engagement with IBM WatsonTransforming Customer Engagement with IBM Watson
Transforming Customer Engagement with IBM Watson
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
Tag-Team of Workshops Provides Proven Path of Data Center Transformation, Ass...
 
Report: CIOs & Big Data
Report: CIOs & Big DataReport: CIOs & Big Data
Report: CIOs & Big Data
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
Spocto :: NPA and Data Recovery Solution
Spocto :: NPA and Data Recovery SolutionSpocto :: NPA and Data Recovery Solution
Spocto :: NPA and Data Recovery Solution
 
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Watson join the cognitive era
Watson   join the cognitive eraWatson   join the cognitive era
Watson join the cognitive era
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
 
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
 
Demystifying Big Data for Associations
Demystifying Big Data for AssociationsDemystifying Big Data for Associations
Demystifying Big Data for Associations
 
Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
 

Similar to A strategy for security data analytics - SIRACon 2016

Semantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - IntroSemantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - Intro
Stephen Lahanas
 
Data analytics for the mid-market: myth vs. reality
Data analytics for the mid-market: myth vs. realityData analytics for the mid-market: myth vs. reality
Data analytics for the mid-market: myth vs. reality
Deloitte Canada
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
Himanshu Bari
 
Building an enterprise security knowledge graph to fuel better decisions, fas...
Building an enterprise security knowledge graph to fuel better decisions, fas...Building an enterprise security knowledge graph to fuel better decisions, fas...
Building an enterprise security knowledge graph to fuel better decisions, fas...
Jon Hawes
 
The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teams
DILIPKUMARMONDAL6
 
Technology in financial services
Technology in financial servicesTechnology in financial services
Technology in financial services
Luis Caldeira
 
Technology in financial services
Technology in financial servicesTechnology in financial services
Technology in financial services
Luis Caldeira
 
Platform Progression
Platform ProgressionPlatform Progression
Platform Progression
Michael Henry
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
Digimark
 
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
Dana Gardner
 
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
BCS Data Management Specialist Group
 
Business analytics Project.docx
Business analytics Project.docxBusiness analytics Project.docx
Business analytics Project.docx
kushi62
 
Essay Narrative Example. Narrative Essay PDF Essays Narrative
Essay Narrative Example. Narrative Essay  PDF  Essays  NarrativeEssay Narrative Example. Narrative Essay  PDF  Essays  Narrative
Essay Narrative Example. Narrative Essay PDF Essays Narrative
Elizabeth Pardue
 
Can Agile Work for this Project?
Can Agile Work for this Project?Can Agile Work for this Project?
Can Agile Work for this Project?
Cognizant
 
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
DataScienceConferenc1
 
Cs633-1 Enterprise Architecture Foundation
Cs633-1 Enterprise Architecture FoundationCs633-1 Enterprise Architecture Foundation
Cs633-1 Enterprise Architecture Foundation
Casey Hudson
 
A practice to perfect the big data solution
A practice to perfect the big data solutionA practice to perfect the big data solution
A practice to perfect the big data solution
Parthasarathy Kannan
 
The Analytics Stack Guidebook (Holistics)
The Analytics Stack Guidebook (Holistics)The Analytics Stack Guidebook (Holistics)
The Analytics Stack Guidebook (Holistics)
Truong Bomi
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
Brett Sheppard
 

Similar to A strategy for security data analytics - SIRACon 2016 (20)

Semantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - IntroSemantech Inc. - Mastering Enterprise Big Data - Intro
Semantech Inc. - Mastering Enterprise Big Data - Intro
 
Data analytics for the mid-market: myth vs. reality
Data analytics for the mid-market: myth vs. realityData analytics for the mid-market: myth vs. reality
Data analytics for the mid-market: myth vs. reality
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
 
Building an enterprise security knowledge graph to fuel better decisions, fas...
Building an enterprise security knowledge graph to fuel better decisions, fas...Building an enterprise security knowledge graph to fuel better decisions, fas...
Building an enterprise security knowledge graph to fuel better decisions, fas...
 
The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teams
 
Technology in financial services
Technology in financial servicesTechnology in financial services
Technology in financial services
 
Technology in financial services
Technology in financial servicesTechnology in financial services
Technology in financial services
 
Platform Progression
Platform ProgressionPlatform Progression
Platform Progression
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
Agnostic Tool Chain Key to Fixing the Broken State of Data and Information Ma...
 
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
Horse meat or beef? (3) D Murphy, National Grid, 21/3/13
 
Business analytics Project.docx
Business analytics Project.docxBusiness analytics Project.docx
Business analytics Project.docx
 
Essay Narrative Example. Narrative Essay PDF Essays Narrative
Essay Narrative Example. Narrative Essay  PDF  Essays  NarrativeEssay Narrative Example. Narrative Essay  PDF  Essays  Narrative
Essay Narrative Example. Narrative Essay PDF Essays Narrative
 
Can Agile Work for this Project?
Can Agile Work for this Project?Can Agile Work for this Project?
Can Agile Work for this Project?
 
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
[DSC Europe 22] Next-Wave of Value – Operating Model for Scaling Data Science...
 
Cs633-1 Enterprise Architecture Foundation
Cs633-1 Enterprise Architecture FoundationCs633-1 Enterprise Architecture Foundation
Cs633-1 Enterprise Architecture Foundation
 
A practice to perfect the big data solution
A practice to perfect the big data solutionA practice to perfect the big data solution
A practice to perfect the big data solution
 
The Analytics Stack Guidebook (Holistics)
The Analytics Stack Guidebook (Holistics)The Analytics Stack Guidebook (Holistics)
The Analytics Stack Guidebook (Holistics)
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 

A strategy for security data analytics - SIRACon 2016

  • 2. What does a strategy for data analytics looks like, which can win in the often chaotic reality of the business environment? 2
  • 3. While I can’t promise this talk will provide a definitive answer, hopefully it will offer some answers, and if not, at least some inspiration about what to think about, and a snag list of things to avoid. 3
  • 4. Security leaders and executives globally face 3 challenging questions: 1. What’s our business risk exposure from cyber? 2. What’s our security capability to manage that risk? 3. And based on the above, what do we prioritize? 4
  • 5. By applying data analytics to security, we can provide the meaningful, timely, accurate insights that security leadership need to do 2 things: 1) Support their colleagues in other teams, so they have the information to make robust, defensible risk decisions; and 2) Gain the evidence we need to justify improvement where it matters most – hopefully at best cost 5
  • 6. That’s easy to say, but harder to do. Because data analytics is a multi-dimensional problem space with a lot of moving parts. At the data layer, we have technologies that provide the output we need for analytics. But even for one type of technology like anti-virus, we may have multiple vendors in place, outputting data in different structures, with various coverage. They can also be swapped out from one year to the next. At the platform layer, how do we marshal our data so that we can deliver the analytics that meet user need? At the analysis layer, what techniques are available to us, and how repeatable are they (or can they be) for the scale and frequency of analysis we want to run? At the insight layer, how can we manage the fact that one question answered often leads to many harder questions emerging, with an expectation for faster turn- around? 6
  • 7. At the communication layer, how do we make insight relevant and accessible to our multiple stakeholders, based on their decision making context and concerns? And how do we make any caveats clear at the different levels of strategic, operational and tactical? And lastly, provability. How do we win trust when so much analysis that leaders have seen is wrong? 6
  • 8. Taking on multidimensional problems always has a high risk of resulting in a resume generating event. So if we know all this, we may be tempted not to try. 7
  • 9. Because if we do try – and precisely because the problem is complex, we will likely fail forward on the way to success. 8
  • 10. To our investors, the teams funding our projects, this is what failure looks like. - An increasing amount of spend - Very little visible value - Someone making a heroic effort to save the situation - Then getting frustrated with the politics they run into - And leaving 9
  • 11. In many cases, this happens because security data analytics efforts are “Death Star projects”. They are built on a big visions (and equally big promises), which require large teams to do stuff successfully they haven’t done before, with coordination of lots of moving parts over a long period of time. And sometimes these visions aim to tackle problems that we don’t know that we can solve, or even that solves the most important problem we have. 10
  • 12. This cartoon sums up a lot of ‘blue sky thinking’ threat detection projects, which often turn into sunk costs that the business can’t bear to put an end to because of the time and money they’ve ploughed into them. 11
  • 13. With any Death Star project, careful thought is needed about the legacy that other teams will have to pick up afterwards. 12
  • 14. But the same is also true at the other end of the spectrum, where it’s easy to end up spending a lot of money on ‘data science noodling’, which doesn’t provide foundational building blocks that can be built upon. We hire some data scientists, set them to work on some interesting things, but eventually they end up helping us fight the latest fire (either in security operations or answering this weeks knee jerk question from the Executive Committee), rather than doing what we hired them for. And while artisanal data analytics programs definitely have short term value they also have 2 core problems: 1) they aren’t scalable; and 2) their legacy doesn’t create the operational engine for future success. 13
  • 15. Although the actual amounts can differ, both the Death Star and artisanal model can lead to this situation. And in security, this isn't an unusual curve. For many executives it represents their experience across most of the security projects they’ve seen. 14
  • 16. So how do we bend the cost-to-value curve to our will? Not only in terms of security data analytics programs themselves, but also in terms of how security data analytics can bend the overall cost curve for security by delivering the meaningful, timely, accurate information that leadership need. 15
  • 17. Ultimately what we need to win in business, is to win in a game of ratios. This means : - minimizing the time where value is less than spend - maximizing the amount of value delivered for spend; and - making value very visible to the people who are funding us 16
  • 18. The conclusion of today’s talk is that we can only achieve this in the following way: 1) Start with a focus on sets of problems, not single use cases – and select initial problem sets that give us building blocks of data that combine further down the line to solve higher order problems with greater ease. 17
  • 19. 2) Take an approach to these problem sets that stacks Minimum Viable Products of ‘data plus analytics’ on top of each other. 18
  • 20. And 3), approach problem sets in a sequence that sets us up to deliver greatest value across multiple stakeholders with the fewest data sets possible. 19
  • 21. In summary, this means that the battle ground we select for our data analytics program needs to be here. 20
  • 22. So, onto Act 1 of this talk. 21
  • 23. 22
  • 24. Let’s imagine we work at ACME Inc., and our data analytics journey so far looks like this. Years ago, we invested heavily in SIEM - and discovered that while it was great if you had a small rule set and a narrow scope, it quickly became clear upon deployment that this dream would be unattainable. As we moved from ‘rules’ to ‘search’, we invested in Splunk, only to run into cost problem that inhibited our ability to ingest high volume data sets. To manage a few specific headaches, we purchased some point analytics solutions. And then spun up our own Hadoop cluster, with the vision of feeding these other technologies with only the data they needed from a data store that we could also use to run more complex analytics. 23
  • 25. In meta terms, we could describe our journey as an oscillation between specific and general platforms … 24
  • 26. … as we adapted to the changing scale, complexity and IT operating model of our organisation. 25
  • 27. Let’s zoom in our our latest project, and walk through some scenarios that we may have experienced … 26
  • 28. … as we moved from build … 27
  • 29. … to ingest … 28
  • 30. … to analysis … 29
  • 31. … and finally, insight. 30
  • 32. And let’s imagine, not unreasonably, that our ‘value delivered’ curve across our scenarios looks like this. 31
  • 34. We’ve built our data lake, ingested data, done analysis, and delivered some insight – so we’re feeling good. 33
  • 35. But now we’ve run into problems. 34
  • 36. And we’re paid a visit by the CFO’s pet T-Rex. What’s gone wrong? 35
  • 37. Well, as the questions people asked us got harder over time, at some point our ability to answer them ran into limits. 36
  • 38. The first problem we had was the architecture of our data lake. It’s centrally managed by IT, and it doesn't support the type of analysis we want to do. Business units using the cluster care about one category of analytics problem, and the components we need to answer our categories of analytics problems isn’t in IT’s roadmap. We put in a change request, but we’re always behind the business in the priority queue – and change control is taking a long time to go through, as IT have to work out if and how an upgrade to one component in the stack will affect all the others. Meanwhile, the business is tapping it's fingers impatiently, which means as a stop gap we're putting stuff in excel and analysis notebooks … which is exactly what we wanted to avoid. 37
  • 39. Fortunately we got that problem solved, but then we encountered another. Now that we’re generating insights, we’re getting high demand for analysis from lots of different people. In essence, we’ve become a service. Everyone who has a question realizes we have a toolset and team that can provide answers, and we’re get a huge influx of requests that are pulling us away from our core mission. Half our team are now working on an effort to understand application dependencies for a major server migration IT is doing. We're effectively DDoSed by our success, and have to service the person who can shout the loudest. We need to wrap a lot of process round servicing incoming requests, but while we’re trying to do that, prioritization has run amok. 38
  • 40. To try and get a handle on this, we called in a consultancy, who’ve convinced us that what we need to do is set up a self service library of recipes so people can answer their own questions. We’ve built an intuitive front end interface to Hadoop, but we've quickly discovered that with the same recipe, two people with different levels of skill can make dishes that taste and look very different. Now we're in a battle to scale the right knowledge for different people on how to do analysis to avoid insights being presented that are accurate. 39
  • 41. We're also finding that, as we deal with more complex questions, what we thought was insight is not proving that valuable to the people consuming it. Our stakeholders don’t want statuses or facts; they wanted us to answer the question ‘What is my best action?’ While we’re used to producing stuff at tactical or operational level for well-defined problems, they are looking for strategic direction. 40
  • 43. Here, we’re pre-insight, and doing good stuff building analytics. 42
  • 44. But it’s taking a lot longer to get to insights than we thought it would. 43
  • 45. We didn’t understand the amount of work involved to: - understand data sets, clean them, and prepare them; then - work out the best analysis for the problem at hand, do that analysis and communicate the result in a way that's meaningful to the stakeholder receiving them … - all with appropriate caveats to communicate the precision and accuracy of the information they’re looking at 44
  • 46. We’re now having conversations like this. because someone has read a presentation or bit of marketing that suggest sciencing data happens auto-magically by throwing it at a pre-built algorithm. 45
  • 47. Specifically in the context of machine learning, a lot of the marketing we're seeing on this today is dangerous. First, there's a blurring of vocabulary, which doesn’t differentiate the discipline of data analytics and data science vs the methods that data science and data analytics use. So when marketing pushes stories of automagic results from data analytics (which is used wrongly as a synonym for ML) – and that later turns out to be an illusion - the good work being done suffers by association. 46
  • 48. Second, it speaks to us on an emotional level, when we don’t have a good framework to assess if these ‘solutions’ will do what they claim in our environments. As the CISO of a global bank said to me a few weeks ago, it is tempting and comforting to think, when we face all the problems we do with headcount, expertise and budget that, yes, perhaps some unsupervised machine learning algo can solve this thorny problem I have. So we give it a try, and it makes our problems worse not better. 47
  • 49. Now, this isn’t a new problem in security. It’s summed up eloquently in a paper called ‘A Market For Silver Bullets’, which describes the fundamental asymmetry of information we face, where both sellers and buyers lack knowledge of what an effective solution looks like. (Of course, the threat actors know, but unfortunately, they’re not telling us). In the world of ML, algos lack the business context they need – and it’s the enrichment of algos that make the difference between lots of anomalies that are interesting, but not high value, vs output we can act on with confidence. 48
  • 50. But often neither the vendor nor the buyer know exactly how to do that. So what you end up with is ‘solutions’ that have user experiences like this. Now, I don’t know if you know the application owners I know, but this is simply not going to happen. 49
  • 51. And it's definitely not going to happen if what vendors deliver is the equivalent of ‘false positive as a service’. 50
  • 52. Because if the first 10 things you give to someone who is very busy with with business concerns are false positives, that’s going to be pretty much game over in getting their time and buy-in for the future. In the same way, Security Operations teams are already being fire hosed with alerts. This means the tap may as well not be on if this is yet another pipe where there isn’t time to do the necessary tuning. 51
  • 53. In short, with ML and it’s promises, we face a classic fruit salad problem. Knowledge is knowing a tomato is a fruit. Wisdom is not putting it in a fruit salad. And while lots of vendors provide ML learning algos that have knowledge, it’s refining those so that they have wisdom in the context of our business that makes them valuable. Until that is possible (and easy) we’ll continue to be disappointed by results. 52
  • 55. Here, we’ve built lake and ingested data, but analysis has hit a wall. 54
  • 56. We didn’t have mature process around data analytics in security when we started this effort, and what we've done is simply scaled up the approach we were taking before. This has created a data swamp, with loads of data pools that are poorly maintained. 55
  • 57. We’re used to running a workflow in which an analyst runs to our tech frankenstack, pulls any data they can on an ad hoc basis into a spreadsheet, runs some best effort analysis, creates a pie chart, and sends off a PDF we hope no one will read. 56
  • 58. By automating part of that mess, we now have … a faster mess. 57
  • 60. We run into trouble at the ingest stage 59
  • 61. We’ve decided to ingest everything before starting with analysis. And because this costs money and takes a lot of time, the business is sat for a long time tapping their fingers waiting for insight. Eventually, they get sick of waiting and cut the budget before we have enough in the lake to do meaningful correlations and get some analysis going. We may try to present some conclusions, but they’re flimsy and unmoving. 60
  • 63. In which we run into problems at the very first stage of building the lake. We’ve been running a big data initiative for 6 months, and the business has come to ask us how we were doing. 62
  • 64. We said it would be done soon while wrestling with getting technology set up that was stable and usable. 63
  • 65. They checked back on us when the next budget cycle rolled round. 64
  • 66. We said it would be done soon (while continuing to battle with the tech). 65
  • 67. And then they decided they were done with a transformation program that was on a trajectory to be anything but. 66
  • 68. So, if these are the foreseeable problems to avoid … 67
  • 69. … what does that mean as we consider our approach at strategic and operational levels? 68
  • 70. Let’s imagine at ACME, we understand all the problems we’ve just looked at, because our team has lived through them in other firms. And we want to take an MVP approach to solving a big problem, so that it has a good chance of success. 69
  • 71. The problem at the top of our agenda is how to deal with newly introduced DevOps pipelines in several business units. Our devs are creating code that’s ready to push into production in 2 weeks. Which is great. 70
  • 72. What’s not so great, is that security has a 3 month waterfall assurance process. And at the end of this, multiple high risk findings are raised consistently. 71
  • 73. So app dev asks the CIO for exceptions, which are now granted so frequently that eventually security is pretty much ignored all together. 72
  • 74. Because of the pain involved in going through this risk management process, the status quo is fast becoming: let’s not go find and manage risk. 73
  • 75. We need to change this, so we can shift the timeline for getting risk under management from months to weeks. We know data analytics is critical to this, both to a) get the information we need to make good data informed decisions, then b) automate off the back of that to manage risk at the speed of the business and be as data-driven as possible. 74
  • 76. This means moving from a policy based approach, where only a tiny bit of code meets all requirements ... 75
  • 77. … to a risk based approach, where we can understand risk holistically, and manage it pragmatically. 76
  • 78. This means bringing together lots of puzzle pieces across security, ops and dev processes. 77
  • 79. And turning those puzzle pieces into a picture, to show risk as a factor of connectedness, dependencies and activity across the operational entities that support and delivery business outcomes. 78
  • 80. Our plan to do this is to understand where we should set thresholds in various relevant metrics, so that when data analytics identifies toxic combinations (or that we’re getting close to them) we can jump on the problem. 79
  • 81. In the long term, ideally we want to be able to do ‘what if’ analysis to address problems before they arise, and shift thresholds dynamically as internal and external factors relating to the threat, technology and business landscape. 80
  • 82. This means we can start measuring risk to business value and revenue across business units, based on business asset exposure to compromise and impact. 81
  • 83. To top it all off, we then want to automate action on our environment using ‘security robotics’ - i.e. orchestration technologies. 82
  • 84. If what we’re building towards is to stand a chance of doing that, we’re going to want lots of optionality across the platform (or platforms!) that could eventually support these outcomes. 83
  • 85. We’ll need to tie in requirements from lots of stakeholders outside security. 84
  • 86. And consider how this effort (and other security controls) are introducing friction or issues into people’s ‘jobs to be done’. 85
  • 87. Especially where we’ve deployed ‘award winning solutions’ that people talk about like this in private. 86
  • 88. If we start with the question ‘What’s the user need?’, we can – no doubt – come up with a set of foundational insights, which will deliver value to the CIO, project managers, devs, sec ops the CISO and security risk functions. 87
  • 89. And we can think about how to make information accessible to interrogate, so lots of different people can self-serve. 88
  • 90. The vision driving our MVP approach might look like this. Which sounds convincing. 89
  • 91. Except, what if we have 2000 developers in one Business Unit? Or at least we think we do. We know we’ve got at least 2000, but it could be more. And our code base is totally fragmented, so we don’t know where all our code is, and how we‘d get good coverage on scanning it. And we‘re about to move a load of infrastructure and operations to a Business Process Outsourcer. Which will make it challenging to get some of the data we want. And the available data that we can correlate in the short term, well ... to be honest, it ain‘t great. 90
  • 92. Perhaps we’ve chosen data analytics as a proxy for the problem that actually needs solving, as analytics is very unlikely to be able to solve the problem we have. 91
  • 93. All of which is to say, you can have the best strategy in the world to tackle a problem you have, but if it isn’t focused on a problem you have, that you also know you can solve, then we’re back to square one and the CFO’s pet T-Rex. 92
  • 94. So onto our final act: Act 3. 93
  • 95. How do we choose our battleground to solve problems we know we have, which we know we can solve. 94
  • 96. Simon Wardley is open sourcing really great thinking on strategy, and he talks a lot about the primacy of ‘where’; i.e. we have to understand our landscape to choose where to play, in order to win. In our strawman devops example, the problem we had to solve was dictated to us. - We had some great ideas and frameworks, but no understanding of our landscape - We had no time to build that up - We couldn’t choose a battleground where we had a good chance of winning - And we had to jump on the problem that was right in front of us, because we were firefighting 95
  • 97. This massively limited our chance of success, because we lacked context about the game, the landscape and the climate. 96
  • 98. Let’s return to the concept we started out with. We need to help our leaders demonstrate strong control over risk. 97
  • 99. And to do that we need to pull lots of puzzle pieces together into a picture. 98
  • 100. Measuring the probability of badness happening is a topic of great debate in security. But very often, at a practical level, we can end up in endless meetings arguing about how naked we need to be to catch a cold, when it would be more productive to just put our clothes on. Because if our cyber hygiene levels are low (or inconsistent), not only is the job of detect and respond harder, but it’s harder to know if we’re in a defensible position should the worst happen. 99
  • 101. Starting with foundational building blocks that are possible, and highly palatable to solve makes good sense. 100
  • 102. As long as we can present outputs and results that people want to hang behind their metaphorical desk on the office wall. 101
  • 103. If this is our battle ground … 102
  • 104. We can now assess where we have problems that sit in that box. This may be as simple as assuring that we have the AV coverage and operational consistency we expect across our different host types (servers and workstations) and OS types. 103
  • 105. The output should be relevant to various stakeholders, from the CIO to IT Ops to security control managers. 104
  • 106. And we should be able to track that we are moving from here … 105
  • 108. This is a model I call ‘the security cross fader’. 107
  • 109. It expresses that investment in detect / respond becomes unsustainable at scale, where that function is also picking up the side effects of poor cyber hygiene. This sets up an investment trade off, of implementing preventative controls or change processes to be secure by design where that makes financial sense, and having detect / respond pick up the slack where it’s not. 108
  • 110. The goal (and challenge) is to find the right balance, so there’s less noise for detect / respond to sift through, and an ability to for Security Operations to control the scope of what they need to worry about. 109
  • 111. How does this shake out into our problem space for security data analytics? 110
  • 112. At the data layer … 111
  • 113. … we want to ensure we tackle problem sets in a way that delivers maximum value for multiple stakeholders with minimal data sets. 112
  • 114. With that constraint, we can ask, “If you could only choose 5 data sets to meet your user needs, what would they be?” Here is an example of an answer we might get back. 113
  • 115. The correlation we get in Gold is a 1st order confirmation, and in blue, 2nd order confirmation of ‘facts’ about ‘stuff’. 114
  • 116. We then have inferences we can draw based on our knowledge. 115
  • 117. And finally 3rd order signals … 116
  • 118. That don’t give us strong confirmations, but which we can use to join dots. 117
  • 119. Now we know what we’re aiming for, we might not start with Netflow, but we can target data sets for collection and analysis that get us on the ladder we eventually need to climb. 118
  • 121. If this is nirvana … 120
  • 122. … the journey can start with a user need that is far narrower. 121
  • 123. As Mark Madsen said in 2011, if you procrastinate long enough, most problems solve themselves. 122
  • 124. And when it comes to building data lakes that can handle data volume, velocity and diversity at scale, this is certainly where the market is heading. So before investing lots of money to try and get there ourselves (with all the inter- dependencies and challenges that entails) the best advice may be to wait a while. 123
  • 125. Finally, onto analysis and insight. 124
  • 126. If this is the approach we take to iterate quickly … 125
  • 127. Then what we are setting up is a phased approach to quickly understand our data, the value we can get from it currently, and the extent of the value we’ll be able to get in future. 126
  • 128. Like a musical cannon, we want to solve early problems that make harmonies more pleasing over time as we add data sources and build analytics. 127
  • 129. For example, we can use these data sets (at the bottom in grey) to address the hygiene factors in green above. 128
  • 130. Over time, we can use this to build upon. 129
  • 131. Adding larger and more complex data soures as we go. 130
  • 132. Tacking increasingly complex problems, accruing wisdom as we do. 131
  • 133. We can then start looking at ‘risk factors’ in populations of operational entities. 132
  • 134. Be they machine, or people. 133
  • 136. Giving us the facts and evidence we need to put detections in context, and understand risk across our landscape. 135
  • 137. I realize that this is quite high level, and the equivalent in some ways of this guide to ‘How to draw a horse’. 136
  • 138. None the less, I hope it’s been useful, and will answer any questions you have as best I can! Thank you very much. 137