13. IP theft is a growing issue
Annual losses due to IP theft more
than $300 billion
Ongoing theft of IP is “the greatest
transfer of wealth in history.”
Better protection of IP would
encourage significantly more R&D
investment and
economic growth
IP theft diminishes productivity
growth, innovation, product
advancements.
21. Customer: $20B manufacturer
2 engineers
stole data
1 YEAR
$1 million spent
Large security vendor
failed to find anything
2 WEEKS
Easily identified
the 2 engineers
Found 3 additional
users stealing data
in North America
Found 8
additional users
stealing data in
China
THREAT
DETECTION
X
22. Example
Developers have
unrestricted source
code access
User
Volume of copying is high
compared to past 30 days & as
compared to other developers
Activity
Source Code has
high risk and
importance value
File
Local saves are
considered
high risk
Method/Exit
John Smith is copying an unusually large
amount of source code and is not checking it back in.
23. Helix Threat Detection
Analytics Modeling
Baselines and creates clusters
Learns Patterns
Learns Anomalies (unusual
hours, data volumes, application
types & more
Risk Scoring
Risk by User
Risk by Activity
Risk by File
Risk by Time
Risk by Volume
Risk by Method/Exit
Verification & Investigation
Highly Readable Event Alarms
Very Intuitive UI
Executive Reporting
BEHAVIORAL ANALYTICS
All Users
Riskfrom0-100
2
0
5
23
Wintermute Wintermute 89
Armitage 82
Hideo 26
Maelcum 26
Molly 25
Aerol 25
Strayllight 25
Case 18
Chiba 8
Proteus 7
25. Data & risky behaviors
Wandering
Accessing content that they don’t normally access
e.g.: It was unusual for John Smith to take from project 54
Sneaking
Active outside of their normal working times of day or days of week
e.g.: John Smith kept unusual working hours
Moocher
Taking unexpectedly large amounts of content, while contributing very little
e.g.: John Smith mooched from project 54
Hoarder
Taking more than expected, compared to self or others
e.g.: John Smith took a large amount compared to other people
Log Data
Timestamp
(Date/Time)
User
Resource (Folder
Structure)
Action
(Give/Take)
Item Number
Client
Size
28. Security Audit
• Detailed assessment by Perforce experts
• Review all aspects of Helix configuration
• Execute IP Threat Detection with
personalised report
31. Things we see: The Top Five
• Default settings unchanged (e.g., no SSL, no passwords)
• Not using built-in mechanism for AD/LDAP, SSO, 2FA
• Wide open Protections table
• Super User accounts with no passwords; too many Super
User accounts
• Helix server not secured behind a firewall
32. Our Recommendations
• Advanced Perforce Administration Training
• Online, instructor-led
• Visit: perforce.com/training
• A Special Offer
33. Special Offer: Try Helix Threat Detection
Before Buying
• No Commitment for 60 days
• Sign up for Helix Threat Detection now, use free of charge before purchase
• 3 Days Consulting for the Price of 1
• For the price of one day’s consulting, you’ll get:
- Run a security audit for your company
- Deploy Helix Threat Detection
- Perform a risk analysis report which shows specific risks to your IP
• Normally, 3 days consulting, you pay for just 1
• Sign-up today to reserve your slot for Perforce Consulting
• Offer expires Oct. 31, 2015
You’re probably already familiar with this quote. Amazing to think it’s nearly 4 years old now.
It’s definitely accurate though. I’m going to take a look at a few examples in a second but the important thing, for me, is that the software does not run in a vacuum. It is not entirely self-contained and self-running. The reality of this important software shift is that needs an much more holistic view of the world – the entire stack from the end user experience to the silicon on which the software runs.
There are many topical facets to these challenges – DevOps looks at the interface between the developer and operations but doesn’t really look too closely at how the entire product is created. Continuous Delivery takes a broader view but becomes hard when considering physical manufacturing. Internet of Things depends on a “whole stack” view – from simple, low-cost, low-power physical sensors to huge Big Data analytics.
At the big scale, taking a holistic view is critical and the person that normally has that view is the person managing the configuration. But now they probably have to consider a much broader set of components than all but the largest organizations have in the past.
But now we have a whole new community having to address problems that those CMs might already have covered. If you’re a fashion company, a furniture company, a local cleaning service, you are having to consider how software will affect your business. It’s likely that this software will be your competitive differentiator. It’s also highly likely that’s it’s not your area of expertise.
Let’s start by looking at a few examples of where software is dominating in ways that perhaps couldn’t have been imagined 5 or 10 years ago.
I’m trying to be generic here and my examples are not, mostly, Perforce customers but they do illustrate the challenges.
Automotive has seen dramatic change in the last decade. This is the dashboard from the latest Audi TT. It’s not a particularly high-end car – as in the past the Mercedes S-Class might have had all the leading edge gadgetry – but Audi have said their intention is to move this kind of technology into their entire range. All manufacturers are doing the same. In 5 or 10 years none of will have mechanical dials (unless we’re driving classic cars!).
But there’s a lot to consider behind this change. E.G. the physical screen, graphics chips, CPUs, sensors, wiring looms, network connections, design tools for the hardware and graphics and much more that are involved in putting this “one” component into the vehicle.
Think also of the tools that are used to design a UX like this. One of our customers, nvidia, is at the leading edge of this revolution and not only make the physical components but they also build an Integrated Development Environment (IDE) for designers to create these great graphics. That means there are a lot of different contributors that need to work together to create such a product. Of course, just to make it interesting, there will be many variants of the product not just within the TT range but across Audi or even across multiple car manufacturers.
So, in the past Smiths or whoever may have needed mechanical engineers that could work out how to take a rapidly rotating disk on the engine or wheel and translate that motion into a needle dial in the car dashboard. Today, they need graphic designers, software engineers, video & audio artists, …
And that’s what we’ve helped them with. Adidas are moving to 3D modelling for their designs and they’re using Perforce to ensure those designs are protected and distributed across all their teams while allowing regional teams to create local variations to suit their market conditions – different materials, colours etc.
But that doesn’t mean to say all the skilled people they have want to be software engineers. Designers are highly skilled and creative people. They want to get on with their beautiful designs without impediment.
With that in mind, we worked with the design team and their preferred software partner to build a specific client for the designers. We had to avoid all the complicated versioning and configuration management terminology, that we all love as much as possible. You may be able to see in this example, that the initial view is more like some kind of catalog browser, in the details some of the items have version numbers and you can easily drill down and scroll through the history. Importantly, for this group, we had to support tagging and a rich search facility to make it easy for them to find their work.
If you would like to see more, you can download a variation of this client from the Perforce Workshop at workshop.perforce.com where it’s called “Piper”.
We continue to work with the adidas team to further embed the versioning into their environment ensuring we can get out of the way of the creative process.
I hope this gives your some idea of how we tool vendors and CM experts have to be aware of these new contributors and how their needs may be different to those we traditionally work with.
Think about all these different contributors that are involved in building modern products. They’ll have a wide variety of skills, few will know what version management or configuration management means let alone want to use them.
They want the freedom to do their work the way they want to …
The business need to trust them but also to be sure their valuable work is being protected. More on that later
If you are a software configuration manager or release manager, you may be very familiar with dealing with traditional software development tools like Visual Studio but these new contributors and content use tools which you may or may not be familiar with. Also, the tools have varying levels of integration with software versioning tools.
Choice is key.
This creative freedom is critical if the company is going to be able to deliver a system to protect their valuable assets in a way that will actually be used. Providing such systems – tools and processes – is where Configuration Manager can really influence the success of the business.
It’s hard enough to manager all these different people and assets in they’re in one place – it’s even harder to do when teams are distributed. That’s true if they’re on different floors of the same building, even worse if different continents. And you’ll have additional challenges of enforcing security and providing high performance for all these distributed teams.
So, as a Configuration Manager or Release Manager trying to bring together all the different components, how do you cope when there are so many different repositories?
By far, the simplest solution is to try to have as few repositories as possible, ideally only one. The configuration management or version management tool is probably the best place to be that “single source of truth” however I frequently hear about situations where this hasn’t been possible because
The VM tool can’t handle all the different file content or size of data (e.g. Git)
The VM tool isn’t particularly easy to use for non-coding staff (DropBox is the anti-example – very easy to use for non-tech people. Does your VM tool offer that kind of simplicity?)
The VM tool requires a certain workflow that doesn’t suit everyone
These are all important considerations when reviewing your tools and processes because it’s very expensive to do anything else …
You can probably guess the kinds of issues that can arise with a multitude of different tools and worflows
And we’re increasingly seeing a new problem this complexity is causing .
If the CM or DevOps team have difficulty working out the right set of bits to make a product, how hard can it be for people even more distant and, possibly, less technically aware such as CISO or Risk Managers to cope?
Once you accept that these digital or software assets are your company’s vital intellectual property, the archetypal “crown jewels”, then business ought to be taking all necessary steps to protect these vital assets. Traditionally though these roles have worried about firewalls, access control, password policies etc. but, mostly because they don’t understand the s/w development world, they haven’t paid much attention to the source control tools and processes being used. As a result, we see far too many organizations relying on aging and insecure tools such as CVS or SVN. Or they’re using off-site storage like GitHub or even Box/Dropbox with even less protection. They’ve got away with it because no one with the real responsibility has been paying attention.
So, we’ve looked at why configuration management and version management is a business-critical service and at some of the risks that exist. So let’s start looking at what the required capabilities are for a platform to address those issues.
Given these requirements, you should evaluate your current tools and processes to see if they are fit for purpose. Are they the right tools to make you the Super Hero?
I’m not sure what your preferred super power might be but x-ray vision is one that probably turns up pretty often – I’m not going to comment on why!
In terms of CM, I’m associating x-ray vision with having visibility into what your teams are doing.
How many places do you have to look to find out the state of any component, which components need to be brought together, what was included in those deliveries? As I discussed earlier, this is really hard or impossible to do safely or efficiently with multiple repositories.
You need a CM repository that is capable of handling ALL your digital assets, provides the performance your users need and support for the tools and workflows they need. If the tools don’t do that – they will be bypassed. Human being are pretty good of finding the path of least resistance, even if that contrary to company policy.
I’ve outlined the increasing and changing need for security. There is no single right answer – you need a multi-layer approach.
Some of the older tools have pretty weak security. Sadly, some of the new tools, like Git, have limitations (e.g. you can’t set file-level access control)
One thing that is clear is that tools that rely on manual tuning will rarely be reliable and will take a lot of maintenance. You need tools that are “self-learning” to be able to provide reliable monitoring.
Do you have any tools at the moment?
Telling the story in a short version:
Company came to us after they found 2 engineers had stolen a large amount of very high value data
The company spent a year and over a million dollars working with a traditional security player and were unable to deploy a tool that would have detected these two engineers - traditional approaches failed.
They were a Perforce customer and approached Interset and Perforce to see if analytics could be the answer.
The company sent us 30 days of log data, including the theft by these users, to see if we could surface the attack – and it was easy. In two weeks, we had clearly defined the attacks of these two users. We also discovered 11 others that were stealing from them – they had no idea. Three in North America – action has been taken against them – and eight in China currently being investigated.
It is important to understand that the company was struggling with two problems
They had no visibility into the activities in their Perforce environment. They could not see if users were stealing data – or any risky activity
The company had over 20K developers – that means the 30 days of logs contained millions of events that occurred in the Perforce system.
Behavioral analytics was able to collect, correlate, analyze and score all of those events. The result – the users doing bad things scored the highest risk…. And were found.
We can solve the really hard threat problems – we can see things other tools cannot see. How? ...
Other Anecdotes:
Four engineers from a large company over a period of time stole a lot of data when they left the company and went to a competitor. It took this company almost nine months to determine what had happened afterwards. They found out from a 3rd party, a partner of both companies, that suddenly some of the things started looking the same. They were seeing some of the same design specs and the same training manuals were showing up. When they went back and looked, sure enough they found out that this attack had taken place.
Helix Threat Detection could have discovered this suspicious activity while it was happening, tell you exactly who to looked at what, and what to worry about. And you can dig in and look at the activity in terms of a time period, from a person, or from the perspective of a project or file folder structure, and prevent the data theft.
A large company kept inactive projects accessible for a long time. An insider breached the source code of these open projects, and this was undiscovered until many years later when a similar game was observed running and available in China.
Helix Threat Detection scores Behavioral Risk using a set of models that takes into account four elements: the user (accounts in the Perforce logs); the activity (behaviors carried out by the user are compared to normal baselines); the asset or file (assets are files or source code projects); and methods. These are all types of behaviors that are built in the math models looking for risky actions.
The first risk element is USER: the machine learning in the system would increase the risk score for this developer based on the interactions with source code repositories (which are sensitive).
The second risk element is ACTIVITY: there are two factor in this example: “unusually large” is a volume measure from comparing current to historical activity or baseline, the second factor in this element comes from the system’s automatic generation of cluster maps, which compares John to other users in a cluster – in this case other developers, comparing John’s baseline to the baseline of his peers. Our system automatically “clusters” like users together and then baselines them to each other. In that way we catch a person “stealing” even if its part of their normal baseline activity since day one.
The third risk element is FILE: John works on source code, that file type would be automatically deemed risker then other file types since it’s more important.
The final risk element is METHOD: John is not checking back in the source code - that ratio of “take to give” would also calculate risk. We are comparing the amount of data checked out vs. the amount checked into Helix.
The behavioral analytics algorithms continually calculate these four risk elements as part of an overall behavioral risk score. This approach is unique to the Helix Threat Detection tool.
Helix Threat Detection uses behavioral analytics and machine learning algorithms on detailed user activity logs to first establish baseline behavior. For example, analytics models will look at an action (commit, sync, get) and the resource (folder, file, path), and it will take that information and cluster groups of people together (designates a project team). The Threat Detection Engine will figure this out on its own, and make updates based on continuously monitoring the Helix user activity and interaction with IP files.
Once enough data has been aggregated and analyzed and baseline behavior has been established, then advanced math analytics models are used to determine behavioral risk scores based on different ways that files or data is accessed by users. In addition to behavioral risk, an Entity Risk is also calculated which summarizes multiple behavioral risk scores and rolls them up into a single risk story & score.
Here you see 2 high-risk project stories identified by the red boxes, and their Risk scores. Helix Threat Detection is able to surface high-risk threats and truly anomalous behavior that and separate them out from lesser more normal baseline behavior.
Helix Threat Detection gives security teams new visibility into threats that they haven’t been able to previously achieve.
This diagram describes the data flow (inputs and outputs and how things work). Helix Threat Detection has 3 components (Connector, Agent, and Engine).
Fine-grained user activity is fed through a Threat Detection Connector to the Threat Detection Engine. There are 5 elements that are used from the logs (as shown). A month (or more) of historical data can be loaded into the engine to be aggregated and analyzed to establish patterns of baseline behavior.
The Threat Detection Engine continuously monitors all IP-related interactions with the Helix environment, applies the behavioral analysis models and entity-based risk scores to detect anomalous behavior and potential threats. The Threat Detection Engine proactively surfaces the highest risk data, devices, activity, and accounts (it’s possible for user accounts to be taken over or compromised). It can process tens of millions of events across tens of thousands of users to generate a prioritized list of the highest-risk accounts, data, and projects.
A GUI interface clearly identifies threats in plain English and there is also online reporting (automatic generation of executive summary reports).
A REST API that allows you to interface to other systems, to a SIEM or Security Operations Center. For example, if someone is taking Perforce log data and putting it into a SIEM, you can take the data out of the SIEM. Can either take the data directly or through other tools.
The Threat Detection Agent is optional. Once threats have been identified, then it’s possible to load a threat detection agent to monitor offline activities on the endpoint, which is sent as metadata to the Threat Detection Engine. The agent runs hidden - it’s obfuscated - and is available for Mac and Windows desktops and laptops.
Perforce has partnered with a company, Interset, that has advanced behavioral analytics models. Both companies worked to develop a joint solution that leverages the fine-grained log data in the Helix repository in combination with Interset’s advanced math risk scoring models and threat detection technology.
Perforce-specific analytics models and a data connector were created for this Helix Threat Detection solution that enables customers to quickly and accurately identify potential security breaches using existing data in Helix repositories.
An interactive graphical user interface is used to clearly identify potential security threats (which are color coded for risk severity) as well as the dominant risky behaviors.
Helix Log Data drives the Helix Threat Detection Analytics.
- Helix log data enables the analytics engine to compute risk scores using the 4 components described earlier, and applying that concept with over 20 different analytical models
The behavioral categories are:
Wandering – accounts that are accessing surprising content that they don’t normally access
Sneaking – accounts that are active outside of their normal working patterns
Mooching – accounts that are taking unexpectedly large amounts of content, while contributing very little
Hoarding – accounts that are taking more than expected, compared to others similar to them
Each category of behavior has multiple behavioral models rolling up into it. For example, wandering models include multiple models that determine when users access from abnormal projects. And sneaking models monitor when it is unusual for someone to be working, whether it is an abnormal time of day, or an abnormal day of the week.
In addition to measuring behavior risk, Interset also measures entity risk. This is an important set of models that helps focus attention on the most significant risks.
High Level
Helix Threat Detection surfaces your highest risk accounts (user information is anonymized to meet privacy concerns) and data assets like shares, folders, files and source code. The highest risk user accounts, machines, activities and time frames are all surfaced.
In Depth
Unlike current rule-based or simple threshold-based security solutions that require pre-configuration and that may generate thousands of alerts per day (e.g., >50, always tripped or never tripped), the Helix Threat Detection Engine overcomes “security alert noise”. It scores the threats using advanced probabilistic math to detect anomalous behavior and actual threats. This gives security teams a new visibility into threats they haven’t been able to get before.
Helix Threat Detection uses advanced behavioral analytics and machine learning to evaluate every event that occurs and applies a risk score to each. It then “connects the dots” of high risk events and surfaces the most important ones to take action on. What caused the risk, who is involved and what projects/files or other assets are at greatest risk are all clearly defined in easy to understand terms and with just one or two clicks.
In screen one, the highest risk account and projects are easily defined. Note that the data is anonymized for privacy protection.
With one click to screen two, the interactions between accounts and projects are defined. Clicking to expand a story brings you to screens 3 and 4 to quickly see the details of the risky actions.
Easy to understand language explains exactly what happened to what project, and who was involved and when.
Training: Online instructor-led training courses, run out of US and UK every month.