3. Company Overview
Wholly-owned subsidiary
of The Boeing Company
•
Cybersecurity & R&D software
company based in Silicon Valley Sunnyvale, CA
Innovative technology
protected by a broad
IP portfolio
•
Focused on fusing semantic and
data planes, applying it to
cybersecurity and risk management
•
Making sense of physical, content,
and social networks
Established customer and
partner base
3
8. Narus nSystem
Comprehensive & Adaptive Analytics To Enhance Cybersecurity and Protect
Critical Assets with Machine Learning
nAnalytics
•
•
Single UI with interactive dashboards offer multidimensional views of cyber activity
‒ Network, Semantic & User Analytics
‒ Targeted Session Captures
Advanced analytics for automated data fusion with
machine learning
nProcessing
•
•
•
•
Centralized scalable data processing & storage
framework
Automated ability to deal with petabytes of data
Support for streaming, query-based and big-data
analytics
Machine learning applied to large volumes of data
nCapture
•
•
•
•
Architected for distribution at multiple sites & links
‒ 100% of packets examined, metadata with
necessary session fidelity
Plugins to assimilate data from heterogeneous sources
Precision targeted full packet capture
Support for 20G (duplex 10G) per-link, path to 100G
8
11. Key Challenges
• Increasing network traffic
– Line speeds from 20 Gbps to 600 Gbps and above
– 210 TB to 6.3 PB Per Day
• Diversity of deployments
– Data rates, vertical application areas, SLA, price points:
everything is a variable
• Operational issues
– Datacenter connectivity
– Burstiness of network traffic
• Data Security
11
12. Lessons Learned/Solutions
• Extract and store all metadata and provide full packets as identified by
the analyst
– 90% reduction in data volume
• Use domain knowledge for message compression
– Short codes for enumerated values (mobile apps, protocols, etc.)
– Session associations to eliminate referential fields
• Hbase over HDFS – provides abstractions useful for modelling
dynamic schema
• Off load CPU work to special purpose co-processors to accelerate
performance
12
13. Lessons Learned/Solutions
• Relational databases are not evil
– Believe it or not, relational algebra is quite powerful
– We use it for fast, in-memory computations in combination with
Java code for processing rule sets
• SQL interfaces on HDFS/Hbase are catching up
Analytics Data Store Performance
mySQL Cluster
Avg. Query Processing Time
7
6
5
Impala
4
3
2
Big SQL
1
0
10
20
Big SQL
Impala
Database Size
50
mySQL Cluster
13
14. Business Considerations
• Optimizing Total Cost of Ownership (TCO)
– System acquisition
– Data center costs
– Administration and maintenance
• Analytics development and skillset required
• Global support
14
15. Data Warehousing Vs Hadoop
Source: “Big Data: What does it really cost?” By Winter Corp
15
16. Summary
• We blend network, semantic, and user-oriented views to create unique
insights
– Data Loss Prevention
– Threat Detection
– Network pattern mining
• Real Time & At-Rest Analytics
– Stateless analysis and short term trends and classification
– At-Rest analysis for training models, opportunistic correlations, and
mega-trends
• Hybrid Approach
– Hadoop/Hbase for horizontal scaling and cost-effective storage and
processing of massive data sets
– Relational databases for creating efficient business intelligence views
16
Editor's Notes
Narus is a wholly-owned subsidiary of The Boeing CompanyWe are a software company, focused in cybersecurity and research and development. We are based in Silicon Valley.In the past few years, we’ve all seen how the web has changed from being static (what we call 1.0), serving up pages, to becoming the semantic web, where data has meaning and context. We are seeing a parallel evolution in approaches to cybersecurity. Narus is pioneering this new approach, termed Cyber 3.0, to address cybersecurity and risk management. We will discuss Cyber 3.0 in more detail later.One of the key elements of Cyber 3.0 is machine learning which speeds up and lowers the cost of problem solving. Narus’ solution incorporates big data analytics to handle huge volume of diverse set of information collected from a wide range of devices and incorporates sentiment analysis – to truly understand context and what is going on in your network and what the users are doing.
While there are lots of notes on this slide – the goal is to provide examples for the customer, talk about how in a consumer world, you start by reading content (for example manuals posted on web sites) to where now based on one choice you have made, other recommendations are given to you (shopping recommendations on Amazon, for example) and how with more context (like your age, your birthday etc.,) those choices can be refined further – that is what constitutes the semantic web and Intelligent Cyber. Let’s start the journey with Web 1.0, the Static Web. Remember during the early days, we had read-only content and static HTML websites. Information was served to us. In the areas of cyber, where we conduct business over the Internet, we start with Cyber 1.0, what we call “Siloed Cyber”. During this phase, we had large volumes of data, not as much as we see today, but the information tend to be homogeneous (e.g. corporate data with credit card info, customer info). The data tend to be “siloed”, collected and managed by individual organizations (e.g. sales, vs marketing vs support). Data is analyzed on demand. We had limited numbers of applications and protocols. We had powerpoint, email, excel, word. Resources resided mostly with IT and not with the organizations that are trying to accomplish specific missions. We relied heavily on human analysts to analyze data and bring meaning and context to the data. We then enter the phase of Web 2.0, or the Social Web, with the introduction of Facebook, Twitter, LinkedIn, Instagram, where users and communities now generate content. We have the read-write web. Posting content on the web is not limited to the few authorized personnel and is not controlled by IT. This led to Cyber 2.0, the “Integrated Cyber”. There is now higher volumes of data on faster networks to be dealt with. With the proliferation of devices, we are also seeing a huge growth of applications and protocols. With Facebook, Twitter and the other social networking applications, people are now connecting with each other and sharing content. With increased complexity, we are still mostly relying on the highly-skilled but scarce analysts to look at the data, extract information and bring context to it. With the increased variety of interactions (applications, devices, people), there is looser control, cyber criminals are taking advantage of the situation and are unleashing new cyber threats.We are now entering the era of “Semantic Web” – we are adding “context” to data and the internet traffic based on superior understanding of “relationships” within the data such as who created the information, who receive the information; when was the information created, where did it come from, where did it go. With changes brought on by “Web 3.0”, conducting business in the cyber world just got even more complex. This leads to the challenges we face today “Cyber 3.0” – “Intelligent Cyber”. Besides high volume of data and traveling at high velocity, we also have to deal with the variety of data available (not just the Microsoft Suite, we also have data generated from social networking, mobile applications. People are now hyperconnected and hyper interactive. Just think about how many devices you use today and how often you interact with your applications. This has led to the need for organizations to automate alignment of resources and missions, placing resources outside of IT where it’s needed. With the added complexity, automating processes and reliance on machine learning to gain more intelligence and context of communications become prevalent. Let’s look at some of these concepts in more detail.
In addition to the changing dynamic of how content is generated and consumed, there are other dynamics that have changed the very nature of networks and how we address cyber security.In 2012, daily traffic volume was 10+ petabytes per day. By 205, it’s going to double to 20.33 petabytes per day. Now, on average, each person uses 2.5 devices and by 2016, we will be generating 19 billion connections by 2016.On the Velocity front – the rate at which information is flowing through the network (for example how quickly you can download a movie from Netflix) we are seeing the adoption of fixed line speeds growing from 1 Gb to 100 Gb.And the variety of devices, applications, operating systems, network topologies has a significant impact on cyber security.Did you know that there is more than 1.5 million applications in the Android and Apple Stores?Types of data we have to discover, analyze and track has grown. In the past, enterprises mostly dealt with corporate data which are mostly text and numbers stored in relational data bases in nice rows and columns, we now have to add in voice and video data; more protocols and network types (local, cloud, virtualized and hybrid).
Today we are addressing new problems using old approaches and tools. VisibilityOrganizations are having a very difficult time keeping pace with the unprecedented and unpredictable emergence of new applications, protocols, hosts and users on their networks. It’s like operating in the dark. ContextTo investigate the impact and root-cause of cyber threats and security breaches requires the understanding of the “context” of the communications. We traditionally relied on highly-skilled, scarce and highly-paid analysts to digest the overwhelming amount of data. Besides being costly, it also took a lot of time, while a cyber attack could be causing havoc in your network. ControlIn recent headlines, we read about theft of intellectual property, attack on networks, financial theft over the internet. Organizations are losing the security battle despite major investments in network security and information security. The static approach is not sufficient to solve today’s dynamic problems. We all have lots of tools however, we still can’t enforce the more granular policies that match the mission and allow tighter control over networks, which ensures that the resources are aligned with your mission or business goals.
In order to analyze the data that is streaming through the network, Narus has developed a methodology to disaggregate unique dimensions, so they can be processed and managed independently. The value of deriving these dimensions is in the analytics, where we fuse these dimensions together to paint a complete picture and provide the complete visibility that security organizations need.The unique dimensions are categorized into 3 different "planes."The network plane consists of information about devices (brand, type and operating system) and hosts (client, server, applications, protocols and services). The semantic plane consists of content, topics, trends, communities and locations. The user plane consists of presence, profiles, identities, associations and relationships related to users. Narus solutions automate the understanding of each of these planes, identify the context of the interactions, and aggregate data across the planes to deliver incisive intelligence.
Better visibility and control over your network, allowing you to proactively defend against new threats with machine-based intelligence for certainty in a complex world