SlideShare a Scribd company logo
1 of 47
OpenSOC
The Open Security Operations
Center
for
Analyzing 1.2 Million Network Packets per Second
in Real TimeJames Sirota,
Big Data Architect
Cisco Security Solutions Practice
jsirota@cisco.com
Sheetal Dolas
Principal Architect
Hortonworks
sheetal@hortonworks.com
June 3, 2014
Cisco Confidential 2© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Problem Statement & Business Case for OpenSOC
 Solution Architecture and Design
 Best Practices and Lessons Learned
 Q & A
Over Next Few Minutes
Cisco Confidential 3© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Business Case
Cisco Confidential 4© 2013-2014 Cisco and/or its affiliates. All rights reserved.
fatalism:
It's no longer if or when you get hacked,
but the assumption is that you've
already been hacked,
with a focus on minimizing the
damage.”
Source: Dark Reading / Security’s New
Reality: Assume The Worst
Cisco Confidential 5© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Breaches Happen in Hours…
But Go Undetected for Months or Even Years
Source: 2013 Data Breach Investigations
Report
Seconds Minutes Hours Days Weeks Months Years
Initial Attack to Initial
Compromise
10% 75% 12% 2% 0% 1% 1%
Initial Compromise
to Data Exfiltration
8% 38% 14% 25% 8% 8% 0%
Initial Compromise
to Discovery
0% 0% 2% 13% 29% 54% 2%
Discovery to
Containment/
Restoration 0% 1% 9% 32% 38% 17% 4%
Timespan of events by percent of breaches
In 60% of
breaches, data
is stolen in hours
54% of breaches
are not discovered for
months
Cisco Confidential 6© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Cisco Global Cloud Index
Source: 2014 Cisco Global Cloud Index
Cisco Confidential 7© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Introducing OpenSOC
Intersection of Big Data and Security Analytics
Multi Petabyte Storage
Interactive Query
Real-Time Search
Scalable Stream
Processing
Unstructured Data
Data Access Control
Scalable Compute
OpenSOC
Real-Time Alerts
Anomaly Detection
Data
Correlation
Rules and Reports
Predictive Modeling
UI and Applications
Big Data
Platform
Hadoop
Cisco Confidential 8© 2013-2014 Cisco and/or its affiliates. All rights reserved.
OpenSOC Journey
Sept 2013
First Prototype
Dec 2013
Hortonworks
joins the
project
March 2014
Platform
development
finished
Sept 2014
General
Availability
May 2014
CR Work off
April 2014
First beta test
at customer
site
Cisco Confidential 9© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Solution Architecture &
Design
Cisco Confidential 10© 2013-2014 Cisco and/or its affiliates. All rights reserved.
OpenSOC Conceptual Architecture
Raw Network Stream
Network Metadata
Stream
Netflow
Syslog
Raw Application Logs
Other Streaming
Telemetry
HiveHBase
Raw Packet
Store
Long-Term
Store
Elastic Search
Real-Time
Index
Network Packet
Mining and
PCAP
Reconstruction
Log Mining and
Analytics
Big Data
Exploration,
Predictive
Modeling
Applications + Analyst Tools
Parse+Format
Enrich
Alert
Threat Intelligence
Feeds
Enrichment Data
Cisco Confidential 11© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Raw Network Packet Capture, Store, Traffic Reconstruction
 Telemetry Ingest, Enrichment and Real-Time Rules-Based
Alerts
 Real-Time Telemetry Search and Cross-Telemetry Matching
 Automated Reports, Anomaly Detection and Anomaly
Alerts
Key Functional Capabilities
Cisco Confidential 12© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Fully-Backed by Cisco and Used Internally for Multiple
Customers
 Free, Open Source and Apache Licensed
 Built on Highly-Scalable and Proven Platforms (Hadoop,
Kafka, Storm)
 Extensible and Pluggable Design
 Flexible Deployment Model (On-Premise or Cloud)
 Centralize your processes, people and data
The OpenSOC Advantage
Cisco Confidential 13© 2013-2014 Cisco and/or its affiliates. All rights reserved.
OpenSOC Deployment at Cisco
Hardware footprint (40u)
 14 Data Nodes (UCS C240 M3)
 3 Cluster Control Nodes (UCS C220
M3)
 2 ESX Hypervisor Hosts (UCS C220
M3)
 1 PCAP Processor (UCS C220 M3 +
Napatech NIC)
 2 SourceFire Threat alert processors
 1 Anue Network Traffic splitter
 1 Router
 1 48 Port 10GE Switch
Software Stack
HDP 2.1
Kafka 0.8
Elastic Search 1.1
MySQL 5.5
Cisco Confidential 14© 2013-2014 Cisco and/or its affiliates. All rights reserved.
OpenSOC - Stitching Things Together
AccessMessaging SystemData CollectionSource Systems StorageReal Time Processing
StormKafka
B Topic
N Topic
Elastic
Search
Index
Web
Services
Search
PCAP
Reconstruction
HBase
PCAP Table
Analytic
Tools
R / Python
Power Pivot
Tableau
Hive
Raw Data
ORC
Passive
Tap
PCAP Topic
DPI Topic
A Topic
Telemetry
Sources
Syslog
HTTP
File System
Other
Flume
Agent A
Agent B
Agent N
B Topology
N Topology
A Topology
PCAP
Traffic
Replicato
r
PCAP
Topology
DPI Topology
Cisco Confidential 15© 2013-2014 Cisco and/or its affiliates. All rights reserved.
OpenSOC - Stitching Things Together
AccessMessaging SystemData CollectionSource Systems StorageReal Time Processing
StormKafka
B Topic
N Topic
Elastic
Search
Index
Web
Services
Search
PCAP
Reconstruction
HBase
PCAP Table
Analytic
Tools
R / Python
Power Pivot
Tableau
Hive
Raw Data
ORC
Passive
Tap
PCAP Topic
DPI Topic
A Topic
Telemetry
Sources
Syslog
HTTP
File System
Other
Flume
Agent A
Agent B
Agent N
B Topology
N Topology
A Topology
PCAP
Traffic
Replicato
r
Deeper
Look
PCAP
Topology
DPI Topology
Cisco Confidential 16© 2013-2014 Cisco and/or its affiliates. All rights reserved.
PCAP Topology
StorageReal Time Processing
Storm
Elastic Search
Index
HBase
PCAP Table
Hive
Raw Data
ORC
Kafka
Spout
Parse
r
Bolt
HDFS
Bolt
HBas
e
Bolt
ES
Bolt
Cisco Confidential 17© 2013-2014 Cisco and/or its affiliates. All rights reserved.
DPI Topology & Telemetry Enrichment
StorageReal Time Processing
Storm
Elastic Search
Index
HBase
PCAP Table
Hive
Raw Data
ORC
Kafka
Spout
Parse
r Bolt
GEO
Enric
h
Whoi
s
Enric
h
CIF
Enric
h
HDF
S
Bolt
ES
Bolt
Cisco Confidential 18© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Enrichments
Parse
r
Bolt
GEO
Enrich
RAW
Message
{
“msg_key1”: “msg value1”,
“src_ip”: “10.20.30.40”,
“dest_ip”: “20.30.40.50”,
“domain”: “mydomain.com”
}
Who
Is
Enrich
"geo":[ {"region":"CA",
"postalCode":"95134",
"areaCode":"408",
"metroCode":"807",
"longitude":-121.946,
"latitude":37.425,
"locId":4522,
"city":"San Jose",
"country":"US"
}]
CIF
Enrich
"whois":[ {
"OrgId":"CISCOS",
"Parent":"NET-144-0-0-0-0",
"OrgAbuseName":"Cisco Systems Inc",
"RegDate":"1991-01-171991-01-17",
"OrgName":"Cisco Systems",
"Address":"170 West Tasman Drive",
"NetType":"Direct Assignment"
} ],
“cif”:”Yes”
Enriched
Message
Cache
MySQL
Geo Lite
Data
Cache
HBase
Who Is Data
Cache
HBase
CIF Data
Cisco Confidential 19© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Applications: Telemetry Matching and DPI
Step1: Search
Step2: Match
Step3: Analyze
Step4: Build PCAP
Cisco Confidential 20© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Integration with Analytics Tools
Dashboards Reports
Cisco Confidential 21© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Best Practices
and
Lessons Learned
Cisco Confidential 22© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Journey Towards Highly
Scalable Application
Cisco Confidential 23© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Kafka Tuning
Cisco Confidential 24© 2013-2014 Cisco and/or its affiliates. All rights reserved.
This is where we began
Cisco Confidential 25© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Some code optimizations and increased
parallelism
Cisco Confidential 26© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Is Disk I/O heavy
 Kafka 0.8+ supports replication and JBOD
 Better performance compared to RAID
 Parallelism is largely driven by number of disks and partitions per topic
 Key configuration parameters:
 num.io.threads - Keep it at least equal to number of disks provided to
Kafka
 num.network.threads - adjust it based on number of concurrent
producers, consumers and replication factor
Kafka Tuning
Cisco Confidential 27© 2013-2014 Cisco and/or its affiliates. All rights reserved.
After Kafka Tuning
Cisco Confidential 28© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Bottleneck Isolation, Resource Profiling,
Load Balancing
Cisco Confidential 29© 2013-2014 Cisco and/or its affiliates. All rights reserved.
HBase Tuning
Cisco Confidential 30© 2013-2014 Cisco and/or its affiliates. All rights reserved.
This is where we began
Cisco Confidential 31© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Row Key design is critical (gets or scans or both?)
 Keys with IP Addresses
 Standard IP addresses have only two variations of the first character : 1 & 2
 Minimum key length will be 7 characters and max 15 with a typical average of 12
 Subnet range scans become difficult – range of 90 to 220 excludes 112
 IP converted to hex (10.20.30.40 => 0a141e28)
 gives 16 variations of first key character
 consistently 8 character key
 Easy to search for subnet ranges
Row Key Design
Cisco Confidential 32© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Experiments with Row Key
Cisco Confidential 33© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Know your data
 Auto split under high workload can result into hotspots and split storms
 Understand your data and presplit the regions
 Identify how many regions a RS can have to perform optimally. Use the
formula below
(RS memory)*(total memstore fraction)/((memstore size)*(# column families))
Region Splits
Cisco Confidential 34© 2013-2014 Cisco and/or its affiliates. All rights reserved.
With Region Pre-Splits
Cisco Confidential 35© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Enable Micro Batching (client side buffer)
 Smart shuffle/grouping in storm
 Understand your data and situationally exploit various WAL options
 Watch for many minor compactions
 For heavy ‘write’ workload Increase hbase.hstore.blockingStoreFiles (we
used 200)
Know Your Application
Cisco Confidential 36© 2013-2014 Cisco and/or its affiliates. All rights reserved.
And Finally
Cisco Confidential 37© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Kafka Spout
Cisco Confidential 38© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Parallelism is controlled by number of partitions per topic
 Set Kafka spout parallelism equal to number of partitions in
topic
 Other key parameters that drive performance
 fetchSizeBytes
 bufferSizeBytes
Kafka Spout
Cisco Confidential 39© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Mysteriously Missing Data
Cisco Confidential 40© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 A bug in Kafka spout that used to miss out some partitions
and loose data
 It is now fixed and available from Hortonworks repository (
http://repo.hortonworks.com/content/repositories/releases/org/apache/
storm/storm-Kafka )
Mysteriously Missing Data Root Cause
Cisco Confidential 41© 2013-2014 Cisco and/or its affiliates. All rights reserved.
Storm
Cisco Confidential 42© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Every small thing counts at scale
 Even simple string operations can slowdown throughput when
executed on millions of Tuples
Storm
Cisco Confidential 43© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Error handling is critical
 Poorly handled errors can lead to topology failure and eventually loss
of data (or data duplication)
Storm
Cisco Confidential 44© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 Tune & Scale individual spout and bolts before performance
testing/tuning entire topology
 Write your own simple data generator spouts and no-op bolts
 Making as many things configurable as possible helps a lot
Storm
Cisco Confidential 47© 2013-2014 Cisco and/or its affiliates. All rights reserved.
 When it comes to Hadoop…partner up
 Separate the hype from the opportunity
 Start small then scale up
 Design Iteratively
 It doesn’t work unless you have proven it at scale
 Keep an eye on ROI
Lessons Learned
Cisco Confidential 48© 2013-2014 Cisco and/or its affiliates. All rights reserved.
How can you contribute?
 Technology Partner Program – contribute developers to
join the Cisco and Hortonworks team
Looking for Community Partners
Cisco + Hortonworks + Community Support for OpenSOC
Thank you!
We are hiring:
jsirota@cisco.com
sheetal@hortonworks.com

More Related Content

What's hot

ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)NAVER D2
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at AirbnbDatabricks
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowSid Anand
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak PerformanceTodd Palino
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
 

What's hot (20)

ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
 
TiDB Introduction
TiDB IntroductionTiDB Introduction
TiDB Introduction
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at Airbnb
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Fast analytics kudu to druid
Fast analytics  kudu to druidFast analytics  kudu to druid
Fast analytics kudu to druid
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 

Viewers also liked

Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
 
Security Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SMESecurity Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SMEAlienVault
 
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...Building a Next-Generation Security Operation Center Based on IBM QRadar and ...
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...IBM Security
 
Building Security Operation Center
Building Security Operation CenterBuilding Security Operation Center
Building Security Operation CenterS.E. CTS CERT-GOV-MD
 
Security Operation Center - Design & Build
Security Operation Center - Design & BuildSecurity Operation Center - Design & Build
Security Operation Center - Design & BuildSameer Paradia
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsBigPanda
 
SOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations CenterSOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations CenterMichael Nickle
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCSheetal Dolas
 
Chapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionChapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionKhalid Elshafie
 
An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)Ahmad Haghighi
 
DTS Solution - Building a SOC (Security Operations Center)
DTS Solution - Building a SOC (Security Operations Center)DTS Solution - Building a SOC (Security Operations Center)
DTS Solution - Building a SOC (Security Operations Center)Shah Sheikh
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Network Traffic Search using Apache HBase
Network Traffic Search using Apache HBaseNetwork Traffic Search using Apache HBase
Network Traffic Search using Apache HBaseEvans Ye
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Alex Pinto
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRyan Bosshart
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...Alex Pinto
 

Viewers also liked (20)

Cisco OpenSOC
Cisco OpenSOCCisco OpenSOC
Cisco OpenSOC
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Security Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SMESecurity Operations Center (SOC) Essentials for the SME
Security Operations Center (SOC) Essentials for the SME
 
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...Building a Next-Generation Security Operation Center Based on IBM QRadar and ...
Building a Next-Generation Security Operation Center Based on IBM QRadar and ...
 
Building Security Operation Center
Building Security Operation CenterBuilding Security Operation Center
Building Security Operation Center
 
Security Operation Center - Design & Build
Security Operation Center - Design & BuildSecurity Operation Center - Design & Build
Security Operation Center - Design & Build
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
 
SOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations CenterSOC presentation- Building a Security Operations Center
SOC presentation- Building a Security Operations Center
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
 
Chapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionChapter 10 Anomaly Detection
Chapter 10 Anomaly Detection
 
An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)An introduction to SOC (Security Operation Center)
An introduction to SOC (Security Operation Center)
 
DTS Solution - Building a SOC (Security Operations Center)
DTS Solution - Building a SOC (Security Operations Center)DTS Solution - Building a SOC (Security Operations Center)
DTS Solution - Building a SOC (Security Operations Center)
 
Anomaly Detection
Anomaly DetectionAnomaly Detection
Anomaly Detection
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Network Traffic Search using Apache HBase
Network Traffic Search using Apache HBaseNetwork Traffic Search using Apache HBase
Network Traffic Search using Apache HBase
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Csirt
CsirtCsirt
Csirt
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLib
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 

Similar to Analyzing 1.2 Million Network Packets per Second in Real-time

Cisco UCS Application acceleration data optimization
Cisco UCS Application acceleration data optimizationCisco UCS Application acceleration data optimization
Cisco UCS Application acceleration data optimizationsolarisyougood
 
CONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromCONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromPROIDEA
 
Model driven telemetry
Model driven telemetryModel driven telemetry
Model driven telemetryCisco Canada
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsCisco Canada
 
IPv6IntegrationBestPracticesfinal.pdf
IPv6IntegrationBestPracticesfinal.pdfIPv6IntegrationBestPracticesfinal.pdf
IPv6IntegrationBestPracticesfinal.pdfCPUHogg
 
[Cisco Connect 2018 - Vietnam] Eric rennie sw cisco_connect
[Cisco Connect 2018 - Vietnam] Eric rennie  sw cisco_connect[Cisco Connect 2018 - Vietnam] Eric rennie  sw cisco_connect
[Cisco Connect 2018 - Vietnam] Eric rennie sw cisco_connectNur Shiqim Chok
 
IPv6 Security - Myths and Reality
IPv6 Security - Myths and RealityIPv6 Security - Myths and Reality
IPv6 Security - Myths and RealitySwiss IPv6 Council
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...Indonesia Network Operators Group
 
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...NetworkCollaborators
 
Fb i pv6-sparchimanv1.0
Fb i pv6-sparchimanv1.0Fb i pv6-sparchimanv1.0
Fb i pv6-sparchimanv1.0Fred Bovy
 
Elastic Cloud Enterprise @ Cisco
Elastic Cloud Enterprise @ CiscoElastic Cloud Enterprise @ Cisco
Elastic Cloud Enterprise @ CiscoElasticsearch
 
Security and Virtualization in the Data Center
Security and Virtualization in the Data CenterSecurity and Virtualization in the Data Center
Security and Virtualization in the Data CenterCisco Canada
 
Cisco Connect 2018 Philippines - software-defined access-a transformational ...
 Cisco Connect 2018 Philippines - software-defined access-a transformational ... Cisco Connect 2018 Philippines - software-defined access-a transformational ...
Cisco Connect 2018 Philippines - software-defined access-a transformational ...NetworkCollaborators
 
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...Cisco Russia
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasBruno Teixeira
 
MongoDB.local Dallas 2019: MongoDB and Spark
MongoDB.local Dallas 2019: MongoDB and SparkMongoDB.local Dallas 2019: MongoDB and Spark
MongoDB.local Dallas 2019: MongoDB and SparkMongoDB
 

Similar to Analyzing 1.2 Million Network Packets per Second in Real-time (20)

Cisco UCS Application acceleration data optimization
Cisco UCS Application acceleration data optimizationCisco UCS Application acceleration data optimization
Cisco UCS Application acceleration data optimization
 
CONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin NystromCONFidence2015: Real World Threat Hunting - Martin Nystrom
CONFidence2015: Real World Threat Hunting - Martin Nystrom
 
Model driven telemetry
Model driven telemetryModel driven telemetry
Model driven telemetry
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data Analytics
 
IPv6IntegrationBestPracticesfinal.pdf
IPv6IntegrationBestPracticesfinal.pdfIPv6IntegrationBestPracticesfinal.pdf
IPv6IntegrationBestPracticesfinal.pdf
 
[Cisco Connect 2018 - Vietnam] Eric rennie sw cisco_connect
[Cisco Connect 2018 - Vietnam] Eric rennie  sw cisco_connect[Cisco Connect 2018 - Vietnam] Eric rennie  sw cisco_connect
[Cisco Connect 2018 - Vietnam] Eric rennie sw cisco_connect
 
IPv6 Security - Myths and Reality
IPv6 Security - Myths and RealityIPv6 Security - Myths and Reality
IPv6 Security - Myths and Reality
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...
08 (IDNOG02) SP Transition to NG Infrastructure based on NFV Service Offering...
 
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014
 
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...Cisco Connect 2018 Indonesia -  software-defined access-a transformational ap...
Cisco Connect 2018 Indonesia - software-defined access-a transformational ap...
 
Fb i pv6-sparchimanv1.0
Fb i pv6-sparchimanv1.0Fb i pv6-sparchimanv1.0
Fb i pv6-sparchimanv1.0
 
Elastic Cloud Enterprise @ Cisco
Elastic Cloud Enterprise @ CiscoElastic Cloud Enterprise @ Cisco
Elastic Cloud Enterprise @ Cisco
 
Security and Virtualization in the Data Center
Security and Virtualization in the Data CenterSecurity and Virtualization in the Data Center
Security and Virtualization in the Data Center
 
Cisco Connect 2018 Philippines - software-defined access-a transformational ...
 Cisco Connect 2018 Philippines - software-defined access-a transformational ... Cisco Connect 2018 Philippines - software-defined access-a transformational ...
Cisco Connect 2018 Philippines - software-defined access-a transformational ...
 
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...
Как развернуть кампусную сеть Cisco за 10 минут? Новые технологии для автомат...
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
 
MongoDB.local Dallas 2019: MongoDB and Spark
MongoDB.local Dallas 2019: MongoDB and SparkMongoDB.local Dallas 2019: MongoDB and Spark
MongoDB.local Dallas 2019: MongoDB and Spark
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

Analyzing 1.2 Million Network Packets per Second in Real-time

  • 1. OpenSOC The Open Security Operations Center for Analyzing 1.2 Million Network Packets per Second in Real TimeJames Sirota, Big Data Architect Cisco Security Solutions Practice jsirota@cisco.com Sheetal Dolas Principal Architect Hortonworks sheetal@hortonworks.com June 3, 2014
  • 2. Cisco Confidential 2© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Problem Statement & Business Case for OpenSOC  Solution Architecture and Design  Best Practices and Lessons Learned  Q & A Over Next Few Minutes
  • 3. Cisco Confidential 3© 2013-2014 Cisco and/or its affiliates. All rights reserved. Business Case
  • 4. Cisco Confidential 4© 2013-2014 Cisco and/or its affiliates. All rights reserved. fatalism: It's no longer if or when you get hacked, but the assumption is that you've already been hacked, with a focus on minimizing the damage.” Source: Dark Reading / Security’s New Reality: Assume The Worst
  • 5. Cisco Confidential 5© 2013-2014 Cisco and/or its affiliates. All rights reserved. Breaches Happen in Hours… But Go Undetected for Months or Even Years Source: 2013 Data Breach Investigations Report Seconds Minutes Hours Days Weeks Months Years Initial Attack to Initial Compromise 10% 75% 12% 2% 0% 1% 1% Initial Compromise to Data Exfiltration 8% 38% 14% 25% 8% 8% 0% Initial Compromise to Discovery 0% 0% 2% 13% 29% 54% 2% Discovery to Containment/ Restoration 0% 1% 9% 32% 38% 17% 4% Timespan of events by percent of breaches In 60% of breaches, data is stolen in hours 54% of breaches are not discovered for months
  • 6. Cisco Confidential 6© 2013-2014 Cisco and/or its affiliates. All rights reserved. Cisco Global Cloud Index Source: 2014 Cisco Global Cloud Index
  • 7. Cisco Confidential 7© 2013-2014 Cisco and/or its affiliates. All rights reserved. Introducing OpenSOC Intersection of Big Data and Security Analytics Multi Petabyte Storage Interactive Query Real-Time Search Scalable Stream Processing Unstructured Data Data Access Control Scalable Compute OpenSOC Real-Time Alerts Anomaly Detection Data Correlation Rules and Reports Predictive Modeling UI and Applications Big Data Platform Hadoop
  • 8. Cisco Confidential 8© 2013-2014 Cisco and/or its affiliates. All rights reserved. OpenSOC Journey Sept 2013 First Prototype Dec 2013 Hortonworks joins the project March 2014 Platform development finished Sept 2014 General Availability May 2014 CR Work off April 2014 First beta test at customer site
  • 9. Cisco Confidential 9© 2013-2014 Cisco and/or its affiliates. All rights reserved. Solution Architecture & Design
  • 10. Cisco Confidential 10© 2013-2014 Cisco and/or its affiliates. All rights reserved. OpenSOC Conceptual Architecture Raw Network Stream Network Metadata Stream Netflow Syslog Raw Application Logs Other Streaming Telemetry HiveHBase Raw Packet Store Long-Term Store Elastic Search Real-Time Index Network Packet Mining and PCAP Reconstruction Log Mining and Analytics Big Data Exploration, Predictive Modeling Applications + Analyst Tools Parse+Format Enrich Alert Threat Intelligence Feeds Enrichment Data
  • 11. Cisco Confidential 11© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Raw Network Packet Capture, Store, Traffic Reconstruction  Telemetry Ingest, Enrichment and Real-Time Rules-Based Alerts  Real-Time Telemetry Search and Cross-Telemetry Matching  Automated Reports, Anomaly Detection and Anomaly Alerts Key Functional Capabilities
  • 12. Cisco Confidential 12© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Fully-Backed by Cisco and Used Internally for Multiple Customers  Free, Open Source and Apache Licensed  Built on Highly-Scalable and Proven Platforms (Hadoop, Kafka, Storm)  Extensible and Pluggable Design  Flexible Deployment Model (On-Premise or Cloud)  Centralize your processes, people and data The OpenSOC Advantage
  • 13. Cisco Confidential 13© 2013-2014 Cisco and/or its affiliates. All rights reserved. OpenSOC Deployment at Cisco Hardware footprint (40u)  14 Data Nodes (UCS C240 M3)  3 Cluster Control Nodes (UCS C220 M3)  2 ESX Hypervisor Hosts (UCS C220 M3)  1 PCAP Processor (UCS C220 M3 + Napatech NIC)  2 SourceFire Threat alert processors  1 Anue Network Traffic splitter  1 Router  1 48 Port 10GE Switch Software Stack HDP 2.1 Kafka 0.8 Elastic Search 1.1 MySQL 5.5
  • 14. Cisco Confidential 14© 2013-2014 Cisco and/or its affiliates. All rights reserved. OpenSOC - Stitching Things Together AccessMessaging SystemData CollectionSource Systems StorageReal Time Processing StormKafka B Topic N Topic Elastic Search Index Web Services Search PCAP Reconstruction HBase PCAP Table Analytic Tools R / Python Power Pivot Tableau Hive Raw Data ORC Passive Tap PCAP Topic DPI Topic A Topic Telemetry Sources Syslog HTTP File System Other Flume Agent A Agent B Agent N B Topology N Topology A Topology PCAP Traffic Replicato r PCAP Topology DPI Topology
  • 15. Cisco Confidential 15© 2013-2014 Cisco and/or its affiliates. All rights reserved. OpenSOC - Stitching Things Together AccessMessaging SystemData CollectionSource Systems StorageReal Time Processing StormKafka B Topic N Topic Elastic Search Index Web Services Search PCAP Reconstruction HBase PCAP Table Analytic Tools R / Python Power Pivot Tableau Hive Raw Data ORC Passive Tap PCAP Topic DPI Topic A Topic Telemetry Sources Syslog HTTP File System Other Flume Agent A Agent B Agent N B Topology N Topology A Topology PCAP Traffic Replicato r Deeper Look PCAP Topology DPI Topology
  • 16. Cisco Confidential 16© 2013-2014 Cisco and/or its affiliates. All rights reserved. PCAP Topology StorageReal Time Processing Storm Elastic Search Index HBase PCAP Table Hive Raw Data ORC Kafka Spout Parse r Bolt HDFS Bolt HBas e Bolt ES Bolt
  • 17. Cisco Confidential 17© 2013-2014 Cisco and/or its affiliates. All rights reserved. DPI Topology & Telemetry Enrichment StorageReal Time Processing Storm Elastic Search Index HBase PCAP Table Hive Raw Data ORC Kafka Spout Parse r Bolt GEO Enric h Whoi s Enric h CIF Enric h HDF S Bolt ES Bolt
  • 18. Cisco Confidential 18© 2013-2014 Cisco and/or its affiliates. All rights reserved. Enrichments Parse r Bolt GEO Enrich RAW Message { “msg_key1”: “msg value1”, “src_ip”: “10.20.30.40”, “dest_ip”: “20.30.40.50”, “domain”: “mydomain.com” } Who Is Enrich "geo":[ {"region":"CA", "postalCode":"95134", "areaCode":"408", "metroCode":"807", "longitude":-121.946, "latitude":37.425, "locId":4522, "city":"San Jose", "country":"US" }] CIF Enrich "whois":[ { "OrgId":"CISCOS", "Parent":"NET-144-0-0-0-0", "OrgAbuseName":"Cisco Systems Inc", "RegDate":"1991-01-171991-01-17", "OrgName":"Cisco Systems", "Address":"170 West Tasman Drive", "NetType":"Direct Assignment" } ], “cif”:”Yes” Enriched Message Cache MySQL Geo Lite Data Cache HBase Who Is Data Cache HBase CIF Data
  • 19. Cisco Confidential 19© 2013-2014 Cisco and/or its affiliates. All rights reserved. Applications: Telemetry Matching and DPI Step1: Search Step2: Match Step3: Analyze Step4: Build PCAP
  • 20. Cisco Confidential 20© 2013-2014 Cisco and/or its affiliates. All rights reserved. Integration with Analytics Tools Dashboards Reports
  • 21. Cisco Confidential 21© 2013-2014 Cisco and/or its affiliates. All rights reserved. Best Practices and Lessons Learned
  • 22. Cisco Confidential 22© 2013-2014 Cisco and/or its affiliates. All rights reserved. Journey Towards Highly Scalable Application
  • 23. Cisco Confidential 23© 2013-2014 Cisco and/or its affiliates. All rights reserved. Kafka Tuning
  • 24. Cisco Confidential 24© 2013-2014 Cisco and/or its affiliates. All rights reserved. This is where we began
  • 25. Cisco Confidential 25© 2013-2014 Cisco and/or its affiliates. All rights reserved. Some code optimizations and increased parallelism
  • 26. Cisco Confidential 26© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Is Disk I/O heavy  Kafka 0.8+ supports replication and JBOD  Better performance compared to RAID  Parallelism is largely driven by number of disks and partitions per topic  Key configuration parameters:  num.io.threads - Keep it at least equal to number of disks provided to Kafka  num.network.threads - adjust it based on number of concurrent producers, consumers and replication factor Kafka Tuning
  • 27. Cisco Confidential 27© 2013-2014 Cisco and/or its affiliates. All rights reserved. After Kafka Tuning
  • 28. Cisco Confidential 28© 2013-2014 Cisco and/or its affiliates. All rights reserved. Bottleneck Isolation, Resource Profiling, Load Balancing
  • 29. Cisco Confidential 29© 2013-2014 Cisco and/or its affiliates. All rights reserved. HBase Tuning
  • 30. Cisco Confidential 30© 2013-2014 Cisco and/or its affiliates. All rights reserved. This is where we began
  • 31. Cisco Confidential 31© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Row Key design is critical (gets or scans or both?)  Keys with IP Addresses  Standard IP addresses have only two variations of the first character : 1 & 2  Minimum key length will be 7 characters and max 15 with a typical average of 12  Subnet range scans become difficult – range of 90 to 220 excludes 112  IP converted to hex (10.20.30.40 => 0a141e28)  gives 16 variations of first key character  consistently 8 character key  Easy to search for subnet ranges Row Key Design
  • 32. Cisco Confidential 32© 2013-2014 Cisco and/or its affiliates. All rights reserved. Experiments with Row Key
  • 33. Cisco Confidential 33© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Know your data  Auto split under high workload can result into hotspots and split storms  Understand your data and presplit the regions  Identify how many regions a RS can have to perform optimally. Use the formula below (RS memory)*(total memstore fraction)/((memstore size)*(# column families)) Region Splits
  • 34. Cisco Confidential 34© 2013-2014 Cisco and/or its affiliates. All rights reserved. With Region Pre-Splits
  • 35. Cisco Confidential 35© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Enable Micro Batching (client side buffer)  Smart shuffle/grouping in storm  Understand your data and situationally exploit various WAL options  Watch for many minor compactions  For heavy ‘write’ workload Increase hbase.hstore.blockingStoreFiles (we used 200) Know Your Application
  • 36. Cisco Confidential 36© 2013-2014 Cisco and/or its affiliates. All rights reserved. And Finally
  • 37. Cisco Confidential 37© 2013-2014 Cisco and/or its affiliates. All rights reserved. Kafka Spout
  • 38. Cisco Confidential 38© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Parallelism is controlled by number of partitions per topic  Set Kafka spout parallelism equal to number of partitions in topic  Other key parameters that drive performance  fetchSizeBytes  bufferSizeBytes Kafka Spout
  • 39. Cisco Confidential 39© 2013-2014 Cisco and/or its affiliates. All rights reserved. Mysteriously Missing Data
  • 40. Cisco Confidential 40© 2013-2014 Cisco and/or its affiliates. All rights reserved.  A bug in Kafka spout that used to miss out some partitions and loose data  It is now fixed and available from Hortonworks repository ( http://repo.hortonworks.com/content/repositories/releases/org/apache/ storm/storm-Kafka ) Mysteriously Missing Data Root Cause
  • 41. Cisco Confidential 41© 2013-2014 Cisco and/or its affiliates. All rights reserved. Storm
  • 42. Cisco Confidential 42© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Every small thing counts at scale  Even simple string operations can slowdown throughput when executed on millions of Tuples Storm
  • 43. Cisco Confidential 43© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Error handling is critical  Poorly handled errors can lead to topology failure and eventually loss of data (or data duplication) Storm
  • 44. Cisco Confidential 44© 2013-2014 Cisco and/or its affiliates. All rights reserved.  Tune & Scale individual spout and bolts before performance testing/tuning entire topology  Write your own simple data generator spouts and no-op bolts  Making as many things configurable as possible helps a lot Storm
  • 45. Cisco Confidential 47© 2013-2014 Cisco and/or its affiliates. All rights reserved.  When it comes to Hadoop…partner up  Separate the hype from the opportunity  Start small then scale up  Design Iteratively  It doesn’t work unless you have proven it at scale  Keep an eye on ROI Lessons Learned
  • 46. Cisco Confidential 48© 2013-2014 Cisco and/or its affiliates. All rights reserved. How can you contribute?  Technology Partner Program – contribute developers to join the Cisco and Hortonworks team Looking for Community Partners Cisco + Hortonworks + Community Support for OpenSOC
  • 47. Thank you! We are hiring: jsirota@cisco.com sheetal@hortonworks.com

Editor's Notes

  1. In Storm bolts shuffle group based on regions so that each HBase bolt gets data mostly for one or two regions and minimizes RS trips In case of DoS attack situations where actual packet are very small 20-60 bytes and individual packets are not very critical for analysis, skip WAL
  2. In Storm bolts shuffle group based on regions so that each HBase bolt gets data mostly for one or two regions and minimizes RS trips In case of DoS attack situations where actual packet are very small 20-60 bytes and individual packets are not very critical for analysis, skip WAL
  3. Frequent minor compactions reduce the overall throughput of system. For ‘write’ heavy workload reduce frequency of minor compactions by increasing hbase.hstore.blockingStoreFiles (we used 200)
  4. Frequent minor compactions reduce the overall throughput of system. For ‘write’ heavy workload reduce frequency of minor compactions by increasing hbase.hstore.blockingStoreFiles (we used 200)
  5. Frequent minor compactions reduce the overall throughput of system. For ‘write’ heavy workload reduce frequency of minor compactions by increasing hbase.hstore.blockingStoreFiles (we used 200)