SlideShare a Scribd company logo
1 of 45
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 1
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2
1. Intro
2. Problem to solve?
3. How does Flume/Solr help?
4. Syslog indexing example
5. HA, DR & scalability
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3
Ops Architect at Cisco CCATG (WebEx)
Ensure operational readiness for complex distributed services
HA, DR, monitoring, config, deployment
Previously eBay, Excite@Home, IBM, VISA
Operations architecture, monitoring, event correlation
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 4
© 2012 Cisco and/or its affiliates. All rights reserved. 5
Cisco WebEx Meetings
• Voice, video, desktop sharing
• Meeting/Event/Support/Training
• Centers
• Integration with TelePresence
Cisco WebEx Social
• Social networking
• Content creation
• Integrated IM
Cisco WebEx Messenger
• IM, presence
• Integrate with voice, video
• XMPP
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6
Participants from over 231 countries, 52% market share
2.2 Billion meeting minutes per month
40.5 Million meeting attendees per month
9.4 million registered hosts worldwide
4 Million mobile downloads
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights reserved. 7
Datacenter / PoP
Leased network link
Global Scale: 13 datacenters &
iPoPs around the globe
Dedicated network: dual path
10G circuits between DCs
Multi-tenant: 95k sites
Real-time collaboration:
voice, desktop sharing, video, chat
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights reserved. 8
Datacenter / PoP
Leased network link
People make mistakes
Hardware fails
Software fails
Even failovers sometimes fail
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9
“If a problem has no solution, it may not be a problem,
but a fact, not to be solved, but to be coped with over time”
— Shimon Peres (“Peres’s Law”)
People/HW/SW failures are facts, not problems
Operations main goal is to maintain high service availability
• Recovery/repair is how we cope with above facts
• Improving recovery/repair improves availability
UnAvailability = MTTR / MTBF
1/10th MTTR just as valuable as 10x MTBF
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10
Even better: proactive
Good: reactive
Your search – What is the root cause of the outage? – did not match any documents.
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 11
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12
Flume
Log4j
File
Avro
Syslog
Other Sinks
Solr
Sink
Applicationstate&APIs
HDFS
Thrift
AMQP RDBMS
Sqoop
HTTP/REST
MySQL
Unstructured/semi-structured data Structured data
Cisco UCS C240 M3 servers
12 x 3TB = 36 TB / server
HDFS
Sink
SolrCloud
Raw dataSolr index
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14
agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
Flume Collector server
Failover & load
balancing agents
Flume Storage tier
All events replicated to
both Channels
DC1 DC2
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16
File
Channel 1
Avro
src
Solr
Sink
HDFS
sink
File
Channel 2
…
Multiplexing
fan-out
flow
Flume Storage tier server
Failover & load
balancing agents
Flume
Collector
Flume
Collector
Flume
Collector
HDFSSolrCloud
Routing to Solr by
Flume event header
All events to HDFS
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17
Isn’t Big Data “schema on read”?
• Why does Solr require a schema on write?
• Dirty little secret: there’s always a schema
• Performance & functionality vs flexibility
• Optimize operations and storage based on field type - that's how you
get sub second response times
There’s always a schema
• Application code vs. central location
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18
Cloudera Morphlines
• Framework to simplify event transformation
• Compatible with existing grok patterns
• Reusable across multiple index workloads:
Flume & M/R
Command: readLine
Command: grok
Command: loadSolr
Solr
Flume event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19
Convert syslog message..
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe
failed for server 10.240.22.111 on port 1234
.. into Solr schema fields
Severity=[3]
Facility=[22]
host=[colo01-wxp00-ace01b-connect.webex.com]
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234]
severity_label=[error]
access_token=[54asdf654]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 1: readLine reads in Flume event headers and body
timestamp=[1371357409000]
host=[colo01-wxp00-ace01b-connect.webex.com]
category=[545f5sfsd5sf]
Severity=[3]
Facility=[22]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Headers
Body
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 2: convertTimestamp converts epoch to ISO 8601 format
timestamp=[2013-06-16T04:36:49.000Z]
host=[colo01-wxp00-ace01b-connect.webex.com]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Severity=[3]
Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 3: addValues creates new field access_token
timestamp=[2013-06-16T04:36:49.000Z]
category=[545f5sfsd5sf]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
host=[colo01-wxp00-ace01b-connect.webex.com]
Severity=[3]
Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 4: tryRules creates field severity_label for severity
timestamp=[2013-06-16T04:36:49.000Z]
severity_label=[error]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
host=[colo01-wxp00-ace01b-connect.webex.com]
category=[545f5sfsd5sf]
Severity=[3]
Facility=[22]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 5: tryRules creates new fields
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 6: sanitizeUnknownSolrFields drops non-schema fields
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
severity_label=[error]
access_token=[545f5sfsd5sf]
host=[colo01-wxp00-ace01b-connect.webex.com]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 7: generateUUID creates an unique id for the document
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
severity_label=[error]
access_token=[545f5sfsd5sf]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
host=[colo01-wxp00-ace01b-connect.webex.com]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 8: loadSolr loads a record into a Solr server
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28
Command: readLine
Command: grok
Command: loadSolr
SolrCloud
Flume syslog event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
Add doc to syslog index
• Collections, shards & replicas
• Pluggable file system
• Central config & coordination with ZK
• Full HA, automatic fail-over
• NRT indexing
• Automatic routing
Where can I index data?
leader3
Collection
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30
Collection “syslog” with
three shards
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31
Special case of search
• Logs are time series data: timestamp + data
• High indexing rate, no updates
• New data is more frequently searched than old
Collection aliases
• Time partitioned collections – e.g. one collection per day
• Reduces the workload to near-real-time data only
• One-to-many collection mapping: queries go to a logical representation
mapped to multiple, same-schema collection
• Simplifies for hot-warm-cold migration of data
Index expiration
• Old data is aged out by Collection Aliases
• Remap only the latest collection to an alias
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32
Solr
• No multi-datacenter cluster support
HDFS
• No multi-datacenter cluster support
Options?
• All our services must survive DC outage
• . . so should logging and indexing
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
…
Collector
tier
Storage
tierPlanned or
unplanned outage
Flume Collector
disk channel
buffering DC1
events
DC1 Hadoop cluster
back online after outage
Replicate
aggregate
data
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
SolrCloud
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
Flume
Flume
Flume
distcp
Manual CNAME
change to DC2
DC1 back
online, sync data
from DC2
Data sent only
to a single DC
distcp
DNS CNAME change
back to DC1
Flip distcp
the other way
Flume buffering events
at collector tier
Create indexes with M/R
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35
Tiers to scale
• Flume Collector tier
• Flume Storage tier
• SolrCloud
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36
100 – 5000 servers per a datacenter
agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
agent agent agent …
…Flume Collector
More agents and data
FileChannel:
14MB/sec
NIC:
100MB/sec
NIC:
100MB/sec
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
Replicating
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37
DC 1 collectors
DC 1
storage tier
Flume 1
DC 2
storage tier
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC 2 collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC N collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
……
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
New logs
to index
Search
queries
1000
tx/sec/core
2x8 cores
16k tx/sec
3 shards
3 x 16k =
48k tx/sec
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39
Central syslog servers
• Network and OS system messages forwarded to several central syslog
servers
Forward syslog to Solr using Flume Morphline SolrSink
• Parse messages with Morphline and grok patterns
SolrCloud
• Index log lines as documents into a Collection (i.e. index)
HUE Solr search
• Simple UI to build a customized search page layout with faceting, sorting.
• Easy drill down with multiple facets: severity, datacenter, hostname, etc
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 40
Screen shots
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41
Search by time
Sort by select field
Facets by selected fields
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42
Wildcard query by field
Highlight the query
keywords
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43
Data sources: REST/JSON, log4j, syslog, Avro, Thrift
Parsing: Cloudera Morphlines
NRT Indexing: SolrCloud embedded in CDH
Batch indexing: MapReduce
Analytics: Use your favorite tool, raw detailed data stored in HDFS
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44
email: ari.flink@webex.com
twitter: @raaka
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 45
Thank you.

More Related Content

What's hot

Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...DataWorks Summit
 
Stephan Ewen - Running Flink Everywhere
Stephan Ewen - Running Flink EverywhereStephan Ewen - Running Flink Everywhere
Stephan Ewen - Running Flink EverywhereFlink Forward
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEkawamuray
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresSteve Loughran
 
Data Architectures for Robust Decision Making
Data Architectures for Robust Decision MakingData Architectures for Robust Decision Making
Data Architectures for Robust Decision MakingGwen (Chen) Shapira
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insightsOmid Vahdaty
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and MessagingXin Wang
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupGwen (Chen) Shapira
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEkawamuray
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)Chris Nauroth
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Steve Hoffman
 
LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...kawamuray
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 

What's hot (20)

Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
 
Stephan Ewen - Running Flink Everywhere
Stephan Ewen - Running Flink EverywhereStephan Ewen - Running Flink Everywhere
Stephan Ewen - Running Flink Everywhere
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
 
Have your cake and eat it too
Have your cake and eat it tooHave your cake and eat it too
Have your cake and eat it too
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
 
Apache Flume (NG)
Apache Flume (NG)Apache Flume (NG)
Apache Flume (NG)
 
Data Architectures for Robust Decision Making
Data Architectures for Robust Decision MakingData Architectures for Robust Decision Making
Data Architectures for Robust Decision Making
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and Messaging
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINE
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Highlights Of Sqoop2
Highlights Of Sqoop2Highlights Of Sqoop2
Highlights Of Sqoop2
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
 
LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...LINE's messaging service architecture underlying more than 200 million monthl...
LINE's messaging service architecture underlying more than 200 million monthl...
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 

Similar to Cisco WebEx Syslog Indexing with Flume and Solr

OpenStack + Cloud Foundry for the OpenStack Boston Meetup
OpenStack + Cloud Foundry for the OpenStack Boston MeetupOpenStack + Cloud Foundry for the OpenStack Boston Meetup
OpenStack + Cloud Foundry for the OpenStack Boston Meetupragss
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigrainePeak Hosting
 
CtrlS: Cloud Solutions for Retail & eCommerce
CtrlS: Cloud Solutions for Retail & eCommerceCtrlS: Cloud Solutions for Retail & eCommerce
CtrlS: Cloud Solutions for Retail & eCommerceeTailing India
 
Presentation cloupia product overview and demo
Presentation   cloupia product overview and demoPresentation   cloupia product overview and demo
Presentation cloupia product overview and demoxKinAnx
 
Should healthcare abandon the cloud final
Should healthcare abandon the cloud finalShould healthcare abandon the cloud final
Should healthcare abandon the cloud finalsapenov
 
Service-Level Objective for Serverless Applications
Service-Level Objective for Serverless ApplicationsService-Level Objective for Serverless Applications
Service-Level Objective for Serverless Applicationsalekn
 
GVP8- Troubleshooting.pptx
GVP8- Troubleshooting.pptxGVP8- Troubleshooting.pptx
GVP8- Troubleshooting.pptxMiyuruChamath
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)Aman Kohli
 
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesMT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesDell EMC World
 
BIND 9 logging best practices
BIND 9 logging best practicesBIND 9 logging best practices
BIND 9 logging best practicesMen and Mice
 
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph Database
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph DatabaseTelecoms Service Assurance & Service Fulfillment with Neo4j Graph Database
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph DatabaseNeo4j
 
Supporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with SplunkSupporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with SplunkErin Sweeney
 
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...MongoDB
 
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and CapacityJim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity360mnbsu
 
OpenStack + CloudFoundry Austin Meetup
OpenStack + CloudFoundry Austin MeetupOpenStack + CloudFoundry Austin Meetup
OpenStack + CloudFoundry Austin Meetupragss
 
OS + CF Austin meetup
OS + CF Austin meetupOS + CF Austin meetup
OS + CF Austin meetupragss
 
Impact2014 session #1317 you have got a friend on z - tales from cics tran...
Impact2014  session #1317   you have got a friend on z - tales from cics tran...Impact2014  session #1317   you have got a friend on z - tales from cics tran...
Impact2014 session #1317 you have got a friend on z - tales from cics tran...Elena Nanos
 
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseRe-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseDell World
 
SmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsSmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsKoppelaars
 

Similar to Cisco WebEx Syslog Indexing with Flume and Solr (20)

OpenStack + Cloud Foundry for the OpenStack Boston Meetup
OpenStack + Cloud Foundry for the OpenStack Boston MeetupOpenStack + Cloud Foundry for the OpenStack Boston Meetup
OpenStack + Cloud Foundry for the OpenStack Boston Meetup
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
 
CtrlS: Cloud Solutions for Retail & eCommerce
CtrlS: Cloud Solutions for Retail & eCommerceCtrlS: Cloud Solutions for Retail & eCommerce
CtrlS: Cloud Solutions for Retail & eCommerce
 
Presentation cloupia product overview and demo
Presentation   cloupia product overview and demoPresentation   cloupia product overview and demo
Presentation cloupia product overview and demo
 
Should healthcare abandon the cloud final
Should healthcare abandon the cloud finalShould healthcare abandon the cloud final
Should healthcare abandon the cloud final
 
Building the Case for System z Linux
Building the Case for System z LinuxBuilding the Case for System z Linux
Building the Case for System z Linux
 
Service-Level Objective for Serverless Applications
Service-Level Objective for Serverless ApplicationsService-Level Objective for Serverless Applications
Service-Level Objective for Serverless Applications
 
GVP8- Troubleshooting.pptx
GVP8- Troubleshooting.pptxGVP8- Troubleshooting.pptx
GVP8- Troubleshooting.pptx
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)
 
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesMT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
 
BIND 9 logging best practices
BIND 9 logging best practicesBIND 9 logging best practices
BIND 9 logging best practices
 
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph Database
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph DatabaseTelecoms Service Assurance & Service Fulfillment with Neo4j Graph Database
Telecoms Service Assurance & Service Fulfillment with Neo4j Graph Database
 
Supporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with SplunkSupporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with Splunk
 
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
 
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and CapacityJim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity
Jim Stertz: Automation and Robotic Arm: Maximizing Throughput and Capacity
 
OpenStack + CloudFoundry Austin Meetup
OpenStack + CloudFoundry Austin MeetupOpenStack + CloudFoundry Austin Meetup
OpenStack + CloudFoundry Austin Meetup
 
OS + CF Austin meetup
OS + CF Austin meetupOS + CF Austin meetup
OS + CF Austin meetup
 
Impact2014 session #1317 you have got a friend on z - tales from cics tran...
Impact2014  session #1317   you have got a friend on z - tales from cics tran...Impact2014  session #1317   you have got a friend on z - tales from cics tran...
Impact2014 session #1317 you have got a friend on z - tales from cics tran...
 
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready EnterpriseRe-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
Re-Architect Your Legacy Environment To Enable An Agile, Future-Ready Enterprise
 
SmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing ConceptsSmartDB Office Hours: Connection Pool Sizing Concepts
SmartDB Office Hours: Connection Pool Sizing Concepts
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Cisco WebEx Syslog Indexing with Flume and Solr

  • 1. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 1
  • 2. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2 1. Intro 2. Problem to solve? 3. How does Flume/Solr help? 4. Syslog indexing example 5. HA, DR & scalability
  • 3. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3 Ops Architect at Cisco CCATG (WebEx) Ensure operational readiness for complex distributed services HA, DR, monitoring, config, deployment Previously eBay, Excite@Home, IBM, VISA Operations architecture, monitoring, event correlation
  • 4. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 4
  • 5. © 2012 Cisco and/or its affiliates. All rights reserved. 5 Cisco WebEx Meetings • Voice, video, desktop sharing • Meeting/Event/Support/Training • Centers • Integration with TelePresence Cisco WebEx Social • Social networking • Content creation • Integrated IM Cisco WebEx Messenger • IM, presence • Integrate with voice, video • XMPP
  • 6. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6 Participants from over 231 countries, 52% market share 2.2 Billion meeting minutes per month 40.5 Million meeting attendees per month 9.4 million registered hosts worldwide 4 Million mobile downloads
  • 7. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights reserved. 7 Datacenter / PoP Leased network link Global Scale: 13 datacenters & iPoPs around the globe Dedicated network: dual path 10G circuits between DCs Multi-tenant: 95k sites Real-time collaboration: voice, desktop sharing, video, chat
  • 8. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights reserved. 8 Datacenter / PoP Leased network link People make mistakes Hardware fails Software fails Even failovers sometimes fail
  • 9. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9 “If a problem has no solution, it may not be a problem, but a fact, not to be solved, but to be coped with over time” — Shimon Peres (“Peres’s Law”) People/HW/SW failures are facts, not problems Operations main goal is to maintain high service availability • Recovery/repair is how we cope with above facts • Improving recovery/repair improves availability UnAvailability = MTTR / MTBF 1/10th MTTR just as valuable as 10x MTBF
  • 10. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10 Even better: proactive Good: reactive Your search – What is the root cause of the outage? – did not match any documents.
  • 11. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 11
  • 12. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12 Flume Log4j File Avro Syslog Other Sinks Solr Sink Applicationstate&APIs HDFS Thrift AMQP RDBMS Sqoop HTTP/REST MySQL Unstructured/semi-structured data Structured data Cisco UCS C240 M3 servers 12 x 3TB = 36 TB / server HDFS Sink SolrCloud Raw dataSolr index
  • 13. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier
  • 14. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14 agent agent agent File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 … Replicating fan-out flow Flume Collector server Failover & load balancing agents Flume Storage tier All events replicated to both Channels DC1 DC2
  • 15. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier
  • 16. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16 File Channel 1 Avro src Solr Sink HDFS sink File Channel 2 … Multiplexing fan-out flow Flume Storage tier server Failover & load balancing agents Flume Collector Flume Collector Flume Collector HDFSSolrCloud Routing to Solr by Flume event header All events to HDFS
  • 17. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17 Isn’t Big Data “schema on read”? • Why does Solr require a schema on write? • Dirty little secret: there’s always a schema • Performance & functionality vs flexibility • Optimize operations and storage based on field type - that's how you get sub second response times There’s always a schema • Application code vs. central location
  • 18. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18 Cloudera Morphlines • Framework to simplify event transformation • Compatible with existing grok patterns • Reusable across multiple index workloads: Flume & M/R Command: readLine Command: grok Command: loadSolr Solr Flume event = headers + body Record Document matching schema.xml Command: tryRules Command: addValues … Record Record Record Record SolrSink
  • 19. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19 Convert syslog message.. <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234 .. into Solr schema fields Severity=[3] Facility=[22] host=[colo01-wxp00-ace01b-connect.webex.com] timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[54asdf654] id=[b2f839c3-dece-404f-a535-e0141ad549bf] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  • 20. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 1: readLine reads in Flume event headers and body timestamp=[1371357409000] host=[colo01-wxp00-ace01b-connect.webex.com] category=[545f5sfsd5sf] Severity=[3] Facility=[22] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] Headers Body
  • 21. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 2: convertTimestamp converts epoch to ISO 8601 format timestamp=[2013-06-16T04:36:49.000Z] host=[colo01-wxp00-ace01b-connect.webex.com] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] Severity=[3] Facility=[22]
  • 22. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 3: addValues creates new field access_token timestamp=[2013-06-16T04:36:49.000Z] category=[545f5sfsd5sf] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] Severity=[3] Facility=[22]
  • 23. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 4: tryRules creates field severity_label for severity timestamp=[2013-06-16T04:36:49.000Z] severity_label=[error] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] category=[545f5sfsd5sf] Severity=[3] Facility=[22]
  • 24. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 5: tryRules creates new fields syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  • 25. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 6: sanitizeUnknownSolrFields drops non-schema fields timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  • 26. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 7: generateUUID creates an unique id for the document timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] id=[b2f839c3-dece-404f-a535-e0141ad549bf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  • 27. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 8: loadSolr loads a record into a Solr server
  • 28. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28 Command: readLine Command: grok Command: loadSolr SolrCloud Flume syslog event = headers + body Record Document matching schema.xml Command: tryRules Command: addValues … Record Record Record Record SolrSink
  • 29. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29 ZooKeeper leader1 replica1 Shard1 leader2 replica2 Shard2 leader3 replica3 Shard3 SolrCloud cluster zk1 zk2 zk3 Pluggable filesystem (local, HDFS) Add doc to syslog index • Collections, shards & replicas • Pluggable file system • Central config & coordination with ZK • Full HA, automatic fail-over • NRT indexing • Automatic routing Where can I index data? leader3 Collection
  • 30. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30 Collection “syslog” with three shards
  • 31. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31 Special case of search • Logs are time series data: timestamp + data • High indexing rate, no updates • New data is more frequently searched than old Collection aliases • Time partitioned collections – e.g. one collection per day • Reduces the workload to near-real-time data only • One-to-many collection mapping: queries go to a logical representation mapped to multiple, same-schema collection • Simplifies for hot-warm-cold migration of data Index expiration • Old data is aged out by Collection Aliases • Remap only the latest collection to an alias
  • 32. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32 Solr • No multi-datacenter cluster support HDFS • No multi-datacenter cluster support Options? • All our services must survive DC outage • . . so should logging and indexing
  • 33. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC 2 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tierPlanned or unplanned outage Flume Collector disk channel buffering DC1 events DC1 Hadoop cluster back online after outage Replicate aggregate data
  • 34. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS SolrCloud DC 1 Flume Flume Flume syslog log4j file DC 2 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier Flume Flume Flume distcp Manual CNAME change to DC2 DC1 back online, sync data from DC2 Data sent only to a single DC distcp DNS CNAME change back to DC1 Flip distcp the other way Flume buffering events at collector tier Create indexes with M/R
  • 35. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35 Tiers to scale • Flume Collector tier • Flume Storage tier • SolrCloud
  • 36. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36 100 – 5000 servers per a datacenter agent agent agent File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 … Replicating fan-out flow agent agent agent … …Flume Collector More agents and data FileChannel: 14MB/sec NIC: 100MB/sec NIC: 100MB/sec File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 Replicating fan-out flow Max per server: 14MB/s 1.2 TB/day 70k events/s
  • 37. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37 DC 1 collectors DC 1 storage tier Flume 1 DC 2 storage tier Avro sink 1 Avro sink 2 Avro sink N … DC 2 collectors Avro sink 1 Avro sink 2 Avro sink N … DC N collectors Avro sink 1 Avro sink 2 Avro sink N …… File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow Max per server: 14MB/s 1.2 TB/day 70k events/s
  • 38. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38 ZooKeeper leader1 replica1 Shard1 leader2 replica2 Shard2 leader3 replica3 Shard3 SolrCloud cluster zk1 zk2 zk3 Pluggable filesystem (local, HDFS) New logs to index Search queries 1000 tx/sec/core 2x8 cores 16k tx/sec 3 shards 3 x 16k = 48k tx/sec
  • 39. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39 Central syslog servers • Network and OS system messages forwarded to several central syslog servers Forward syslog to Solr using Flume Morphline SolrSink • Parse messages with Morphline and grok patterns SolrCloud • Index log lines as documents into a Collection (i.e. index) HUE Solr search • Simple UI to build a customized search page layout with faceting, sorting. • Easy drill down with multiple facets: severity, datacenter, hostname, etc
  • 40. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 40 Screen shots
  • 41. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41 Search by time Sort by select field Facets by selected fields
  • 42. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42 Wildcard query by field Highlight the query keywords
  • 43. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43 Data sources: REST/JSON, log4j, syslog, Avro, Thrift Parsing: Cloudera Morphlines NRT Indexing: SolrCloud embedded in CDH Batch indexing: MapReduce Analytics: Use your favorite tool, raw detailed data stored in HDFS
  • 44. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44 email: ari.flink@webex.com twitter: @raaka
  • 45. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 45 Thank you.

Editor's Notes

  1. As of Feb 2013
  2. As of Feb 2013
  3. As of Feb 2013
  4. CEP: Complex Event Processing
  5. CEP: Complex Event Processing
  6. CEP: Complex Event Processing
  7. CEP: Complex Event Processing
  8. CEP: Complex Event Processing
  9. CEP: Complex Event Processing
  10. CEP: Complex Event Processing
  11. CEP: Complex Event Processing
  12. CEP: Complex Event Processing
  13. CEP: Complex Event Processing
  14. CEP: Complex Event Processing
  15. CEP: Complex Event Processing
  16. CEP: Complex Event Processing