SlideShare a Scribd company logo
1 of 34
Download to read offline
Enabling Exploratory Analytics of Data in
Shared-service Hadoop Clusters
PRESENTED BY Sagi Zelnick Principal Architect @ Yahoo and Ledion Bitincka Principal Architect @ Splunk
Hadoop Summit June 2014 San Jose, CA
Overview
2 Yahoo Proprietary
!  Hadoop @ Yahoo: 8+ years of innovation
!  Hunk @ Yahoo: organization-wide investment for next 3+ years
!  Yahoo providing Hunk as a self-service to explore, analyze & visualize data in HDFS
›  Hunk allows for visually browsing very complex tables (250+ fields)
›  Rapid prototyping for new jobs with almost instant results for searches, without having
to wait for the entire job/query to finish
›  Cuts down on the development cycles by faster interaction with results
›  Built-in graphs/charts makes for a powerful solution for many situations
About your speakers
3 Yahoo Proprietary
Sagi Zelnick Ledion Bitincka
Principal Architect Principal Architect
Yahoo Splunk
Hunk + Hadoop @ Yahoo
4Yahoo Proprietary
5 Yahoo Proprietary
History of Hadoop innovation @ Yahoo
Over 600PB of Hadoop storage (over half an Exabyte)
6 Yahoo Proprietary
!  Very large clusters used by many groups across the enterprise.
!  More than 35,000 individual datanodes.
!  Hadoop is provided as a service.
!  Multiple cluster types such as research, dev, sandbox and production.
!  Services such as HBase, Hive, Oozie, etc…
!  Users are free to run jobs, but have resource constraints.
!  Maintained by the Grid Operations Group.
Improving operational visibility with Hunk
!  We pointed Hunk at many operational logs and event data we already
had on the grid.
!  This includes system metrics, HDFS ops, JVM stats and YARN metrics.
!  Created instrumentation to measure usage per user and job.
!  Analyzed terabytes of NameNode audit logs.
!  Job history leveraged for visualizing usage/growth and historical views.
!  Custom events for HBase statistics.
7 Yahoo Proprietary
Use Case Customer Benefits
System metrics from 35k nodes Grid Ops / Grid
Customers
Identify slow tasks/nodes
when debugging
Historical insights of resources All Grid Customers Track organic growth
Job performance All Grid Customers Improved job SLAs
HBase metrics All Grid Customers Track region/RS/table
metrics…
Job logs in near real-time All Grid Customers / Ops Search for errors directly
from the YARN logs
Namenode operational data Research, Dev Improved performance and
stability
Tracking Hadoop performance and metrics in Hunk
8 Yahoo Proprietary
Measuring NameNode performance pre & post upgrades
9 Yahoo Proprietary
!  Historical visualizations of all operations.
!  Search data in Hunk from billions of NameNode events.
!  Measure JVM and memory usage.
!  Insights into operational performance.
Yahoo Proprietary
index="simon_blue_new_all" this_cluster="dilithiumblue*" (log_subtype="DFS" #hdfs=hdfs) | timechart spa
n=1h avg(number*) as num_*
Last 7 days
✓ 10,086 events (5/15/14 1:00:00.000 AM to 5/22/14 1:36:34.000 AM)
_time
num_BlockReports num_CopyBl...perations num_HeartBeats num_ReadBl...perations
num_ReadMe...perations num_Replac...Operations num_WriteB...Operations num_blockChecksumOp
Fri May 16
2014
Sun May 18 Tue May 20
200,000,000
400,000,000
600,000,000
_time ↕
num_Bl
ockRep
orts ↕
num_Copy
BlockOpera
tions ↕
num_
HeartB
eats ↕
num_Read
BlockOpera
tions ↕
num_ReadMe
tadataOperati
ons ↕
num_Replac
eBlockOperat
ions ↕
num_Write
BlockOpera
tions ↕
num_blo
ckChecks
umOp ↕
2014-05-15 01:00 112443
7.7359
02
46721126.
819672
51495
7.3840
98
12930433.0
77869
0.000000 94210832.78
6885
63512425.9
67213
13975.30
6557
2014-05-15 02:00 111549
6.2904
92
53597000.
262295
29871
7.6370
49
10402176.7
17213
0.000000 94109944.65
5738
93916552.3
93443
35459.28
8689
2014-05-15 03:00 111037
2.4173
56566721.
704918
42849
4.9449
13296385.5
90164
0.000000 94141430.29
5082
97353478.2
29508
20307.54
9344
Visualization
Visualization using Hunk
10
11 Yahoo Proprietary
n=5m avg(number*) as num_*
Last 2 days
✓ 2,753 events (5/20/14 1:14:21.000 AM to 5/22/14 1:14:21.000 AM)
_time
num_BlockReports num_CopyBl...perations num_HeartBeats num_ReadBl...perations
num_ReadMe...perations num_Replac...Operations num_WriteB...Operations num_blockChecksumOp
12:00 PM
Tue May 20
2014
12:00 AM
Wed May 21
12:00 PM
1,000,000,000
250,000,000
500,000,000
750,000,000
_time ↕
num_Bl
ockRep
orts ↕
num_Copy
BlockOpera
tions ↕
num_
HeartB
eats ↕
num_Read
BlockOpera
tions ↕
num_ReadMe
tadataOperati
ons ↕
num_Replac
eBlockOperat
ions ↕
num_Write
BlockOpera
tions ↕
num_blo
ckChecks
umOp ↕
2014-05-20 01:15:00 105604
7.0240
00
34677652.
000000
12412
1.2640
00
26242490.8
00000
0.000000 88112292.80
0000
126478486.
400000
51405.34
6000
2014-05-20 01:20:00 105551 30920700. 10653 22756041.8 0.000000 87745422.40 92323387.2 32070.48
Visualization
Sample troubleshooting in Hunk of 750 million events
12 Yahoo Proprietary
New Search
index="simon_blue_new_all" this_cluster="dilithiumblue*" (log_subtype="JVM" ProcessName="NameNode") | tim
echart span=5m avg(Threads*) as threads_*
Last 2 days
✓ 8,463 events (5/20/14 12:00:00.000 AM to 5/22/14 12:00:00.000 AM)
_time
threads_Blocked threads_New threads_Runnable threads_Terminated threads_TimedWaiting
threads_Waiting
12:00 AM
Tue May 20
2014
12:00 PM 12:00 AM
Wed May 21
12:00 PM
200
400
_time ↕
threads_Block
ed ↕
threads_Ne
w ↕
threads_Runna
ble ↕
threads_Terminat
ed ↕
threads_TimedWait
ing ↕
threads_Waiti
ng ↕
2014-05-20 00:00:00 72.360000 10.638333 5.485833 0.000000 21.208333 78.555000
2014-05-20 00:05:00 70.177333 10.554667 5.277333 0.000000 20.744667 76.578000
2014-05-20 00:10:00 70.211333 9.998667 5.022000 0.000000 19.333333 73.766667
2014-05-20 00:15:00 70.300667 10.268000 5.156667 0.000000 17.488667 70.122000
2014-05-20 00:20:00 70.422667 10.376000 5.188000 0.000000 15.700000 66.611333
2014-05-20 00:25:00 70.444000 10.288000 5.144000 0.000000 14.089333 63.400667
Visualization
Big picture plus granular details
Analyzing NameNode RPC calls (troubleshooting)
13 Yahoo Proprietary
!  Who is making what RPC call (open, listStatus, create, etc.).
!  How often are they making these RPC calls.
!  From which IP/host are they coming from.
!  Search and visualize historical data from billions of events.
!  Prevent NameNode abuse/misuse.
14 Yahoo Proprietary
Visualizing 834 million discrete events …
15 Yahoo Confidential & Proprietary
… continued
Queue insights (capacity & provisioning)
!  Each Hadoop job runs in a specific queue.
!  We track every aspect of the YARN framework.
!  Immediate queue performance and configuration profiling via job
history server.
!  Historical views and trends that enable better capacity management.
!  Improved queue utilization and allocation management.
16 Yahoo Proprietary
 New Search
index="jobsummary_logs_all_red" cluster="dilithium*" | eval total_slot_seconds=(mapSlotSeconds + reduceSlotSec
onds) | eval gb_hours=((total_slot_seconds * 0.5) / 3600) | eval gb_hours=round(gb_hours) | timechart span=6h sum
(gb_hours) as gb_hours by queue
Last 7 days
✓ 1,175,726 events (5/20/14 8:00:00.000 PM to 5/27/14 8:26:26.000 PM)
200,000
400,000
600,000
_time ↕
OTH
ER
↕
apg_dai
lyhigh_
p3 ↕
apg_dail
ymedium
_p5 ↕
apg_hou
rlyhigh_
p1 ↕
apg_ho
urlylow_
p4 ↕
apg_hourl
ymedium
_p2 ↕
apg
_p7
↕
curveb
all_larg
e ↕
curveb
all_me
d ↕
sling
shot
↕
sling
stone
↕
2014-05-20 18:00 415
4
45512 7071 25643 12111 29664 347
3
26547 14192 6087
5
4537
6
2014-05-21 00:00 193
41
92661 18005 41008 22944 88115 108
96
38648 8693 4818
6
8767
0
2014-05-21 06:00 211 108137 38398 35627 14934 101925 244 29269 14066 2434 4783
Visualization
_time
Wed May 21
2014
Thu May 22 Fri May 23 Sat May 24 Sun May 25 Mon May 26
Search | Splunk 6.1.0 http://spbl103n01.blue.ygrid.yahoo.com:9999/en-US/app/search...
Visualizing queues
17 Yahoo Proprietary
Self-service job reports
18 Yahoo Proprietary
!  Each job is unique and so are the map and reduce elements.
!  How to start analyzing jobs?
!  Historical job performance and profiling enables in-depth
performance tuning.
!  Long terms historical views and trending of growth.
19 Yahoo Proprietary
clu
ster
↕
us
er
↕
que
ue
↕ jobName ↕ jobId ↕
status
↕
gb-ho
urs ↕
run_
mins
↕
cob
alt
g
m
on
grid
eng
PigLatin:findRemoteHDFSFromAudits.pig job_1398982765
383_315271
SUCCE
EDED
108.0
0
33.07
cob
alt
g
m
on
grid
eng
PigLatin:findRemoteHDFSFromAudits.pig job_1398982765
383_312700
SUCCE
EDED
104.0
0
37.37
cob
alt
g
m
on
grid
eng
PigLatin:findRemoteHDFSFromAudits.pig job_1398982765
383_309715
SUCCE
EDED
88.00 29.83
cob
alt
g
m
on
grid
ops
distcp: job_1398982765
383_309921
SUCCE
EDED
36.00 68.49
cob
alt
g
m
on
grid
ops
SPLK_spbl103n01.blue.ygrid.yahoo.com_1401125953.2076_0 job_1398982765
383_313570
SUCCE
EDED
25.00 14.26
cob
alt
g
m
on
grid
ops
nnaudit_DR_2014_05_25 job_1398982765
383_308938
SUCCE
EDED
25.00 15.43
cob g grid nnaudit_DB_2014_05_25 job_1398982765 SUCCE 24.00 18.07
New Search
index="jobsummary_logs_all_blue" cluster="*" user="gmon" |
eval total_slot_seconds=(mapSlotSeconds + reduceSlotSeconds) |
eval gb_hours=((total_slot_seconds * 0.5) / 3600) |
eval gb_hours=round(gb_hours,2) |
eval runtime=(finishTime-submitTime)/1000 | stats sum(gb_hours) as gb-hours
avg(runtime) as run_mins
by cluster user queue jobName jobId status| eval run_mins=round(run_mins/60,2) | sort -gb-hours
Yesterday
✓ 4,871 events (5/26/14 12:00:00.000 AM to 5/27/14 12:00:00.000 AM)
Statistics (4,871)
20 Yahoo Proprietary
21 Yahoo Proprietary
22 Yahoo Proprietary
More data to tap into with the metastore / Hive sources
23 Yahoo Proprietary
!  Using the metastore we can setup virtual indexes to any table(s) in
Hive, without the need to define the schema up-front
!  Visualize very complex tables (250+ fields)
!  Rapid prototyping for new jobs with almost instant results for searches,
without having to wait for the entire job/query to finish
!  Built-in aggregates and graphs/charts
!  Accelerates development workflow by providing faster interaction with
data
... it’s not just logs we’re looking at
24 Yahoo Proprietary
Meet%Hunk%!
26%
Integrated%Analy4cs%Pla8orm%for%Diverse%Data%Stores%
Full%featured,!
Integrated!
Product%
Fast!Insights!!
for!Everyone%
Works!with!
What!You!
Have!Today%
Explore% Visualize% Dashboard
s%
Share%Analyze%
Hadoop!Clusters! NoSQL!and!Other!Data!Stores!
Hadoop%Client%Libraries% Streaming%Resource%Libraries%
27%
Fast%Deployment%and%Configura4on%
Just%point%at%Hadoop%
•  Cer4fied%integra4ons%to%all%
major%Hadoop%distribu4ons%
•  Choose%1stLgen%MapReduce%
or%YARN%%
•  Create%Virtual%Indexes%across%
one%or%more%clusters%
•  From%download%to%searching%
data%in%<%60%minutes%
Connect%to%one%or%mul4ple%Hadoop%clusters%
YARN%
cer4fied%
28%
Interac4ve%Search%and%Results%Preview%
Rapidly%interact%with%data%
•  Powerful%Search%Processing%
Language%(SPL™)%
•  Ad%hoc%exploratory%analy4cs%
across%massive%datasets%
•  Preview%results%
•  No%fixed%schema%
•  No%requirement%to%
“understand”%data%upfront%
Search%
interface%
Preview%
results%
Drill%down%
to%raw%data%
Pause%or%stop%MapReduce%jobs%
29%
Powerful%Dashboards%for%SelfLService%Analy4cs%
Interac4ve%Dashboards%
and%Charts%
•  EasyLtoLuse%dashboard%editor%
•  Chart%overlay%
•  Pan%and%zoom%
•  InLdashboard%drill%down%
•  Embed%charts%and%
dashboards%in%3rd%party%apps%
•  Reuse%skills%with%Splunk%
Enterprise%6.1%and%Hunk%6.1%
30%
Automate%Access%for%Rapid%Explora4on%
Supported%File%Formats%
•  Text%files%
•  Sequence%files%%
•  RCFile%
•  ORC%files%
•  Parquet%
31%
RoleLbased%Security%for%Shared%Clusters%
PassLthrough%
Authen4ca4on%
•  Provide%roleLbased%security%
for%Hadoop%clusters%
•  Access%Hadoop%resources%
under%security%and%
compliance%
•  Integrates%with%Kerberos%
for%Hadoop%security%
Business!
Analyst%
MarkeNng!
Analyst%
Sys!
Admin%
Business!!
Analyst!!
Queue:!!
Biz!AnalyNcs%
MarkeNng!
Analyst!
Queue:!
MarkeNng%
Sys!!
Admin2!
Queue:!!
Prod%
32%
Powerful%Developer%
Environment%
•  Use%a%standardsLbased%web%
framework%and%REST%API%%
•  Customize%dashboards%and%
UIs%with%Simple%XML,%
JavaScript%or%Django%
•  Choose%among%SDKs%%
•  One%integra4on%for%both%
Splunk%Enterprise%and%Hunk%
Build%Analy4csLRich%Big%Data%Apps%
33%
Explore,%analyze%and%visualize%data%in%
one%integrated%pla8orm%
Point%Hunk%at%your%storage%clusters%and%
explore%data%immediately%
Preview%results%as%MapReduce%jobs%run%and%
accelerate%reports%with%no%fixed%schemas%
INTERACTIVE!
SEARCH!
RICH!DEVELOPER!
ENVIRONMENT!
Build%big%data%apps%using%standard%web%
languages%and%frameworks%
FULL%FEATURED!
ANALYTICS!
FAST!TO!DEPLOY!
AND!DRIVE!VALUE!
FullLFeatured,%Integrated%Analy4cs%Pla8orm%
Question/Comments?
Sagi Zelnick – Principal Architect
Email: zelnicks@yahoo-inc.com
Ledion Bitincka – Principal Architect
Email: lbitincka@splunk.com

More Related Content

What's hot

Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014Urena Nicolas
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunk
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunk
 
How to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkHow to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkSplunk
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk
 
SplunkLive! Customer Presentation - ExxonMobil
SplunkLive! Customer Presentation - ExxonMobilSplunkLive! Customer Presentation - ExxonMobil
SplunkLive! Customer Presentation - ExxonMobilSplunk
 
Data Onboarding Breakout Session
Data Onboarding Breakout SessionData Onboarding Breakout Session
Data Onboarding Breakout SessionSplunk
 
SplunkLive! London: Splunk ninjas- new features and search dojo
SplunkLive! London: Splunk ninjas- new features and search dojoSplunkLive! London: Splunk ninjas- new features and search dojo
SplunkLive! London: Splunk ninjas- new features and search dojoSplunk
 
Sl boston 05_12_15_ener_noc_final_public
Sl boston 05_12_15_ener_noc_final_publicSl boston 05_12_15_ener_noc_final_public
Sl boston 05_12_15_ener_noc_final_publicSplunk
 
Splunk in Staples: IT Operations
Splunk in Staples: IT OperationsSplunk in Staples: IT Operations
Splunk in Staples: IT OperationsTimur Bagirov
 
Customer Presentation
Customer PresentationCustomer Presentation
Customer PresentationSplunk
 
Splunk for Developers
Splunk for DevelopersSplunk for Developers
Splunk for DevelopersSplunk
 
Splunk Enterprise 6.4
Splunk Enterprise 6.4Splunk Enterprise 6.4
Splunk Enterprise 6.4Splunk
 
SplunkLive! San Francisco Dec 2012 - Intuit
SplunkLive! San Francisco Dec 2012 - IntuitSplunkLive! San Francisco Dec 2012 - Intuit
SplunkLive! San Francisco Dec 2012 - IntuitSplunk
 
SplunkLive! Warsaw 2016 - Cisco
SplunkLive! Warsaw 2016 - Cisco SplunkLive! Warsaw 2016 - Cisco
SplunkLive! Warsaw 2016 - Cisco Splunk
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...Splunk
 
Splunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunk
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseSplunk
 

What's hot (20)

Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014Splunk Sales Presentation Imagemaker 2014
Splunk Sales Presentation Imagemaker 2014
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk Overview
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
 
How to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in SplunkHow to Design, Build and Map IT and Business Services in Splunk
How to Design, Build and Map IT and Business Services in Splunk
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
 
SplunkLive! Customer Presentation - ExxonMobil
SplunkLive! Customer Presentation - ExxonMobilSplunkLive! Customer Presentation - ExxonMobil
SplunkLive! Customer Presentation - ExxonMobil
 
Data Onboarding Breakout Session
Data Onboarding Breakout SessionData Onboarding Breakout Session
Data Onboarding Breakout Session
 
SplunkLive! London: Splunk ninjas- new features and search dojo
SplunkLive! London: Splunk ninjas- new features and search dojoSplunkLive! London: Splunk ninjas- new features and search dojo
SplunkLive! London: Splunk ninjas- new features and search dojo
 
Sl boston 05_12_15_ener_noc_final_public
Sl boston 05_12_15_ener_noc_final_publicSl boston 05_12_15_ener_noc_final_public
Sl boston 05_12_15_ener_noc_final_public
 
Splunk in Staples: IT Operations
Splunk in Staples: IT OperationsSplunk in Staples: IT Operations
Splunk in Staples: IT Operations
 
Customer Presentation
Customer PresentationCustomer Presentation
Customer Presentation
 
Splunk for Developers
Splunk for DevelopersSplunk for Developers
Splunk for Developers
 
Splunk Enterprise 6.4
Splunk Enterprise 6.4Splunk Enterprise 6.4
Splunk Enterprise 6.4
 
SplunkLive! San Francisco Dec 2012 - Intuit
SplunkLive! San Francisco Dec 2012 - IntuitSplunkLive! San Francisco Dec 2012 - Intuit
SplunkLive! San Francisco Dec 2012 - Intuit
 
SplunkLive! Warsaw 2016 - Cisco
SplunkLive! Warsaw 2016 - Cisco SplunkLive! Warsaw 2016 - Cisco
SplunkLive! Warsaw 2016 - Cisco
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...
 
Splunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search Dojo
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - Staples
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 

Viewers also liked

SplunkLive! Hunk Technical Deep Dive
SplunkLive! Hunk Technical Deep DiveSplunkLive! Hunk Technical Deep Dive
SplunkLive! Hunk Technical Deep DiveSplunk
 
SplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunk
 
Monitoring a Database Driven System Utilizing Splunk's DB Connect
Monitoring a Database Driven System Utilizing Splunk's DB ConnectMonitoring a Database Driven System Utilizing Splunk's DB Connect
Monitoring a Database Driven System Utilizing Splunk's DB ConnectSplunk
 
Hunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataHunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataSplunk
 
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDB
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDBSplunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDB
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDBMongoDB
 
Loushkii lookbook voyager 2012 spectrum
Loushkii lookbook voyager 2012 spectrumLoushkii lookbook voyager 2012 spectrum
Loushkii lookbook voyager 2012 spectrumloushkii
 
Power point brescia
Power point brescia Power point brescia
Power point brescia simonefelcaro
 
Thấu hiểu và vượt qua sự trì hoãn
Thấu hiểu và vượt qua sự trì hoãnThấu hiểu và vượt qua sự trì hoãn
Thấu hiểu và vượt qua sự trì hoãnTrần Onion
 
Sandy financial analysis
Sandy financial analysisSandy financial analysis
Sandy financial analysispiyush.u.t
 
ukol KPI
ukol KPIukol KPI
ukol KPISlavoM
 

Viewers also liked (20)

SplunkLive! Hunk Technical Deep Dive
SplunkLive! Hunk Technical Deep DiveSplunkLive! Hunk Technical Deep Dive
SplunkLive! Hunk Technical Deep Dive
 
Vantrix hunk
Vantrix hunkVantrix hunk
Vantrix hunk
 
SplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical OverviewSplunkLive! Hunk Technical Overview
SplunkLive! Hunk Technical Overview
 
Monitoring a Database Driven System Utilizing Splunk's DB Connect
Monitoring a Database Driven System Utilizing Splunk's DB ConnectMonitoring a Database Driven System Utilizing Splunk's DB Connect
Monitoring a Database Driven System Utilizing Splunk's DB Connect
 
Hunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big DataHunk - Unlocking the Power of Big Data
Hunk - Unlocking the Power of Big Data
 
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDB
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDBSplunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDB
Splunk's Hunk: A Powerful Way to Visualize Your Data Stored in MongoDB
 
Catalog lc 2012-2
Catalog lc 2012-2Catalog lc 2012-2
Catalog lc 2012-2
 
cloning
cloningcloning
cloning
 
Loushkii lookbook voyager 2012 spectrum
Loushkii lookbook voyager 2012 spectrumLoushkii lookbook voyager 2012 spectrum
Loushkii lookbook voyager 2012 spectrum
 
Power point brescia
Power point brescia Power point brescia
Power point brescia
 
Thấu hiểu và vượt qua sự trì hoãn
Thấu hiểu và vượt qua sự trì hoãnThấu hiểu và vượt qua sự trì hoãn
Thấu hiểu và vượt qua sự trì hoãn
 
Sandy financial analysis
Sandy financial analysisSandy financial analysis
Sandy financial analysis
 
Kpi_závěr ukol
Kpi_závěr ukolKpi_závěr ukol
Kpi_závěr ukol
 
ukol KPI
ukol KPIukol KPI
ukol KPI
 
power point brescia
power point bresciapower point brescia
power point brescia
 
2013 ufsc rt_seccom
2013 ufsc rt_seccom2013 ufsc rt_seccom
2013 ufsc rt_seccom
 
Advertising awards
Advertising awardsAdvertising awards
Advertising awards
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
 
2013 ufsc rt_grad_class
2013 ufsc rt_grad_class2013 ufsc rt_grad_class
2013 ufsc rt_grad_class
 
Woocommerce
WoocommerceWoocommerce
Woocommerce
 

Similar to Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters

Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersDataWorks Summit
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionSplunk
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup Qubole
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesQubole
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Bb world 2011 capacity planning
Bb world 2011 capacity planningBb world 2011 capacity planning
Bb world 2011 capacity planningSteve Feldman
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015Jonathan Woodward
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with HadoopJayant Shekhar
 
Data Culture Series - Keynote - 3rd Dec
Data Culture Series - Keynote - 3rd DecData Culture Series - Keynote - 3rd Dec
Data Culture Series - Keynote - 3rd DecJonathan Woodward
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages JaunesDataiku
 
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)Stacy Deere
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALMark Tabladillo
 
Building Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech CompaniesBuilding Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech CompaniesHolistics Software
 
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...Wiiisdom
 
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 

Similar to Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters (20)

Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersEnabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
 
Agile scrum with Microsoft VSTS
Agile scrum with Microsoft VSTSAgile scrum with Microsoft VSTS
Agile scrum with Microsoft VSTS
 
Hunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout SessionHunk - Unlocking The Power of Big Data Breakout Session
Hunk - Unlocking The Power of Big Data Breakout Session
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
 
StaffingModel_EXAMPLE
StaffingModel_EXAMPLEStaffingModel_EXAMPLE
StaffingModel_EXAMPLE
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Bb world 2011 capacity planning
Bb world 2011 capacity planningBb world 2011 capacity planning
Bb world 2011 capacity planning
 
Frappe Open Day - August 2018
Frappe Open Day - August 2018Frappe Open Day - August 2018
Frappe Open Day - August 2018
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with Hadoop
 
Data Culture Series - Keynote - 3rd Dec
Data Culture Series - Keynote - 3rd DecData Culture Series - Keynote - 3rd Dec
Data Culture Series - Keynote - 3rd Dec
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
 
Building Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech CompaniesBuilding Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech Companies
 
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...
Uncover Your Data Journey: End-To-End Data Lineage For SAP BOBJ And SAP Data ...
 
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 

More from Brett Sheppard

5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunkBrett Sheppard
 
Sample Google Paid campaign results
Sample Google Paid campaign resultsSample Google Paid campaign results
Sample Google Paid campaign resultsBrett Sheppard
 
Summary of Made to Stick book
Summary of Made to Stick bookSummary of Made to Stick book
Summary of Made to Stick bookBrett Sheppard
 
Shift from manual to interactive reporting
Shift from manual to interactive reportingShift from manual to interactive reporting
Shift from manual to interactive reportingBrett Sheppard
 
Brett sheppard references
Brett sheppard referencesBrett sheppard references
Brett sheppard referencesBrett Sheppard
 
Datadog APM Product Launch
Datadog APM Product LaunchDatadog APM Product Launch
Datadog APM Product LaunchBrett Sheppard
 
Brett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard
 
Idc datadog-expands-into-apm
Idc datadog-expands-into-apmIdc datadog-expands-into-apm
Idc datadog-expands-into-apmBrett Sheppard
 
Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Brett Sheppard
 
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...Brett Sheppard
 
SEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsSEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsBrett Sheppard
 
GigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardGigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardBrett Sheppard
 
DxContinuum Forrester Webinar
DxContinuum Forrester WebinarDxContinuum Forrester Webinar
DxContinuum Forrester WebinarBrett Sheppard
 

More from Brett Sheppard (15)

5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk
 
Sample Google Paid campaign results
Sample Google Paid campaign resultsSample Google Paid campaign results
Sample Google Paid campaign results
 
Summary of Made to Stick book
Summary of Made to Stick bookSummary of Made to Stick book
Summary of Made to Stick book
 
Shift from manual to interactive reporting
Shift from manual to interactive reportingShift from manual to interactive reporting
Shift from manual to interactive reporting
 
Brett sheppard references
Brett sheppard referencesBrett sheppard references
Brett sheppard references
 
Datadog APM Product Launch
Datadog APM Product LaunchDatadog APM Product Launch
Datadog APM Product Launch
 
Brett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard Sample Portfolio
Brett Sheppard Sample Portfolio
 
Idc datadog-expands-into-apm
Idc datadog-expands-into-apmIdc datadog-expands-into-apm
Idc datadog-expands-into-apm
 
Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014
 
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
 
Datadog brief
Datadog briefDatadog brief
Datadog brief
 
SEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsSEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth Startups
 
GigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardGigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett Sheppard
 
DxContinuum Forrester Webinar
DxContinuum Forrester WebinarDxContinuum Forrester Webinar
DxContinuum Forrester Webinar
 
Cloudera Hunk
Cloudera HunkCloudera Hunk
Cloudera Hunk
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 

Recently uploaded (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 

Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters

  • 1. Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters PRESENTED BY Sagi Zelnick Principal Architect @ Yahoo and Ledion Bitincka Principal Architect @ Splunk Hadoop Summit June 2014 San Jose, CA
  • 2. Overview 2 Yahoo Proprietary !  Hadoop @ Yahoo: 8+ years of innovation !  Hunk @ Yahoo: organization-wide investment for next 3+ years !  Yahoo providing Hunk as a self-service to explore, analyze & visualize data in HDFS ›  Hunk allows for visually browsing very complex tables (250+ fields) ›  Rapid prototyping for new jobs with almost instant results for searches, without having to wait for the entire job/query to finish ›  Cuts down on the development cycles by faster interaction with results ›  Built-in graphs/charts makes for a powerful solution for many situations
  • 3. About your speakers 3 Yahoo Proprietary Sagi Zelnick Ledion Bitincka Principal Architect Principal Architect Yahoo Splunk
  • 4. Hunk + Hadoop @ Yahoo 4Yahoo Proprietary
  • 5. 5 Yahoo Proprietary History of Hadoop innovation @ Yahoo
  • 6. Over 600PB of Hadoop storage (over half an Exabyte) 6 Yahoo Proprietary !  Very large clusters used by many groups across the enterprise. !  More than 35,000 individual datanodes. !  Hadoop is provided as a service. !  Multiple cluster types such as research, dev, sandbox and production. !  Services such as HBase, Hive, Oozie, etc… !  Users are free to run jobs, but have resource constraints. !  Maintained by the Grid Operations Group.
  • 7. Improving operational visibility with Hunk !  We pointed Hunk at many operational logs and event data we already had on the grid. !  This includes system metrics, HDFS ops, JVM stats and YARN metrics. !  Created instrumentation to measure usage per user and job. !  Analyzed terabytes of NameNode audit logs. !  Job history leveraged for visualizing usage/growth and historical views. !  Custom events for HBase statistics. 7 Yahoo Proprietary
  • 8. Use Case Customer Benefits System metrics from 35k nodes Grid Ops / Grid Customers Identify slow tasks/nodes when debugging Historical insights of resources All Grid Customers Track organic growth Job performance All Grid Customers Improved job SLAs HBase metrics All Grid Customers Track region/RS/table metrics… Job logs in near real-time All Grid Customers / Ops Search for errors directly from the YARN logs Namenode operational data Research, Dev Improved performance and stability Tracking Hadoop performance and metrics in Hunk 8 Yahoo Proprietary
  • 9. Measuring NameNode performance pre & post upgrades 9 Yahoo Proprietary !  Historical visualizations of all operations. !  Search data in Hunk from billions of NameNode events. !  Measure JVM and memory usage. !  Insights into operational performance.
  • 10. Yahoo Proprietary index="simon_blue_new_all" this_cluster="dilithiumblue*" (log_subtype="DFS" #hdfs=hdfs) | timechart spa n=1h avg(number*) as num_* Last 7 days ✓ 10,086 events (5/15/14 1:00:00.000 AM to 5/22/14 1:36:34.000 AM) _time num_BlockReports num_CopyBl...perations num_HeartBeats num_ReadBl...perations num_ReadMe...perations num_Replac...Operations num_WriteB...Operations num_blockChecksumOp Fri May 16 2014 Sun May 18 Tue May 20 200,000,000 400,000,000 600,000,000 _time ↕ num_Bl ockRep orts ↕ num_Copy BlockOpera tions ↕ num_ HeartB eats ↕ num_Read BlockOpera tions ↕ num_ReadMe tadataOperati ons ↕ num_Replac eBlockOperat ions ↕ num_Write BlockOpera tions ↕ num_blo ckChecks umOp ↕ 2014-05-15 01:00 112443 7.7359 02 46721126. 819672 51495 7.3840 98 12930433.0 77869 0.000000 94210832.78 6885 63512425.9 67213 13975.30 6557 2014-05-15 02:00 111549 6.2904 92 53597000. 262295 29871 7.6370 49 10402176.7 17213 0.000000 94109944.65 5738 93916552.3 93443 35459.28 8689 2014-05-15 03:00 111037 2.4173 56566721. 704918 42849 4.9449 13296385.5 90164 0.000000 94141430.29 5082 97353478.2 29508 20307.54 9344 Visualization Visualization using Hunk 10
  • 11. 11 Yahoo Proprietary n=5m avg(number*) as num_* Last 2 days ✓ 2,753 events (5/20/14 1:14:21.000 AM to 5/22/14 1:14:21.000 AM) _time num_BlockReports num_CopyBl...perations num_HeartBeats num_ReadBl...perations num_ReadMe...perations num_Replac...Operations num_WriteB...Operations num_blockChecksumOp 12:00 PM Tue May 20 2014 12:00 AM Wed May 21 12:00 PM 1,000,000,000 250,000,000 500,000,000 750,000,000 _time ↕ num_Bl ockRep orts ↕ num_Copy BlockOpera tions ↕ num_ HeartB eats ↕ num_Read BlockOpera tions ↕ num_ReadMe tadataOperati ons ↕ num_Replac eBlockOperat ions ↕ num_Write BlockOpera tions ↕ num_blo ckChecks umOp ↕ 2014-05-20 01:15:00 105604 7.0240 00 34677652. 000000 12412 1.2640 00 26242490.8 00000 0.000000 88112292.80 0000 126478486. 400000 51405.34 6000 2014-05-20 01:20:00 105551 30920700. 10653 22756041.8 0.000000 87745422.40 92323387.2 32070.48 Visualization Sample troubleshooting in Hunk of 750 million events
  • 12. 12 Yahoo Proprietary New Search index="simon_blue_new_all" this_cluster="dilithiumblue*" (log_subtype="JVM" ProcessName="NameNode") | tim echart span=5m avg(Threads*) as threads_* Last 2 days ✓ 8,463 events (5/20/14 12:00:00.000 AM to 5/22/14 12:00:00.000 AM) _time threads_Blocked threads_New threads_Runnable threads_Terminated threads_TimedWaiting threads_Waiting 12:00 AM Tue May 20 2014 12:00 PM 12:00 AM Wed May 21 12:00 PM 200 400 _time ↕ threads_Block ed ↕ threads_Ne w ↕ threads_Runna ble ↕ threads_Terminat ed ↕ threads_TimedWait ing ↕ threads_Waiti ng ↕ 2014-05-20 00:00:00 72.360000 10.638333 5.485833 0.000000 21.208333 78.555000 2014-05-20 00:05:00 70.177333 10.554667 5.277333 0.000000 20.744667 76.578000 2014-05-20 00:10:00 70.211333 9.998667 5.022000 0.000000 19.333333 73.766667 2014-05-20 00:15:00 70.300667 10.268000 5.156667 0.000000 17.488667 70.122000 2014-05-20 00:20:00 70.422667 10.376000 5.188000 0.000000 15.700000 66.611333 2014-05-20 00:25:00 70.444000 10.288000 5.144000 0.000000 14.089333 63.400667 Visualization Big picture plus granular details
  • 13. Analyzing NameNode RPC calls (troubleshooting) 13 Yahoo Proprietary !  Who is making what RPC call (open, listStatus, create, etc.). !  How often are they making these RPC calls. !  From which IP/host are they coming from. !  Search and visualize historical data from billions of events. !  Prevent NameNode abuse/misuse.
  • 14. 14 Yahoo Proprietary Visualizing 834 million discrete events …
  • 15. 15 Yahoo Confidential & Proprietary … continued
  • 16. Queue insights (capacity & provisioning) !  Each Hadoop job runs in a specific queue. !  We track every aspect of the YARN framework. !  Immediate queue performance and configuration profiling via job history server. !  Historical views and trends that enable better capacity management. !  Improved queue utilization and allocation management. 16 Yahoo Proprietary
  • 17.  New Search index="jobsummary_logs_all_red" cluster="dilithium*" | eval total_slot_seconds=(mapSlotSeconds + reduceSlotSec onds) | eval gb_hours=((total_slot_seconds * 0.5) / 3600) | eval gb_hours=round(gb_hours) | timechart span=6h sum (gb_hours) as gb_hours by queue Last 7 days ✓ 1,175,726 events (5/20/14 8:00:00.000 PM to 5/27/14 8:26:26.000 PM) 200,000 400,000 600,000 _time ↕ OTH ER ↕ apg_dai lyhigh_ p3 ↕ apg_dail ymedium _p5 ↕ apg_hou rlyhigh_ p1 ↕ apg_ho urlylow_ p4 ↕ apg_hourl ymedium _p2 ↕ apg _p7 ↕ curveb all_larg e ↕ curveb all_me d ↕ sling shot ↕ sling stone ↕ 2014-05-20 18:00 415 4 45512 7071 25643 12111 29664 347 3 26547 14192 6087 5 4537 6 2014-05-21 00:00 193 41 92661 18005 41008 22944 88115 108 96 38648 8693 4818 6 8767 0 2014-05-21 06:00 211 108137 38398 35627 14934 101925 244 29269 14066 2434 4783 Visualization _time Wed May 21 2014 Thu May 22 Fri May 23 Sat May 24 Sun May 25 Mon May 26 Search | Splunk 6.1.0 http://spbl103n01.blue.ygrid.yahoo.com:9999/en-US/app/search... Visualizing queues 17 Yahoo Proprietary
  • 18. Self-service job reports 18 Yahoo Proprietary !  Each job is unique and so are the map and reduce elements. !  How to start analyzing jobs? !  Historical job performance and profiling enables in-depth performance tuning. !  Long terms historical views and trending of growth.
  • 19. 19 Yahoo Proprietary clu ster ↕ us er ↕ que ue ↕ jobName ↕ jobId ↕ status ↕ gb-ho urs ↕ run_ mins ↕ cob alt g m on grid eng PigLatin:findRemoteHDFSFromAudits.pig job_1398982765 383_315271 SUCCE EDED 108.0 0 33.07 cob alt g m on grid eng PigLatin:findRemoteHDFSFromAudits.pig job_1398982765 383_312700 SUCCE EDED 104.0 0 37.37 cob alt g m on grid eng PigLatin:findRemoteHDFSFromAudits.pig job_1398982765 383_309715 SUCCE EDED 88.00 29.83 cob alt g m on grid ops distcp: job_1398982765 383_309921 SUCCE EDED 36.00 68.49 cob alt g m on grid ops SPLK_spbl103n01.blue.ygrid.yahoo.com_1401125953.2076_0 job_1398982765 383_313570 SUCCE EDED 25.00 14.26 cob alt g m on grid ops nnaudit_DR_2014_05_25 job_1398982765 383_308938 SUCCE EDED 25.00 15.43 cob g grid nnaudit_DB_2014_05_25 job_1398982765 SUCCE 24.00 18.07 New Search index="jobsummary_logs_all_blue" cluster="*" user="gmon" | eval total_slot_seconds=(mapSlotSeconds + reduceSlotSeconds) | eval gb_hours=((total_slot_seconds * 0.5) / 3600) | eval gb_hours=round(gb_hours,2) | eval runtime=(finishTime-submitTime)/1000 | stats sum(gb_hours) as gb-hours avg(runtime) as run_mins by cluster user queue jobName jobId status| eval run_mins=round(run_mins/60,2) | sort -gb-hours Yesterday ✓ 4,871 events (5/26/14 12:00:00.000 AM to 5/27/14 12:00:00.000 AM) Statistics (4,871)
  • 23. More data to tap into with the metastore / Hive sources 23 Yahoo Proprietary !  Using the metastore we can setup virtual indexes to any table(s) in Hive, without the need to define the schema up-front !  Visualize very complex tables (250+ fields) !  Rapid prototyping for new jobs with almost instant results for searches, without having to wait for the entire job/query to finish !  Built-in aggregates and graphs/charts !  Accelerates development workflow by providing faster interaction with data ... it’s not just logs we’re looking at
  • 27. 27% Fast%Deployment%and%Configura4on% Just%point%at%Hadoop% •  Cer4fied%integra4ons%to%all% major%Hadoop%distribu4ons% •  Choose%1stLgen%MapReduce% or%YARN%% •  Create%Virtual%Indexes%across% one%or%more%clusters% •  From%download%to%searching% data%in%<%60%minutes% Connect%to%one%or%mul4ple%Hadoop%clusters% YARN% cer4fied%
  • 28. 28% Interac4ve%Search%and%Results%Preview% Rapidly%interact%with%data% •  Powerful%Search%Processing% Language%(SPL™)% •  Ad%hoc%exploratory%analy4cs% across%massive%datasets% •  Preview%results% •  No%fixed%schema% •  No%requirement%to% “understand”%data%upfront% Search% interface% Preview% results% Drill%down% to%raw%data% Pause%or%stop%MapReduce%jobs%
  • 29. 29% Powerful%Dashboards%for%SelfLService%Analy4cs% Interac4ve%Dashboards% and%Charts% •  EasyLtoLuse%dashboard%editor% •  Chart%overlay% •  Pan%and%zoom% •  InLdashboard%drill%down% •  Embed%charts%and% dashboards%in%3rd%party%apps% •  Reuse%skills%with%Splunk% Enterprise%6.1%and%Hunk%6.1%
  • 31. 31% RoleLbased%Security%for%Shared%Clusters% PassLthrough% Authen4ca4on% •  Provide%roleLbased%security% for%Hadoop%clusters% •  Access%Hadoop%resources% under%security%and% compliance% •  Integrates%with%Kerberos% for%Hadoop%security% Business! Analyst% MarkeNng! Analyst% Sys! Admin% Business!! Analyst!! Queue:!! Biz!AnalyNcs% MarkeNng! Analyst! Queue:! MarkeNng% Sys!! Admin2! Queue:!! Prod%
  • 34. Question/Comments? Sagi Zelnick – Principal Architect Email: zelnicks@yahoo-inc.com Ledion Bitincka – Principal Architect Email: lbitincka@splunk.com