Erik Swan, Michael WildeBig Data for Everyman
Hi... We work at Splunk.
We stare at data all day.
WTF is Big Data?!
larger   than small data?
smaller than giant data?
some cool sauce for DBAs?
Aaaahhh, no.
a simple way to describe a     massive problem   *or opportunity depending on your p.o.v.
Big data comes out of machines                                                         GPS,                               ...
Big data comes out of machines Machine-generated data is one of the                                                       ...
no, not uswe’re justnice guyswho wantshow youcool stuff
building a service?you are a producer and  consumer of data   using an app?
Seth Rabinowitz              James Rodmell      CEO                        CTO  Location-­‐Based	  Messaging	    and	  Int...
DATE/TIME                        Data! Good!                                    DEVICE ID2011-11-06 11:57:31,65,00027d27-a...
show them something    cool already!
Oh, real quick. Did you check in    or tweet #splunk #sxsw                    ...please
All this data can be pretty cool        and empowering
except one littlePROBLEM     Text
alot of it looks like this
0,113/Apr/2011 08:52:53,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.16,192.168.1.6,(empty),(empty),1100,43025,43025_t...
and we’re expected to talk to         it like this
select (select max(answer.answer) from answer where answer.member_id in (select member_id from team_members where project_...
It could be better.yes? better is good!
{[-­‐]	  	  checkin	  :	  {[-­‐]	  	  	  	  badges	  :	  [],	  	  	  	  created	  :	  1331454784,	  	  	  	  geolat	  :	  ...
source=foursquare | timechart count       by checkin.venue.name         The languages to talk to data are           gettin...
Guys.. come on! Goback to the data please.
Need data?a simple way to describe a     massive problemA friend in Boulder can help
The Social Media APIJud ValeskiCo-Founder, CEO
Just when you think you’re all done, wait. There is another   consumer you may have          forgotten
Someone with  a different perspective  sees your  service asinput to theirs
DEMAND REALTIME DATAIN A STREAM OVER THE WEB      IN JSON FORMAT
Hey audience!We still have a few    minutes.  What questions  might you have been saving untilthis exact moment?
Thanks.                                                        Erik Swan, CTO                                             ...
Big Data for Everyman
Big Data for Everyman
Big Data for Everyman
Big Data for Everyman
Upcoming SlideShare
Loading in...5
×

Big Data for Everyman

2,475

Published on

A presentation given by Erik Swan, CTO/Co-Founder of Splunk and Michael Wilde, Splunk NInja at the SXSW Interactive 2012 Conference on March 11, 2011

Published in: Technology, Business
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,475
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
212
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Big Data for Everyman

  1. 1. Erik Swan, Michael WildeBig Data for Everyman
  2. 2. Hi... We work at Splunk.
  3. 3. We stare at data all day.
  4. 4. WTF is Big Data?!
  5. 5. larger than small data?
  6. 6. smaller than giant data?
  7. 7. some cool sauce for DBAs?
  8. 8. Aaaahhh, no.
  9. 9. a simple way to describe a massive problem *or opportunity depending on your p.o.v.
  10. 10. Big data comes out of machines GPS, RFID, Hypervisor, Web Servers, Email, Messaging Clickstreams, Mobile, Telephony, IVR, Databases, Sensors, Telematics, Storage, Servers, Security Devices, Desktops Volume | Velocity | Variety | Variability
  11. 11. Big data comes out of machines Machine-generated data is one of the GPS, fastest growing, most complex RFID,and most valuable segments of big data Hypervisor, Web Servers, Email, Messaging Clickstreams, Mobile, Telephony, IVR, Databases, Sensors, Telematics, Storage, Servers, Security Devices, Desktops Volume | Velocity | Variety | Variability
  12. 12. no, not uswe’re justnice guyswho wantshow youcool stuff
  13. 13. building a service?you are a producer and consumer of data using an app?
  14. 14. Seth Rabinowitz James Rodmell CEO CTO Location-­‐Based  Messaging   and  Intelligence  For  Your  App   and  Your  Customers
  15. 15. DATE/TIME Data! Good! DEVICE ID2011-11-06 11:57:31,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.75496,-73.963853,602011-11-06 12:17:32,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.755001,-73.963886,702011-11-06 12:37:33,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754982,-73.963849,75 LAT/LONG2011-11-06 12:57:34,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754984,-73.963883,852011-11-06 13:17:35,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754941,-73.9639,902011-11-06 13:37:36,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754948,-73.963874,902011-11-06 13:57:37,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754931,-73.963892,95 BATTERY STRENGTH2011-11-06 14:17:38,50,00027d27-ae02-627d-a79a-fa0004d3a347,40.755232,-73.963522,1002011-11-06 14:37:33,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754979,-73.9639,100
  16. 16. show them something cool already!
  17. 17. Oh, real quick. Did you check in or tweet #splunk #sxsw ...please
  18. 18. All this data can be pretty cool and empowering
  19. 19. except one littlePROBLEM Text
  20. 20. alot of it looks like this
  21. 21. 0,113/Apr/2011 08:52:53,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.16,192.168.1.6,(empty),(empty),1100,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1048,135,epmap,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1049,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1051,135,epmap,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1052,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.64,192.168.1.6,(empty),(empty),1694,135,epmap,(empty),
  22. 22. and we’re expected to talk to it like this
  23. 23. select (select max(answer.answer) from answer where answer.member_id in (select member_id from team_members where project_id in ( select project_idfrom project where Business_stream=Upstream and stage=Appraise andproject_id in (select project_id from projectextra where subteam<>1 ) ) ) andanswer.page_id=page.page_id) as thinl, (select max(avgscore) from task_projectwhere task_project.project_id not in (select project_id from projectextrawhere subteam=1 ) and task_project.project_id in (select project_id fromproject where stage=Appraise and Business_stream = Upstream) andtask_project.page_id=page.page_id) as bmax, (select max(answer) from answerwhere answer.page_id=page.page_id) as datamax, (select avg(avgscore) fromtask_project where project_id=1 and task_project.page_id=page.page_id) asprojavg, (select avg(avgscore) from task_project where project_id not in(select project_id from projectextra where subteam=1) andtask_project.page_id=page.page_id) as companyavg, (select avg(avgscore) fromtask_project where project_id not in (select project_id from projectextrawhere subteam=1) and project_id in (select project_id from project whereBusiness_stream = Upstream) and task_project.page_id=page.page_id) asbusinessavg, page.* from page,riverorder where page.category_name=BusinessBoundaries and stage_name=Appraise andriverorder.category_name=page.category_name order byriverorder.riverorder,page.order_id select (select max(answer.answer) fromanswer where answer.member_id in ( select member_id from team_members whereproject_id in ( select project_id from project where
  24. 24. It could be better.yes? better is good!
  25. 25. {[-­‐]    checkin  :  {[-­‐]        badges  :  [],        created  :  1331454784,        geolat  :  "30.2640941786",        geolong  :  "-­‐97.7414819408",        mayor  :  {[-­‐]            type  :  "nochange"        },        primarycategory  :  {[-­‐]            fullpathname  :  "Food:American  Restaurants",            iconurl  :  "https://foursquare.com/img/categories/food/default.png",            id  :  "4bf58dd8d48988d14e941735", Text            nodename  :  "American  Restaurants"        },        timezone  :  "America/Chicago",        user  :  {[-­‐]            gender  :  "male"        },        venue  :  {[-­‐]            id  :  "4d752b1bba682d43e7563876",            name  :  "CNN  Grill  @  SXSW  (Maxs  Wine  Dive)"        }    }} readable, ya think?
  26. 26. source=foursquare | timechart count by checkin.venue.name The languages to talk to data are getting better for us humans
  27. 27. Guys.. come on! Goback to the data please.
  28. 28. Need data?a simple way to describe a massive problemA friend in Boulder can help
  29. 29. The Social Media APIJud ValeskiCo-Founder, CEO
  30. 30. Just when you think you’re all done, wait. There is another consumer you may have forgotten
  31. 31. Someone with a different perspective sees your service asinput to theirs
  32. 32. DEMAND REALTIME DATAIN A STREAM OVER THE WEB IN JSON FORMAT
  33. 33. Hey audience!We still have a few minutes. What questions might you have been saving untilthis exact moment?
  34. 34. Thanks. Erik Swan, CTO Co-Founder, Splunk Michael Wilde Splunk NinjaWho else sends you on your way with a cute dog photo?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×