• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
8,371
On Slideshare
0
From Embeds
0
Number of Embeds
10

Actions

Shares
Downloads
40
Comments
0
Likes
23

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CRUBY+JRUBYFLUENTDCEPNORIKRAMSGPACK-RPC-OVER-HTTPLOGGINGSTREAM PROCESSINGxQLESPER13年6月1日土曜日
  • 2. Complex Event Processingon Ruby, Fluentd and NorikraRubyKaigi 2013 (2013/06/01)TAGOMORI Satoshi (@tagomoris)13年6月1日土曜日
  • 3. TAGOMORI Satoshi (@tagomoris)LINE corp.Ruby, Perl, Node.js, Hadoop, ...13年6月1日土曜日
  • 4. TAGOMORI Satoshi (@tagomoris)LINE corp.Ruby, Perl, Node.js, Hadoop, ...Please, Call me MORIS !13年6月1日土曜日
  • 5. 13年6月1日土曜日
  • 6. 2013/04- LINE Corporation (+NHN Japan)2012/01- NHN Japan-2011/12 livedoor (+NHN Japan +Naver Japan)13年6月1日土曜日
  • 7. 13年6月1日土曜日
  • 8. 13年6月1日土曜日
  • 9. My mission: loggingStore access logs / application logsCalculate & visualize service activitiesBuild data warehouse for applicationengineers operationsNotify anomaly service statusesfor system status (HTTP status, responsetime, ...)for application metrics13年6月1日土曜日
  • 10. Our log trafficDaily1.5+ TB (non compressed)5.6+ Billion lines / day (56億行/day)Peak time140,000+ lines / sec300Mbps13年6月1日土曜日
  • 11. What we want to doCOUNT PV,UU and others (daily/realtime)COUNT Service metrics (daily/hourly)FIND Surprising Errors [4xx,5xx] (immediately)CHECK Response Times (immediately)SERCH Logs in troubles (hourly/immediately)VISUALIZE/NOTIFY App Status(realtime)13年6月1日土曜日
  • 12. BATCHESANDSTREAMS13年6月1日土曜日
  • 13. Batches and StreamsHadoop is for batchesHigh performance batch is importantHDFS has good performanceStream log writing and calculationsare also VERY VERY IMPORTANTHybrid System:Stream processing + Batch13年6月1日土曜日
  • 14. System OverviewWebServers FluentdClusterArchiveStorage(scribed)FluentdWatchersGraphToolsNotifications(IRC)Hadoop Cluster(HDFS, YARN)webhdfsHuahinManagerhiveserverSTREAMShib ShibUIBATCHSCHEDULEDBATCHNorikra13年6月1日土曜日
  • 15. Stream processingParsing logsAppending flags for analysisCounting rate/bytesCalculating system metricsCalculating application metrics13年6月1日土曜日
  • 16. Fluentd"Fluentd" is a lightweight and flexible log collector.Fluentd receives logs as JSON streams, buffersthem, and sends them to other systems likeAmazon S3, MongoDB, Hadoop, or otherFluentds.http://fluentd.org13年6月1日土曜日
  • 17. Fluentd on CRubyeasy to install/setup (from rubygems.org)pluginseasy to install (from rubygems.org)easy to write (with ruby!)stability (no one crashes in this 1 year)throughput (17500 msgs/sec)td-agent (rpm/deb: ruby and fluentd and someplugins)13年6月1日土曜日
  • 18. Fluentd users13年6月1日土曜日
  • 19. Fluentd: stream aggregationSystem metrics: status / response time13年6月1日土曜日
  • 20. Fluentd: stream aggregation### response time aggregation<match responsetime.monitor.*>type numeric_monitortag monitor.responsetimeaggregate tagunit minutemonitor_key durationpercentiles 50,90,95,98,99</match>### response time counting<match responsetime.counter.*>type numeric_countertag numcount.responsetimeaggregate tagunit minutecount_key durationpattern1 u100ms 0 100000pattern2 u500ms 100000 500000pattern3 u1s 500000 1000000pattern4 u3s 1000000 3000000pattern5 long 3000000</match>### HTTP status counting<match httpstatus.counter.*>type datacountertag_prefix datacount.httpstatusoutput_per_tag yesaggregate tagoutput_messages yesunit minutecount_key statuspattern1 2xx ^2ddpattern2 3xx ^3ddpattern3 429 ^429pattern4 4xx ^4ddpattern5 5xx ^5dd</match>13年6月1日土曜日
  • 21. break    13年6月1日土曜日
  • 22. And more: stream queryCustom plugin: not so casual enoughxQL: declarative languagestreams processingfor optional data fieldsno more schema managementconnectivity with Fluentd13年6月1日土曜日
  • 23. Stream query:vs stored data queryNo more query wait timeImmediate result for time batchNo more storagesNo more query execution managementOnce register query, runs forever13年6月1日土曜日
  • 24. Norikra13年6月1日土曜日
  • 25. NorikraFull feature of Esper over JRubySimple RPC: msgpack-rpc-over-httpSimple RPC Server: mizuno (jetty + rack)Simple Client Library: norikra-clientJust same code for cruby/jruby13年6月1日土曜日
  • 26. NorikraNorikra Server (on JVM)Esper Instance (Query Engine)Type DefinitionManagerOutput EventPoolNorikra EngineRPC Servermizuno (Jetty + Rack)Rack RPC HandlerNorikraClientNorikraClientJRUBYCRUBYmsgpack-rpc-over-http13年6月1日土曜日
  • 27. Esper"Esper and Event Processing Language (EPL)provide a highly scalable, memory-efficient, in-memory computing, SQL-standard, minimallatency, real-time streaming Big Data processingengine for medium to high-velocity and high-variety data."http://esper.codehaus.org/13年6月1日土曜日
  • 28. Norikra Query: target "sales"goods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad"goods_id:4 price:36.6 num:10 shop:"Cookpad"SELECT shop, sum(price*num) AS amountFROM sales.win:time_batch(10 minutes)GROUP BY shopgoods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"SELECT affiliate, count(*) AS cntFROM sales.win:time_batch(1 hour)GROUP BY affiliate13年6月1日土曜日
  • 29. Norikra query:vs Fluentd custom pluginSQL!!!No more restart for new queriesregister queries whenever we wantNo more private pluginsNo more fat Fluentd configurations13年6月1日土曜日
  • 30. fluent-plugin-norikraFluentd plugin to use NorikraNorikra server autostartAutomatically defined target(ex: table)Pre-defined queries for each targets13年6月1日土曜日
  • 31. fluent-plugin-norikrainstallation`gem install fluent-plugin-norikra`configurationsee DEMO13年6月1日土曜日
  • 32. Demo: bootstraprbenv shell jruby-1.7.4gem install norikrawhich norikrarbenv shell 2.0.0-pxxxgem install fluent-plugin-norikravi demo.conffluentd -c demo.conf13年6月1日土曜日
  • 33. Demo: query streamssome messages over fluent-catregister queries with norikra-clientmore messages over fluent-cat & norikra-client13年6月1日土曜日
  • 34. Roadmapof Norikra13年6月1日土曜日
  • 35. roadmap of norikraNorikra is still UNDER DEVELOPMENTNorikra feature updates (JOINs, etc)Web GUIquery & target list managementsave & restoreDistributed & orchestrated nodes13年6月1日土曜日
  • 36. Ruby without Rails13年6月1日土曜日
  • 37. Unbelievableto stop GC!!!!!!!!!!13年6月1日土曜日
  • 38. CRubygreat partner for java & rubyistand for jvm middleware, like HadoopNorikra uses Espers internal API toparse queriesgems across platforms?JRubylong-running daemons on crubymemory usage is big problem13年6月1日土曜日
  • 39. SHUT THE FUCK UPAND WRITE SOME QUERY13年6月1日土曜日
  • 40. See also:http://fluentd.org/http://fluentd.org/plugin/https://github.com/tagomoris/norikrahttps://github.com/tagomoris/norikra-clienthttps://github.com/tagomoris/fluent-plugin-norikrahttp://esper.codehaus.org/"Fluentd: The ruby based middleware across the world"http://www.slideshare.net/tagomoris/fluentd-in-tkrk10"Log analysis system with Hadoop in livedoor 2013 Winter"http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-201313年6月1日土曜日