Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi
Upcoming SlideShare
Loading in...5

Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi






Total Views
Views on SlideShare
Embed Views



7 Embeds 2,919 2449 210 208 27 18 5 2



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi Presentation Transcript

  • Complex Event Processingon Ruby, Fluentd and NorikraRubyKaigi 2013 (2013/06/01)TAGOMORI Satoshi (@tagomoris)13年6月1日土曜日
  • TAGOMORI Satoshi (@tagomoris)LINE corp.Ruby, Perl, Node.js, Hadoop, ...13年6月1日土曜日
  • TAGOMORI Satoshi (@tagomoris)LINE corp.Ruby, Perl, Node.js, Hadoop, ...Please, Call me MORIS !13年6月1日土曜日
  • 13年6月1日土曜日
  • 2013/04- LINE Corporation (+NHN Japan)2012/01- NHN Japan-2011/12 livedoor (+NHN Japan +Naver Japan)13年6月1日土曜日
  • 13年6月1日土曜日
  • 13年6月1日土曜日
  • My mission: loggingStore access logs / application logsCalculate & visualize service activitiesBuild data warehouse for applicationengineers operationsNotify anomaly service statusesfor system status (HTTP status, responsetime, ...)for application metrics13年6月1日土曜日
  • Our log trafficDaily1.5+ TB (non compressed)5.6+ Billion lines / day (56億行/day)Peak time140,000+ lines / sec300Mbps13年6月1日土曜日
  • What we want to doCOUNT PV,UU and others (daily/realtime)COUNT Service metrics (daily/hourly)FIND Surprising Errors [4xx,5xx] (immediately)CHECK Response Times (immediately)SERCH Logs in troubles (hourly/immediately)VISUALIZE/NOTIFY App Status(realtime)13年6月1日土曜日
  • Batches and StreamsHadoop is for batchesHigh performance batch is importantHDFS has good performanceStream log writing and calculationsare also VERY VERY IMPORTANTHybrid System:Stream processing + Batch13年6月1日土曜日
  • System OverviewWebServers FluentdClusterArchiveStorage(scribed)FluentdWatchersGraphToolsNotifications(IRC)Hadoop Cluster(HDFS, YARN)webhdfsHuahinManagerhiveserverSTREAMShib ShibUIBATCHSCHEDULEDBATCHNorikra13年6月1日土曜日
  • Stream processingParsing logsAppending flags for analysisCounting rate/bytesCalculating system metricsCalculating application metrics13年6月1日土曜日
  • Fluentd"Fluentd" is a lightweight and flexible log collector.Fluentd receives logs as JSON streams, buffersthem, and sends them to other systems likeAmazon S3, MongoDB, Hadoop, or otherFluentds.http://fluentd.org13年6月1日土曜日
  • Fluentd on CRubyeasy to install/setup (from to install (from to write (with ruby!)stability (no one crashes in this 1 year)throughput (17500 msgs/sec)td-agent (rpm/deb: ruby and fluentd and someplugins)13年6月1日土曜日
  • Fluentd users13年6月1日土曜日
  • Fluentd: stream aggregationSystem metrics: status / response time13年6月1日土曜日
  • Fluentd: stream aggregation### response time aggregation<match responsetime.monitor.*>type numeric_monitortag monitor.responsetimeaggregate tagunit minutemonitor_key durationpercentiles 50,90,95,98,99</match>### response time counting<match responsetime.counter.*>type numeric_countertag numcount.responsetimeaggregate tagunit minutecount_key durationpattern1 u100ms 0 100000pattern2 u500ms 100000 500000pattern3 u1s 500000 1000000pattern4 u3s 1000000 3000000pattern5 long 3000000</match>### HTTP status counting<match httpstatus.counter.*>type datacountertag_prefix datacount.httpstatusoutput_per_tag yesaggregate tagoutput_messages yesunit minutecount_key statuspattern1 2xx ^2ddpattern2 3xx ^3ddpattern3 429 ^429pattern4 4xx ^4ddpattern5 5xx ^5dd</match>13年6月1日土曜日
  • break    13年6月1日土曜日
  • And more: stream queryCustom plugin: not so casual enoughxQL: declarative languagestreams processingfor optional data fieldsno more schema managementconnectivity with Fluentd13年6月1日土曜日
  • Stream query:vs stored data queryNo more query wait timeImmediate result for time batchNo more storagesNo more query execution managementOnce register query, runs forever13年6月1日土曜日
  • Norikra13年6月1日土曜日
  • NorikraFull feature of Esper over JRubySimple RPC: msgpack-rpc-over-httpSimple RPC Server: mizuno (jetty + rack)Simple Client Library: norikra-clientJust same code for cruby/jruby13年6月1日土曜日
  • NorikraNorikra Server (on JVM)Esper Instance (Query Engine)Type DefinitionManagerOutput EventPoolNorikra EngineRPC Servermizuno (Jetty + Rack)Rack RPC HandlerNorikraClientNorikraClientJRUBYCRUBYmsgpack-rpc-over-http13年6月1日土曜日
  • Esper"Esper and Event Processing Language (EPL)provide a highly scalable, memory-efficient, in-memory computing, SQL-standard, minimallatency, real-time streaming Big Data processingengine for medium to high-velocity and high-variety data."年6月1日土曜日
  • Norikra Query: target "sales"goods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad"goods_id:4 price:36.6 num:10 shop:"Cookpad"SELECT shop, sum(price*num) AS amountFROM minutes)GROUP BY shopgoods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"SELECT affiliate, count(*) AS cntFROM hour)GROUP BY affiliate13年6月1日土曜日
  • Norikra query:vs Fluentd custom pluginSQL!!!No more restart for new queriesregister queries whenever we wantNo more private pluginsNo more fat Fluentd configurations13年6月1日土曜日
  • fluent-plugin-norikraFluentd plugin to use NorikraNorikra server autostartAutomatically defined target(ex: table)Pre-defined queries for each targets13年6月1日土曜日
  • fluent-plugin-norikrainstallation`gem install fluent-plugin-norikra`configurationsee DEMO13年6月1日土曜日
  • Demo: bootstraprbenv shell jruby-1.7.4gem install norikrawhich norikrarbenv shell 2.0.0-pxxxgem install fluent-plugin-norikravi demo.conffluentd -c demo.conf13年6月1日土曜日
  • Demo: query streamssome messages over fluent-catregister queries with norikra-clientmore messages over fluent-cat & norikra-client13年6月1日土曜日
  • Roadmapof Norikra13年6月1日土曜日
  • roadmap of norikraNorikra is still UNDER DEVELOPMENTNorikra feature updates (JOINs, etc)Web GUIquery & target list managementsave & restoreDistributed & orchestrated nodes13年6月1日土曜日
  • Ruby without Rails13年6月1日土曜日
  • Unbelievableto stop GC!!!!!!!!!!13年6月1日土曜日
  • CRubygreat partner for java & rubyistand for jvm middleware, like HadoopNorikra uses Espers internal API toparse queriesgems across platforms?JRubylong-running daemons on crubymemory usage is big problem13年6月1日土曜日
  • See also:"Fluentd: The ruby based middleware across the world""Log analysis system with Hadoop in livedoor 2013 Winter"年6月1日土曜日