SlideShare a Scribd company logo
1 of 41
Small Pieces Loosely
       Joined
        #cgn13
or...
A practical example of processing real-time
   data with a distributed agent network
            (Warning: does not contain real code)
Red Gate
12th October 2011
eMail Marketing
Mailchimp webhook
"type": "subscribe",
"fired_at": "2009-03-26 21:35:57",
"data[id]": "8a25ff1d98",
"data[list_id]": "a6b5da1054",
"data[email]": "api@mailchimp.com",
"data[email_type]": "html",
"data[merges][EMAIL]": "api@mailchimp.com",
"data[merges][FNAME]": "MailChimp",
"data[merges][LNAME]": "API",
"data[merges][INTERESTS]": "Group1,Group2",
"data[ip_opt]": "10.20.10.30",
"data[ip_signup]": "10.20.10.30"
Pump the callbacks into a message bus...
Messaging
mailchimp-pump.php


$json = json_encode($_POST);
$msg = new AMQPMessage($json);
$channel->basic_publish($msg, 'mailchimp', "morat.campaign.mailchimp.".
$_POST['type']);
I’d like to watch the stream on IRC...
Valve

Subscribe to mailchimp exchange
morat.campaign.mailchimp.#
Translate to plain english for IRC
Inject into irc exchange with routing key morat.irc.
[channel]
mailchimp-irc-valve.rb

case record['type']
when 'subscribe'
    output :irc, "'#{record['data']['merges']['FNAME']} #{record['data']
    ['merges']['LNAME']}' has joined the list"
when 'unsubscribe'
    output :irc, "'#{record['data']['merges']['FNAME']} #{record['data']
    ['merges']['LNAME']}' has left the list"
...
Create a Sink to send the messages to IRC...
irc-sink.pl

$q = $amq->channel(1)->queue('morat.irc.' . $channel , { passive => 0,
durable => 0, auto_delete => 1, exclusive => 0, })->subscribe( sub {
    my ($payload, $meta) = @_;
    my ($channel) = $meta->{'queue'} =~ /.([^.]+)$/;

      $irc->yield('privmsg', '#'.$channel, GREEN.$payload);
});
Where have we got to?

Pump: Mailchimp webhook (HTTP POST) >
morat.[campaign].mailchimp.[type] (JSON)
Valve: morat.campaign.mailchimp.[type] (JSON) >
morat.irc.[campaign] (Text)
Sink: morat.irc.[campaign] (Text) > IRC server
That’s cool, but hey it would be great to see
#campaign tweets as well...
twitter-search-pump.rb
TweetStream::Client.new.track(keywords.split(',')) do |status|
  keywords.split(',').each do |searchterm|
    if status.text.match(searchterm)
      searchterm.sub!(' ','')
      searchterm.sub!('#','')
      log.debug "Sending: #{status.user.screen_name} :: #{status.text} ::
morat.twitter.search.#{searchterm}"
      broker.exchange.publish JSON.generate(status), :routing_key =>
"morat.twitter.search.#{searchterm}"
    end
  end
end
twitter-irc-valve.rb

case routing_key
when 'morat.twitter.@neildavidson.list.redgaters'
     output :irc, "RG chatter: #{record['user']['screen_name']} tweeted:
     #{record['text']}", :routing_key => "morat.irc.redgaters"
else
     searchterm = routing_key.match(/morat.twitter.search.(.+)/)[1]
     output :irc, "#{record['user']['screen_name']} tweeted:
     #{record['text']}", :routing_key => "morat.irc.#{searchterm}"
I feel the urge to graph...
Thanks @garethr
Valve
Subscribe to mailchimp exchange morat.
[campaign].mailchimp.#
Translate to Graphite format: [value] [timestamp]
Inject into graphite exchange with routing key
based on sample window: 10sec.
[campaign].mailchimp.[action].count
But let’s make it cool...
Complex Event Processing
mailchimp-graphite-
                valve.rb
    %w{ subscribe unsubscribe campaign }.each do |action|
  [ '10 sec', '1 min', '5 min', '15 min' ].each do |window|
     valve.register "SELECT count(*) from
MailchimpEvent(type='#{action}').win:time_batch(#{window})", (
       Listener.new(valve) do |agent, event|
         valve.output :graphite, "#{event.get('count(*)')}", :routing_key =>
window.delete(' ') + ".morat.#{valve.application}.mailchimp.#{action}"
       end
     )
  end
end
Why use CEP?
# find the sum of retweets of last 5 tweets which saw more than 10 retweets
SELECT sum(retweets) from TweetEvent(retweets >= 10).win:length(5)

# find max, min and average number of retweets for a sliding 60 second window of time
SELECT max(retweets), min(retweets), avg(retweets) FROM TweetEvent.win:time(60 sec)

# compute number of retweets for all tweets in 10 second batches
SELECT sum(retweets) from TweetEvent.win:time_batch(10 sec)

# number of retweets, grouped by timezone, buffered in 10 second increments
SELECT timezone, sum(retweets) from TweetEvent.win:time_batch(10 sec) group by timezone

# compute the sum of retweets in sliding 60 second window, and emit count every 30 events
SELECT sum(retweets) from TweetEvent.win:time(60 sec) output snapshot every 30 events

# every 10 seconds, report timezones which accumulated more than 10 retweets
SELECT timezone, sum(retweets) from TweetEvent.win:time_batch(10 sec) group by timezone having
sum(retweets) > 10

       Courtesy @igrigorik http://www.igvita.com/2011/05/27/streamsql-event-processing-with-esper/
Is there really a correlation?
Statistical Computing
Valve

Grab raw data for window from graphite via REST
Create scatter graph using R and calculate
correlation
Inject correlation into graphite exchange
twitter-correlation-valve.rb
      require 'rsruby'
...

r.jpeg(filename)
r.assign('xs', data[1])
r.assign('ys', data[2])
fit = r.lm('ys ~ xs')
r.plot({
   'x' => data[1],
   'y' => data[2],
   'xlab' => label[1],
   'ylab' => label[2]
})
cor = r.cor(data[1],data[2]).to_s
r.title("Correlation: " + cor)
r.abline(fit['coefficients']['(Intercept)'],fit['coefficients']['xs'])
r.eval_R("dev.off()")
Lets add some realtime visualisation...
Websockets
Valve
Subscribe to twitter exchange
morat.twitter.search.[keyword]
Extract adjectives using entagger
Inject adjectives into twitter exchange with routing
key morat.twitter.search.[keyword].adjectives as:
[adjective] [count]
twitter-sentiment-valve.rb
      require 'engtagger'
...

log.debug "Received tweet from #{record['user']['screen_name']} on
#{routing_key}"

adjectives = @parser.add_tags(record['text']).scan(EngTagger::ADJ).map do |
n|
   @parser.strip_tags(n)
end

ret = Hash.new(0)
adjectives.each do |n|
  n = @parser.stem(n)
  ret[n] += 1 unless n =~ /As*z/
end
Sink

Subscribe to twitter exchange
morat.twitter.search.[keyword].adjectives
Use node.js and Socket.IO to send data to web
client via Websockets
Visualise with processing.js in web browser
twitter-sentiment-sink.js
    io.sockets.on('connection', function (socket) {
     amqp_connection.on('ready', function () {
         var queue = amqp_connection.queue('');
         exchange = amqp_connection.exchange('twitter', { type: 'topic',
passive: false, durable: true, autoDelete: true}, function (exchange) {
             queue.bind(exchange,routing_key);
             queue.subscribe(function (message) {
                 socket.emit('data', { text: message.data.toString() });
             });
         });
     });
});
twitter-sentiment-sink.html
     <H1>Twitter Sentiment</H1>
  <div id="container">
    <canvas id="twitter-sentiment-sink" data-processing-sources="twitter-
sentiment-sink.pde" WIDTH=800 HEIGHT=600></canvas>
  </div>
  <script src="/socket.io/socket.io.js"></script>
  <script type="text/javascript">
    var socket = io.connect('http://localhost');
    socket.on('data', function (data) {
      var pjs = Processing.getInstanceById('twitter-sentiment-sink');
      pjs.addDatum(data.text.split(' ')[0]);
    });
  </script>
@ennui2342

www.morat.co.uk
 polis.ecafe.org

More Related Content

What's hot

Cross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineCross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineAndy McKay
 
The Ring programming language version 1.10 book - Part 10 of 212
The Ring programming language version 1.10 book - Part 10 of 212The Ring programming language version 1.10 book - Part 10 of 212
The Ring programming language version 1.10 book - Part 10 of 212Mahmoud Samir Fayed
 
Build Lightweight Web Module
Build Lightweight Web ModuleBuild Lightweight Web Module
Build Lightweight Web ModuleMorgan Cheng
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)Wesley Beary
 
Using Change Streams to Keep Up with Your Data
Using Change Streams to Keep Up with Your DataUsing Change Streams to Keep Up with Your Data
Using Change Streams to Keep Up with Your DataEvan Rodd
 
Use Kotlin scripts and Clova SDK to build your Clova extension
Use Kotlin scripts and Clova SDK to build your Clova extensionUse Kotlin scripts and Clova SDK to build your Clova extension
Use Kotlin scripts and Clova SDK to build your Clova extensionLINE Corporation
 
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your PluginNagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your PluginNagios
 
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptPsycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptSurvey Department
 
Streaming twitter data using kafka
Streaming twitter data using kafkaStreaming twitter data using kafka
Streaming twitter data using kafkaKiran Krishna
 
Watch out: Observables are here to stay
Watch out: Observables are here to stayWatch out: Observables are here to stay
Watch out: Observables are here to stayGuilherme Ventura
 
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in NagiosNagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in NagiosNagios
 
Akka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAkka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAlex Fruzenshtein
 
Redux. From twitter hype to production
Redux. From twitter hype to productionRedux. From twitter hype to production
Redux. From twitter hype to productionFDConf
 
Building Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeBuilding Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeMongoDB
 
Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Fwdays
 
Assignment no39
Assignment no39Assignment no39
Assignment no39Jay Patel
 

What's hot (20)

Cross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App EngineCross Domain Web
Mashups with JQuery and Google App Engine
Cross Domain Web
Mashups with JQuery and Google App Engine
 
The Ring programming language version 1.10 book - Part 10 of 212
The Ring programming language version 1.10 book - Part 10 of 212The Ring programming language version 1.10 book - Part 10 of 212
The Ring programming language version 1.10 book - Part 10 of 212
 
Firebase ng2 zurich
Firebase ng2 zurichFirebase ng2 zurich
Firebase ng2 zurich
 
Build Lightweight Web Module
Build Lightweight Web ModuleBuild Lightweight Web Module
Build Lightweight Web Module
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
 
Using Change Streams to Keep Up with Your Data
Using Change Streams to Keep Up with Your DataUsing Change Streams to Keep Up with Your Data
Using Change Streams to Keep Up with Your Data
 
Use Kotlin scripts and Clova SDK to build your Clova extension
Use Kotlin scripts and Clova SDK to build your Clova extensionUse Kotlin scripts and Clova SDK to build your Clova extension
Use Kotlin scripts and Clova SDK to build your Clova extension
 
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your PluginNagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
 
Psycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python ScriptPsycopg2 - Connect to PostgreSQL using Python Script
Psycopg2 - Connect to PostgreSQL using Python Script
 
Rxjs swetugg
Rxjs swetuggRxjs swetugg
Rxjs swetugg
 
Streaming twitter data using kafka
Streaming twitter data using kafkaStreaming twitter data using kafka
Streaming twitter data using kafka
 
Watch out: Observables are here to stay
Watch out: Observables are here to stayWatch out: Observables are here to stay
Watch out: Observables are here to stay
 
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in NagiosNagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
 
SignalR
SignalRSignalR
SignalR
 
Akka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAkka: Actor Design & Communication Technics
Akka: Actor Design & Communication Technics
 
Redux. From twitter hype to production
Redux. From twitter hype to productionRedux. From twitter hype to production
Redux. From twitter hype to production
 
Building Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeBuilding Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at Stripe
 
Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"Anton Moldovan "Load testing which you always wanted"
Anton Moldovan "Load testing which you always wanted"
 
Rxjs marble-testing
Rxjs marble-testingRxjs marble-testing
Rxjs marble-testing
 
Assignment no39
Assignment no39Assignment no39
Assignment no39
 

Similar to Small pieces loosely joined

TSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech TalkTSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech TalkAnirudh Todi
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the CloudWesley Beary
 
Streaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingStreaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingAll Things Open
 
112 portfpres.pdf
112 portfpres.pdf112 portfpres.pdf
112 portfpres.pdfsash236
 
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]The Browser Environment - A Systems Programmer's Perspective [sinatra edition]
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]Eleanor McHugh
 
Building Analytics Applications with Streaming Expressions in Apache Solr - A...
Building Analytics Applications with Streaming Expressions in Apache Solr - A...Building Analytics Applications with Streaming Expressions in Apache Solr - A...
Building Analytics Applications with Streaming Expressions in Apache Solr - A...Lucidworks
 
Streaming Solr - Activate 2018 talk
Streaming Solr - Activate 2018 talkStreaming Solr - Activate 2018 talk
Streaming Solr - Activate 2018 talkAmrit Sarkar
 
Introduction to Marionette Collective
Introduction to Marionette CollectiveIntroduction to Marionette Collective
Introduction to Marionette CollectivePuppet
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...Databricks
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaOCoderFest
 
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)DataStax Academy
 
Zeromq - Pycon India 2013
Zeromq - Pycon India 2013Zeromq - Pycon India 2013
Zeromq - Pycon India 2013Srinivasan R
 
Arduino and the real time web
Arduino and the real time webArduino and the real time web
Arduino and the real time webAndrew Fisher
 
Reactive programming every day
Reactive programming every dayReactive programming every day
Reactive programming every dayVadym Khondar
 
Introducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorIntroducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorWSO2
 
Streaming sql w kafka and flink
Streaming sql w  kafka and flinkStreaming sql w  kafka and flink
Streaming sql w kafka and flinkKenny Gorman
 
Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Samir Bessalah
 

Similar to Small pieces loosely joined (20)

TSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech TalkTSAR (TimeSeries AggregatoR) Tech Talk
TSAR (TimeSeries AggregatoR) Tech Talk
 
Tsar tech talk
Tsar tech talkTsar tech talk
Tsar tech talk
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloud
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
Streaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via StreamingStreaming Way to Webscale: How We Scale Bitly via Streaming
Streaming Way to Webscale: How We Scale Bitly via Streaming
 
112 portfpres.pdf
112 portfpres.pdf112 portfpres.pdf
112 portfpres.pdf
 
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]The Browser Environment - A Systems Programmer's Perspective [sinatra edition]
The Browser Environment - A Systems Programmer's Perspective [sinatra edition]
 
Building Analytics Applications with Streaming Expressions in Apache Solr - A...
Building Analytics Applications with Streaming Expressions in Apache Solr - A...Building Analytics Applications with Streaming Expressions in Apache Solr - A...
Building Analytics Applications with Streaming Expressions in Apache Solr - A...
 
Streaming Solr - Activate 2018 talk
Streaming Solr - Activate 2018 talkStreaming Solr - Activate 2018 talk
Streaming Solr - Activate 2018 talk
 
Introduction to Marionette Collective
Introduction to Marionette CollectiveIntroduction to Marionette Collective
Introduction to Marionette Collective
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
 
Pycon - Python for ethical hackers
Pycon - Python for ethical hackers Pycon - Python for ethical hackers
Pycon - Python for ethical hackers
 
Zeromq - Pycon India 2013
Zeromq - Pycon India 2013Zeromq - Pycon India 2013
Zeromq - Pycon India 2013
 
Arduino and the real time web
Arduino and the real time webArduino and the real time web
Arduino and the real time web
 
Reactive programming every day
Reactive programming every dayReactive programming every day
Reactive programming every day
 
Introducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event ProcessorIntroducing the WSO2 Complex Event Processor
Introducing the WSO2 Complex Event Processor
 
Streaming sql w kafka and flink
Streaming sql w  kafka and flinkStreaming sql w  kafka and flink
Streaming sql w kafka and flink
 
Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013
 

Recently uploaded

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Small pieces loosely joined

  • 1. Small Pieces Loosely Joined #cgn13
  • 2. or... A practical example of processing real-time data with a distributed agent network (Warning: does not contain real code)
  • 6. Mailchimp webhook "type": "subscribe", "fired_at": "2009-03-26 21:35:57", "data[id]": "8a25ff1d98", "data[list_id]": "a6b5da1054", "data[email]": "api@mailchimp.com", "data[email_type]": "html", "data[merges][EMAIL]": "api@mailchimp.com", "data[merges][FNAME]": "MailChimp", "data[merges][LNAME]": "API", "data[merges][INTERESTS]": "Group1,Group2", "data[ip_opt]": "10.20.10.30", "data[ip_signup]": "10.20.10.30"
  • 7. Pump the callbacks into a message bus...
  • 9. mailchimp-pump.php $json = json_encode($_POST); $msg = new AMQPMessage($json); $channel->basic_publish($msg, 'mailchimp', "morat.campaign.mailchimp.". $_POST['type']);
  • 10. I’d like to watch the stream on IRC...
  • 11. Valve Subscribe to mailchimp exchange morat.campaign.mailchimp.# Translate to plain english for IRC Inject into irc exchange with routing key morat.irc. [channel]
  • 12. mailchimp-irc-valve.rb case record['type'] when 'subscribe' output :irc, "'#{record['data']['merges']['FNAME']} #{record['data'] ['merges']['LNAME']}' has joined the list" when 'unsubscribe' output :irc, "'#{record['data']['merges']['FNAME']} #{record['data'] ['merges']['LNAME']}' has left the list" ...
  • 13. Create a Sink to send the messages to IRC...
  • 14. irc-sink.pl $q = $amq->channel(1)->queue('morat.irc.' . $channel , { passive => 0, durable => 0, auto_delete => 1, exclusive => 0, })->subscribe( sub { my ($payload, $meta) = @_; my ($channel) = $meta->{'queue'} =~ /.([^.]+)$/; $irc->yield('privmsg', '#'.$channel, GREEN.$payload); });
  • 15. Where have we got to? Pump: Mailchimp webhook (HTTP POST) > morat.[campaign].mailchimp.[type] (JSON) Valve: morat.campaign.mailchimp.[type] (JSON) > morat.irc.[campaign] (Text) Sink: morat.irc.[campaign] (Text) > IRC server
  • 16. That’s cool, but hey it would be great to see #campaign tweets as well...
  • 17. twitter-search-pump.rb TweetStream::Client.new.track(keywords.split(',')) do |status| keywords.split(',').each do |searchterm| if status.text.match(searchterm) searchterm.sub!(' ','') searchterm.sub!('#','') log.debug "Sending: #{status.user.screen_name} :: #{status.text} :: morat.twitter.search.#{searchterm}" broker.exchange.publish JSON.generate(status), :routing_key => "morat.twitter.search.#{searchterm}" end end end
  • 18. twitter-irc-valve.rb case routing_key when 'morat.twitter.@neildavidson.list.redgaters' output :irc, "RG chatter: #{record['user']['screen_name']} tweeted: #{record['text']}", :routing_key => "morat.irc.redgaters" else searchterm = routing_key.match(/morat.twitter.search.(.+)/)[1] output :irc, "#{record['user']['screen_name']} tweeted: #{record['text']}", :routing_key => "morat.irc.#{searchterm}"
  • 19.
  • 20. I feel the urge to graph...
  • 22. Valve Subscribe to mailchimp exchange morat. [campaign].mailchimp.# Translate to Graphite format: [value] [timestamp] Inject into graphite exchange with routing key based on sample window: 10sec. [campaign].mailchimp.[action].count
  • 23. But let’s make it cool...
  • 25. mailchimp-graphite- valve.rb %w{ subscribe unsubscribe campaign }.each do |action| [ '10 sec', '1 min', '5 min', '15 min' ].each do |window| valve.register "SELECT count(*) from MailchimpEvent(type='#{action}').win:time_batch(#{window})", ( Listener.new(valve) do |agent, event| valve.output :graphite, "#{event.get('count(*)')}", :routing_key => window.delete(' ') + ".morat.#{valve.application}.mailchimp.#{action}" end ) end end
  • 26. Why use CEP? # find the sum of retweets of last 5 tweets which saw more than 10 retweets SELECT sum(retweets) from TweetEvent(retweets >= 10).win:length(5) # find max, min and average number of retweets for a sliding 60 second window of time SELECT max(retweets), min(retweets), avg(retweets) FROM TweetEvent.win:time(60 sec) # compute number of retweets for all tweets in 10 second batches SELECT sum(retweets) from TweetEvent.win:time_batch(10 sec) # number of retweets, grouped by timezone, buffered in 10 second increments SELECT timezone, sum(retweets) from TweetEvent.win:time_batch(10 sec) group by timezone # compute the sum of retweets in sliding 60 second window, and emit count every 30 events SELECT sum(retweets) from TweetEvent.win:time(60 sec) output snapshot every 30 events # every 10 seconds, report timezones which accumulated more than 10 retweets SELECT timezone, sum(retweets) from TweetEvent.win:time_batch(10 sec) group by timezone having sum(retweets) > 10 Courtesy @igrigorik http://www.igvita.com/2011/05/27/streamsql-event-processing-with-esper/
  • 27.
  • 28. Is there really a correlation?
  • 30. Valve Grab raw data for window from graphite via REST Create scatter graph using R and calculate correlation Inject correlation into graphite exchange
  • 31. twitter-correlation-valve.rb require 'rsruby' ... r.jpeg(filename) r.assign('xs', data[1]) r.assign('ys', data[2]) fit = r.lm('ys ~ xs') r.plot({ 'x' => data[1], 'y' => data[2], 'xlab' => label[1], 'ylab' => label[2] }) cor = r.cor(data[1],data[2]).to_s r.title("Correlation: " + cor) r.abline(fit['coefficients']['(Intercept)'],fit['coefficients']['xs']) r.eval_R("dev.off()")
  • 32.
  • 33. Lets add some realtime visualisation...
  • 35. Valve Subscribe to twitter exchange morat.twitter.search.[keyword] Extract adjectives using entagger Inject adjectives into twitter exchange with routing key morat.twitter.search.[keyword].adjectives as: [adjective] [count]
  • 36. twitter-sentiment-valve.rb require 'engtagger' ... log.debug "Received tweet from #{record['user']['screen_name']} on #{routing_key}" adjectives = @parser.add_tags(record['text']).scan(EngTagger::ADJ).map do | n| @parser.strip_tags(n) end ret = Hash.new(0) adjectives.each do |n| n = @parser.stem(n) ret[n] += 1 unless n =~ /As*z/ end
  • 37. Sink Subscribe to twitter exchange morat.twitter.search.[keyword].adjectives Use node.js and Socket.IO to send data to web client via Websockets Visualise with processing.js in web browser
  • 38. twitter-sentiment-sink.js io.sockets.on('connection', function (socket) { amqp_connection.on('ready', function () { var queue = amqp_connection.queue(''); exchange = amqp_connection.exchange('twitter', { type: 'topic', passive: false, durable: true, autoDelete: true}, function (exchange) { queue.bind(exchange,routing_key); queue.subscribe(function (message) { socket.emit('data', { text: message.data.toString() }); }); }); }); });
  • 39. twitter-sentiment-sink.html <H1>Twitter Sentiment</H1> <div id="container"> <canvas id="twitter-sentiment-sink" data-processing-sources="twitter- sentiment-sink.pde" WIDTH=800 HEIGHT=600></canvas> </div> <script src="/socket.io/socket.io.js"></script> <script type="text/javascript"> var socket = io.connect('http://localhost'); socket.on('data', function (data) { var pjs = Processing.getInstanceById('twitter-sentiment-sink'); pjs.addDatum(data.text.split(' ')[0]); }); </script>
  • 40.

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. Take the adjectives and store a running total in Redis to create long timeline tag clouds\n Pull out @replies and RT&amp;#x2019;s and throw them into Neo4j - a graph database for post-competition analysis\n Hook an Arduino up to IRC to receive Mailchimp subscriptions and create a physical visualisation in the office (e.g. glow ball)\n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n