Statim
time series interface for Perl

        Thiago Rondon
       YAPC::Brasil 2011
“Science progresses
throught observation”
-- Isaac Newton
“The greatest value of a picture
is that is forces us to notice what
we never expected to see”
-- John Tukey
Where ?
•   Astronomy          •   Apps

•   Medicine           •   Operating System

•   Stock Market       •   Networks

•   Meteorology        •   Robotics

•   Biometrics         •   Events

•   Geology            •   ....
Wait! let’s see
Chart::Cliker examples.
Line
StackedLine
Bar
StackedBar
Area
StackedArea
Bubble
CandleStick
Pie
PolarArea
What ?

• time series is a sequence of data points,
  measured typically at successive times
  spaced at uniform time intervals.
• interface define a set of methods and
  conventions that provides a consistent time
  series interface.
Why ?
• real time statistics.
• simple way to deploy.
• support openstandards
• distributed input
• provides default methods, functions, tools
  and etc.
Motivation
• independent
• many writers
• abstract layer on top of other databases to
  map and reduce.
• Portability
• easy integration
But, we need more.
• 3D,Visualizing data.
• Euclidean distance
 • Find similar patterns
 • Anomaly detection
 • Rule discovery
 • Frequent pattern discovery
Architecture

                         TCP     TCP
                        Server   Client

                                           Perl
storage schema engine
                                          Client

                         REST    REST
                        Server   Client
Conventions

• data collection
• period (time units)
• step and timestamp.
• type of aggregation
• fields
Engine

• AnyEvent
 • TCP
 • REST
Storage
•   Key Interface

    •   Redis

    •   Memcached

•   Table Interface

    •   DBI

    •   File

•   RRD Interface

    •   RRD
Schema
{
    "collection" : {
      "period" : "84600",
      "aggregate" : "sum",
      "fields" : {
         "foo" : "count",
         "bar" : "enum",
         "jaz" : "enum"
       }
    }
}
Command

• add, get and set
• period, step
• version, quit
Commands
•   add collection foo:jaz bar:1
•   OK 1
•   add collection foo:jaz bar:1
•   OK 2
•   add collection foo:wow bar:10
•   OK 10
•   get collection foo:jaz bar
•   OK 2
•   get collection bar
•   OK 12
•   set collection foo:jaz bar:2
•   OK 2
Commands with “ts”

•   You can pass timestamp with “ts”.

•   add collection foo:bar ts:123467890 bar:1

•   You can get a new series with “ts”

•   get collection foo:bar ts:1234567880-1234567890
    bar
Commands with step

•   You can pass step with “step”.

•   add collection foo:bar step:-10 bar:1

•   You can get a new series with “step”

•   get collection foo:bar step:-10..-20 bar
step vs timestamp

• You can get the step in server by:
• step 1234567890
• Or make the calc in the client.
timestamp vs step


• Timestamp make easy life to insert data
• Step make easy life to get data
New time series by
       periods.
• get collection bar:jaz ts:$ts1-$ts2 foo:max
• get collection bar:* ts:$ts1-$ts2 foo:max
• get collection bar:jaz ts:$ts1-$ts3 foo:min
• get collection ts:$ts1-$ts3 foo:sum
• get collection ts:$ts1-$ts3 foo:avg
• get collection ts:$ts1-$ts3 foo:distinct
New time series by
       periods.
• get collection ts:$ts1-$ts2
  foo:trend(5)
• get collection ts:$ts1-$ts2
  foo:anomaly(100)
• get collection ts:$ts1-$ts3
  foo:frequency(200)
Chart !

• You can use any lib to do that.
• Server-side, eg. Chart::Cliker.
• Client-site, eg. html5 and javascript libs.
REST


• You can extend to any protocol.
• statim-rest : /method/[args]
Thanks.

• http://github.com/maluco/Statim
• http://search.cpan.org/dist/Statim

• @thiagorondon

Statim, time series interface for Perl.

  • 1.
    Statim time series interfacefor Perl Thiago Rondon YAPC::Brasil 2011
  • 2.
  • 3.
    “The greatest valueof a picture is that is forces us to notice what we never expected to see” -- John Tukey
  • 4.
    Where ? • Astronomy • Apps • Medicine • Operating System • Stock Market • Networks • Meteorology • Robotics • Biometrics • Events • Geology • ....
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    What ? • timeseries is a sequence of data points, measured typically at successive times spaced at uniform time intervals. • interface define a set of methods and conventions that provides a consistent time series interface.
  • 17.
    Why ? • realtime statistics. • simple way to deploy. • support openstandards • distributed input • provides default methods, functions, tools and etc.
  • 18.
    Motivation • independent • manywriters • abstract layer on top of other databases to map and reduce. • Portability • easy integration
  • 19.
    But, we needmore. • 3D,Visualizing data. • Euclidean distance • Find similar patterns • Anomaly detection • Rule discovery • Frequent pattern discovery
  • 20.
    Architecture TCP TCP Server Client Perl storage schema engine Client REST REST Server Client
  • 21.
    Conventions • data collection •period (time units) • step and timestamp. • type of aggregation • fields
  • 22.
  • 23.
    Storage • Key Interface • Redis • Memcached • Table Interface • DBI • File • RRD Interface • RRD
  • 24.
    Schema { "collection" : { "period" : "84600", "aggregate" : "sum", "fields" : { "foo" : "count", "bar" : "enum", "jaz" : "enum" } } }
  • 25.
    Command • add, getand set • period, step • version, quit
  • 26.
    Commands • add collection foo:jaz bar:1 • OK 1 • add collection foo:jaz bar:1 • OK 2 • add collection foo:wow bar:10 • OK 10 • get collection foo:jaz bar • OK 2 • get collection bar • OK 12 • set collection foo:jaz bar:2 • OK 2
  • 27.
    Commands with “ts” • You can pass timestamp with “ts”. • add collection foo:bar ts:123467890 bar:1 • You can get a new series with “ts” • get collection foo:bar ts:1234567880-1234567890 bar
  • 28.
    Commands with step • You can pass step with “step”. • add collection foo:bar step:-10 bar:1 • You can get a new series with “step” • get collection foo:bar step:-10..-20 bar
  • 29.
    step vs timestamp •You can get the step in server by: • step 1234567890 • Or make the calc in the client.
  • 30.
    timestamp vs step •Timestamp make easy life to insert data • Step make easy life to get data
  • 31.
    New time seriesby periods. • get collection bar:jaz ts:$ts1-$ts2 foo:max • get collection bar:* ts:$ts1-$ts2 foo:max • get collection bar:jaz ts:$ts1-$ts3 foo:min • get collection ts:$ts1-$ts3 foo:sum • get collection ts:$ts1-$ts3 foo:avg • get collection ts:$ts1-$ts3 foo:distinct
  • 32.
    New time seriesby periods. • get collection ts:$ts1-$ts2 foo:trend(5) • get collection ts:$ts1-$ts2 foo:anomaly(100) • get collection ts:$ts1-$ts3 foo:frequency(200)
  • 33.
    Chart ! • Youcan use any lib to do that. • Server-side, eg. Chart::Cliker. • Client-site, eg. html5 and javascript libs.
  • 34.
    REST • You canextend to any protocol. • statim-rest : /method/[args]
  • 35.