Overcoming the Top Four Challenges to 
Real‐Time Performance in Large‐Scale, 
      Data‐Centric Applications
              Tom Lubinski
           Founder and CEO
           F     d    d CEO
             SL Corporation
             17 November 2011
 Here to talk about Scalability and Performance
                               y

 Problem Space:

  Collection, Analysis, and Visualization in Real-Time
  of large volumes of monitoring data from large
                                              large-
  scale, complex, distributed applications

  Keywords: Real-Time, Large Volumes of Data
Challenges

 Challenge #1:

  Database Performance
  D t b    P f

     Common to see queries taking minutes
     How can you get real-time that way ?
Challenges

 Challenge #2:

  Network Data-Transfer Bandwidth
  N t   kD t T      f B d idth

     Bigger pipes, but there s more data to send
            pipes      there’s
     How do you get the greatest throughput ?
Challenges

 Challenge #3:

  Processor P f
  P         Performance

     More cores just means more processes !
     How do you optimize your utilization ?
Challenges

 Challenge #4:

  Lack f R l Ti
  L k of Real-Time Predictability
                   P di t bilit

     Virtualization is the new time-share !
                               time share
     How can you trust your data ?

  “time-sharing”, “network computer”, “cloud”, do
  things ever really change ?
 Solution – Clues ?

 F t of Life:
  Facts f Lif

  Database – can’t live with it can’t live without it
             can t           it, can t

  Network – it’s a funnel, no way around it
                         ,      y

  Processor – must limit what you ask it to do

  Virtualization - it’s erratic, have to compensate
Solutions

 Solution #1:

  Proper Data Model
  P      D t M d l

     Data structures designed for real-time
                                  real time
     In-memory structures to buffer database
Can your application be … 




… like a high‐performance racecar ?  
What is most important part of racecar ?
            (besides the engine)




        … the Transmission …
For Real‐Time performance, it’s the Cache …




 Not a simple                  High‐performance
                               High performance
“current value”          Real-time Multi-dimensional
     cache                       Data Cache
Real‐Time Cache – optimized for performance !
                    Current / History Tables:
                    C       / Hi      T bl




                    In                   Out




                               …
    Indexed Insertion ‐                    Indexed extraction ‐
asynchronous real‐time data            optimized transfer to clients
Real‐Time Cache – Data Processing / Aggregation

                                                     Detail
                                                     Views
Raw
Data           Reduced

                                                    Summary
                                                     Views

    Reduction, 
 Resolution,   Aging                     Aggregation
Real‐Time Cache – Database read/write through
          ( p
          (optimized for timestamped multi‐dimensional data)
                     f           p                         )




Real‐Time data
                                                Seamless timeline 
 automatically 
 automatically
                                            navigation with automatic 
 written to DB
                                                 database query
This sounds a bit like Oracle Coherence …



                                     Buffer database

                                     Read/write through

                                     Listeners

                                     Indexed queries

                                     What’s different ?
                                              ff
Different tools for different problems !

Real-Time Multi-dimensional d t
R l Ti    M lti di    i   l data:

      Current / History Tables:
                                     Multiple rows (time range) of 
                                        li l       (i         ) f
                                   selected columns returned in one 
                                                query 

                                  Coherence cache distributes objects 
                                    (rows) = optimized horizontally

                                  Real‐Time multi‐dimensional cache 
                                   manages columns and optimizes 
                 …                                    y
                                             vertically
Benefits: Indexed Real-Time Caching

Slow SQL queries minimized
Sl           i    i i i d

Users shielded from database details

Minimize CPU load using effective indexing
Solutions

 Solution #2

  Server-Side A
  S      Sid Aggregation
                    ti
     (am I being too obvious with this one ?)


     Know the use cases
     Joins and GroupBy done on server
                    p y
     SQL does this, but do you need it ?
Problems with SQL Database Queries

Slow
Sl
Slowwer with concurrent queries
If you need it fast, it goes even slowwwwwwer !
               fast

SQL = Not p
 Q        portable
   (Timestamps, especially)
Know your problem space !

Real-Time Monitoring:
R l Ti    M it i
   Join and GroupBy heavily used

We wrote our own!
Performed in real-time on server-side data
Optimized for real-time requirements
 Example: Server-Side Aggregation/Caching
       Servlet Data




                      GroupBy
                        App

Raw
Data
       App Data                            Totals By App
                                Join on 
                                                                                                  To 
                                  App
                                                                                                  Clients
                                                                                                  Cli t

                                                           GroupBy
                                                            Server



       Server Data                                                             Totals By Server
                                                                     Join on
                                                                     Server
 Each cache can maintain its own history
             Servlet Data




Cached
Data                        Totals By App
And                                                            To 
Aggregates                                                     Clients
                                                               Cli t


                                            Totals By Server
                  …
                                …


                                                  …
 Result: trend chart of Totals by History has all
  data available immediately

 Using SQL would require:

   Query 3 tables
   2 GroupBys, 2 Joins, + Join on Timestamp (not portable)
Benefits: Server-Side Aggregation

Client requests and gets exactly what i needed
Cli               d           l   h is     d d

Client processing = zero
Server processing = done ahead of time
Current/History for aggregates readily available (No SQL)

Response time = fast
Solutions

 Solution #3

  Use Appropriate D i
  U A        i t Design Patterns
                        P tt

     Server Side vs Client-Side
     Server-Side vs. Client Side Processing
     Efficient Data Transfer Patterns
 Pattern #1:

  Data Compaction
  D    C      i
     (obvious, initial approach for any data transfers)




       Server              Packets only partially filled  …
                                                                                  Client




                encode                                                  decode

                           … replaced with full packets



                                             … even simple, non‐proprietary 
                                             algorithms can make big difference
                                             algorithms can make big difference
 Pattern #2:

  Data Current / Changed
  D    C         Ch    d
      (large data tables with sparse real-time updates)



                          Entire table sent every update …
      Server
                                                                                    Client




               encode                                                     decode


                          … instead, send only changed rows



                                                … little more complex, requires indexing
 Pattern #3:

  Data History / C
  D    Hi        Current
      (trend chart invoke with real-time updates)



                          Entire history table sent every update …
      Server
                                                                                          Client




               manage                                                          merge
      …
                        … instead, send history once, then current updates



                                         … similar to current / changed pattern, but specific to history
 Pattern #4:

  Data Current / Subset
  D    C         S b
      (optimizing transfer of data subsets to multiple clients)


                                                                                              Client
                          Changed subset sent to every client …
      Server
                                                                                     listen
                                                                                   indexed




                                                                                              Client
               register
               indexed



                                                                                     listen
                                                                                   indexed



                           … instead, send subset only to registered client

                                                           … requires registration logic coupled with cache
Benefits: Design Patterns for Data Transfer

Same problem over and over again solved similar way
S       bl          d         i    l d i il

Reduce load on network
Optimize response time – no unnecessary data
Conclusions

 Conclusion #1:

  Know your data !
  K         d t

     Data Model designed for real-time
                             real time
     In-memory structures to buffer database
     Server-side aggregations
                  gg g
Conclusions

 Conclusion #2

  Respect Design P tt
  R     tD i     Patterns !

     Server Side vs Client-Side
     Server-Side vs. Client Side Processing
     Efficient Data Transfer Patterns
     Don’t over-generalize – solve the problem
                 g                     p
About SL Corporation



 Software Product company since 1983

 Headquarters in Marin County CA
                        County,

 Worldwide presence in Americas, APAC, EMEA

 Over 100,000 licenses sold

 Core expertise in application performance
  monitoring – special focus on middleware
Select SL RTView Customers
 Financial Services
 Financial Services   E‐Commerce/Retail
                      E Commerce/Retail   Telecommunications




                           Energy               Other
RTView: The Solution Offering

    Application                   Oracle
                                                            Middleware
   Performance                  Coherence
                                                            Monitoring
    Monitoring                  Monitoring
• Collect all necessary    •Understand the              • Determine how
application-centric and    behavior of Coherence        applications are
middleware-centric                                      interacting with
p
performance data           •Debug and validate
                                  g                     middleware systems
                                                                      y
                           functionality after
• Configure data           configuration changes        • Assess whether
aggregation and                                         applications are running
persistence, filters,      •Integrate OCM with          efficiently and reliably
analytics,
analytics alerts and       existing monitoring tools
displays to deliver                                     • Ensure the maximum
information tailored for   •Enable quick notification   benefit from an ESB
app support teams          of problems                  investment
RTView – Sample Applications




OOCL World WideShipment Tracking   Hospitality Card application at     Online Gaming Systems
                                   Harrah’s casino gaming tables




      Tax Season at Intuit         PJM Real‐time Energy Pricing       Banking application in Korea
RTView – Multi-Tier Visibility

                                 Unified Real‐time display of data
                                     from all Application tiers




              Update for ORCL




                                      In‐depth Monitoring of 
                                     Middleware Components
                                     Middleware Components
RTView – Large Data Volumes


 Typical large
  implementation,
  distributed over several
  regions with many
  custom applications

 Heatmap View showing
  current state of entire
  system – size represents
  number of servers for
      b     f        f
  application

 Color represents how
  close metric is to SLA –
  large red boxes are worst
  – drilldown to detail
RTView - Drill-Down to Detail Metrics


 Drilldown to detail level
  metrics showing i t
     t i   h i     internal l
  metrics from each application

 Sophisticated history and
  alert view allows fine-tuning
  of thresholds for each metric
RTView – Complex Visualizations

 Observe “internal load balancing” of Data Grid
Questions?
         See www.sl.com
for more info about SL and RTView

Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, Data‐Centric Applications

  • 1.
    Overcoming the Top Four Challenges to  Real‐Time Performance in Large‐Scale,  Data‐Centric Applications Tom Lubinski Founder and CEO F d d CEO SL Corporation 17 November 2011
  • 2.
     Here totalk about Scalability and Performance y  Problem Space: Collection, Analysis, and Visualization in Real-Time of large volumes of monitoring data from large large- scale, complex, distributed applications Keywords: Real-Time, Large Volumes of Data
  • 3.
    Challenges  Challenge #1: Database Performance D t b P f Common to see queries taking minutes How can you get real-time that way ?
  • 4.
    Challenges  Challenge #2: Network Data-Transfer Bandwidth N t kD t T f B d idth Bigger pipes, but there s more data to send pipes there’s How do you get the greatest throughput ?
  • 5.
    Challenges  Challenge #3: Processor P f P Performance More cores just means more processes ! How do you optimize your utilization ?
  • 6.
    Challenges  Challenge #4: Lack f R l Ti L k of Real-Time Predictability P di t bilit Virtualization is the new time-share ! time share How can you trust your data ? “time-sharing”, “network computer”, “cloud”, do things ever really change ?
  • 7.
     Solution –Clues ?  F t of Life: Facts f Lif Database – can’t live with it can’t live without it can t it, can t Network – it’s a funnel, no way around it , y Processor – must limit what you ask it to do Virtualization - it’s erratic, have to compensate
  • 8.
    Solutions  Solution #1: Proper Data Model P D t M d l Data structures designed for real-time real time In-memory structures to buffer database
  • 9.
  • 10.
    What is most important part of racecar ? (besides the engine) … the Transmission …
  • 11.
    For Real‐Time performance, it’s the Cache … Nota simple High‐performance High performance “current value” Real-time Multi-dimensional cache Data Cache
  • 12.
    Real‐Time Cache – optimized for performance ! Current / History Tables: C / Hi T bl In Out … Indexed Insertion ‐ Indexed extraction ‐ asynchronous real‐time data optimized transfer to clients
  • 13.
    Real‐Time Cache –Data Processing / Aggregation Detail Views Raw Data Reduced Summary  Views Reduction,  Resolution,   Aging Aggregation
  • 14.
    Real‐Time Cache –Database read/write through ( p (optimized for timestamped multi‐dimensional data) f p ) Real‐Time data Seamless timeline  automatically  automatically navigation with automatic  written to DB database query
  • 15.
    This sounds abit like Oracle Coherence … Buffer database Read/write through Listeners Indexed queries What’s different ? ff
  • 16.
    Different tools fordifferent problems ! Real-Time Multi-dimensional d t R l Ti M lti di i l data: Current / History Tables: Multiple rows (time range) of  li l (i ) f selected columns returned in one  query  Coherence cache distributes objects  (rows) = optimized horizontally Real‐Time multi‐dimensional cache  manages columns and optimizes  … y vertically
  • 17.
    Benefits: Indexed Real-TimeCaching Slow SQL queries minimized Sl i i i i d Users shielded from database details Minimize CPU load using effective indexing
  • 18.
    Solutions  Solution #2 Server-Side A S Sid Aggregation ti (am I being too obvious with this one ?) Know the use cases Joins and GroupBy done on server p y SQL does this, but do you need it ?
  • 19.
    Problems with SQLDatabase Queries Slow Sl Slowwer with concurrent queries If you need it fast, it goes even slowwwwwwer ! fast SQL = Not p Q portable (Timestamps, especially)
  • 20.
    Know your problemspace ! Real-Time Monitoring: R l Ti M it i Join and GroupBy heavily used We wrote our own! Performed in real-time on server-side data Optimized for real-time requirements
  • 21.
     Example: Server-SideAggregation/Caching Servlet Data GroupBy App Raw Data App Data Totals By App Join on  To  App Clients Cli t GroupBy Server Server Data Totals By Server Join on Server
  • 22.
     Each cachecan maintain its own history Servlet Data Cached Data Totals By App And  To  Aggregates Clients Cli t Totals By Server … … …
  • 23.
     Result: trendchart of Totals by History has all data available immediately  Using SQL would require: Query 3 tables 2 GroupBys, 2 Joins, + Join on Timestamp (not portable)
  • 24.
    Benefits: Server-Side Aggregation Clientrequests and gets exactly what i needed Cli d l h is d d Client processing = zero Server processing = done ahead of time Current/History for aggregates readily available (No SQL) Response time = fast
  • 25.
    Solutions  Solution #3 Use Appropriate D i U A i t Design Patterns P tt Server Side vs Client-Side Server-Side vs. Client Side Processing Efficient Data Transfer Patterns
  • 26.
     Pattern #1: Data Compaction D C i (obvious, initial approach for any data transfers) Server Packets only partially filled  … Client encode decode … replaced with full packets … even simple, non‐proprietary  algorithms can make big difference algorithms can make big difference
  • 27.
     Pattern #2: Data Current / Changed D C Ch d (large data tables with sparse real-time updates) Entire table sent every update … Server Client encode decode … instead, send only changed rows … little more complex, requires indexing
  • 28.
     Pattern #3: Data History / C D Hi Current (trend chart invoke with real-time updates) Entire history table sent every update … Server Client manage merge … … instead, send history once, then current updates … similar to current / changed pattern, but specific to history
  • 29.
     Pattern #4: Data Current / Subset D C S b (optimizing transfer of data subsets to multiple clients) Client Changed subset sent to every client … Server listen indexed Client register indexed listen indexed … instead, send subset only to registered client … requires registration logic coupled with cache
  • 30.
    Benefits: Design Patternsfor Data Transfer Same problem over and over again solved similar way S bl d i l d i il Reduce load on network Optimize response time – no unnecessary data
  • 31.
    Conclusions  Conclusion #1: Know your data ! K d t Data Model designed for real-time real time In-memory structures to buffer database Server-side aggregations gg g
  • 32.
    Conclusions  Conclusion #2 Respect Design P tt R tD i Patterns ! Server Side vs Client-Side Server-Side vs. Client Side Processing Efficient Data Transfer Patterns Don’t over-generalize – solve the problem g p
  • 33.
    About SL Corporation Software Product company since 1983  Headquarters in Marin County CA County,  Worldwide presence in Americas, APAC, EMEA  Over 100,000 licenses sold  Core expertise in application performance monitoring – special focus on middleware
  • 34.
    Select SL RTViewCustomers Financial Services Financial Services E‐Commerce/Retail E Commerce/Retail Telecommunications Energy Other
  • 35.
    RTView: The SolutionOffering Application Oracle Middleware Performance Coherence Monitoring Monitoring Monitoring • Collect all necessary •Understand the • Determine how application-centric and behavior of Coherence applications are middleware-centric interacting with p performance data •Debug and validate g middleware systems y functionality after • Configure data configuration changes • Assess whether aggregation and applications are running persistence, filters, •Integrate OCM with efficiently and reliably analytics, analytics alerts and existing monitoring tools displays to deliver • Ensure the maximum information tailored for •Enable quick notification benefit from an ESB app support teams of problems investment
  • 36.
    RTView – SampleApplications OOCL World WideShipment Tracking Hospitality Card application at  Online Gaming Systems Harrah’s casino gaming tables Tax Season at Intuit PJM Real‐time Energy Pricing Banking application in Korea
  • 37.
    RTView – Multi-TierVisibility Unified Real‐time display of data from all Application tiers Update for ORCL In‐depth Monitoring of  Middleware Components Middleware Components
  • 38.
    RTView – LargeData Volumes  Typical large implementation, distributed over several regions with many custom applications  Heatmap View showing current state of entire system – size represents number of servers for b f f application  Color represents how close metric is to SLA – large red boxes are worst – drilldown to detail
  • 39.
    RTView - Drill-Downto Detail Metrics  Drilldown to detail level metrics showing i t t i h i internal l metrics from each application  Sophisticated history and alert view allows fine-tuning of thresholds for each metric
  • 40.
    RTView – ComplexVisualizations Observe “internal load balancing” of Data Grid
  • 41.
    Questions? See www.sl.com for more info about SL and RTView