Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2016 IBM Corporation1
IBM Streams
22 April 2016
Matt Grover, Walmart ISD Enterprise Architecture
Roger Rea, IBM Streams ...
© 2016 IBM Corporation2
Walmart and IBM
Today, nearly 260 million
customers visit our more than
11,500 stores under 72 ban...
© 2016 IBM Corporation3
Why rewrite Linear Road
• No comprehensive streaming benchmark
available
• The storage design did ...
4 © 2016 IBM Corporation© 2016 IBM Corporation
Linear Road Benchmark
 Linear city is a fictional metropolis 100x100 miles...
5 © 2016 IBM Corporation© 2016 IBM Corporation
Linear Road Benchmark
Four types of events
 Type 0: 99% of events are real...
6 © 2016 IBM Corporation© 2016 IBM Corporation
High level Linear Road architecture
Linear Road
data generator
Courtesy of:...
7 © 2016 IBM Corporation© 2016 IBM Corporation
Why did IBM select Redis ?
 Great maturity level
 Top performance
 API i...
8 © 2016 IBM Corporation© 2016 IBM Corporation
High level Linear Road architecture with Redis and IBM Streams
Linear Road ...
9 © 2016 IBM Corporation© 2016 IBM Corporation
Cloud Service (all nodes on a vnet named Subnet-1)
Streaming analytics test...
10 © 2016 IBM Corporation
Streams results
 L-Rating 50 on one Azure node, 200 on
4 Azure nodes
 1 node, 16 cores, nearly...
11 © 2016 IBM Corporation
Streams results
 Development effort: one person, 14.5 days
 1.5 days install Linux & Streams o...
12 © 2016 IBM Corporation© 2016 IBM Corporation
Comparison to other technologies
Technology Hardware
on Azure
L-Rating
IBM...
13 © 2016 IBM Corporation© 2016 IBM Corporation
IBM recognized as a leader
The Forrester Wave™: Big Data
Streaming Analyti...
14 © 2016 IBM Corporation© 2016 IBM Corporation
Stream Computing
OpenSource
Extensibleplatform
ManagedService
Batch&Stream...
15 © 2016 IBM Corporation© 2016 IBM Corporation
Affordable Realtime Analytics
IBM Streams
100 Azure nodes
$110,261/Mo
5 Az...
16 © 2016 IBM Corporation© 2016 IBM Corporation
Streams is the industry leading stream computing runtime for real time
ana...
17 © 2016 IBM Corporation© 2016 IBM Corporation
Additional resources
Visit:
ibm.com/streams
github.com/Walmart
github.com/...
Upcoming SlideShare
Loading in …5
×

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

1,957 views

Published on

The Linear Road benchmark was devised in 2004 to
compare Stream Data Management Systems. Walmart selected Linear Road to compare performance of streaming analytic
offerings. IBM implemented the benchmark application using Redis to maintain state, and IBM Streams to handle the
incoming events and queries. Walmart had to completely revamp the data drivers and test verification to take advantage
of multicore multithreaded servers available today. Tests were run on Microsoft Azure cloud to ensure fair comparison of
vendors. Redis and IBM Streams handled nearly 1 billion events in a 3 hour test on a single 16 core Azure node, and 3.8 billion
when scaled out to 4 nodes. Come learn about the application and near linear scalability of Redis and IBM Streams.

Published in: Technology
  • Be the first to comment

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM

  1. 1. © 2016 IBM Corporation1 IBM Streams 22 April 2016 Matt Grover, Walmart ISD Enterprise Architecture Roger Rea, IBM Streams Offering Manager Mike Spicer, IBM Lead Architect IBM Streams Linear Road Benchmark Performance Comparison of Streaming Analytic Offerings
  2. 2. © 2016 IBM Corporation2 Walmart and IBM Today, nearly 260 million customers visit our more than 11,500 stores under 72 banners in 28 countries and e-commerce sites in 11 countries each week. We employ 2.2 million associates around the world, 1.4 million in the U.S. alone. International Business Machines Corporation ('IBM') is a globally integrated enterprise operating in over 170 countries, has 380.000 employees. It brings innovative solutions to a diverse client base to help solve some of their toughest business challenges.
  3. 3. © 2016 IBM Corporation3 Why rewrite Linear Road • No comprehensive streaming benchmark available • The storage design did not represent the current state of streaming data systems Requirement for Streaming Analytics at Walmart • Worldwide monitoring of logistics • Real time inventory control • Real time analytics Linear Road Benchmark (2004) • Original White paper (link) • Open Source benchmark • Enables comparison between offerings • Sophisticated application with state management Why Linear Road?
  4. 4. 4 © 2016 IBM Corporation© 2016 IBM Corporation Linear Road Benchmark  Linear city is a fictional metropolis 100x100 miles  10 Expressways every 10 miles  Every mile each has an exit and onramp  Each expressway has 4 lanes in each direction  3 travel lanes and one lane for entrance and exit  Every vehicle emits position report every 30 seconds  One accident occurs randomly on each expressway every 20 minutes, taking 10 to 20 minutes to clear Linear Road on github
  5. 5. 5 © 2016 IBM Corporation© 2016 IBM Corporation Linear Road Benchmark Four types of events  Type 0: 99% of events are real-time position reports  Type 2: Historical requests for account balances  Type 3: Daily expenditure for a specific day in the past 10 weeks  Type 4: travel time predictions GOAL: Maximum L-Rating (max # expressways) Linear Road on github
  6. 6. 6 © 2016 IBM Corporation© 2016 IBM Corporation High level Linear Road architecture Linear Road data generator Courtesy of: Wal-Mart Stores Inc. Linear Road (Solution implementation using vendor specific Streaming analytics middleware) Results Validator (Rewritten in Python by Wal-Mart Stores Inc.) Determine L-Rating
  7. 7. 7 © 2016 IBM Corporation© 2016 IBM Corporation Why did IBM select Redis ?  Great maturity level  Top performance  API is tremendously easy and very flexible  Clustered in memory Key Value Store with fault tolerance  Option for in memory or in memory backed by persistence  Easy installation and monitoring
  8. 8. 8 © 2016 IBM Corporation© 2016 IBM Corporation High level Linear Road architecture with Redis and IBM Streams Linear Road Data Feeder streaming the events via TCP or Kafka Type 3 results Event router TCP receiver Data Feeder IBM Streams Linear Road logic Kafka consumer Daily expenditure analytics Account Balance analytics Type 2 results Position report analytics (for each xway and direction) 1 .. N Type 1 accident alerts Type 0 toll notifications Historical reference data loader (A separate Streams application) Distributed state keeper
  9. 9. 9 © 2016 IBM Corporation© 2016 IBM Corporation Cloud Service (all nodes on a vnet named Subnet-1) Streaming analytics test bed Linux or Windows Jump box IBM Network Subnet-1 CPU: Intel Xeon E5-2670 @ 2.60 GHZ (16 cores on all the machines) Memory: 110GB on Nodes 1 to 6) Redis: Total of 10 instances running on 5 machines Streams Management Server [Node 1] Streams Application Server [Node 2] Streams Application Server [Node 3] Streams Application Server [Node 4] Streams Application Server (Ingest) [Node 5] Standby and scratch work Server [Node 6]
  10. 10. 10 © 2016 IBM Corporation Streams results  L-Rating 50 on one Azure node, 200 on 4 Azure nodes  1 node, 16 cores, nearly 1B events  4 nodes, 64 cores, nearly 4B events  Linear scalability  Handles bursty traffic  99% of responses sub-second # of x-ways # of cars Entries Memory CPU 1 278973 19.2 Million 2.2 GB 2% 2 558726 38.5 Million 4.5 GB 4% 5 1.3 Million 96.3 Million 10.9 GB 7% 10 2.7 Million 192.5 Million 22.0 GB 11% 15 4.1 Million 289.7 Million 33.0 GB 16% 20 5.6 Million 385.2 Million 43.5 GB 20% 25 6.9 Million 482.0 Million 54.5 GB 26% 50 14.0 Million 963.1 Million 109.0 GB 31% 100 27.6 Million 1.9 Billion 220 GB 22% 150 41.5 Million 2.8 Billion 330 GB 33% 200 55.0 Million 3.8 Billion 440 GB 45% 0 50 100 1 10 20 30 40 50 Avg. Throughput(K events/second) Number of expressways 0 100 200 300 400 50 100 150 200Avg. Throughput(K events/second) Number of expressways
  11. 11. 11 © 2016 IBM Corporation Streams results  Development effort: one person, 14.5 days  1.5 days install Linux & Streams on 5 Azure nodes  2 days design application  8 days iterative development  3 days unit testing & tuning  Scale automated with User Defined Parallelization One Way
  12. 12. 12 © 2016 IBM Corporation© 2016 IBM Corporation Comparison to other technologies Technology Hardware on Azure L-Rating IBM Streams Option 1 200 Apache Apex Option 1 102 Apache Storm Option 2 10 Four nodes of Option 1 or 2 for application processing: Option 1: Azure A11 (16 cores, 112 GB RAM, 382 GB Disk, 10 Gbit/s networking), or •CPU model: 45, Intel(R) Xeon(R) CPU E5- 2670 0 @ 2.60GHz Option 2: Azure D14 (16 cores, 112 GB RAM, 800 GB Disk (SSD), 1 Gbit/s) •CPU model: 45, Intel(R) Xeon(R) CPU E5- 2660 0 @ 2.20GHz Two nodes for ingesting data: •If A11 selected, then A10 (8 cores, 56 GB RAM, 382 GB Disk, 10 Gbit/s networking •If D14 selected, then D13 (8 cores, 56 GB RAM, 400 GB Disk, 1 Gbit/s) Plus: an A10 or D13 Windows Server, or Linux, jump box (Windows if a GUI is needed) Six total nodes IBM Streams: 2x better than Apex 20x better than Storm * Twitter has replaced Storm
  13. 13. 13 © 2016 IBM Corporation© 2016 IBM Corporation IBM recognized as a leader The Forrester Wave™: Big Data Streaming Analytics Platforms, Q1 ‘16 The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. “IBM’s architecture can flex to handle any streaming challenge.” IBM had the highest possible scores in Architecture, Operational Management, Streaming Operators, Application Development and Business Applications, Roadmap, Ability to Execute, Implementation Support and Partners. . “The development environment provides one of the richest set of operators in the market.” “Streams can ingest and understand the always- on stream of data to make the decisions that underlie cognitive solutions.” © 2016 IBM Corporation
  14. 14. 14 © 2016 IBM Corporation© 2016 IBM Corporation Stream Computing OpenSource Extensibleplatform ManagedService Batch&Streaming CommandLinei/face Web&JMXmgmt AtLeastOnce Exactlyone State Windows Backpressure MachineLearning Modelscoring Video/Image Geospatial TextAnalytics Visualdevelopment AutomatedHA Enterpriseadapters Opensourceadapters Esper IBM Streams Storm Flink Spark Streaming Dataflow
  15. 15. 15 © 2016 IBM Corporation© 2016 IBM Corporation Affordable Realtime Analytics IBM Streams 100 Azure nodes $110,261/Mo 5 Azure nodes $5,513/Mo
  16. 16. 16 © 2016 IBM Corporation© 2016 IBM Corporation Streams is the industry leading stream computing runtime for real time analytic processing for large-scale, in-memory distributed data processing. Why do customers choose Streams? • Superior performance and low latency • Superior reliability and management • Widest range of adapters • Rapid development/debug capabilities • User Community – StreamsDev, github • Advanced Analytics – Machine Learning, Audio/Video, Geospatial, Natural Language Processing • Enterprise integration & reliability • IBM worldwide services and support IBM Streams Success
  17. 17. 17 © 2016 IBM Corporation© 2016 IBM Corporation Additional resources Visit: ibm.com/streams github.com/Walmart github.com/IBMStreams/benchmarks

×