Platform for Real-Time Production Operations

Prepared for LSPE Meet-up
November 21, 2013
DataTorrent in Hadoop Ecosystem

• Most powerful Hadoop platform for real-time stream computations
• Massive Real-Time Pro...
DataTorrent Technology Stack
Malhar – Open Source Operators and Apps Library
(Apache v2 License)
SLA

Alerts

Tools

Web S...
DataTorrent’s Platform Differentiators
.
Extreme Scalability

•
•

•

Automatically scale to
changing loads
Sub-second lat...
Stream Processing
Stream 3

Stream 1

Data
Load

Stream 4

Stream 2

Window 3

•
•
•
•
•

Window 2

Window 1

A Stream is ...
DataTorrent Hadoop GRID
1

4

3
2

DT
Console

dtCLI

6
5
Resource
Manager

NM
MapReduce

NM

DT
Gateway

NM

NM
MapReduce...
Live Demonstration
Open Sourced Production Operations Application
Real-Time Dashboards and Actions
•
•
•
•
•
•

DOS Attack
Predictive mainten...
How to get Started?
• DataTorrent
• Try Sandbox (https://datatorrent.com)
• Free for small to medium enterprises: Contact ...
Platform Capabilities
Scale able High
Performance
• Throughput in Billions Events/Sec
• Latency in Milliseconds

Powerful ...
Appendix
Upcoming SlideShare
Loading in...5
×

Data torrent meetup-productioneng

921

Published on

DataTorrent presentation at http://www.meetup.com/SF-Bay-Area-Large-Scale-Production-Engineering/events/137185282/

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
921
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data torrent meetup-productioneng

  1. 1. Platform for Real-Time Production Operations Prepared for LSPE Meet-up November 21, 2013
  2. 2. DataTorrent in Hadoop Ecosystem • Most powerful Hadoop platform for real-time stream computations • Massive Real-Time Production Monitoring, Analytics, and Alerting – Systems monitoring: Resource Utilization, Logs Analysis – Predictive Maintenance, DOS Attack, Launch Validation etc.
  3. 3. DataTorrent Technology Stack Malhar – Open Source Operators and Apps Library (Apache v2 License) SLA Alerts Tools Web Services State Snapshot Security Scalability Fault Tolerance Partitioning Dynamic Modifications StrAM (Stream Application Master)
  4. 4. DataTorrent’s Platform Differentiators . Extreme Scalability • • • Automatically scale to changing loads Sub-second latency with linear scalability Complex monitoring applications with massive computations Mission Critical • • • Built-in Stateful Faulttolerance. 24/7 uptime guaranteed Predictive Analysis, and trouble shooting Update your application while it's running! Hadoop-Native • • • Runs on your existing Apache Hadoop cluster. Develop faster with our open-source framework. Integrate seamlessly with your existing monitoring stack.
  5. 5. Stream Processing Stream 3 Stream 1 Data Load Stream 4 Stream 2 Window 3 • • • • • Window 2 Window 1 A Stream is a sequence of data events with schema An Operator takes input streams and compute output streams An Application is a Directed Acyclic Graph (DAG) In-memory asynchronous distributed computations A Streaming Window is an atomic batch of sequential data events
  6. 6. DataTorrent Hadoop GRID 1 4 3 2 DT Console dtCLI 6 5 Resource Manager NM MapReduce NM DT Gateway NM NM MapReduce StrAM MapReduce 3 1 MapReduce MapReduce 2 5 4 6 MapReduce
  7. 7. Live Demonstration
  8. 8. Open Sourced Production Operations Application Real-Time Dashboards and Actions • • • • • • DOS Attack Predictive maintenance of servers Pre and post Launch analysis 404 Response Root cause analysis for LAMP architecture Segmentation – – – – • Geo Location Gender, Age Resource usage (urls) Etc. URL Analysis – Response times – Patterns • Seamless integration into monitoring stacks
  9. 9. How to get Started? • DataTorrent • Try Sandbox (https://datatorrent.com) • Free for small to medium enterprises: Contact us for details • Malhar Open Source (Apache 2.0) project • https://github.com/DataTorrent/Malhar • malhar-users@googlegroups.com • Applications available Jan 2014 • LogStream: Site Operations • Map-Reduce Monitor DataTorrent Inc. 3200 Partrick Henry, 2nd Fl Santa Clara, CA 95054 info@datatorrent.com www.datatorrent.com Twitter.com/DataTorrent Facebook.com/DataTorrent
  10. 10. Platform Capabilities Scale able High Performance • Throughput in Billions Events/Sec • Latency in Milliseconds Powerful Tools • GUI For Cluster Performance Monitoring • GUI and Debuggers for Event Data • Test Framework, Certification, Versioning • CLI, Macros Easy To Use Fault-Tolerance • No State loss, No Message loss node outage recovery • State Management • Efficient State Checkpointing • Library of Operator Templates • Focus On Business Logic • Connectors to Current Tools • HDFS, Hbase, MySql, ActiveMQ • APIs for Tool Integrations Adaptability Native YARN Application • Runtime Scaling and Resource Optimization • Dynamic Application Modification •Integrates with Hadoop 2.0 Distributions •Apache, Cloudera, Hortonworks, MapR, Pivotal •Co-Exists with Existing Batch Infrastructure •Multi-Tenancy with Existing Hadoop Applications
  11. 11. Appendix

×