SlideShare a Scribd company logo
1 of 11
Download to read offline
A Practical Reference Guide to Stream
Analytics Solutions in the Enterprise
Contents
Overview.......................................................................................................................................................3
Key Characteristics of Modern Stream Analytics Solutions..........................................................................3
Challenges with Traditional Stream Analytic Platform Vendors...................................................................5
Stream Analytics in the Cloud.......................................................................................................................5
AWS Kinesis ..............................................................................................................................................6
Azure Stream Analytics ............................................................................................................................6
Salesforce Thunder...................................................................................................................................7
Stream Analytics On-Premise .......................................................................................................................8
Apache Storm...........................................................................................................................................8
Spark Streaming .......................................................................................................................................9
Apache Samza.........................................................................................................................................10
Summary.....................................................................................................................................................11
Overview
Stream analytics is becoming a foundational piece of real time analytics solutions.
The ability to process, analyze and react to data insights real time is an essential
component of modern enterprise solutions. While most enterprises clearly
understand the value proposition of stream analytics solutions, that hasn’t
translated onto technology adoptions. In our opinion, this technology adoption
challenge is a combination of the failure of traditional “complex event processing”
vendors to deliver modern stream analytic capabilities as well as the lack of
understanding about modern platforms that deliver a simpler and yet more
sophisticated stream analytics experience.
Having implemented a few stream analytic solutions for Global 2000 companies, we
decided to compile some of our experiences in this report. The goal is not to
produce an analyst report but rather to offer practical guidance to organizations
evaluating stream analytics solutions. The thoughts expressed in this report are the
result of practical experience implementations of stream analytic solutions and not
on specific product guidance or recommendation.
Key Characteristics of Modern Stream Analytics Solutions
While the principles of stream analytics solutions have been widely adopted in
specific domains like public markets, they are relatively novel to most industries. In
that sense, there is no well-established criteria of what makes an enterprise-ready
stream analytics solution. In our experience, there are a few factors that should be
considered.
Connectors to Line of Business Systems
Integration with existing line of business systems is an essential element of stream
analytics solutions. Often overlooked streaming data to and from enterprise
systems traditionally becomes one of the most complex and time consuming
aspects of the stream analytics implementations. To address that, enterprise-ready
stream analytic solutions should provide connectors to business systems that will
produce or consume data streams based on system events. Stream data connectors
are not only a huge time saver but enforce a very important consistency to how
data is being produced and consumed in stream analytics solutions.
Stream Replay and Simulation
Testing is one of the most difficult capabilities to implement in real world stream
analytics solution. To efficiently test stream analytics scenarios, a solution should
be able to replay specific data streams in the correct order. When evaluating
stream analytics platforms, organizations should consider solutions that enable the
recording and replaying of events in order to recreate real world conditions. This
capability will allow enterprises to recreate different real world scenarios by
replaying the correct streams of data.
Stream Tracking and Monitoring
Similar to the previous point, actively tracking and monitoring of events is an
essential element in enterprise-ready stream analytics solutions. This capability is a
foundational building block to enable the operational management, support and
compliance of stream analytics solutions. Beyond the infrastructure aspect of high
throughput event stream monitoring, enterprise-ready stream analytics solutions
should provide tools that allows dev.-ops and IT professionals to visualize,
understand and take action based on the runtime behavior of stream analytics
solutions.
Security and Data Privacy
Security and data privacy are key aspects of any enterprise data solution and
stream analytics is not the exception. Providing mechanisms to encrypt, sign and
protect data streams is required to address many of the security and compliance
requirements of mission critical enterprise solutions. In that sense, enterprises
should factor in the relevant security and data privacy capabilities when evaluating
stream analytic platforms.
Interfaces for Power Users
Stream analytics is typically considered an infrastructure capability. However, many
of the solutions powered by stream analytics platforms have a strong end user
component. As a result, power users often need to interact and extend aspects of
the solution in order to adapt to new scenarios. To enable this requirement,
enterprises should look for stream analytic solutions that provide robust tools for
power users in order to enable important self-service capabilities.
Integration with Existing Analytic Tools
The analytics and data visualization space has evolved drastically in the last few
years. Many new entrants in the market like Tableau or QlikView have become
common citizens in modern enterprises. The integration with mainstream analytics
tools is an important element of modern stream analytics solutions. This capability
allows business users to interact with complex stream analytics infrastructures
using a familiar toolset, which minimizes the friction for the adoption of these
solutions in the enterprise.
Challenges with Traditional Stream Analytic Platform Vendors
When evaluating stream analytic solutions, organizations tend to segment the
market on different groups. Based on our experience, grouping stream analytics
solutions based on their underlying topology (cloud or on-premise) is a very
effective way to start analyzing this market. While cloud and on-premise stream
analytics stack can be very similar on their functional capabilities, they drastically
differ in terms of their underlying infrastructure as well as the management and
scalability model.
The next section of this document will look at the most important stream analytics
platforms in both on-premise and cloud topologies.
Stream Analytics in the Cloud
Stream analytics have become an important component of platform as a service
(PaaS) solutions. The elastic scalability model of cloud infrastructure drastically
simplifies the implementation of stream analytics solutions. As a result, many of the
most sophisticated stream analytics platforms in the market are delivered as part of
PaaS offerings.
Below we detailed some of the most important players in the space.
AWS Kinesis
 Description: Amazon Kinesis is a platform for streaming data on AWS,
offering powerful services to make it easy to load and analyze streaming
data, and also providing the ability for you to build custom streaming data
applications for specialized needs. Web applications, mobile devices,
wearable’s, industrial sensors, and many software applications and services
can generate staggering amounts of streaming data – sometimes TBs per
hour – that need to be collected, stored, and processed continuously.
 Key Capabilities: AWS Kinesis is based on three fundamental building
blocks
o Kinesis Firehose: The Firehose provides an easy way to load
streaming data into AWS. By default, Kinesis Firehose can load data
streams into AWS services like redshift or S3. It can also batch,
compress, and encrypt the data before loading it, minimizing the
amount of storage used at the destination and increasing security.
o Kinesis Analytics: Kinesis Analytics executes SQL queries against
dynamic data streams produced by the Kinesis Firehose. Additionally,
Kinesis Analytics helps to model actions as a result of stream queries.
o Kinesis Streams: Kinesis Streams provides a model to implement
data streams that can be processed by Kinesis applications. This
component of the platform provides the programming models to build
connectors that generate data streams from various data systems
including the services in the AWS platform.
 Challenges: AWS Kinesis is one of the most complete stream analytics
platforms in the market. However, like any new solution it presents well-
known challenges when implemented in real world solutions. The lack of tools
for non-developers as well as the limited simulation and even tracking
capabilities are some of the most notorious limitations of AWS Kinesis.
Azure Stream Analytics
 Description: Azure Stream Analytics (ASA) is a fully managed, real-time
event-processing engine that helps to unlock deep insights from data.
Stream Analytics makes it easy to set up real-time analytic computations on
data streaming from devices, sensors, web sites, social media, applications,
infrastructure systems, and more. Using the Azure portal, you can author a
Stream Analytics job specifying the input source of the streaming data, the
output sink for the results of your job, and a data transformation expressed
in an SQL-like language. You can monitor and adjust the scale/speed of jobs
in the Azure portal to scale from a few kilobytes to a gigabyte or more of
events processed per second.
 Key Capabilities: Azure Stream Analytics provides some key capabilities
that should be considered during its evaluation.
o IOT Hub and Event Hubs: Azure Stream Analytics integrates with
Event Hubs and IOT Hub to process large volume of events per second
and scale linearly.
o SQL Queries: Azure Stream Analytics powers the authoring and
execution of SQL like queries over dynamic streams of data.
o Support for Storm and Spark Streaming: In addition to its native
stream analytics platform, Azure provides support for Apache Storm
and Spark Streaming. As a result, solutions built on the Azure platform
can take advantage of various stream analytic model depending on the
requirements.
 Challenges: Similar to AWS Kinesis, Azure Stream Analytics lacks strong
support and tooling for non-developers. Additionally, the platform hasn’t yet
developed an ecosystem of connectors that produce and consume streams of
data from Azure Stream Analytics.
Salesforce.com Thunder
 Description: Based on Apache Storm, Salesforce.com Thunder is a very
scalable event-processing engine, designed to ingest and orchestrate billions
of events from the connected world in real time. Thunder uses smart
technology to reveal insights that were invisible and allow anyone to take
proactive personalized actions from any device.
 Key Capabilities: Despite being a new entrant in the market,
Salesforce.com Thunder includes some key capabilities such as the following:
o Orchestration: Salesforce.com Thunder provides a end user friendly
model to author and manage rules that model actions based on the
results extracted from data streams.
o SQL Queries: Salesforce.com Thunder powers the authoring and
execution of SQL like queries over dynamic streams of data.
o Integrated with Salesforce.com Cloud Products: Without any
required customization, Salesforce.com Thunder integrates with
different Salesforce.com Cloud products, which facilitates its
immediate applicability in business scenarios.
 Challenges: Compared to platforms like AWS and Azure, Salesforce.com
Thunder provides a more limited solution from the infrastructure standpoint.
Additionally, Salesforce.com Thunder still lacks connectors, stream producers
and even integration with third party services outside the Salesforce.com
platform
Stream Analytics On-Premise
While cloud stream analytics are incredibly advanced from the capability standpoint,
they are prohibited for many organizations subjected to regulatory and compliance
requirements. In those scenarios, on-premise stream analytics solutions represent
a powerful alternative. One important distinction from the platforms in the market,
we believe Apache Storm, Spark Streaming and Apache Samza represent the most
advanced technology stacks.
Apache Storm
 Description: Storm is a distributed real-time computation system for
processing large volumes of high-velocity data. Storm is extremely fast, with
the ability to process over a million records per second per node on a cluster
of modest size. Enterprises harness this speed and combine it with other data
access applications in Apache Hadoop to prevent undesirable events or to
optimize positive outcomes.
 Key Capabilities:
o Modular Architecture: Storm offers a very modular architecture in
which components can be composed onto really complex stream
analytics solutions.
o Trident: Trident offers a simpler interface to author applications on
top of Storm. This model facilitates the implementation of stream
analytics applications without having to be completely familiar with
Storm’s architecture.
o Connector Framework: Storm provides a number of connectors
(Spouts and Bolts) to different data platforms and provides models
that allow developers to implement custom connectors to new
systems.
 Challenges: The biggest challenge with enterprise implementations of Storm
is the lack of sophisticated, enterprise-ready tools that are well known and
adopted within IT environments. However, the Storm community and
implementers have slowly started to make progress to address some of this
limitations.
Spark Streaming
 Description: Spark is rapidly becoming one of the most important big data
platforms in modern solutions. Spark Streaming is an extension of the core
Spark API that enables scalable, high-throughput, fault-tolerant stream
processing of live data streams. Data can be ingested from many sources like
Kafka, Flume, Twitter, ZeroMQ, Kinesis, or TCP sockets, and can be
processed using complex algorithms expressed with high-level functions like
map, reduce, join and window. Finally, processed data can be pushed out to
file systems, databases, and live dashboards.
 Key Capabilities:
o Integration with Spark: Spark streaming is tightly integrated with
the other components of the Spark platform including Spark MLib or
Spark SQL. This feature facilitates the implementation of feature-rich
stream analytics solutions without abandoning the Spark platform.
o Simplicity: Spark Streaming provides a super simple programming
model that allow developers author complex stream analytic solutions
without having to become an expert in the space.
o Big Data Platform Ecosystem: Spark is being widely adopted by
most big data platform vendors such as Cloudera, Hortonworks or
MapR. This ecosystem is indirectly influencing the adoption of Spark
Streaming as the default engine to implement stream analytics
solutions in big data platforms.
Challenges: The dependencies on Spark bring notable advantages to Spark
Streaming but also introduce very notable challenges. Extending stream analytic
solutions beyond the capabilities provided by the Spark can result in very complex
implementations. Another interesting challenge of Spark Streaming solutions is the
integration with third party systems. Based on the MPP requirements of Spark,
integrating third party systems into Spark requires a considerable engineering
effort compared to some of the other alternatives in the market.
Apache Samza
 Description: Apache Samza is a distributed stream-processing framework. It
was initially created in LinkedIn as an alternative to platforms like Apache Storm.
Samza uses Apache Kafka for messaging, and Apache Hadoop YARN to provide
fault tolerance, processor isolation, security, and resource management.
Although a recent entrant in the stream analytics space, the current feature set is
quite impressive.
 Key Capabilities:
o Extremely Complete Stream Processing Engine: Apache Samza
was designed to abstract some of the most complex aspects of stream
processing from the application developers. By default, Samza includes
capabilities such as state management, order processing, data
buffering etc. which are incredibly hard to implement in stream
analytic solutions.
o Simplicity: Apache Samza provides a super simple programming
model that allows developers author complex stream analytic
solutions. Additionally, Samza was authored to support multi-language
applications even though the initial implementation has been
constrained to JVM languages.
o Integration with Apache Kafka: Samza relies Apache Kafka for its
message passing models. Also built by LinkedIn, Kafka is quickly
becoming one of the most robust messaging platforms in the market
and is experiencing relevant adoption in the enterprise. The current
model makes Apache Samza a great choice for organizations looking
to expand their Kafka infrastructure with stream processing
capabilities.
Challenges: Apache Samza is the newest entrant in the stream analytics space. In
that sense, Samza still has limitations in terms of the integration with third party
systems, the robustness of its management tool stack and other capabilities
required to guarantee a mainstream adoption in the enterprise. Additionally, the
Samza developer and customer communities are relatively small compared to other
platforms.
Summary
Stream analytics is quickly becoming one of the most important elements of
modern enterprise business intelligence solutions. When evaluating stream
analytics solutions, organizations should look for capabilities such as integration
with third party systems, tracking and management tools etc. that will facilitate the
adoption in an enterprise environment. This paper provided an overview of the key
capabilities that are relevant to implement stream analytic solution in the
enterprise.
The stream analytic ecosystem can be divided between cloud and on-premise
platforms. In the cloud space, platforms like Azure, AWS or Salesforce.com have
released some of the most innovative stream analytic solutions in the market. Many
of those solutions are based on stream processing platforms like Apache Storm,
Spark Streaming or Apache Samza, which can be adopted on premise. This paper
included an analysis of some of the key stream analytic platforms including
strengths and weaknesses based on our experience in real world implementations.

More Related Content

What's hot

Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010
Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010
Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010thinkofdevil
 
S+S Partner Opportunity Whitepaper
S+S Partner Opportunity WhitepaperS+S Partner Opportunity Whitepaper
S+S Partner Opportunity Whitepapersumanthr
 
Avangate transition to_saa_s_-_whitepaper
Avangate transition to_saa_s_-_whitepaperAvangate transition to_saa_s_-_whitepaper
Avangate transition to_saa_s_-_whitepaper2Checkout
 
58961174 case-study-on-sql-server-high-availability-and-disaster-recovery
58961174 case-study-on-sql-server-high-availability-and-disaster-recovery58961174 case-study-on-sql-server-high-availability-and-disaster-recovery
58961174 case-study-on-sql-server-high-availability-and-disaster-recoveryhomeworkping3
 
IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM India Smarter Computing
 
Cscc cloud-customer-architecture-for-e commerce
Cscc cloud-customer-architecture-for-e commerceCscc cloud-customer-architecture-for-e commerce
Cscc cloud-customer-architecture-for-e commercer_arorabms
 
Case Studies (Questions and Answers)
Case Studies (Questions and Answers)Case Studies (Questions and Answers)
Case Studies (Questions and Answers)113068
 
A collaborative requirement elicitation technique
A collaborative requirement elicitation techniqueA collaborative requirement elicitation technique
A collaborative requirement elicitation techniqueHasan Dwi Cahyono
 
Cloud computing
Cloud computingCloud computing
Cloud computingsfu-kras
 
Cloud computing for java and dotnet
Cloud computing for java and dotnetCloud computing for java and dotnet
Cloud computing for java and dotnetredpel dot com
 
IBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centersIBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centersVirginia Fernandez
 
Cloud report q4 2011
Cloud report q4 2011Cloud report q4 2011
Cloud report q4 2011Mathias Ekman
 

What's hot (19)

Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010
Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010
Hi tech whitepaper_augmenting_value_saas_changing_business_eco-system_09_2010
 
S+S Partner Opportunity Whitepaper
S+S Partner Opportunity WhitepaperS+S Partner Opportunity Whitepaper
S+S Partner Opportunity Whitepaper
 
Avangate transition to_saa_s_-_whitepaper
Avangate transition to_saa_s_-_whitepaperAvangate transition to_saa_s_-_whitepaper
Avangate transition to_saa_s_-_whitepaper
 
Cloud view platform-highlights-web3
Cloud view platform-highlights-web3Cloud view platform-highlights-web3
Cloud view platform-highlights-web3
 
Csb(박준성교수 080813)
Csb(박준성교수 080813)Csb(박준성교수 080813)
Csb(박준성교수 080813)
 
58961174 case-study-on-sql-server-high-availability-and-disaster-recovery
58961174 case-study-on-sql-server-high-availability-and-disaster-recovery58961174 case-study-on-sql-server-high-availability-and-disaster-recovery
58961174 case-study-on-sql-server-high-availability-and-disaster-recovery
 
IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)
 
IBM Point of View: Security and Cloud Computing
IBM Point of View: Security and Cloud ComputingIBM Point of View: Security and Cloud Computing
IBM Point of View: Security and Cloud Computing
 
Cscc cloud-customer-architecture-for-e commerce
Cscc cloud-customer-architecture-for-e commerceCscc cloud-customer-architecture-for-e commerce
Cscc cloud-customer-architecture-for-e commerce
 
Saa s hr automation
Saa s hr automationSaa s hr automation
Saa s hr automation
 
Case Studies (Questions and Answers)
Case Studies (Questions and Answers)Case Studies (Questions and Answers)
Case Studies (Questions and Answers)
 
A collaborative requirement elicitation technique
A collaborative requirement elicitation techniqueA collaborative requirement elicitation technique
A collaborative requirement elicitation technique
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud computing for java and dotnet
Cloud computing for java and dotnetCloud computing for java and dotnet
Cloud computing for java and dotnet
 
IBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centersIBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centers
 
2011-ESB-WP-Draft
2011-ESB-WP-Draft2011-ESB-WP-Draft
2011-ESB-WP-Draft
 
Cloud report q4 2011
Cloud report q4 2011Cloud report q4 2011
Cloud report q4 2011
 
IJET-V2I6P19
IJET-V2I6P19IJET-V2I6P19
IJET-V2I6P19
 
ENERGY EFFICIENCY IN CLOUD COMPUTING
ENERGY EFFICIENCY IN CLOUD COMPUTINGENERGY EFFICIENCY IN CLOUD COMPUTING
ENERGY EFFICIENCY IN CLOUD COMPUTING
 

Similar to Your practical reference guide to build an stream analytics solution

Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdf
Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdfReal-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdf
Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdfBahaa Al Zubaidi
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopLynn Langit
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesVasu S
 
Azure. Is It Worth It? - TechEd Beijing 2010 - Ethos
Azure. Is It Worth It? - TechEd Beijing 2010 - EthosAzure. Is It Worth It? - TechEd Beijing 2010 - Ethos
Azure. Is It Worth It? - TechEd Beijing 2010 - EthosEthos Technologies
 
Adoption of Blockchain in SAP Supply Chain Management
Adoption of Blockchain in SAP Supply Chain ManagementAdoption of Blockchain in SAP Supply Chain Management
Adoption of Blockchain in SAP Supply Chain ManagementIRJET Journal
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHappiest Minds Technologies
 
A treatise on SAP logistics information reporting
A treatise on SAP logistics information reportingA treatise on SAP logistics information reporting
A treatise on SAP logistics information reportingVijay Raj
 
Future of work machine learning and middle level jobs 112618
Future of work machine learning and middle level jobs 112618Future of work machine learning and middle level jobs 112618
Future of work machine learning and middle level jobs 112618Economic Strategy Institute
 
Bus 421 Research Paper
Bus 421 Research PaperBus 421 Research Paper
Bus 421 Research PaperCrystal Torres
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxArunPandiyan890855
 
The State of Log Management & Analytics for AWS
The State of Log Management & Analytics for AWSThe State of Log Management & Analytics for AWS
The State of Log Management & Analytics for AWSTrevor Parsons
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsInside Analysis
 

Similar to Your practical reference guide to build an stream analytics solution (20)

Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdf
Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdfReal-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdf
Real-time analytics in applications_ New Architectures - Bahaa Al Zubaidi.pdf
 
Evaluation guide to Streaming Analytics
Evaluation guide to Streaming AnalyticsEvaluation guide to Streaming Analytics
Evaluation guide to Streaming Analytics
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
 
Optimizing the Cloud Infrastructure for Enterprise Applications
Optimizing the Cloud Infrastructure for Enterprise ApplicationsOptimizing the Cloud Infrastructure for Enterprise Applications
Optimizing the Cloud Infrastructure for Enterprise Applications
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
 
Cloud Analytics
Cloud AnalyticsCloud Analytics
Cloud Analytics
 
Azure. Is It Worth It? - TechEd Beijing 2010 - Ethos
Azure. Is It Worth It? - TechEd Beijing 2010 - EthosAzure. Is It Worth It? - TechEd Beijing 2010 - Ethos
Azure. Is It Worth It? - TechEd Beijing 2010 - Ethos
 
Adoption of Blockchain in SAP Supply Chain Management
Adoption of Blockchain in SAP Supply Chain ManagementAdoption of Blockchain in SAP Supply Chain Management
Adoption of Blockchain in SAP Supply Chain Management
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
 
A treatise on SAP logistics information reporting
A treatise on SAP logistics information reportingA treatise on SAP logistics information reporting
A treatise on SAP logistics information reporting
 
Future of work machine learning and middle level jobs 112618
Future of work machine learning and middle level jobs 112618Future of work machine learning and middle level jobs 112618
Future of work machine learning and middle level jobs 112618
 
Overview of SaaS
Overview of SaaSOverview of SaaS
Overview of SaaS
 
Bus 421 Research Paper
Bus 421 Research PaperBus 421 Research Paper
Bus 421 Research Paper
 
Data Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptxData Modernization_Harinath Susairaj.pptx
Data Modernization_Harinath Susairaj.pptx
 
Cloud analytics
Cloud analyticsCloud analytics
Cloud analytics
 
The State of Log Management & Analytics for AWS
The State of Log Management & Analytics for AWSThe State of Log Management & Analytics for AWS
The State of Log Management & Analytics for AWS
 
Mapping Manager
Mapping ManagerMapping Manager
Mapping Manager
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
 

More from Jesus Rodriguez

The Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesThe Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesJesus Rodriguez
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxJesus Rodriguez
 
DeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketDeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketJesus Rodriguez
 
The Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersThe Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersJesus Rodriguez
 
Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Jesus Rodriguez
 
DeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesDeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesJesus Rodriguez
 
High Frequency Trading and DeFi
High Frequency Trading and DeFiHigh Frequency Trading and DeFi
High Frequency Trading and DeFiJesus Rodriguez
 
Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Jesus Rodriguez
 
15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics15 Minutes of DeFi Analytics
15 Minutes of DeFi AnalyticsJesus Rodriguez
 
DeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesDeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesJesus Rodriguez
 
Practical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revPractical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revJesus Rodriguez
 
Better Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsBetter Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsJesus Rodriguez
 
Price Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesPrice Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesJesus Rodriguez
 
Fascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesFascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesJesus Rodriguez
 
Price PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningPrice PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningJesus Rodriguez
 
Demystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceDemystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceJesus Rodriguez
 
Crypto assets are a data science heaven rev
Crypto assets are a data science heaven revCrypto assets are a data science heaven rev
Crypto assets are a data science heaven revJesus Rodriguez
 
Implementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldImplementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldJesus Rodriguez
 

More from Jesus Rodriguez (20)

The Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesThe Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-Primitives
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
DeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketDeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto Market
 
MEV Deep Dive .pptx
MEV Deep Dive .pptxMEV Deep Dive .pptx
MEV Deep Dive .pptx
 
Quant in Crypto Land
Quant in Crypto LandQuant in Crypto Land
Quant in Crypto Land
 
The Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersThe Polygon Blockchain by the Numbers
The Polygon Blockchain by the Numbers
 
Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies
 
DeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesDeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating Strategies
 
High Frequency Trading and DeFi
High Frequency Trading and DeFiHigh Frequency Trading and DeFi
High Frequency Trading and DeFi
 
Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About
 
15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics
 
DeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesDeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and Challenges
 
Practical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revPractical Crypto Asset Predictions rev
Practical Crypto Asset Predictions rev
 
Better Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsBetter Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain Indicators
 
Price Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesPrice Predictions for Cryptocurrencies
Price Predictions for Cryptocurrencies
 
Fascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesFascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About Cryptocurrencies
 
Price PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningPrice PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep Learning
 
Demystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceDemystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data Science
 
Crypto assets are a data science heaven rev
Crypto assets are a data science heaven revCrypto assets are a data science heaven rev
Crypto assets are a data science heaven rev
 
Implementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldImplementing Machine Learning in the Real World
Implementing Machine Learning in the Real World
 

Recently uploaded

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 

Recently uploaded (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 

Your practical reference guide to build an stream analytics solution

  • 1. A Practical Reference Guide to Stream Analytics Solutions in the Enterprise
  • 2. Contents Overview.......................................................................................................................................................3 Key Characteristics of Modern Stream Analytics Solutions..........................................................................3 Challenges with Traditional Stream Analytic Platform Vendors...................................................................5 Stream Analytics in the Cloud.......................................................................................................................5 AWS Kinesis ..............................................................................................................................................6 Azure Stream Analytics ............................................................................................................................6 Salesforce Thunder...................................................................................................................................7 Stream Analytics On-Premise .......................................................................................................................8 Apache Storm...........................................................................................................................................8 Spark Streaming .......................................................................................................................................9 Apache Samza.........................................................................................................................................10 Summary.....................................................................................................................................................11
  • 3. Overview Stream analytics is becoming a foundational piece of real time analytics solutions. The ability to process, analyze and react to data insights real time is an essential component of modern enterprise solutions. While most enterprises clearly understand the value proposition of stream analytics solutions, that hasn’t translated onto technology adoptions. In our opinion, this technology adoption challenge is a combination of the failure of traditional “complex event processing” vendors to deliver modern stream analytic capabilities as well as the lack of understanding about modern platforms that deliver a simpler and yet more sophisticated stream analytics experience. Having implemented a few stream analytic solutions for Global 2000 companies, we decided to compile some of our experiences in this report. The goal is not to produce an analyst report but rather to offer practical guidance to organizations evaluating stream analytics solutions. The thoughts expressed in this report are the result of practical experience implementations of stream analytic solutions and not on specific product guidance or recommendation. Key Characteristics of Modern Stream Analytics Solutions While the principles of stream analytics solutions have been widely adopted in specific domains like public markets, they are relatively novel to most industries. In that sense, there is no well-established criteria of what makes an enterprise-ready stream analytics solution. In our experience, there are a few factors that should be considered. Connectors to Line of Business Systems Integration with existing line of business systems is an essential element of stream analytics solutions. Often overlooked streaming data to and from enterprise systems traditionally becomes one of the most complex and time consuming aspects of the stream analytics implementations. To address that, enterprise-ready stream analytic solutions should provide connectors to business systems that will
  • 4. produce or consume data streams based on system events. Stream data connectors are not only a huge time saver but enforce a very important consistency to how data is being produced and consumed in stream analytics solutions. Stream Replay and Simulation Testing is one of the most difficult capabilities to implement in real world stream analytics solution. To efficiently test stream analytics scenarios, a solution should be able to replay specific data streams in the correct order. When evaluating stream analytics platforms, organizations should consider solutions that enable the recording and replaying of events in order to recreate real world conditions. This capability will allow enterprises to recreate different real world scenarios by replaying the correct streams of data. Stream Tracking and Monitoring Similar to the previous point, actively tracking and monitoring of events is an essential element in enterprise-ready stream analytics solutions. This capability is a foundational building block to enable the operational management, support and compliance of stream analytics solutions. Beyond the infrastructure aspect of high throughput event stream monitoring, enterprise-ready stream analytics solutions should provide tools that allows dev.-ops and IT professionals to visualize, understand and take action based on the runtime behavior of stream analytics solutions. Security and Data Privacy Security and data privacy are key aspects of any enterprise data solution and stream analytics is not the exception. Providing mechanisms to encrypt, sign and protect data streams is required to address many of the security and compliance requirements of mission critical enterprise solutions. In that sense, enterprises should factor in the relevant security and data privacy capabilities when evaluating stream analytic platforms. Interfaces for Power Users
  • 5. Stream analytics is typically considered an infrastructure capability. However, many of the solutions powered by stream analytics platforms have a strong end user component. As a result, power users often need to interact and extend aspects of the solution in order to adapt to new scenarios. To enable this requirement, enterprises should look for stream analytic solutions that provide robust tools for power users in order to enable important self-service capabilities. Integration with Existing Analytic Tools The analytics and data visualization space has evolved drastically in the last few years. Many new entrants in the market like Tableau or QlikView have become common citizens in modern enterprises. The integration with mainstream analytics tools is an important element of modern stream analytics solutions. This capability allows business users to interact with complex stream analytics infrastructures using a familiar toolset, which minimizes the friction for the adoption of these solutions in the enterprise. Challenges with Traditional Stream Analytic Platform Vendors When evaluating stream analytic solutions, organizations tend to segment the market on different groups. Based on our experience, grouping stream analytics solutions based on their underlying topology (cloud or on-premise) is a very effective way to start analyzing this market. While cloud and on-premise stream analytics stack can be very similar on their functional capabilities, they drastically differ in terms of their underlying infrastructure as well as the management and scalability model. The next section of this document will look at the most important stream analytics platforms in both on-premise and cloud topologies. Stream Analytics in the Cloud Stream analytics have become an important component of platform as a service (PaaS) solutions. The elastic scalability model of cloud infrastructure drastically simplifies the implementation of stream analytics solutions. As a result, many of the most sophisticated stream analytics platforms in the market are delivered as part of PaaS offerings.
  • 6. Below we detailed some of the most important players in the space. AWS Kinesis  Description: Amazon Kinesis is a platform for streaming data on AWS, offering powerful services to make it easy to load and analyze streaming data, and also providing the ability for you to build custom streaming data applications for specialized needs. Web applications, mobile devices, wearable’s, industrial sensors, and many software applications and services can generate staggering amounts of streaming data – sometimes TBs per hour – that need to be collected, stored, and processed continuously.  Key Capabilities: AWS Kinesis is based on three fundamental building blocks o Kinesis Firehose: The Firehose provides an easy way to load streaming data into AWS. By default, Kinesis Firehose can load data streams into AWS services like redshift or S3. It can also batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security. o Kinesis Analytics: Kinesis Analytics executes SQL queries against dynamic data streams produced by the Kinesis Firehose. Additionally, Kinesis Analytics helps to model actions as a result of stream queries. o Kinesis Streams: Kinesis Streams provides a model to implement data streams that can be processed by Kinesis applications. This component of the platform provides the programming models to build connectors that generate data streams from various data systems including the services in the AWS platform.  Challenges: AWS Kinesis is one of the most complete stream analytics platforms in the market. However, like any new solution it presents well- known challenges when implemented in real world solutions. The lack of tools for non-developers as well as the limited simulation and even tracking capabilities are some of the most notorious limitations of AWS Kinesis. Azure Stream Analytics  Description: Azure Stream Analytics (ASA) is a fully managed, real-time event-processing engine that helps to unlock deep insights from data.
  • 7. Stream Analytics makes it easy to set up real-time analytic computations on data streaming from devices, sensors, web sites, social media, applications, infrastructure systems, and more. Using the Azure portal, you can author a Stream Analytics job specifying the input source of the streaming data, the output sink for the results of your job, and a data transformation expressed in an SQL-like language. You can monitor and adjust the scale/speed of jobs in the Azure portal to scale from a few kilobytes to a gigabyte or more of events processed per second.  Key Capabilities: Azure Stream Analytics provides some key capabilities that should be considered during its evaluation. o IOT Hub and Event Hubs: Azure Stream Analytics integrates with Event Hubs and IOT Hub to process large volume of events per second and scale linearly. o SQL Queries: Azure Stream Analytics powers the authoring and execution of SQL like queries over dynamic streams of data. o Support for Storm and Spark Streaming: In addition to its native stream analytics platform, Azure provides support for Apache Storm and Spark Streaming. As a result, solutions built on the Azure platform can take advantage of various stream analytic model depending on the requirements.  Challenges: Similar to AWS Kinesis, Azure Stream Analytics lacks strong support and tooling for non-developers. Additionally, the platform hasn’t yet developed an ecosystem of connectors that produce and consume streams of data from Azure Stream Analytics. Salesforce.com Thunder  Description: Based on Apache Storm, Salesforce.com Thunder is a very scalable event-processing engine, designed to ingest and orchestrate billions of events from the connected world in real time. Thunder uses smart technology to reveal insights that were invisible and allow anyone to take proactive personalized actions from any device.  Key Capabilities: Despite being a new entrant in the market, Salesforce.com Thunder includes some key capabilities such as the following:
  • 8. o Orchestration: Salesforce.com Thunder provides a end user friendly model to author and manage rules that model actions based on the results extracted from data streams. o SQL Queries: Salesforce.com Thunder powers the authoring and execution of SQL like queries over dynamic streams of data. o Integrated with Salesforce.com Cloud Products: Without any required customization, Salesforce.com Thunder integrates with different Salesforce.com Cloud products, which facilitates its immediate applicability in business scenarios.  Challenges: Compared to platforms like AWS and Azure, Salesforce.com Thunder provides a more limited solution from the infrastructure standpoint. Additionally, Salesforce.com Thunder still lacks connectors, stream producers and even integration with third party services outside the Salesforce.com platform Stream Analytics On-Premise While cloud stream analytics are incredibly advanced from the capability standpoint, they are prohibited for many organizations subjected to regulatory and compliance requirements. In those scenarios, on-premise stream analytics solutions represent a powerful alternative. One important distinction from the platforms in the market, we believe Apache Storm, Spark Streaming and Apache Samza represent the most advanced technology stacks. Apache Storm  Description: Storm is a distributed real-time computation system for processing large volumes of high-velocity data. Storm is extremely fast, with the ability to process over a million records per second per node on a cluster of modest size. Enterprises harness this speed and combine it with other data access applications in Apache Hadoop to prevent undesirable events or to optimize positive outcomes.  Key Capabilities:
  • 9. o Modular Architecture: Storm offers a very modular architecture in which components can be composed onto really complex stream analytics solutions. o Trident: Trident offers a simpler interface to author applications on top of Storm. This model facilitates the implementation of stream analytics applications without having to be completely familiar with Storm’s architecture. o Connector Framework: Storm provides a number of connectors (Spouts and Bolts) to different data platforms and provides models that allow developers to implement custom connectors to new systems.  Challenges: The biggest challenge with enterprise implementations of Storm is the lack of sophisticated, enterprise-ready tools that are well known and adopted within IT environments. However, the Storm community and implementers have slowly started to make progress to address some of this limitations. Spark Streaming  Description: Spark is rapidly becoming one of the most important big data platforms in modern solutions. Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Data can be ingested from many sources like Kafka, Flume, Twitter, ZeroMQ, Kinesis, or TCP sockets, and can be processed using complex algorithms expressed with high-level functions like map, reduce, join and window. Finally, processed data can be pushed out to file systems, databases, and live dashboards.  Key Capabilities: o Integration with Spark: Spark streaming is tightly integrated with the other components of the Spark platform including Spark MLib or Spark SQL. This feature facilitates the implementation of feature-rich stream analytics solutions without abandoning the Spark platform.
  • 10. o Simplicity: Spark Streaming provides a super simple programming model that allow developers author complex stream analytic solutions without having to become an expert in the space. o Big Data Platform Ecosystem: Spark is being widely adopted by most big data platform vendors such as Cloudera, Hortonworks or MapR. This ecosystem is indirectly influencing the adoption of Spark Streaming as the default engine to implement stream analytics solutions in big data platforms. Challenges: The dependencies on Spark bring notable advantages to Spark Streaming but also introduce very notable challenges. Extending stream analytic solutions beyond the capabilities provided by the Spark can result in very complex implementations. Another interesting challenge of Spark Streaming solutions is the integration with third party systems. Based on the MPP requirements of Spark, integrating third party systems into Spark requires a considerable engineering effort compared to some of the other alternatives in the market. Apache Samza  Description: Apache Samza is a distributed stream-processing framework. It was initially created in LinkedIn as an alternative to platforms like Apache Storm. Samza uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management. Although a recent entrant in the stream analytics space, the current feature set is quite impressive.  Key Capabilities: o Extremely Complete Stream Processing Engine: Apache Samza was designed to abstract some of the most complex aspects of stream processing from the application developers. By default, Samza includes capabilities such as state management, order processing, data buffering etc. which are incredibly hard to implement in stream analytic solutions. o Simplicity: Apache Samza provides a super simple programming model that allows developers author complex stream analytic
  • 11. solutions. Additionally, Samza was authored to support multi-language applications even though the initial implementation has been constrained to JVM languages. o Integration with Apache Kafka: Samza relies Apache Kafka for its message passing models. Also built by LinkedIn, Kafka is quickly becoming one of the most robust messaging platforms in the market and is experiencing relevant adoption in the enterprise. The current model makes Apache Samza a great choice for organizations looking to expand their Kafka infrastructure with stream processing capabilities. Challenges: Apache Samza is the newest entrant in the stream analytics space. In that sense, Samza still has limitations in terms of the integration with third party systems, the robustness of its management tool stack and other capabilities required to guarantee a mainstream adoption in the enterprise. Additionally, the Samza developer and customer communities are relatively small compared to other platforms. Summary Stream analytics is quickly becoming one of the most important elements of modern enterprise business intelligence solutions. When evaluating stream analytics solutions, organizations should look for capabilities such as integration with third party systems, tracking and management tools etc. that will facilitate the adoption in an enterprise environment. This paper provided an overview of the key capabilities that are relevant to implement stream analytic solution in the enterprise. The stream analytic ecosystem can be divided between cloud and on-premise platforms. In the cloud space, platforms like Azure, AWS or Salesforce.com have released some of the most innovative stream analytic solutions in the market. Many of those solutions are based on stream processing platforms like Apache Storm, Spark Streaming or Apache Samza, which can be adopted on premise. This paper included an analysis of some of the key stream analytic platforms including strengths and weaknesses based on our experience in real world implementations.