Srinath Perera discusses the rise of streaming SQL and evolution of streaming applications. He covers what streaming is, how almost all new data is streaming, the streaming processing market, building streaming apps, the history of stream processing, why streaming SQL is useful, common solutions with stream processing, how stream processors are stateful and need high availability, how most are resource-intensive, the need for machine learning and advanced query authoring with stream processing. He then introduces WSO2 Stream Processor as a lightweight option for streaming applications.
We are at the dawn of digital businesses, that are reimagined to make the best use of digital technologies such as automation, analytics, cloud, and integration. These businesses are efficient, continuously optimizing, proactive, flexible and able to understand customers in detail. A key part of a digital business is analytics: the eyes and ears of the system that tracks and provides a detailed view on what was and what is and lets decision makers predict what will be.
This session will explore how the WSO2 analytics platform
Plays a role in your digital transformation journey
Collects and analyzes data through batch, real-time, interactive and predictive processing technologies
Lets you communicate the results through dashboards
Brings together all analytics technologies into a single platform and user experience
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
This talk discusses Outline of the state of the art of Enterprise Software and how we get there, as I see it. Also second part describes Ballerina, a new programming language WSO2 has built for Enterprise Computing.
It is presented as a Keynote at 11th Symposium and Summer School On Service-Oriented Computing.
Machine learning, or predictive analytics have started entering into our daily life. Businesses and enterprises could use predictive analytics to improve efficiency, improve user experience, as well as to create new business opportunities. This talk will present WSO2 Machine Learner, our experiences of predicting Super Bowl winners, and few real life use cases. Furthermore, talk will discuss open challenges and problems people are working on.
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
Presented at KDD, August 11, 2015.
Abstract of the paper:
Machine learning techniques have proved effective in recommender systems and other applications, yet teams working to deploy them lack many of the advantages that those in more established software disciplines today take for granted. The well-known Agile methodology advances projects in a chain of rapid development cycles, with subsequent steps often informed by production experiments. Support for such workflow in machine learning applications remains primitive.
The platform developed at if(we) embodies a specific machine learning approach and a rigorous data architecture constraint, so allowing teams to work in rapid iterative cycles. We require models to consume data from a time-ordered event history, and we focus on facilitating creative feature engineering. We make it practical for data scientists to use the same model code in development and in production deployment, and make it practical for them to collaborate on complex models.
We deliver real-time recommendations at scale, returning top results from among 10,000,000 candidates with sub-second response times and incorporating new updates in just a few seconds. Using the approach and architecture described here, our team can routinely go from ideas for new models to production-validated results within two weeks.
We are at the dawn of digital businesses, that are reimagined to make the best use of digital technologies such as automation, analytics, cloud, and integration. These businesses are efficient, continuously optimizing, proactive, flexible and able to understand customers in detail. A key part of a digital business is analytics: the eyes and ears of the system that tracks and provides a detailed view on what was and what is and lets decision makers predict what will be.
This session will explore how the WSO2 analytics platform
Plays a role in your digital transformation journey
Collects and analyzes data through batch, real-time, interactive and predictive processing technologies
Lets you communicate the results through dashboards
Brings together all analytics technologies into a single platform and user experience
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
This talk discusses Outline of the state of the art of Enterprise Software and how we get there, as I see it. Also second part describes Ballerina, a new programming language WSO2 has built for Enterprise Computing.
It is presented as a Keynote at 11th Symposium and Summer School On Service-Oriented Computing.
Machine learning, or predictive analytics have started entering into our daily life. Businesses and enterprises could use predictive analytics to improve efficiency, improve user experience, as well as to create new business opportunities. This talk will present WSO2 Machine Learner, our experiences of predicting Super Bowl winners, and few real life use cases. Furthermore, talk will discuss open challenges and problems people are working on.
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
Presented at KDD, August 11, 2015.
Abstract of the paper:
Machine learning techniques have proved effective in recommender systems and other applications, yet teams working to deploy them lack many of the advantages that those in more established software disciplines today take for granted. The well-known Agile methodology advances projects in a chain of rapid development cycles, with subsequent steps often informed by production experiments. Support for such workflow in machine learning applications remains primitive.
The platform developed at if(we) embodies a specific machine learning approach and a rigorous data architecture constraint, so allowing teams to work in rapid iterative cycles. We require models to consume data from a time-ordered event history, and we focus on facilitating creative feature engineering. We make it practical for data scientists to use the same model code in development and in production deployment, and make it practical for them to collaborate on complex models.
We deliver real-time recommendations at scale, returning top results from among 10,000,000 candidates with sub-second response times and incorporating new updates in just a few seconds. Using the approach and architecture described here, our team can routinely go from ideas for new models to production-validated results within two weeks.
A presentation covers how data science is connected to build effective machine learning solutions. How to build end to end solutions in Azure ML. How to build, model, and evaluate algorithms in Azure ML.
Use cases and examples using Apache Spark, presented at the Hadoop User Group (UK) November 2014 Hadoop Meetup
http://www.meetup.com/hadoop-users-group-uk/events/217791892/
This is a part of presentation done at Global Azure BootCamp 2017 Mohali Location.
We talked about how to get started with your first data science experiment using Azure Machine Learning Studio.
The Machine Learning Workflow with AzureIvo Andreev
Machine learning is not black magic but a discipline that involves data analysis, data science and of course – hard work. From searching patterns in data, applying algorithms to converting to usable predictions, you would need background and appropriate tools. In this session, we will go through major approaches to prepare data, build and deploy ML models in Azure (ML Studio, DataScience VM, Jupyter Notebook). Most importantly – based on some examples from the real world, we will provide you with a workflow of best practices.
Intro to Machine Learning with H2O and AWSSri Ambati
Navdeep Gill @ Galvanize Seattle- May 2016
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Afternoons with Azure - Azure Machine Learning CCG
Journey through programming languages such as R, and Python that can be used for Machine Learning. Next, explore Azure Machine Learning Studio see the interconnectivity.
For more information about Microsoft Azure, call (813) 265-3239 or visit www.ccganalytics.com/solutions
Building predictive models in Azure Machine LearningMostafa
This presentation covers how to build and drive insights from data by building machine learning models. The session covers how to develop and train models in Python/R using Azure Machine Learning. The session covers how to explore key concepts in data acquisition, preparation, exploration, and visualization, and take a look at how to build a predictive solution using Azure Machine Learning, R, and Python. The session covers tips and tricks on selecting the right algorithm for your data science problem and how to utilize Machine Learning to solve it.
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use CasesSanjay Sharma
Financial institutions today are under intense pressure to provide more value add to the customers, reduce IT costs and also grow year to year. This challenge has been further complicated by huge amounts of data being generated as well as mandatory federal compliances in place.
Similarly, Manufacturing industry today also is facing the challenge to process huge amount of data in real time and predict failures as early as possible to reduce cost and increase production efficiency.
The session will cover some high level Big Data use cases applicable to financial and manufacturing domain and how big data technologies are being used successfully to solve these challenges using some examples in credit card/banking industry in financial domain and semi-conductor production in manufacturing domain.
This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
Building Real Time Targeting Capabilities - Ryan Zotti, Subbu Thiruppathy - C...Sri Ambati
A team of data and software engineers and data scientists at Capital One are experimenting with various technologies to enable lightning-fast promotional content that visitors will see when they visit Capital One’s website looking to apply for a credit card. In this presentation we’ll first talk about some of the technologies that we’re exploring such as the Akka-based Play framework, and H2O, a popular open source machine learning library. We will explore our evolution of data science and the H2O tools used to create the groundwork for continuous and automated testing and optimization, with the ability to scale across the entire company. Then conclude with a quick demo followed by a few tips and tricks that we learned along the way. #h2ony
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Machines and the Magic of Fast LearningSingleStore
Human-machine interaction is no longer the exclusive province of science fiction. The advance of the internet and connected devices has inspired data scientists to create machine-learning applications to extract value from these new forms of data.
So what's the next frontier?
Join MemSQL Engineer Michael Andrews and Sr. Director Mike Boyarski to learn how to use real-time data as a vehicle for operationalizing machine-learning models. Michael and Mike will explore advanced tools, including TensorFlow, Apache Spark, and Apache Kafka, and compelling use cases demonstrating the power of machine learning to effect positive change.
You will learn:
Top technologies for building the ideal machine-learning stack
How to power machine-learning applications with real-time data
A use case and demo of machine learning for social good
Slide deck used to introduce machine learning with Azure Machine Learning Service. Focus on deployment of models with the machine learning SDK and consumption of the models with Python and Go.
Afternoons with Azure - Power BI and Azure Analysis ServicesCCG
See how Microsoft Power BI and Azure Analysis Services are influencing the BI and analytics market. Journey through data structures and fundamentals for setting up your next dashboard initiative.
Interested in learning more? Click ccganalytics.com/resources for more or call (813) 265-3239.
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
Learning Objectives:
- Get an overview of streaming data and it's application in analytics and big data.
- Understand the factors driving the accelerating transformation of batch processing to real-time.
- Learn how you should plan for incorporating data streaming in your analytics and processing workloads.
Business can now easily perform real-time analytics on data that has been traditionally analyzed using batch processing in data warehouses or using Hadoop frameworks, and react to new information in minutes or seconds instead of hours or days. In this webinar, Forrester analyst Mike Gualtieri and Amazon Kinesis GM Roger Barga will discuss this prevalent trend, it's business significance, and how you should plan for it. You will also learn about the AWS services that can help you get started quickly with real-time, streaming applications fore your analytics and big data workloads.
A presentation covers how data science is connected to build effective machine learning solutions. How to build end to end solutions in Azure ML. How to build, model, and evaluate algorithms in Azure ML.
Use cases and examples using Apache Spark, presented at the Hadoop User Group (UK) November 2014 Hadoop Meetup
http://www.meetup.com/hadoop-users-group-uk/events/217791892/
This is a part of presentation done at Global Azure BootCamp 2017 Mohali Location.
We talked about how to get started with your first data science experiment using Azure Machine Learning Studio.
The Machine Learning Workflow with AzureIvo Andreev
Machine learning is not black magic but a discipline that involves data analysis, data science and of course – hard work. From searching patterns in data, applying algorithms to converting to usable predictions, you would need background and appropriate tools. In this session, we will go through major approaches to prepare data, build and deploy ML models in Azure (ML Studio, DataScience VM, Jupyter Notebook). Most importantly – based on some examples from the real world, we will provide you with a workflow of best practices.
Intro to Machine Learning with H2O and AWSSri Ambati
Navdeep Gill @ Galvanize Seattle- May 2016
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Afternoons with Azure - Azure Machine Learning CCG
Journey through programming languages such as R, and Python that can be used for Machine Learning. Next, explore Azure Machine Learning Studio see the interconnectivity.
For more information about Microsoft Azure, call (813) 265-3239 or visit www.ccganalytics.com/solutions
Building predictive models in Azure Machine LearningMostafa
This presentation covers how to build and drive insights from data by building machine learning models. The session covers how to develop and train models in Python/R using Azure Machine Learning. The session covers how to explore key concepts in data acquisition, preparation, exploration, and visualization, and take a look at how to build a predictive solution using Azure Machine Learning, R, and Python. The session covers tips and tricks on selecting the right algorithm for your data science problem and how to utilize Machine Learning to solve it.
Global Big Data Conference Hyderabad-2Aug2013- Finance/Manufacturing Use CasesSanjay Sharma
Financial institutions today are under intense pressure to provide more value add to the customers, reduce IT costs and also grow year to year. This challenge has been further complicated by huge amounts of data being generated as well as mandatory federal compliances in place.
Similarly, Manufacturing industry today also is facing the challenge to process huge amount of data in real time and predict failures as early as possible to reduce cost and increase production efficiency.
The session will cover some high level Big Data use cases applicable to financial and manufacturing domain and how big data technologies are being used successfully to solve these challenges using some examples in credit card/banking industry in financial domain and semi-conductor production in manufacturing domain.
This talk presents you how three scala libraries - Smile, Saddle and Spark ML - satisfy requirements of new Big Data Science projects. Let's see it on example of click-through rate prediction.
Building Real Time Targeting Capabilities - Ryan Zotti, Subbu Thiruppathy - C...Sri Ambati
A team of data and software engineers and data scientists at Capital One are experimenting with various technologies to enable lightning-fast promotional content that visitors will see when they visit Capital One’s website looking to apply for a credit card. In this presentation we’ll first talk about some of the technologies that we’re exploring such as the Akka-based Play framework, and H2O, a popular open source machine learning library. We will explore our evolution of data science and the H2O tools used to create the groundwork for continuous and automated testing and optimization, with the ability to scale across the entire company. Then conclude with a quick demo followed by a few tips and tricks that we learned along the way. #h2ony
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Machines and the Magic of Fast LearningSingleStore
Human-machine interaction is no longer the exclusive province of science fiction. The advance of the internet and connected devices has inspired data scientists to create machine-learning applications to extract value from these new forms of data.
So what's the next frontier?
Join MemSQL Engineer Michael Andrews and Sr. Director Mike Boyarski to learn how to use real-time data as a vehicle for operationalizing machine-learning models. Michael and Mike will explore advanced tools, including TensorFlow, Apache Spark, and Apache Kafka, and compelling use cases demonstrating the power of machine learning to effect positive change.
You will learn:
Top technologies for building the ideal machine-learning stack
How to power machine-learning applications with real-time data
A use case and demo of machine learning for social good
Slide deck used to introduce machine learning with Azure Machine Learning Service. Focus on deployment of models with the machine learning SDK and consumption of the models with Python and Go.
Afternoons with Azure - Power BI and Azure Analysis ServicesCCG
See how Microsoft Power BI and Azure Analysis Services are influencing the BI and analytics market. Journey through data structures and fundamentals for setting up your next dashboard initiative.
Interested in learning more? Click ccganalytics.com/resources for more or call (813) 265-3239.
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Amazon Web Services
Learning Objectives:
- Get an overview of streaming data and it's application in analytics and big data.
- Understand the factors driving the accelerating transformation of batch processing to real-time.
- Learn how you should plan for incorporating data streaming in your analytics and processing workloads.
Business can now easily perform real-time analytics on data that has been traditionally analyzed using batch processing in data warehouses or using Hadoop frameworks, and react to new information in minutes or seconds instead of hours or days. In this webinar, Forrester analyst Mike Gualtieri and Amazon Kinesis GM Roger Barga will discuss this prevalent trend, it's business significance, and how you should plan for it. You will also learn about the AWS services that can help you get started quickly with real-time, streaming applications fore your analytics and big data workloads.
This slide deck explores trends in stream processing, how streaming SQL has become a standard, the advantages of streaming SQL and more.
View video: https://wso2.com/library/conference/2018/07/wso2con-usa-2018-the-rise-of-streaming-sql/
What's streaming processing? The evolution of streaming SQL. It's advantages & challenges, and how we can overcome them. Presented at WSO2 Con 2018 USA
This session takes an in-depth look at:
- Trends in stream processing
- How streaming SQL has become a standard
- The advantages of Streaming SQL
- Ease of development with streaming SQL: Graphical and Streaming SQL query editors
- Business value of streaming SQL and its related tools: Domain-specific UIs
- Scalable deployment of streaming SQL: Distributed processing
Streaming Data Ingest and Processing with Apache KafkaAttunity
Apache™ Kafka is a fast, scalable, durable, and fault-tolerant
publish-subscribe messaging system. It offers higher throughput, reliability and replication. To manage growing data volumes, many companies are leveraging Kafka for streaming data ingest and processing.
Join experts from Confluent, the creators of Apache™ Kafka, and the experts at Attunity, a leader in data integration software, for a live webinar where you will learn how to:
-Realize the value of streaming data ingest with Kafka
-Turn databases into live feeds for streaming ingest and processing
-Accelerate data delivery to enable real-time analytics
-Reduce skill and training requirements for data ingest
The recorded webinar on slide 32 includes a demo using automation software (Attunity Replicate) to stream live changes from a database into Kafka and also includes a Q&A with our experts.
For more information, please go to www.attunity.com/kafka.
Neha Narkhede talks about the experience at LinkedIn moving from batch-oriented ETL to real-time streams using Apache Kafka and how the design and implementation of Kafka was driven by this goal of acting as a real-time platform for event data. She covers some of the challenges of scaling Kafka to hundreds of billions of events per day at Linkedin, supporting thousands of engineers, etc.
Organizational success depends on our ability to sense the environment, grab opportunities and eliminate threats that are present in real-time. Such real-time processing is now available to all organizations (with or without a big data background) through the new WSO2 Stream Processor.
This slides presents WSO2 Stream Processor’s new features and improvements and explains how they make an organization excel in the current competitive marketplace. Some key features we will consider are:
* WSO2 Stream Processor’s highly productive developer environment, with graphical drag-and-drop, and the Streaming SQL query editor
* The ability to process real-time queries that span from seconds to years
* Its interactive visualization and dashboarding features with improved widget generation
* Its ability to processing at scale via distributed deployments with full observability
* Default support for HTTP analytics, distributed message trace analytics, and Twitter analytics
Independent of the source of data, the integration of event streams into an Enterprise Architecture gets more and more important in the world of sensors, social media streams and Internet of Things. Events have to be accepted quickly and reliably, they have to be distributed and analysed, often with many consumers or systems interested in all or part of the events. Storing such huge event streams into HDFS or a NoSQL datastore is feasible and not such a challenge anymore. But if you want to be able to react fast, with minimal latency, you can not afford to first store the data and doing the analysis/analytics later. You have to be able to include part of your analytics right after you consume the data streams. Products for doing event processing, such as Oracle Event Processing or Esper, are avaialble for quite a long time and used to be called Complex Event Processing (CEP). In the past few years, another family of products appeared, mostly out of the Big Data Technology space, called Stream Processing or Streaming Analytics. These are mostly open source products/frameworks such as Apache Storm, Spark Streaming, Flink, Kafka Streams as well as supporting infrastructures such as Apache Kafka. In this talk I will present the theoretical foundations for Stream Processing, discuss the core properties a Stream Processing platform should provide and highlight what differences you might find between the more traditional CEP and the more modern Stream Processing solutions.
What to Expect for Big Data and Apache Spark in 2017 Databricks
Big data remains a rapidly evolving field with new applications and infrastructure appearing every year. In this talk, Matei Zaharia will cover new trends in 2016 / 2017 and how Apache Spark is moving to meet them. In particular, he will talk about work Databricks is doing to make Apache Spark interact better with native code (e.g. deep learning libraries), support heterogeneous hardware, and simplify production data pipelines in both streaming and batch settings through Structured Streaming.
Speaker: Matei Zaharia
Video: http://go.databricks.com/videos/spark-summit-east-2017/what-to-expect-big-data-apache-spark-2017
This talk was originally presented at Spark Summit East 2017.
Time's Up! Getting Value from Big Data NowEric Kavanagh
The Briefing Room with Dr. Robin Bloor and CASK
We all know the promise of big data, but who gets the value? There are plenty of success stories already, and most of them involve one key ingredient: facilitated access to important data sets. Most research studies suggest that the Pareto principle applies: 80 percent goes to data integration, and only 20 to analysis. Inverting that balance is the Holy Grail.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor explain why the time has finally come for turning the tables on the status quo in analytics. He'll be briefed by CASK CEO Jonathan Gray, who will showcase his company's big data integration platform, CDAP, which was specifically designed to expedite time-to-value for big data.
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
In this session, you will learn best practices for implementing simple to advanced real-time streaming data use cases on AWS. First, we’ll review decision points on near real-time versus real time scenarios. Next, we will take a look at streaming data architecture patterns that include Amazon Kinesis Analytics, Amazon Kinesis Firehose, Amazon Kinesis Streams, Spark Streaming on Amazon EMR, and other open source libraries. Finally, we will dive deep into the most common of these patterns and cover design and implementation considerations.
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
Connecting enterprise systems has always been a tough task. Modern IoT applications have exacerbated the issue by the need to integrate legacy systems with novel high velocity data streams. Various patterns like messaging, REST, etc. have been proposed, but they necessitate rearchitecting the integration layer which is extremely arduous. In this talk we will show you how to use Apache NiFi to solve your data integration, movement and ingestion problems. Next, we will examine how Apache NiFi can be used to construct durable, scalable and responsive IoT apps in conjunction with other stream processing and messaging frameworks.
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteStreamNative
In this talk, Till Rohrmann and Addison Higham discuss how Flink allows for ambitious stream processing workflows and how Pulsar and Flink enable new capabilities that push forward the state-of-the-art in streaming. They will also share upcoming features and new capabilities in the integrations between Flink and Pulsar and how these two communities are working together to truly advance the power of stream processing.
10 Big Data Technologies you Didn't Know About Jesus Rodriguez
This session covers 9 new and exciting big data technologies that are starting to become relevant in the enterprise. The session focuses on technologies that are still not mainstream but that have the potential to influence the next generation of enterprise big data solutions
Similar to The Rise of Streaming SQL and Evolution of Streaming Applications (20)
Book: Software Architecture and Decision-MakingSrinath Perera
Uncertainty is the leading cause of mistakes made by practicing software architects. The primary goal of architecture is to handle uncertainty arising from user cases as well as architectural techniques. The book discusses how to make architectural decisions and manage uncertainty. From the book, You will learn common problems while designing a system, a default solution for each, more complex alternatives, and 5Q & 7P (Five Questions and Seven Principles) that help you choose.
Book, https://amzn.to/3v1MfZX
Blog: http://tinyurl.com/swdmblog
Six min video - https://youtu.be/jtnuHvPWlYU
We have critically evaluated how AI will shape integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study.
We observe that AI can significantly impact integration use cases and identify 13 AI-based use case classes for integration. Points to note include:
Enabling AI in an enterprise involves collecting, cleaning up, and creating a single representation of data as well as enforcing decisions and exposing data outside, each of which leads to many integration use cases. Hence, AI indirectly creates demand for integration.
AI needs data, which in some cases lead to significant competitive advantages. The need to collect data would drive vendors to offer most AI products in the cloud through APIs.
Due to lack of expertise and data, custom AI model building will be limited to large organizations. It is hard for small and medium size organization to build and maintain custom models.
The Role of Blockchain in Future IntegrationsSrinath Perera
We have critically evaluated blockchain-based integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study. Based on our analysis, we observe that blockchain can significantly impact integration use cases.
In our paper, we identify 30-plus blockchain-based use cases for integration and four architecture patterns. Notably, each use case we identified can be implemented using one of the architecture patterns. Furthermore, we also discuss challenges and risks posed by blockchains that would affect these architecture patterns.
Our webinar presents a critical analysis of serverless technology and our thoughts about its future. We use Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, as the methodology of our study. Based on our analysis, we believe that serverless can significantly impact applications and software development workflows.
We’ve also made two further observations:
Limitations, such as tail latencies and cold starts, are not deal breakers for adoption. There are significant use cases that can work with existing serverless technologies despite these limitations.
We see a significant gap in required tooling and IDE support, best practices, and architecture blueprints. With proper tooling, it is possible to train existing enterprise developers to program with serverless. If proper tools are forthcoming, we believe serverless can cross the chasm in 3-5 years.
A detailed analysis can be found here: A Survey of Serverless: Status Quo and Future Directions. Join our webinar as we discuss this study, our conclusions, and evidence in detail.
1. Blockchain potential impact is real. If successful, Blockchain technologies can transform the way we live our day to day lives.
2. We believe technology is ready for limited applications in Digital Currency, Lightweight financial systems, Ledgers (of identity, ownership, status, and authority), Provenance (e.g. supply chains and other B2B scenarios) and Disintermediation, which we believe will happen in next three years.
3. However, with other use cases, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. These are hard problems and
4. It is not clear whether blockchain can sustain the current level of effort for extended period of 5+ years. There are many startups and they run the risk of running out of money before markets are ready. Failure of startups can inhibit further funding and investments.
5. Value and need of decentralization compared to centralized and semi-centralized alternatives is not clear.
A Visual Canvas for Judging New TechnologiesSrinath Perera
In the fast-changing technology world, the technology landscape shifts faster and faster. The agents of thses changes are new emerging technologies, which sometimes even create, destroy, or transform segments. In a shifting world, prevailing advantages are fleeting. Organizations that can master change and ride technology waves owns the future.
Not all emerging technologies live up to their promise. Every year, as a part of annual planning, most organizations need to decide relevance, impact, and the probability of success of emerging technologies and pick their bets. Although it is a regular decision there is no widely accepted framework for evaluating emerging technologies.
As a solution to this problem, we present “Emerging Technology Analysis Canvas” (ETAC), a framework to assess an individual emerging technology as a solution to this problem. Inspired by the Business Model Canvas, It represents different aspects of technology visually on a single page. This approach includes a set of questions that probe the technology arranged around a logical narrative. The visual representation is concise, compact, and comprehensible in a glance.
The talk discusses how analytics can attack privacy and what we can do about it. It discusses the legal responses (e.g. GDPR) as well technical responses ( differential privacy and homomorphic encryption).
The video is in https://www.facebook.com/eduscopelive/videos/314847475765297/ from 1.18.
Blockchain is often cited as one of the most impactful technology along with AI. It has attracted many startups, venture investments, and academic research. If successful, Blockchain technologies can transform the way, we live our day to day lives.
However, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. They are hard problems, and likely it will take at least 5-10 years to find answers to those problems.
Given the risk involved as well as the significant potential returns, we recommend a cautiously optimistic approach for blockchain with the focus on concrete use cases.
Today's Technology and Emerging Technology LandscapeSrinath Perera
We have seen the rise and fall of many technologies, some disappearing without a trace while others redefining the world. Collectively they have shaped our world beyond recognition. In this talk, Srinath will start with past technologies exploring their behavior. Then he will explore current middleware landscape, its composition, and relationships between different segments. He will discuss significant developments and discuss their future. Further, he will discuss emerging technologies, forces that shape them, and the promise of each technology, and finally, speculate about their evolution. You will walk away with knowledge on the evolution of middleware, the status quo, and discussion about how, at WSO2, we think those technologies will evolve.
Some died, some get by, but some have woven themselves to today's middleware so much that we do not notice them. The point I want to make is that not all emerging technologies are fads. Some are, and some are too early, like AI. But some are lasting.
Analytics and AI: The Good, the Bad and the UglySrinath Perera
Analytics let us question the data, which in effect questions the world around us. This let us understand, monitor, and shape the world. AI let us discover connections, predict the possible futures and automate tasks.
These twin technologies can change the world around us. On one hand, make us efficient, connected, and fulfilled. At the same time, the change of status quo can replace jobs, affect lives and build biases into our systems that can marginalize millions.
In this talk, we will discuss core ideas behind analytics and AI, their possible impact, both good and bad outcomes, and challenges.
The dawn of digital businesses is upon us, with reimagined business models that make the best use of digital technologies such as automation, analytics, integration and cloud. Digital businesses are efficient, continuously optimizing, proactive, flexible and are able to fully understand their customers. Analytics is a key technology that helps in doing so. It acts as the eyes and ears of the system and provides a holistic view on the past and present so that decision-makers can predict what will happen in the future. This webinar will explore
Why becoming a digital business is not a choice
The role of analytics in digital transformation with examples
How best to leverage state of the art analytics technology
How can we filter the truth from lies and complex shades between the two? In the time of data avalanche, this is a skill that serves both our carriers as well as lives.
In this talk, we will discuss where to find information, the importance of sources, understanding bias and conflicts of interests, and finally how to communicate our conclusions with their associated confidence.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
The Rise of Streaming SQL and Evolution of Streaming Applications
1. Srinath Perera
VP Research, WSO2
srinath@wso2.com
The Rise of Streaming SQL and
Evolution of Streaming Applications
2. What is Streaming?
• A Stream is series of Events
• Query Data Streams
• Detect conditions fast (within the time of
receiving the data, - 10ms-1m).
e.g. receive an alert by querying a data streams coming from a
temperature sensor and detecting when the temperature has
reached the freezing point.
3. Almost all new data is Streaming
Almost all new data is
streams, even batch
data are at one point
potential streaming
data
One can choose to consume them as streaming data or batch
data based on value of responding to them fast
• Transaction data
• Log data
• Sensor data
• Health data
• Traffic Data
4. Stream Processing Market
Lack of proficient
developers are slowing it
Success depends on
Analytics
Positive trends
• Microservices and Observability
• Security analytics
• EDA and Messaging
Lot of analytics and machine learning
use cases will eventually shift to stream
processing
Stream Processing and IoT depends on each
Other
Market
200-500m
30% growth
5. Building a Streaming App
Code it Yourself
• Code it yourself
• Publish data to a message
topic
• Write a actor: Subscribe,
process, and put back to a
topic
Use a Streaming SQL based Stream
Processor
• Just write Streaming SQL ( will discuss
later)
Use a Stream Processors
• You just write actor and
stream processor
handles data flow, scale,
failures
6. History of Stream Processing
Started with active
databases, users want
to act when data met
a condition
TelegraphCQ
(based
PostgreSQL)
People thought about
this outside of
databases as well
Stream
Processing
Complex
Event
Processing
7. History of Stream Processing
Stream Processing
• Create a graph of actors
and run them using many
machines
• e.g. Aurora, PIPES,
STREAM, Borealis
( academic)
Complex Event Processing
Processing
• Provide a query language
and focused on effect
matching on 1-2 nodes.
• SASE, Esper, Cayuga, and
Siddhi (powers WSO2 SP),
Apama, IBM Infosphere
Niche Applications: Stock Markets, Monitoring and
Alerts, Surveillance
8. Stream Processing enters Big Data
Yahoo S4 (2010)
Twitter Storm (2011)
Both were donated to Apache
Described as “like
Hadoop, but realtime”
Wide adoption and
visibility
Spark
Streaming,
Samza, Flink
9. Rise of Streaming SQL
Apache based SP
engines used Code as
API
Big Data Switched to SQL from
MapReduce
Merged to support SQL over many
nodes
Streaming SQL
Apache Storm
Apache Flink
WSO2 CEP->WSO2 SP
Apache Kafka (KSQL)
Apache Samza and Calcite
CEPStream
Processing
10. What is Streaming SQL?
Time bID T
07:23:30 B1 210
07:23:37 B1 234
…
…
A Stream is a table never ending table, think of table
where new data (events) kept adding
Select bid, t*7/5 + 32 as tF
from BoilerStream Where t >
350
Streaming SQL
is SQL written
in such a never
ending table
Unlike SQL that returns data
when query us done,
Streaming SQL outputs data
as new events are added
You get a trigger whenever
data matches
11. Why Streaming SQL?
core operations covers
90% of use cases
without code, rest
handled via extensions
Easy to learn for the many people
who know SQL.
It's expressive, short, sweet and fast!!
Manipulate streaming data
declaratively without having to write
code.
A query engine can
better optimize the
executions with a
streaming SQL
model.
12. Common Solutions with SP
Detect a condition and trigger an alert
that bring user back to dashboard
Condition can
be
• a simple limit
• A complex
trend over time
• correlations
across streams
• a machine
learning model
Detect a condition and
update a dashboard
Train a ML
model apply over
steaming data,
and switch
models as they
drift
Detect a condition
and trigger an action
Calculate short
term values, store them
long term in a database,
and show single view
}
13. Stream Processor are Stateful
Stream Processors works off memory,
that is the secret of their performance
in 50K plus throughput
To avoid this,
Stream processors
must have HA
Stream Queries never ends
When a stream
processor failed,
which it eventually
must, the streaming
App will loose state
Most stream queries are stateful
(e.g. patterns, windows, joins)
}
14. Most Stream Processors are Obese
Most Stream Processors need 5+
nodes to setup a HA environment
Then minimal HA
size matters.
Their use cases are large, so are
there deployments. 5 plus nodes
are not a problem for large use
deployments
However, given a
Stream Processors can
do 50,000 events per
second, most use cases
need a one node.
Most famous Stream Processors
come from large internet
companies
}
15. Stream Processing need ML
Use Streaming
machine learning that
learns on the fly
Train the models offline
and apply online.
When model drift from
data, retrain and swap
the model.
As stream processing is the real
time extension of batch
processing.
Most batch ML use cases will
apply in realtime as well.
}
16. SP need Advanced Query Authoring Environments
We need integrated
development environments that
let developers write, simulate,
debug, trace, and verify and do it
Lack of programmers who
are comfortable with stream
processing is holding it back
Stream processing
queries are like regular
expressions, which are
• Based on simple rules
• very powerful
• tough on new
programmers
}
}
17. Stream Processors are
So Far
Two branches: Stream
Processing and CEP
Obese
Rise of
Streaming SQL
Introduction to Stream
Processing
Apache Storm and
inclusion to Big
Data
Stateful and Need HA
Need ML
Need Authoring Tools
19. When to use WSO2 SP?
When you want to detect complex
patterns over time
When you want to fuse data in motion
and data at rest in same application
When you are not sure about the
final load ( scale with Kafa with
same queries)
When you want to do ML
within your queries
When your load is less than
100,000 events/sec ( WSO2 SP
support with just two nodes)
When you want your
end users to tweak your
queries
20. Next Steps
Checkout WSO2 Stream
Processor Learn about Streaming
Applications with
13 Stream Processing Patterns
for building Streaming
Applications
Webinar: Distributed
Stream Processing with
WSO2 SP
Learn about Streaming
SQL with Streaming SQL
101
Webinar: WSO2 Stream
Processor