Presentation I gave at the IBM Big Data Developers meetup group in San Jose, CA.
There is also a video available of this talk at:
https://www.youtube.com/watch?v=TSt49yPBmW0&t=7m59s
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1FQYcP0.
Gian Merlino presents the advantages, challenges, and best practices to deploying and maintaining lambda architectures in the real world, using the infrastructure at Metamarkets as a case study. Filmed at qconsf.com.
Gian Merlino is a senior software engineer at Metamarkets, responsible for the infrastructure behind its data ingestion pipelines and is a committer on the Druid project.
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...DataWorks Summit
The Central Bank of the Republic of Turkey is primarily responsible for steering the monetary and exchange rate policies in Turkey.
One of the major core functions of the Bank is market operations. In this context, analyzing and interpreting real-time tick data related to money market instruments has become not only a requirement but also a challenge.
For this use case, an API provided by one of the financial data vendors has been used to gather real-time tick data and data routing has been orchestrated by Apache NiFi.
Gathered data is being transferred to Kafka topics and then handed off to Druid for real-time indexing tasks.
Indicators such as effective cost, bid-ask spread, price impact measures, return reversal are calculated using Apache Storm and finally visualized by means of Apache Superset in order to provide decision-makers with a new set of tools.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1FQYcP0.
Gian Merlino presents the advantages, challenges, and best practices to deploying and maintaining lambda architectures in the real world, using the infrastructure at Metamarkets as a case study. Filmed at qconsf.com.
Gian Merlino is a senior software engineer at Metamarkets, responsible for the infrastructure behind its data ingestion pipelines and is a committer on the Druid project.
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...DataWorks Summit
The Central Bank of the Republic of Turkey is primarily responsible for steering the monetary and exchange rate policies in Turkey.
One of the major core functions of the Bank is market operations. In this context, analyzing and interpreting real-time tick data related to money market instruments has become not only a requirement but also a challenge.
For this use case, an API provided by one of the financial data vendors has been used to gather real-time tick data and data routing has been orchestrated by Apache NiFi.
Gathered data is being transferred to Kafka topics and then handed off to Druid for real-time indexing tasks.
Indicators such as effective cost, bid-ask spread, price impact measures, return reversal are calculated using Apache Storm and finally visualized by means of Apache Superset in order to provide decision-makers with a new set of tools.
Hadoop application architectures - using Customer 360 as an examplehadooparchbook
Hadoop application architectures - using Customer 360 (more generally, Entity 360) as an example. By Ted Malaska, Jonathan Seidman and Mark Grover at Strata + Hadoop World 2016 in NYC.
Fast Data: A Customer’s Journey to Delivering a Compelling Real-Time SolutionGuido Schmutz
This is my part of the Open World 2014 presentation on Fast Data and Oracle Event Processing (OEP) 12c.
It contains an architecture discussion with some architecture patterns of where Events are useful. The 2nd part is a demo showcase showing OEP12c and BAM12c in action, analyzing the live OOW2014 twitter feed.
Architecting a next-generation data platformhadooparchbook
Slides for Architecting a next-generation data platform at Strata + Hadoop World, London 2017.
https://conferences.oreilly.com/strata/strata-eu/public/schedule/detail/57652
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
With advances in computer hardware such as 10 gigabit network cards, infiniband, and solid state drives all becoming commodity offerings, the new bottleneck in big data technologies is very commonly the processing power of the CPU. In order to meet the computational demand desired by users, enterprises have had to resort to extreme scale out approaches just to get the processing power they need. One of the most well known technologies in this space, Apache Spark, has numerous enterprises publicly talking about the challenges in running multiple 1000+ node clusters to give their users the processing power they need. This talk is based on work completed by NVIDIA’s Applied Solutions Engineering team. Attendees will learn how they were able to GPU-accelerate UDFs in PySpark using open source technologies such as Numba and PyGDF, the lessons they learned in the process, and how they were able to accelerate workloads in a fraction of the hardware footprint.
Architecting next generation big data platformhadooparchbook
A tutorial on architecting next generation big data platform by the authors of O'Reilly's Hadoop Application Architectures book. This tutorial discusses how to build a customer 360 (or entity 360) big data application.
Audience: Technical.
Case Study: Realtime Analytics with DruidSalil Kalia
The case study is about ViralGains - a US based video marketing platform. The presentation was delivered by me (Salil Kalia) at Great Indian Developer Summit (GIDS) 2016. This is a piece of a great work that we have done at TO THE NEW Digital with our customer, ViralGains.
Here, I show-cased Druid (http://druid.io) and the supporting technologies (Kafka/Zookeeper) to demonstrate how it helped us in building a stable realtime analytics system, in capturing hundreds of millions of analytics events per day. When it comes to Ad industry - it becomes very important to be precise or close to precision because money is involved at every step (even for a single ad impression).
The case study included a demo and a short talk on their journey of moving from Redis to Cassandra and finally ending up on Druid with an outstanding performance.
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...StampedeCon
From the StampedeCon 2015 Big Data Conference: There is an adage, “If you fail to plan, you plan to fail” . When developing systems the adage can be taken a step further, “If you fail to plan FOR FAILURE, you plan to fail”. At Huffington post data moves between a number of systems to provide statistics for our technical, business, and editorial teams. Due to the mission-critical nature of our data, considerable effort is spent building resiliency into processes.
This talk will focus on designing for failure. Some material will focus understanding the traits of specific distributed systems such as message queues or NoSQL databases and what are the consequences for different types of failures. While other parts of the presentation will focus on how systems and software can be designed to make re-processing batch data simple, or how to determine what failure mode semantics are important for a real time event processing system.
Reference architecture for Internet of ThingsSujee Maniyam
What kind of a data infrastructure is needed, to support Internet of Things?
This talk presents a reference architecture.
We are actually building this architecture as open source project. See here : bit.ly / iotxyz
Lifting the hood on spark streaming - StampedeCon 2015StampedeCon
At the StampedeCon 2015 Big Data Conference: Today if a byte of data were a gallon of water, in only 10 seconds there would be enough data to fill an average home, in 2020 it will only take 2 seconds. The Internet of Things is driving a tremendous amount of this growth, providing more data at a higher rate then we’ve ever seen. With this explosive growth comes the demand from consumers and businesses to leverage and act on what is happening right now. Without stream processing these demands will never be met, and there will be no big data and no Internet of Things. Apache Spark, and Spark Streaming in particular can be used to fulfill this stream processing need now and in the future. In this talk I will peel back the covers and we will take a deep dive into the inner workings of Spark Streaming; discussing topics such as DStreams, input and output operations, transformations, and fault tolerance.
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
Mit der Architektur steht und fällt jedes IT-Projekt. Das gilt in noch stärkerem Maße für Big-Data-Projekte, denn hier konnten noch keine Standards über Jahrzehnte ihre Tauglichkeit beweisen. Dennoch verbreiten und etablieren sich auch hier gute und effektive Lösungen. Der Vortrag erklärt, welche Bausteine wichtig für die verschiedenen Einsatzmöglichkeiten im Big-Data-Umfeld sind, und wie sie in konkrete Lösungen gegossen werden können. Dabei beleuchtet er sowohl traditionelle Big-Data-Architekturen als auch aktuelle Ansätze, wie z. B. die Lambda- und die Kappa-Architektur. Ebenfalls ein Thema sind Stream-Processing-Infrastrukturen und ihre Kombination mit Big-Data-Technologien. Ausgehend von einer produkt- und technologieunabhängigen Referenzarchitektur stellt dieser Vortrag verschiedene Lösungsmöglichkeiten auf Basis von Open-Source-Komponenten vor.
Using Multiple Persistence Layers in Spark to Build a Scalable Prediction Eng...StampedeCon
At the StampedeCon 2015 Big Data Conference: This talk will examine the benefits of using multiple persistence strategies to build an end-to-end predictive engine. Utilizing Spark Streaming backed by a Cassandra persistence layer allows rapid lookups and inserts to be made in order to perform real-time model scoring. Spark backed by Parquet files, stored in HDFS, allows for high-throughput model training and tuning utilizing Spark MLlib. Both of these persistence layers also provide ad-hoc queries via Spark SQL in order to easily analyze model sensitivity and accuracy. Storing the data in this way also provides extensibility to leverage existing tools like CQL to perform operational queries on the data stored in Cassandra and Impala to perform larger analytical queries on the data stored in HDFS further maximizing the benefits of the flexible architecture.
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...jaxLondonConference
Presented at JAX London 2013
With the proliferation of data sources and growing user bases, the amount of data generated requires new ways for storage and processing. Hadoop opened new possibilities, yet it falls short of instant delivery. Adding stream processing using Nathan Marz’s Storm, can overcome this delay and bridge the gap to real-time aggregation and reporting. On the Batch layer all master data is kept and is immutable. Once the base data is stored a recurring process will index the data. This process reads all master data, parses it and will create new views out of it.
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
At NMC (Nielsen Marketing Cloud) we provide our customers (marketers and publishers) real-time analytics tools to profile their target audiences.
To achieve that, we need to ingest billions of events per day into our big data stores, and we need to do it in a scalable yet cost-efficient manner.
In this session, we will discuss how we continuously transform our data infrastructure to support these goals.
Specifically, we will review how we went from CSV files and standalone Java applications all the way to multiple Kafka and Spark clusters, performing a mixture of Streaming and Batch ETLs, and supporting 10x data growth.
We will share our experience as early-adopters of Spark Streaming and Spark Structured Streaming, and how we overcame technical barriers (and there were plenty...).
We will present a rather unique solution of using Kafka to imitate streaming over our Data Lake, while significantly reducing our cloud services' costs.
Topics include :
* Kafka and Spark Streaming for stateless and stateful use-cases
* Spark Structured Streaming as a possible alternative
* Combining Spark Streaming with batch ETLs
* "Streaming" over Data Lake using Kafka
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
Designing a streaming application which has to process data from 1 or 2 streams is easy. Any streaming framework which provides scalability, high-throughput, and fault-tolerance would work. But when the number of streams start growing in order 100s or 1000s, managing them can be daunting. How would you share resources among 1000s of streams with all of them running 24×7? Manage their state, Apply advanced streaming operations, Add/Delete streams without restarting? This talk explains common scenarios & shows techniques that can handle thousands of streams using Spark Structured Streaming.
Big Data and Analytics: The IBM PerspectiveThe_IPA
Gareth Mitchell-Jones, Associate Partner Big Data & Analytics at IBM, shares his thoughts on the hot topic of Big Data from his unique perspective at an IPA 44 Club event in London. To learn more about The IPA visit www.ipa.co.uk and The 44 Club here http://www.ipa.co.uk/groups/44-club-2
Hadoop application architectures - using Customer 360 as an examplehadooparchbook
Hadoop application architectures - using Customer 360 (more generally, Entity 360) as an example. By Ted Malaska, Jonathan Seidman and Mark Grover at Strata + Hadoop World 2016 in NYC.
Fast Data: A Customer’s Journey to Delivering a Compelling Real-Time SolutionGuido Schmutz
This is my part of the Open World 2014 presentation on Fast Data and Oracle Event Processing (OEP) 12c.
It contains an architecture discussion with some architecture patterns of where Events are useful. The 2nd part is a demo showcase showing OEP12c and BAM12c in action, analyzing the live OOW2014 twitter feed.
Architecting a next-generation data platformhadooparchbook
Slides for Architecting a next-generation data platform at Strata + Hadoop World, London 2017.
https://conferences.oreilly.com/strata/strata-eu/public/schedule/detail/57652
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
With advances in computer hardware such as 10 gigabit network cards, infiniband, and solid state drives all becoming commodity offerings, the new bottleneck in big data technologies is very commonly the processing power of the CPU. In order to meet the computational demand desired by users, enterprises have had to resort to extreme scale out approaches just to get the processing power they need. One of the most well known technologies in this space, Apache Spark, has numerous enterprises publicly talking about the challenges in running multiple 1000+ node clusters to give their users the processing power they need. This talk is based on work completed by NVIDIA’s Applied Solutions Engineering team. Attendees will learn how they were able to GPU-accelerate UDFs in PySpark using open source technologies such as Numba and PyGDF, the lessons they learned in the process, and how they were able to accelerate workloads in a fraction of the hardware footprint.
Architecting next generation big data platformhadooparchbook
A tutorial on architecting next generation big data platform by the authors of O'Reilly's Hadoop Application Architectures book. This tutorial discusses how to build a customer 360 (or entity 360) big data application.
Audience: Technical.
Case Study: Realtime Analytics with DruidSalil Kalia
The case study is about ViralGains - a US based video marketing platform. The presentation was delivered by me (Salil Kalia) at Great Indian Developer Summit (GIDS) 2016. This is a piece of a great work that we have done at TO THE NEW Digital with our customer, ViralGains.
Here, I show-cased Druid (http://druid.io) and the supporting technologies (Kafka/Zookeeper) to demonstrate how it helped us in building a stable realtime analytics system, in capturing hundreds of millions of analytics events per day. When it comes to Ad industry - it becomes very important to be precise or close to precision because money is involved at every step (even for a single ad impression).
The case study included a demo and a short talk on their journey of moving from Redis to Cassandra and finally ending up on Druid with an outstanding performance.
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...StampedeCon
From the StampedeCon 2015 Big Data Conference: There is an adage, “If you fail to plan, you plan to fail” . When developing systems the adage can be taken a step further, “If you fail to plan FOR FAILURE, you plan to fail”. At Huffington post data moves between a number of systems to provide statistics for our technical, business, and editorial teams. Due to the mission-critical nature of our data, considerable effort is spent building resiliency into processes.
This talk will focus on designing for failure. Some material will focus understanding the traits of specific distributed systems such as message queues or NoSQL databases and what are the consequences for different types of failures. While other parts of the presentation will focus on how systems and software can be designed to make re-processing batch data simple, or how to determine what failure mode semantics are important for a real time event processing system.
Reference architecture for Internet of ThingsSujee Maniyam
What kind of a data infrastructure is needed, to support Internet of Things?
This talk presents a reference architecture.
We are actually building this architecture as open source project. See here : bit.ly / iotxyz
Lifting the hood on spark streaming - StampedeCon 2015StampedeCon
At the StampedeCon 2015 Big Data Conference: Today if a byte of data were a gallon of water, in only 10 seconds there would be enough data to fill an average home, in 2020 it will only take 2 seconds. The Internet of Things is driving a tremendous amount of this growth, providing more data at a higher rate then we’ve ever seen. With this explosive growth comes the demand from consumers and businesses to leverage and act on what is happening right now. Without stream processing these demands will never be met, and there will be no big data and no Internet of Things. Apache Spark, and Spark Streaming in particular can be used to fulfill this stream processing need now and in the future. In this talk I will peel back the covers and we will take a deep dive into the inner workings of Spark Streaming; discussing topics such as DStreams, input and output operations, transformations, and fault tolerance.
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
Mit der Architektur steht und fällt jedes IT-Projekt. Das gilt in noch stärkerem Maße für Big-Data-Projekte, denn hier konnten noch keine Standards über Jahrzehnte ihre Tauglichkeit beweisen. Dennoch verbreiten und etablieren sich auch hier gute und effektive Lösungen. Der Vortrag erklärt, welche Bausteine wichtig für die verschiedenen Einsatzmöglichkeiten im Big-Data-Umfeld sind, und wie sie in konkrete Lösungen gegossen werden können. Dabei beleuchtet er sowohl traditionelle Big-Data-Architekturen als auch aktuelle Ansätze, wie z. B. die Lambda- und die Kappa-Architektur. Ebenfalls ein Thema sind Stream-Processing-Infrastrukturen und ihre Kombination mit Big-Data-Technologien. Ausgehend von einer produkt- und technologieunabhängigen Referenzarchitektur stellt dieser Vortrag verschiedene Lösungsmöglichkeiten auf Basis von Open-Source-Komponenten vor.
Using Multiple Persistence Layers in Spark to Build a Scalable Prediction Eng...StampedeCon
At the StampedeCon 2015 Big Data Conference: This talk will examine the benefits of using multiple persistence strategies to build an end-to-end predictive engine. Utilizing Spark Streaming backed by a Cassandra persistence layer allows rapid lookups and inserts to be made in order to perform real-time model scoring. Spark backed by Parquet files, stored in HDFS, allows for high-throughput model training and tuning utilizing Spark MLlib. Both of these persistence layers also provide ad-hoc queries via Spark SQL in order to easily analyze model sensitivity and accuracy. Storing the data in this way also provides extensibility to leverage existing tools like CQL to perform operational queries on the data stored in Cassandra and Impala to perform larger analytical queries on the data stored in HDFS further maximizing the benefits of the flexible architecture.
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...jaxLondonConference
Presented at JAX London 2013
With the proliferation of data sources and growing user bases, the amount of data generated requires new ways for storage and processing. Hadoop opened new possibilities, yet it falls short of instant delivery. Adding stream processing using Nathan Marz’s Storm, can overcome this delay and bridge the gap to real-time aggregation and reporting. On the Batch layer all master data is kept and is immutable. Once the base data is stored a recurring process will index the data. This process reads all master data, parses it and will create new views out of it.
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
At NMC (Nielsen Marketing Cloud) we provide our customers (marketers and publishers) real-time analytics tools to profile their target audiences.
To achieve that, we need to ingest billions of events per day into our big data stores, and we need to do it in a scalable yet cost-efficient manner.
In this session, we will discuss how we continuously transform our data infrastructure to support these goals.
Specifically, we will review how we went from CSV files and standalone Java applications all the way to multiple Kafka and Spark clusters, performing a mixture of Streaming and Batch ETLs, and supporting 10x data growth.
We will share our experience as early-adopters of Spark Streaming and Spark Structured Streaming, and how we overcame technical barriers (and there were plenty...).
We will present a rather unique solution of using Kafka to imitate streaming over our Data Lake, while significantly reducing our cloud services' costs.
Topics include :
* Kafka and Spark Streaming for stateless and stateful use-cases
* Spark Structured Streaming as a possible alternative
* Combining Spark Streaming with batch ETLs
* "Streaming" over Data Lake using Kafka
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
Designing a streaming application which has to process data from 1 or 2 streams is easy. Any streaming framework which provides scalability, high-throughput, and fault-tolerance would work. But when the number of streams start growing in order 100s or 1000s, managing them can be daunting. How would you share resources among 1000s of streams with all of them running 24×7? Manage their state, Apply advanced streaming operations, Add/Delete streams without restarting? This talk explains common scenarios & shows techniques that can handle thousands of streams using Spark Structured Streaming.
Big Data and Analytics: The IBM PerspectiveThe_IPA
Gareth Mitchell-Jones, Associate Partner Big Data & Analytics at IBM, shares his thoughts on the hot topic of Big Data from his unique perspective at an IPA 44 Club event in London. To learn more about The IPA visit www.ipa.co.uk and The 44 Club here http://www.ipa.co.uk/groups/44-club-2
Scala: the unpredicted lingua franca for data scienceAndy Petrella
Talk given at Strata London with Dean Wampler (Lightbend) about Scala as the future of Data Science. First part is an approach of how scala became important, the remaining part of the talk is in notebooks using the Spark Notebook (http://spark-notebook.io/).
The notebooks are available on GitHub: https://github.com/data-fellas/scala-for-data-science.
Connected Car - the future technology and opportunities in car networkingspirit conference
Reza Zanjani, Unit Manager at SEVEN PRINCIPLES Solutions & Consulting AG, held a presentation about "Connected Car - the future technology and opportunities in car networking" during the spirit conference 2014.
So you got a handle on what Big Data is and how you can use it to find business value in your data. Now you need an understanding of the Microsoft products that can be used to create a Big Data solution. Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together. How does Microsoft enhance and add value to Big Data? From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way
Enabling the Internet of Things with Real-time HadoopBecky Mendenhall
These slides show how IoT is driving the growth of Hadoop adoption in Enterprises across the world. Discuss real-life use cases of IoT and how Pepperdata and MapR - two Hadoop technology experts - have helped customers take advantage of the IoT era and drive more business value with lowered infrastructure costs, faster performance, and SLA guarantees.
Edge-controlled, cloud-connected: Design patterns for the IIoTJohn Breitenbach
RTI presents a RIoT lunch and learn on the IIC's layered databus design pattern for the IIoT. Industrial systems requiring millisecond response times like autonomous vehicles, patient monitoring, and the power grid can’t push data up to the cloud for live decision making. Latencies across the wide-area network mean those decisions arrive too late. For real-time response, data must be shared and acted on locally. Some data, however, may need to be pooled via the cloud to work with back-end systems for billing, analytics, diagnostics, and other enterprise level applications. What are the recommended design patterns to satisfy both needs? Join this lunch and learn for an overview of the “big 3” architectural patterns recommended by the Industrial Internet of Things Reference Architecture. A live walkthrough of the Layered Databus pattern will demonstrate real-time, millisecond control at the edge coupled with enterprise connectivity all the way back to the Web.
Edge-controlled, cloud-connected: Design patterns for the IIoTJohn Breitenbach
RTI presents a RIoT lunch and learn on the IIC's layered databus design pattern for the IIoT. Industrial systems requiring millisecond response times like autonomous vehicles, patient monitoring, and the power grid can’t push data up to the cloud for live decision making. Latencies across the wide-area network mean those decisions arrive too late. For real-time response, data must be shared and acted on locally. Some data, however, may need to be pooled via the cloud to work with back-end systems for billing, analytics, diagnostics, and other enterprise level applications. What are the recommended design patterns to satisfy both needs? Join this lunch and learn for an overview of the “big 3” architectural patterns recommended by the Industrial Internet of Things Reference Architecture. A live walkthrough of the Layered Databus pattern will demonstrate real-time, millisecond control at the edge coupled with enterprise connectivity all the way back to the Web.
Achieving Business Value by Fusing Hadoop and Corporate DataInside Analysis
The Briefing Room with Richard Hackathorn and Teradata
Live Webcast March 25, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e7254708146d056339a0974f097f569b2
Hadoop data lakes are emerging as peers to corporate data warehouses. However, successful analytic solutions require a fusion of all relevant data, big and small, which has proven challenging for many companies. By allowing business analysts to quickly access data wherever it rests, success factors shift to focus on three key aspects: 1) business objectives, 2) organizational workflow, and 3) data placement.
Register for this Special Edition of The Briefing Room to hear veteran Analyst Richard Hackathorn as he provides details from his recent research report focused on success stories using Teradata QueryGrid. Examples of use cases described will include:
Joining sensor data in Hadoop with data warehouse labor schedules in seconds
How bridging corporate cultures and systems creates new business opportunities
The 360 view of customer journeys using weblogs in Hadoop via BI tools
How can you put the data where you want and query it however you want
Virtualizing Hadoop data with Teradata QueryGrid
Visit InsideAnalysis.com for more information.
Visualizing IoT: Rapid Business Data Discovery for the Internet of ThingsMia Yuan Cao
As the Internet of Things (IoT) is making our world more connected, there is a growing need to understand the data through data visualization, analysis and discovery across different types of connected device platforms.
Big Data kennen sehr viele IT-Experten, wenigstens haben Sie eine Vorstellung davon. In der Praxis arbeiten damit in Deutschland derzeit nur wenige. Dabei bringt Big Data ein ganz neues Momentum in moderne Softwarelösungen und ist im Kontext der Mobil-, Cloud- und Social-Veränderungen nicht wegzudenken. Big Data macht Software intelligent und damit auf eine ganz neue Art für die Benutzer erlebbar. Mit Big Data entstehen neue Softwarearchitekturen, weil Informationen völlig anders verarbeitet werden - nämlich schneller, differenzierter und oft mit dem Ziel, Schlüsse zu ziehen und Vorhersagen zu treffen.
In diesem Vortrag wird erläutert, wie moderne Softwarearchitekturen gestaltet werden, sodass Sie Big Data Paradigmen erfolgreich umsetzen und welche Vorteile sich für die zunehmend mobilen Softwarelösungen ergeben. Wir werfen zudem einen Blick auf die Potentiale und Optionen in Branchen wie Banken, Versicherung oder Handel.
Overcoming the AIoT Obstacles through Smart Component IntegrationInnodisk Corporation
Enterprises in every industry are gearing up for AI’s integration with IoT at the edge. Analytics and cloud-based applications are crucial foundations for the AIoT infrastructure. But even more importantly, AIoT requires complete, real-time access to the data in fulfill the needs of highly responsive edge computing applications.
In our experience, many customers are facing the same difficulties with regards to cyber level and physical level device integration in the new AI era. As the world's leading industrial storage and memory provider, Innodisk has a solid track record with more than 2000 customers, and expertise built on more than a decade of integration of hardware, firmware and software solutions.
Attend this webinar to learn about:
- Preparing your business for the new Internet of Things (IoT) an AI era
- How do we Overcome the Current Architectural Issues?
- Increasing process efficiency and delivering a better customer experience
- Facilitating new platforms that enable rapid development of next generation intelligent IoT systems
- Trends and technology in AIoT intelligent storage/ data optimization
We're introducing MapR Streams, a reliable, global event streaming system that connects data producers and data consumers across shared topics of information. With the integration of MapR Streams, comes the industry’s first and only converged data platform that integrates file, database, event streaming, and analytics to accelerate data-driven applications and address emerging IoT needs.
Are you ready to accelerate your business with the power of a truly global platform for integrating data-in-motion with data-at-rest?
Vertex perspectives ai optimized chipsets (part i)Yanai Oron
Businesses are increasingly adopting AI to create new applications to transform existing operations, driving big data with the growth of IoT and 5G networks and increasing future process complexities for human operators. In this new environment, AI will be needed to write algorithms dynamically to automate the entire programming process. Fortunately, algorithms associated with deep learning are able to achieve enhanced performance with increasing data, unlike the rest associated with machine learning.
Vertex Perspectives | AI-optimized Chipsets | Part IVertex Holdings
Businesses are increasingly adopting AI to create new applications to transform existing operations, driving big data with the growth of IoT and 5G networks and increasing future process complexities for human operators. In this new environment, AI will be needed to write algorithms dynamically to automate the entire programming process. Fortunately, algorithms associated with deep learning are able to achieve enhanced performance with increasing data, unlike the rest associated with machine learning. To date, deep learning technology has primarily been a software play. Existing processors were not originally designed for these new applications. Hence the need to develop AI-optimized hardware.
QNAP Systems, Inc., headquartered in Taipei, Taiwan, provides a comprehensive range of cutting-edge Network-attached Storage (NAS) and video surveillance solutions based on the principles of usability, high security, and flexible scalability. QNAP offers quality NAS products for home and business users, providing solutions for storage, backup/snapshot, virtualization, teamwork, multimedia, and more. QNAP envisions NAS as being more than "simple storage", and has created many NAS-based innovations to encourage users to host and develop Internet of Things, artificial intelligence, and machine learning solutions on their QNAP NAS.
Bhadale group of companies technology ecosystem for modernizationVijayananda Mohire
This is our draft version of the technology ecosystem for modernization offers. We offer services at national, provincial, city, enterprise and individual levels that enables countries to adopt and improve their lifestyles and income
Startup pitch presented by co-founder and CEO Jaco Els. Cubitic offers a predictive analytics platform that allows developers to build custom solutions for analytics and visualisation on top of a machine learning engine.
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big DataVoltDB
This webinar with Chris Selland of HPE Vertica and Dennis Duckworth of VoltDB addresses the growing challenges with managing a complex IoT solution and how to enable real-time operational interaction with comprehensive data analytics.
Rapidly Developing Internet of Things (IoT) Applications. Demos include using the Raspberry Pi, Beacons, the Oculus Rift, and other sensors. Apps were developed using Bluemix services.
Similar to Virdata: lessons learned from the Internet of Things and M2M Cloud Services @ IBM Big Data Developers Meetup (20)
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Virdata: lessons learned from the Internet of Things and M2M Cloud Services @ IBM Big Data Developers Meetup
1. Big Data Developers - Virdata, Internet of Things #virdata
Big Data & IoT: lessons learned
Big Data Developers Meetup, San Jose, CA - June 5, 2014
#virdata | @nathan_gs
2. Big Data Developers - Virdata, Internet of Things #virdata
Who is Technicolor?
Domains
● Media Services
● Entertainment Services
● Connected Home
● Emerging Ventures
● Technology & Innovations
Who We Are
Technicolor, a worldwide technology leader in the media and entertainment sector, is at the
forefront of digital innovation. Our world class research and innovation laboratories and our
creative talent pool enable us to lead the market in delivering advanced services to content
creators and distributors. We also benefit from an extensive intellectual property portfolio
focused on imaging and sound technologies, supporting our thriving licensing business.
3. Big Data Developers - Virdata, Internet of Things #virdata
Virdata – OUR CORE CLOUD SERVICES
Device
Monitoring
Device
Management
Big Data
Analytics
Big Data
Queries
Application
Monitoring
Virdata Cloud APIs
MQTT
MQTT
MQTT
MQTT
M
Q
TT
MQTT
4. Big Data Developers - Virdata, Internet of Things #virdata
Virdata - 2 COMPONENTS: A CLOUD & A LIBRARY
★ Elastic and Scalable cutting edge technologies
★ API’s for different types of information/data consumption
★ Cloud agnostic thru self build monitoring tools
★ Running on both public & private cloud infrastructure
★ Bi-directional messaging
★ High performance brokers architecture
★ Lightweight and portable library
★ Multiple programming languages
★ Supports multiple transport protocols
★ Available for all HW and OS
★ Supports any type of data in any format/syntax
★ Payload is compressed and encrypted
5. Big Data Developers - Virdata, Internet of Things #virdata
Virdata - SERVICE ARCHITECTURE
millions of simultaneous persistent bi-directional connections
millions of messages per second
Real-time Complex Event Processing
Distributed Pub/Sub Messaging
Historical Data Archiving Pre-computed Data
In-Memory
real-time Data
REST API
Launch Queries - Launch Jobs
INTEGRATION
CUSTOMIZATION
NOC, OPERATIONS, MGMT REPORTS, TRENDS
ANALYTICS
6. Big Data Developers - Virdata, Internet of Things #virdata
Virdata - VERTICAL INDUSTRIES
AUTOMOTIVE
● Fleet Management
● Insurance
● Emergency Services
UTILITIES
● Remote Meter Management
● Monitor Energy Consumption
● Optimize Subscription Plan
CONSUMER ELECTRONICS
● Monitoring & Management
● Upsell Services
● Enhanced End User Experience
CUSTOMER CARE
● Monitor Device & Application
● One Button Care
● Call Avoidance
RETAIL
● Geo-location Based Adverts
● Heat Mapping
● Individualized Offering
HEALTH
● Promote Patient Independence
● Time-Series Analysis
● Pro-active Responses
7. Big Data Developers - Virdata, Internet of Things #virdata
Live Demo
Contact us for a live demo at info@virdata.com or virdata.com.
8. Big Data Developers - Virdata, Internet of Things #virdata
Connected “Things”
9. Big Data Developers - Virdata, Internet of Things #virdata
Huge variety in devices and OSs.
10. Big Data Developers - Virdata, Internet of Things #virdata
Virdata Client Libraries
12. Big Data Developers - Virdata, Internet of Things #virdata
Northbound and Southbound API
Northbound API = Cloud API
● Messaging API
○ REST
○ PUB/SUB
○ MQTT
○ JMS
● Data Processing API
○ SQL
○ JobAPI
○ Query/REST
Southbound API provided at the device
level
13. Big Data Developers - Virdata, Internet of Things #virdata
Integration of Virdata into IBM BlueMix
Objectives
• Show the strengths of the Virdata Internet of Things platform
• Scalability to supports millions of connected devices
• Real-time and historical data processing
• Cloud API’s powering new data drives services across vertical markets
• Demonstrate the power of the IBM BlueMix solution
• Rapid development and deployment of new applications
• Platform as a Service marketplace
• Highlight the value of combining both
• Internet of Things platform as a service
Use-case
• Virdata provides real-time car data
• App acts upon car trouble codes
• Invokes manufacturer analytics service
• Initiates recommended actions, e.g. through
Maximo workflow service
• Schedules car dealer appointment
• Informs the car driver
14. Big Data Developers - Virdata, Internet of Things #virdata
Messaging & Broker
15. Big Data Developers - Virdata, Internet of Things #virdata
Messaging Architecture: Device to Platform
Protocol
Adapter
Protocol
Adapter
Protocol
Adapter
Kafka
Kafka
Kafka
Kafka
Storm
Storm
Storm
API
Data
Processing
API
State
State
State
16. Big Data Developers - Virdata, Internet of Things #virdata
Messaging Architecture: Device to Device(s)
Protocol
Adapter
Protocol
Adapter
Protocol
Adapter
Kafka
Kafka
Kafka
Kafka
Storm
Storm
Storm
API
Data
Processing
API
State
State
State
17. Big Data Developers - Virdata, Internet of Things #virdata
Messaging Architecture: Large Fan Out
Protocol
Adapter
Protocol
Adapter
Protocol
Adapter
Kafka
Kafka
Kafka
Kafka
Storm
Storm
Storm
API
Data
Processing
API
State
State
State
18. Big Data Developers - Virdata, Internet of Things #virdata
Horizontally scalable
… and elastic as well.
Messaging
19. Big Data Developers - Virdata, Internet of Things #virdata
Persistent connections
Broker
20. Big Data Developers - Virdata, Internet of Things #virdata
Real-time bidirectional communication
21. Big Data Developers - Virdata, Internet of Things #virdata
MQTT
Pub/Sub
Protocol Adaptor
22. Big Data Developers - Virdata, Internet of Things #virdata
MQTT: QoS levels
QoS 0: best effort
QoS 1: at least once
QoS 2: Exactly once
Protocol Adaptor
25. Big Data Developers - Virdata, Internet of Things #virdata
Message passing
Storm
26. Big Data Developers - Virdata, Internet of Things #virdata
Stream/Message partitioning, as well as grouping.
Storm
27. Big Data Developers - Virdata, Internet of Things #virdata
Storm
Nimbus Zookeeper
Supervisor
Worker Node
Executer
Executer
Executer
Supervisor
Worker Node
Executer
Executer
Executer
Supervisor
Worker Node
Executer
Executer
Executer
28. Big Data Developers - Virdata, Internet of Things #virdata
Storm
Tuple
Stream
Field 1 | Field 2 | Field 3| Field 4 | Field 5
TUPLE
TUPLE TUPLE TUPLE TUPLE
STREAM
29. Big Data Developers - Virdata, Internet of Things #virdata
Storm
Spout
Bolt
SPOUT BOLT
T
T T T
T T T BOLT
T T T
T T T
T T T BOLT API
30. Big Data Developers - Virdata, Internet of Things #virdata
Storm
Grouping
S
B
B
B
B
B
GROUPING GROUPING
32. Big Data Developers - Virdata, Internet of Things #virdata
Events used to manipulate the master data.
Events: Before
33. Big Data Developers - Virdata, Internet of Things #virdata
Today, events are the master data.
Events: After
34. Big Data Developers - Virdata, Internet of Things #virdata
Let’s store everything.
Data System
35. Big Data Developers - Virdata, Internet of Things #virdata
Data is Immutable.
Data System
36. Big Data Developers - Virdata, Internet of Things #virdata
Data is Time Based.
Data System
37. Big Data Developers - Virdata, Internet of Things #virdata
The data you query is often transformed, aggregated, ...
Rarely used in its original form.
Query
38. Big Data Developers - Virdata, Internet of Things #virdata
Query = function ( all data )
Query
39. Big Data Developers - Virdata, Internet of Things #virdata
Functional computation, based on immutable inputs, is
idempotent.
Batch Layer
40. Big Data Developers - Virdata, Internet of Things #virdata
Query: Number of cars living in each city
Car Location Timestamp
BMW 1 Antwerp 2008-10-11
Aston Martin Cologne 2010-01-23
BMW 2 Antwerp 2012-09-12
BMW 1 Cologne 2014-04-29
Location Count
Antwerp 1
Cologne 2
41. Big Data Developers - Virdata, Internet of Things #virdata
Query
All Data QueryPrecomputed
View
42. Big Data Developers - Virdata, Internet of Things #virdata
Layered Architecture
Batch Layer
Speed Layer
Serving
Layer
43. Big Data Developers - Virdata, Internet of Things #virdata
Layered Architecture
Spark C*
Incoming Data
*
Query
45. Big Data Developers - Virdata, Internet of Things #virdata
Batch Layer
Incoming Data
Spark C*
46. Big Data Developers - Virdata, Internet of Things #virdata
Batch Layer
The batch layer can calculate anything, given enough time...
Unrestrained computation.
47. Big Data Developers - Virdata, Internet of Things #virdata
Keep the data in its original format.
The batch layer stores the data normalized, the generated views are often, if not always denormalized.
Batch Layer
48. Big Data Developers - Virdata, Internet of Things #virdata
Horizontally scalable.
Batch Layer
49. Big Data Developers - Virdata, Internet of Things #virdata
Stores a master copy of the data set
Batch Layer
… append only
50. Big Data Developers - Virdata, Internet of Things #virdata
High Latency.
Let’s for now pretend the update latency doesn’t matter.
Batch Layer
52. Big Data Developers - Virdata, Internet of Things #virdata
In-memory storage
Spark
53. Big Data Developers - Virdata, Internet of Things #virdata
Advanced DAG execution engine
Cyclic data, in memory computing.
Spark
54. Big Data Developers - Virdata, Internet of Things #virdata
Multilanguage support, interactive shells
Scala, Java & Python
Spark
55. Big Data Developers - Virdata, Internet of Things #virdata
Write programs in terms of transformations on
distributed datasets.
RDD, are collections of objects, stored in RAM or on disk.
Are build through parallel transformations,
and are automatically rebuild on failure.
Spark
56. Big Data Developers - Virdata, Internet of Things #virdata
map
Spark: API
reduce
57. Big Data Developers - Virdata, Internet of Things #virdata
map
filter
groupBy
sort
union
join
leftOuterJoin
rightOuterJoin
count
fold
reduceByKey
groupByKey
Spark: API
reduce
cogroup
cross
zip
sample
take
first
partitionBy
mapWith
pipe
save
...
58. Big Data Developers - Virdata, Internet of Things #virdata
Spark Ecosystem
Spark
HDFS
Tachyon
Mesos
Spark
Streaming
Shark /
Spark SQL
GraphX MLlib Mahout
MR
v1
Blink
DB
Velox
YARN
59. Big Data Developers - Virdata, Internet of Things #virdata
Every iteration produces the views from scratch.
Batch Layer
60. Big Data Developers - Virdata, Internet of Things #virdata
Batch View Databases
We need a (read-only) database to store those views.
61. Big Data Developers - Virdata, Internet of Things #virdata
Example: the automotive market
Real Time Tracking
Engine Block Performance
Fleet Management
3rd
Party API integration
Integration with Informix
Big Data Visualization
3rd
Party Application Creation
BlueMix Platform as a Service
Process Integrations
The Open Source Route Enterprise Integration Bringing Analytics to the Data
62. Big Data Developers - Virdata, Internet of Things #virdata
Batch Layer
Data absorbed into Batch Views
Time
Now
We are not done yet…
Not yet absorbed.
Just a few hours of data.
64. Big Data Developers - Virdata, Internet of Things #virdata
Speed Layer
Spark C*
Incoming Data
C*
65. Big Data Developers - Virdata, Internet of Things #virdata
Stream processing.
Speed Layer
66. Big Data Developers - Virdata, Internet of Things #virdata
Continuous computation.
Speed Layer
67. Big Data Developers - Virdata, Internet of Things #virdata
Storing a limited window of data.
Compensating for the last few hours of data.
Speed Layer
68. Big Data Developers - Virdata, Internet of Things #virdata
All the complexity is isolated in the Speed Layer.
If anything goes wrong, it’s auto-corrected.
Speed Layer
69. Big Data Developers - Virdata, Internet of Things #virdata
You have a choice between:
● Availability
○ Queries are eventually
consistent
● Consistency
○ Queries are consistent
CAP
Consistency
Partition
Tolerance
Availability
70. Big Data Developers - Virdata, Internet of Things #virdata
Eventual accuracy
Some algorithms are hard to implement in real-time.
For those cases we could estimate the results.
78. Big Data Developers - Virdata, Internet of Things #virdata
Serving Layer
Spark C*
Incoming Data
C*
Query
79. Big Data Developers - Virdata, Internet of Things #virdata
Serving Layer
Random reads.
80. Big Data Developers - Virdata, Internet of Things #virdata
This layer queries the batch & real-time views and
merges it.
Serving Layer
81. Big Data Developers - Virdata, Internet of Things #virdata
Lambda Architecture
82. Big Data Developers - Virdata, Internet of Things #virdata
Lambda Architecture
The Lambda Architecture can discard any view, batch
and real-time, and just recreate everything from the
master data.
83. Big Data Developers - Virdata, Internet of Things #virdata
Mistakes are corrected via recomputation.
Write bad data? Remove the data & recompute.
Bug in view generation? Just recompute the view.
Lambda Architecture
84. Big Data Developers - Virdata, Internet of Things #virdata
Using a new schema?
No problem, keep your data, keep your input F, change your output.
Lambda Architecture
85. Big Data Developers - Virdata, Internet of Things #virdata
Data storage is highly optimized.
Lambda Architecture
87. Big Data Developers - Virdata, Internet of Things #virdata
Cloud Agnostic
Control Plane
88. Big Data Developers - Virdata, Internet of Things #virdata
IBM SoftLayer
Experiences & Observations
1. Smooth migration from SCE 2.2 to SoftLayer in 1 months time including:
■ Development of SoftLayer specific FOG abstraction layer expansion to
accommodate Virdata’s Devops tooling (CHEF)
■ Complete on-boarding of the Virdata Platform
■ Complete launch of simulation and emulation clusters
■ Very exhaustive and complete API
2. Very constructive and professional support throughout the complete on-boarding
process
3. Availability of bare metal seen as a differentiator
89. Big Data Developers - Virdata, Internet of Things #virdata
Cluster Management & Orchestration
Control Plane
RGOSSIP
90. Big Data Developers - Virdata, Internet of Things #virdata
Monitoring and Logging
Control Plane
92. Big Data Developers - Virdata, Internet of Things #virdata
Virdata - SERVICE ARCHITECTURE
millions of simultaneous persistent bi-directional connections
millions of messages per second
Real-time Complex Event Processing
Distributed Pub/Sub Messaging
Historical Data Archiving Pre-computed Data
In-Memory
real-time Data
REST API
Launch Queries - Launch Jobs
INTEGRATION
CUSTOMIZATION
NOC, OPERATIONS, MGMT REPORTS, TRENDS
ANALYTICS
93. Big Data Developers - Virdata, Internet of Things #virdata
Questions?
@virdata_iot | #virdata
@nathan_gs
94. Big Data Developers - Virdata, Internet of Things #virdata
Acknowledgements
I would like to thank Nathan Marz for writing a very insightful book, where the idea of the Lambda Architecture comes from.
Lambda: Big Data - Nathan Marz published at Manning
Lambda, Storm: A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van Landeghem at FOSDEM 2013
Spark: Apache Spark website
Spark: Apache Spark - the light at the end of the tunnel? - Michael Hausenblas, MapR at Data Science Day Berlin 2014
95. Big Data Developers - Virdata, Internet of Things #virdata
Thank you
virdata.com | +1 (937) 569 4220 | info@virdata.com
#virdata | @virdata_iot
@nathan_gs | nathan.bijnens@virdata.com