IBM Informix - The Ideal Database for Internet of Things
Exclusive luncheon at IBM World of Watson 2016. Informix is the best fit for IoT sensor data analytics at the edge and in the cloud.
Informix Spark Streaming is an extension of Informix that allows data to be streamed out of the database as soon as it is inserted, updated, or deleted.
The protocol currently used to stream the changes is MQTT v3.1.1 (older versions not supported!). This extension is able to stream data to any MQTT broker where it can be processed or passed on to subscribing clients for processing.
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
Lance Olson. Cortana Analytics is a fully managed big data and advanced analytics suite that helps you transform your data into intelligent action. Come to this two-part session to learn how you can do "big data" processing and storage in Cortana Analytics. In the first part, we will provide an overview of the processing and storage services. We will then talk about the patterns and use cases which make up most big data solutions. In the second part, we will go hands-on, showing you how to get started today with writing batch/interactive queries, real-time stream processing, or NoSQL transactions all over the same repository of data. Crunch petabytes of data by scaling out your computation power to any sized cluster. Store any amount of unstructured data in its native format with no limits to file or account size. All of this can be done with no hardware to acquire or maintain and minimal time to setup giving you the value of "big data" within minutes. Go to https://channel9.msdn.com/ to find the recording of this session.
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageDataCore Software
Business continuity, especially across data centers in nearby locations often depends on complicated scripts, manual intervention and numerous checklists. Those error-prone processes are exponentially more difficult when the data storage equipment differs between sites.
Such difficulties force many organizations to settle for partial disaster recovery measures, conceding data loss and hours of downtime during occasional facility outages.
In this webcast and live demo, you’ll learn about:
• Software-defined storage services capable of continuously mirroring data in
real-time between unlike storage devices.
• Non-disruptive failover between stretched cluster requiring zero touch.
• Rapid restoration of normal conditions when the facilities come back up.
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
Lance Olson. Cortana Analytics is a fully managed big data and advanced analytics suite that helps you transform your data into intelligent action. Come to this two-part session to learn how you can do "big data" processing and storage in Cortana Analytics. In the first part, we will provide an overview of the processing and storage services. We will then talk about the patterns and use cases which make up most big data solutions. In the second part, we will go hands-on, showing you how to get started today with writing batch/interactive queries, real-time stream processing, or NoSQL transactions all over the same repository of data. Crunch petabytes of data by scaling out your computation power to any sized cluster. Store any amount of unstructured data in its native format with no limits to file or account size. All of this can be done with no hardware to acquire or maintain and minimal time to setup giving you the value of "big data" within minutes. Go to https://channel9.msdn.com/ to find the recording of this session.
Undertaking a digital journey starts with clearly articulating the success factors for the entire digital journey, and our experience from the field has shown it to be an Achilles heel for most CXOs, across Fortune 500 organizations. Our findings were corroborated when a Mckinsey study reported that only 15% of the organizations are able to calculate the ROI of a digital initiative.
In this talk we will deliberate on demonstrated examples from multi-billion dollar businesses around proven methodologies to measure the value of a digital enterprise. The panel will share experiences as well as provide actionable advice for immediate next steps around the following:
Successful metrics for measuring the value for Digital / IoT / AI/ Machine learning engagements
How can 'Digital Traction Metrics' help with actionable insights even before the Financial Metrics have been reported
What are the best in-class organizational constructs and futuristic employee engagement methods to facilitate the digital revolution
Panelists for this session include:
• Christian Bilien - Head of Global Data at Societe Generale
• Pierre Alexandre Pautrat – Head of Big Data at BPCE/Nattixis
• Ronny Fehling – VP , Airbus
• Juergen Urbanski – Silicon Valley Data Science
• Abhas Ricky - EMEA Lead, Innovation & Strategy, Hortonworks
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
This talk draws on our experience in debugging and analyzing Hadoop jobs to describe some methodical approaches to this and present current and new tracing and tooling ideas that can help semi-automate parts of this difficult problem.
Informix Spark Streaming is an extension of Informix that allows data to be streamed out of the database as soon as it is inserted, updated, or deleted.
The protocol currently used to stream the changes is MQTT v3.1.1 (older versions not supported!). This extension is able to stream data to any MQTT broker where it can be processed or passed on to subscribing clients for processing.
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
Lance Olson. Cortana Analytics is a fully managed big data and advanced analytics suite that helps you transform your data into intelligent action. Come to this two-part session to learn how you can do "big data" processing and storage in Cortana Analytics. In the first part, we will provide an overview of the processing and storage services. We will then talk about the patterns and use cases which make up most big data solutions. In the second part, we will go hands-on, showing you how to get started today with writing batch/interactive queries, real-time stream processing, or NoSQL transactions all over the same repository of data. Crunch petabytes of data by scaling out your computation power to any sized cluster. Store any amount of unstructured data in its native format with no limits to file or account size. All of this can be done with no hardware to acquire or maintain and minimal time to setup giving you the value of "big data" within minutes. Go to https://channel9.msdn.com/ to find the recording of this session.
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageDataCore Software
Business continuity, especially across data centers in nearby locations often depends on complicated scripts, manual intervention and numerous checklists. Those error-prone processes are exponentially more difficult when the data storage equipment differs between sites.
Such difficulties force many organizations to settle for partial disaster recovery measures, conceding data loss and hours of downtime during occasional facility outages.
In this webcast and live demo, you’ll learn about:
• Software-defined storage services capable of continuously mirroring data in
real-time between unlike storage devices.
• Non-disruptive failover between stretched cluster requiring zero touch.
• Rapid restoration of normal conditions when the facilities come back up.
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
Lance Olson. Cortana Analytics is a fully managed big data and advanced analytics suite that helps you transform your data into intelligent action. Come to this two-part session to learn how you can do "big data" processing and storage in Cortana Analytics. In the first part, we will provide an overview of the processing and storage services. We will then talk about the patterns and use cases which make up most big data solutions. In the second part, we will go hands-on, showing you how to get started today with writing batch/interactive queries, real-time stream processing, or NoSQL transactions all over the same repository of data. Crunch petabytes of data by scaling out your computation power to any sized cluster. Store any amount of unstructured data in its native format with no limits to file or account size. All of this can be done with no hardware to acquire or maintain and minimal time to setup giving you the value of "big data" within minutes. Go to https://channel9.msdn.com/ to find the recording of this session.
Undertaking a digital journey starts with clearly articulating the success factors for the entire digital journey, and our experience from the field has shown it to be an Achilles heel for most CXOs, across Fortune 500 organizations. Our findings were corroborated when a Mckinsey study reported that only 15% of the organizations are able to calculate the ROI of a digital initiative.
In this talk we will deliberate on demonstrated examples from multi-billion dollar businesses around proven methodologies to measure the value of a digital enterprise. The panel will share experiences as well as provide actionable advice for immediate next steps around the following:
Successful metrics for measuring the value for Digital / IoT / AI/ Machine learning engagements
How can 'Digital Traction Metrics' help with actionable insights even before the Financial Metrics have been reported
What are the best in-class organizational constructs and futuristic employee engagement methods to facilitate the digital revolution
Panelists for this session include:
• Christian Bilien - Head of Global Data at Societe Generale
• Pierre Alexandre Pautrat – Head of Big Data at BPCE/Nattixis
• Ronny Fehling – VP , Airbus
• Juergen Urbanski – Silicon Valley Data Science
• Abhas Ricky - EMEA Lead, Innovation & Strategy, Hortonworks
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
This talk draws on our experience in debugging and analyzing Hadoop jobs to describe some methodical approaches to this and present current and new tracing and tooling ideas that can help semi-automate parts of this difficult problem.
The process of streaming real-time data from a wide variety of machine data sources and entities can be very complex and unwieldy. Using an agent-based approach, Informatica has invented a new technique and open access product that makes this process much more user friendly and efficient, even when dealing with multiple environments such as Hadoop, Cassandra, Storm, Amazon Kinesis and Complex Event Processing.
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
Big Data adoption is a journey. Depending on the business the process can take weeks, months, or even years. With any transformative technology the challenges have less to do with the technology and more to do with how a company adapts itself to a new way of thinking about data. Building a Center of Excellence is one way for IT to help drive success.
This talk will explore Enterprise Holdings Inc. (which operates the Enterprise Rent-A-Car, National Car Rental and Alamo Rent A Car) and their experience with Big Data. EHI’s journey started in 2013 with Hadoop as a POC and today are working to create the next generation data warehouse in Microsoft’s Azure cloud utilizing a lambda architecture.
We’ll discuss the Center of Excellence, the roles in the new world, share the things which worked well, and rant about those which didn’t.
No deep Hadoop knowledge is necessary, architect or executive level.
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...DataStax Academy
We have the challenge of how to reliably store massive quantities of data that are available even in the face of infrastructure failures. We have similar challenges on the application side. The most successful cloud architectures break applications down into microservices. How then do we deploy, upgrade and manage the scale of those microservices? This session will illustrate how to tackle these challenges by taking advantage of both Cassandra and Microsoft's next generation PaaS infrastructure called Azure Service Fabric.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureMats Johansson
Presentation at Data Innovation Summit 2016 in Stockholm
How to build a modern data architecture supporting data in motion and data at rest with Hortonworks Data Flow and Data Platform.
Choosing the right platform for your Internet -of-Things solutionIBM_Info_Management
Deploying a solution within the context of the Internet of Things (IoT) typically requires involves many considerations, ranging from the hardware involved to the architecture of the whole environment, and from the decisions about where processing and analytics is to take place to the software choices that allow you to exploit the Internet of Things. This presentation will focus on the need to support a homogeneous processing environment. That is, it will be preferable if processing in all tiers of the IoT is consistent and compatible. This joint presentation will go on to discuss the implications of this consistency for database selection.
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareData Con LA
Hadoop has been widely embraced for its ability to economically store and analyze large data sets. Using parallel computing techniques like MapReduce, Hadoop can reduce long computation times to hours or minutes. This works well for mining large volumes of historical data stored on disk, but it is not suitable for gaining real-time insights from live operational data. Still, the idea of using Hadoop for real-time data analytics on live data is appealing because it leverages existing programming skills and infrastructure – and the parallel architecture of Hadoop itself. This presentation will describe how real-time analytics using Hadoop can be performed by combining an in-memory data grid (IMDG) with an integrated, stand-alone Hadoop MapReduce execution engine. This new technology delivers fast results for live data and also accelerates the analysis of large, static data sets.
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...DataWorks Summit
In this talk Mark Baker (CSL) will show how CSL Behring is Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI to a central Hadoop data lake at CSL Behring
The challenge of merging data from disparate systems has been a leading driver behind investments in data warehousing systems, as well as, in Hadoop. While data warehousing solutions are ready-built for RDBMS integration, Hadoop adds the benefits of infinite and economical scale – not to mention the variety of structured and non-structured formats that it can handle. Whether using a data warehouse or Hadoop or both, physical data movement and consolidation is the primary method of integration.
There may also be challenges with synchronizing rapidly changing data from a system of record to a consolidated Hadoop platform .
This introduces the need for “data federation” , where data is integrated without copying data between systems.
For historical/batch data use cases there is a replication of data across remote data hubs into a central data lake using Apache NIFI.
We will demo using Apache Zeppelin for analyzing data using Apache Spark and Apache HIVE.
Presentation at IoT World, May 2016 in Santa Clara, CA. Session "Manage your IoT Sensor Data at the Edge! Control your IoT sensor data at the most appropriate spot" (Thursday, 12 May 2016. IoT & the Cloud Track)
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
Cybersecurity requires an organization to collect data, analyze it, and alert on cyber anomalies in near real-time. This is a challenging endeavor when considering the variety of data sources which need to be collected and analyzed. Everything from application logs, network events, authentications systems, IOT devices, business events, cloud service logs, and more need to be taken into consideration. In addition, multiple data formats need to be transformed and conformed to be understood by both humans and ML/AI algorithms.
To solve this problem, the Aetna Global Security team developed the Unified Data Platform based on Apache NiFi, which allows them to remain agile and adapt to new security threats and the onboarding of new technologies in the Aetna environment. The platform currently has over 60 different data flows with 95% doing real-time ETL and handles over 20 billion events per day. In this session learn from Aetna’s experience building an edge to AI high-speed data pipeline with Apache NiFi.
Real-time analysis using an in-memory data grid - Cloud Expo 2013ScaleOut Software
ScaleOut technical session at Cloud Expo 2013 in NY. Covers the use of in-memory data grids for real-time analysis of fast-changing data. Includes a financial services example.
Successful AI/ML Projects with End-to-End Cloud Data EngineeringDatabricks
Trusted, high-quality data and efficient use of data engineers’ time are critical success factors for AI/ML projects. Enterprise data is complex—it comes from several sources, in a variety of formats, and at varied speeds. For your machine learning projects on Apache Spark, you need a holistic approach to data engineering: finding & discovering, ingesting & integrating, server-less processing at scale, and data governance. Stop by this session for an overview on how to set up AI/ML projects for success while Informatica takes the heavy lifting out of your data engineering.
Transform Your Mainframe Data for the Cloud with Precisely and Apache KafkaPrecisely
Your mainframe does hard work for your business, supporting essential computing transactions every day. However, mainframe data does not easily integrate with the cloud platforms driving data-driven, real-time, analytics-focused business processes. Integrating data from this critical technology often results in high costs and downtime. So, what can you do?
View this on-demand webinar to learn how Precisely Connect can help use the power of Apache Kafka to eliminate data silos and make cloud-based, event-driven data architectures a reality. Start your cloud transformation journey today, knowing you don’t need to leave essential transaction data behind!
During this webinar, you will learn more about:
· Where to begin your cloud transformation journey using mainframe data and Apache Kafka
· What you need to move mainframe data to the cloud while reducing costs, modernizing architectures, and using the staff you have today
· How Precisely Connect customers are using change data capture and Apache Kafka to deliver real-time insights to the cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
Take Data Management to the next level: Connect Analytics and Machine Learning in a single governed platform consisting of a curated protable open source stack. Run this platform on-prem, hybrid or multicloud, reuse code and models avoid lock-in.
Lightning Fast Analytics with Hive LLAP and DruidDataWorks Summit
Cox Communications, one of the largest network providers in the U.S., is primarily focused on ensuring network security and providing better service to customers including:
• Real-time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization
• Enrich the security team with capabilities to determine the source and destination of traffic, class of service, and the causes of congestion on NetFlow data
Challenges:
Data related to Network Security includes more granular streaming data. The major challenge lies in having an unified platform to perform data cleansing, transformation, analytics and reporting on this huge streaming datasets. With the growing network traffic, there is an exponential growth with the associated data. There is a need for Scalable framework to handle these datasets and derive useful information out of data. Along with data processing, data retrieval also plays a major role for better analysis. Currently Data processing was done in daily batch using manual python scripts and with implementation of custom data structures which were specific to use cases. There was a need for more generic and unified framework to provide automated real time end to end solution to obtain high performing, more granular business results.
Solution:
Automation of this process has opportunities on several fronts, notably, providing consistency, repeat-ability, and modernization of OLAP analytics on enterprise big data platform. Reports can be generated easier and faster with the underlying OLAP engine.
• Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time stream data processing and enriching data using the ecosystem components like Spark, Kafka, Hive
• Impressively faster OLAP analytics using Hive LLAP and Druid Integration
• Simple and faster reporting using Superset
All of the necessary components under one roof of Hortonworks Hadoop Platform.
An end-to-end solution using Big Data platform produced faster and repeatable results with sub second query results.
Value Additions by above solution:
• Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results
• Opportunity for business users to explore and visualize real time streaming datasets with integration for various data sources and build dashboards for different slices
• Capability to run BI queries in just milliseconds over 1TB dataset
• High granular permission model on security datasets that allow intricate rules on accessibility for the datasets
The process of streaming real-time data from a wide variety of machine data sources and entities can be very complex and unwieldy. Using an agent-based approach, Informatica has invented a new technique and open access product that makes this process much more user friendly and efficient, even when dealing with multiple environments such as Hadoop, Cassandra, Storm, Amazon Kinesis and Complex Event Processing.
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
Big Data adoption is a journey. Depending on the business the process can take weeks, months, or even years. With any transformative technology the challenges have less to do with the technology and more to do with how a company adapts itself to a new way of thinking about data. Building a Center of Excellence is one way for IT to help drive success.
This talk will explore Enterprise Holdings Inc. (which operates the Enterprise Rent-A-Car, National Car Rental and Alamo Rent A Car) and their experience with Big Data. EHI’s journey started in 2013 with Hadoop as a POC and today are working to create the next generation data warehouse in Microsoft’s Azure cloud utilizing a lambda architecture.
We’ll discuss the Center of Excellence, the roles in the new world, share the things which worked well, and rant about those which didn’t.
No deep Hadoop knowledge is necessary, architect or executive level.
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...DataStax Academy
We have the challenge of how to reliably store massive quantities of data that are available even in the face of infrastructure failures. We have similar challenges on the application side. The most successful cloud architectures break applications down into microservices. How then do we deploy, upgrade and manage the scale of those microservices? This session will illustrate how to tackle these challenges by taking advantage of both Cassandra and Microsoft's next generation PaaS infrastructure called Azure Service Fabric.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureMats Johansson
Presentation at Data Innovation Summit 2016 in Stockholm
How to build a modern data architecture supporting data in motion and data at rest with Hortonworks Data Flow and Data Platform.
Choosing the right platform for your Internet -of-Things solutionIBM_Info_Management
Deploying a solution within the context of the Internet of Things (IoT) typically requires involves many considerations, ranging from the hardware involved to the architecture of the whole environment, and from the decisions about where processing and analytics is to take place to the software choices that allow you to exploit the Internet of Things. This presentation will focus on the need to support a homogeneous processing environment. That is, it will be preferable if processing in all tiers of the IoT is consistent and compatible. This joint presentation will go on to discuss the implications of this consistency for database selection.
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareData Con LA
Hadoop has been widely embraced for its ability to economically store and analyze large data sets. Using parallel computing techniques like MapReduce, Hadoop can reduce long computation times to hours or minutes. This works well for mining large volumes of historical data stored on disk, but it is not suitable for gaining real-time insights from live operational data. Still, the idea of using Hadoop for real-time data analytics on live data is appealing because it leverages existing programming skills and infrastructure – and the parallel architecture of Hadoop itself. This presentation will describe how real-time analytics using Hadoop can be performed by combining an in-memory data grid (IMDG) with an integrated, stand-alone Hadoop MapReduce execution engine. This new technology delivers fast results for live data and also accelerates the analysis of large, static data sets.
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...DataWorks Summit
In this talk Mark Baker (CSL) will show how CSL Behring is Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache NIFI to a central Hadoop data lake at CSL Behring
The challenge of merging data from disparate systems has been a leading driver behind investments in data warehousing systems, as well as, in Hadoop. While data warehousing solutions are ready-built for RDBMS integration, Hadoop adds the benefits of infinite and economical scale – not to mention the variety of structured and non-structured formats that it can handle. Whether using a data warehouse or Hadoop or both, physical data movement and consolidation is the primary method of integration.
There may also be challenges with synchronizing rapidly changing data from a system of record to a consolidated Hadoop platform .
This introduces the need for “data federation” , where data is integrated without copying data between systems.
For historical/batch data use cases there is a replication of data across remote data hubs into a central data lake using Apache NIFI.
We will demo using Apache Zeppelin for analyzing data using Apache Spark and Apache HIVE.
Presentation at IoT World, May 2016 in Santa Clara, CA. Session "Manage your IoT Sensor Data at the Edge! Control your IoT sensor data at the most appropriate spot" (Thursday, 12 May 2016. IoT & the Cloud Track)
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
Cybersecurity requires an organization to collect data, analyze it, and alert on cyber anomalies in near real-time. This is a challenging endeavor when considering the variety of data sources which need to be collected and analyzed. Everything from application logs, network events, authentications systems, IOT devices, business events, cloud service logs, and more need to be taken into consideration. In addition, multiple data formats need to be transformed and conformed to be understood by both humans and ML/AI algorithms.
To solve this problem, the Aetna Global Security team developed the Unified Data Platform based on Apache NiFi, which allows them to remain agile and adapt to new security threats and the onboarding of new technologies in the Aetna environment. The platform currently has over 60 different data flows with 95% doing real-time ETL and handles over 20 billion events per day. In this session learn from Aetna’s experience building an edge to AI high-speed data pipeline with Apache NiFi.
Real-time analysis using an in-memory data grid - Cloud Expo 2013ScaleOut Software
ScaleOut technical session at Cloud Expo 2013 in NY. Covers the use of in-memory data grids for real-time analysis of fast-changing data. Includes a financial services example.
Successful AI/ML Projects with End-to-End Cloud Data EngineeringDatabricks
Trusted, high-quality data and efficient use of data engineers’ time are critical success factors for AI/ML projects. Enterprise data is complex—it comes from several sources, in a variety of formats, and at varied speeds. For your machine learning projects on Apache Spark, you need a holistic approach to data engineering: finding & discovering, ingesting & integrating, server-less processing at scale, and data governance. Stop by this session for an overview on how to set up AI/ML projects for success while Informatica takes the heavy lifting out of your data engineering.
Transform Your Mainframe Data for the Cloud with Precisely and Apache KafkaPrecisely
Your mainframe does hard work for your business, supporting essential computing transactions every day. However, mainframe data does not easily integrate with the cloud platforms driving data-driven, real-time, analytics-focused business processes. Integrating data from this critical technology often results in high costs and downtime. So, what can you do?
View this on-demand webinar to learn how Precisely Connect can help use the power of Apache Kafka to eliminate data silos and make cloud-based, event-driven data architectures a reality. Start your cloud transformation journey today, knowing you don’t need to leave essential transaction data behind!
During this webinar, you will learn more about:
· Where to begin your cloud transformation journey using mainframe data and Apache Kafka
· What you need to move mainframe data to the cloud while reducing costs, modernizing architectures, and using the staff you have today
· How Precisely Connect customers are using change data capture and Apache Kafka to deliver real-time insights to the cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
Take Data Management to the next level: Connect Analytics and Machine Learning in a single governed platform consisting of a curated protable open source stack. Run this platform on-prem, hybrid or multicloud, reuse code and models avoid lock-in.
Lightning Fast Analytics with Hive LLAP and DruidDataWorks Summit
Cox Communications, one of the largest network providers in the U.S., is primarily focused on ensuring network security and providing better service to customers including:
• Real-time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization
• Enrich the security team with capabilities to determine the source and destination of traffic, class of service, and the causes of congestion on NetFlow data
Challenges:
Data related to Network Security includes more granular streaming data. The major challenge lies in having an unified platform to perform data cleansing, transformation, analytics and reporting on this huge streaming datasets. With the growing network traffic, there is an exponential growth with the associated data. There is a need for Scalable framework to handle these datasets and derive useful information out of data. Along with data processing, data retrieval also plays a major role for better analysis. Currently Data processing was done in daily batch using manual python scripts and with implementation of custom data structures which were specific to use cases. There was a need for more generic and unified framework to provide automated real time end to end solution to obtain high performing, more granular business results.
Solution:
Automation of this process has opportunities on several fronts, notably, providing consistency, repeat-ability, and modernization of OLAP analytics on enterprise big data platform. Reports can be generated easier and faster with the underlying OLAP engine.
• Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time stream data processing and enriching data using the ecosystem components like Spark, Kafka, Hive
• Impressively faster OLAP analytics using Hive LLAP and Druid Integration
• Simple and faster reporting using Superset
All of the necessary components under one roof of Hortonworks Hadoop Platform.
An end-to-end solution using Big Data platform produced faster and repeatable results with sub second query results.
Value Additions by above solution:
• Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results
• Opportunity for business users to explore and visualize real time streaming datasets with integration for various data sources and build dashboards for different slices
• Capability to run BI queries in just milliseconds over 1TB dataset
• High granular permission model on security datasets that allow intricate rules on accessibility for the datasets
Internet of Things Cologne 2015: Why Your Dad’s Database won’t Work for IoT a...MongoDB
IoT is the next big paradigm shift in computing. The move to super-dense sensor networks creates a completely new set of opportunities and challenges for developers, designers and end-users. The databases we designed for the computing environments of the early 90s can no longer support modern, mobile super-scale web applications. In this talk, Joe discussed some of these changes and how they impact the requirements for a modern database.
OSGi Community Event 2014
Abstract:
This presentation tells how OSGi can help developing a distributed and cloud ready Internet of Things platform.
IoT brings unprecedented complexity both in terms of technological variety and new development paradigms. Modularity offered by OSGi is the key concept to build maintainable and robust IoT platforms. OSGi declarative services and dependency injection mechanism allow service producers and service consumers to interact with full respect of mutual component boundaries: this is the fundamental requirement to enable important aspects of an IoT platform like multi-tenancy, separation of concerns between M2M protocols management and application development and dynamic services management.
Plat.One IoT platform revolves around the OSGi technology: this presentation describes our lesson learnt during several years of “hands-on OSGi activities” and development.
Speaker Bio:
After graduating in Physics with specialisation in High Energy Physics, he started working in industrial automation and machine to machine applications. Since 2006 he joined Abo Data and he started the development of PLAT.ONE IoT and M2M platform. Currently, he is leading the PLAT.ONE development team. PLAT.ONE has already been adopted by major telco operators and system integrators to enable a new breed of cloud-based IoT applications and services
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB
Arthur Viegers, Senior Solutions Architect, MongoDB.
The value of the fast growing class of NoSQL databases is the ability to handle high velocity and volumes of data while enabling greater agility with dynamic schemas. MongoDB gives you those benefits while also providing a rich querying capability and a document model for developer productivity. Arthur Viegers outlines the reasons for MongoDB's popularity in IoT applications and how you can leverage the core concepts of NoSQL to build robust and highly scalable IoT applications.
Developing io t applications in the fog a distributed dataflow approachNam Giang
In this paper we examine the development of IoT applications from the perspective of the Fog Computing paradigm, where computing infrastructure at the network edge in devices and gateways is leverage for efficiency and timeliness. Due to the intrinsic nature of the IoT: heterogeneous devices/resources, a tightly coupled perception-action cycle and widely distributed devices and processing, application development in the Fog can be challenging. To address these challenges, we propose a Distributed Dataflow (DDF) programming model for the IoT that utilises computing infrastructures across the Fog and the Cloud. We evaluate our proposal by implementing a DDF framework based on Node-RED (Distributed Node-RED or D-NR), a visual programming tool that uses a flow-based model for building IoT applications. Via demonstrations, we show that our approach eases the development process and can be used to build a variety of IoT applications that work efficiently in the Fog.
Understanding the Operational Database Infrastructure for IoT and Fast DataVoltDB
Join this webinar as Ryan Betts, CTO of VoltDB, describes several data-as-a-service reference architectures for IoT and discuss a real use case highlighting how an in-memory operational database simplified a large-scale enterprise architecture to handle real-time data for IoT -- faster and smarter. View the webinar in its entirety here: http://learn.voltdb.com/WRIoT.html
The Internet of Things (IoT) is one of the hottest mega-trends in technology – and for good reason , IoT deals with all the components of what we consider web 3.0 including Big Data Analytics, Cloud Computing and Mobile Computing .
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...Animesh Singh
When people aren't talking about VMs and containers, they're talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT.
In this session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, detailed how to build a distributed serverless, polyglot, microservices framework using open source technologies like:
OpenWhisk: Open source distributed compute service to execute application logic in response to events
Docker: To run event driven actions 6. Ansible and BOSH: to deploy the serverless platform
MQTT: Messaging protocol for IoT
Node-RED: Tool to wire IoT together
Consul: Tool for service discovery and configuration. Consul is distributed, highly available, and extremely scalable.
Kafka: A high-throughput distributed messaging system.
StatsD/ELK/Graphite: For statistics, monitoring and logging
Reactive Data Centric Architectures with Vortex, Spark and ReactiveXAngelo Corsaro
An increasing number of Software Architects are realising that data is the most important asset of a system and are staring to embrace the Data-Centric revolution (datacentricmanifesto.org) — setting data at the center of their architecture and modelling applications as “visitors” to the data. At the same time, architect have also realised that reactive architectures (reactivemanifesto.org) facilitates the design of scalable, fault-tolerant and high performance systems.
Yet, few architects have realised that reactive and data-centric architectures are the two sides to the same coin and should always go hand in hand.
This presentation shows how reactive data-centric systems can be designed and built taking advantage of Vortex data sharing capabilities along with its integration with reactive and data-centric processing technologies such as Apache Spark and ReactiveX.
The Fog Computing [fɒg kəmˈpjuːtɪŋ] paradigm was introduced to extend and overcome the limitations imposed by cloud centric architectures with respect to their assumptions on connectivity, bandwidth and latency. As such Fog Computing aims at bringing elastic and high-performance computing, storage and communication at the edge.
Early demonstration of Fog Computing architectures such as those carried on the Barcelona Smart City demonstrator, have proved the effectiveness of this paradigm and initiatives such as the Open Fog Consortium aim at popularising and accelerating the adoption of Fog computing as one of the key paradigm at the foundation of IoT.
In this presentation we explain the forces that drove the introduction of Fog Computing and provide a throughout definition of the underlying architectural style. Additionally we will explore the relationships and synergies that exist between Fog and Cloud Computing. Finally we will show how Vortex naturally supports Fog Computing Architectures.
IBM IoT Architecture and Capabilities at the Edge and Cloud Pradeep Natarajan
This slide deck answers the following questions:
1) What does the generalized IoT architecture looks like?
2) What is the need for an IoT gateway or IoT edge solution?
3) Why use a database solution in the IoT gateway?
4) Why IBM Informix is the perfect data management solution for IoT gateways at the edge?
How Crosser Built a Modern Industrial Data Historian with InfluxDB and GrafanaInfluxData
Crosser are the creators of Crosser Node, a streaming analytics platform. This real-time analytics engine is installed at the edge and pulls data from any sensor, PLC, DCS, MES, SCADA system or historian. Their drag-and-drop tool enables Industry 4.0 data collection and integration. Discover how Crosser’s easy-to-use IIoT monitoring platform empowers non-developers to connect IIoT machine and sensor data with cloud services.
In this webinar, Dr. Göran Appelquist will dive into:
Crosser’s approach to enabling better IIoT data analysis and anomaly detection
Their methodology to equipping their clients with ML models by supporting all Python-based frameworks
How Crosser uses InfluxDB time series platform for storage
WSO2 Data Analytics Server is a comprehensive enterprise data analytics platform; it fuses batch and real-time analytics of any source of data with predictive analytics via machine learning.
How to scale your PaaS with OVH infrastructure?OVHcloud
ForePaaS has developed an “as-a-service” platform which lets you automate an infrastructure designed for analytical applications. The company has formed a cloud partnership with OVH in order to deliver flexible solutions for containerised and high-performance tools, such as Kunernetes and Docker.
OCCIware@CloudExpoLondon2017 - an extensible, standard XaaS Cloud consumer pl...Marc Dutoo
Who uses multi cloud today ? Everybody. Alas, this leads to a lot of "technical glue". Enter OCCIware's Studio and Runtime : manage all layers and domains of the Cloud (XaaS) in a uniform, standard, extensible way - the Cloud consumer platform. With demos of Docker & Linked Data Studios, OCCInterface playground.
Extensible and Standard-based XaaS Platform To Manage Everything in The Cloud...OCCIware
Who uses multi cloud today ? Everybody. Docker AND VMs, scaling internally AND bursting to Amazon, storing on a public cloud except for data legally required to stay within the country: different solutions for different needs, but more often than not used at the same time. Alas, this leads to a "noodle plate" architecture where a lot of "technical glue" with the various, incompatible clouds creeps in and makes it impossible to evolve.
To solve this problem, the OCCIware project builds on the Open Cloud Computing (OCCI) standard's unified, uniform architectural approach and provides a platform to manage all layers and domains of the Cloud (XaaS), with two main components: the OCCIware Studio Factory and Runtime. The talk includes a demonstration of the Docker connector and of how to use the OCCIware Cloud Designer to configure a real life, SmartCity-themed Cloud application (a Java API server on top of a MongoDB cluster)'s business, platform and infrastructure layers seamlessly on both VirtualBox and OW2's OpenStack infrastructure.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
By 2020, 50% of all new software will process machine-generated data of some sort (Gartner). Historically, machine data use cases have required non-SQL data stores like Splunk, Elasticsearch, or InfluxDB.
Today, new SQL DB architectures rival the non-SQL solutions in ease of use, scalability, cost, and performance. Please join this webinar for a detailed comparison of machine data management approaches.
SpringPeople - Introduction to Cloud ComputingSpringPeople
Cloud computing is no longer a fad that is going around. It is for real and is perhaps the most talked about subject. Various players in the cloud eco-system have provided a definition that is closely aligned to their sweet spot –let it be infrastructure, platforms or applications.
This presentation will provide an exposure of a variety of cloud computing techniques, architecture, technology options to the participants and in general will familiarize cloud fundamentals in a holistic manner spanning all dimensions such as cost, operations, technology etc
Dopo una breve introduzione dei concetti di base legati all'Internet of Things, durante questa sessione si fornirà una panoramica degli strumenti che Microsoft mette a diposizione degli sviluppatori per creare le proprie soluzioni IoT: Windows 10 for IoT e alcuni servizi di Azure quali Event Hubs e Stream Analytics. Si utilizzerà un semplice esempio di telemetria per mostrare la realizzazione pratica di uno scenario end-to-end per la trasformazione dei dati provenienti da un sensore in informazioni utili per effettuare analisi e/o prendere decisioni.
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
Similar to Informix - The Ideal Database for IoT (20)
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
3. World of Watson 2016
Internet of Things Architecture – Analytics End-to-End
Deep Analytics Zone
Analytics Zone
Smart Gateways
Devices/Sensors
Uses
Gateway Direct to
Cloud
Insights from BigData
Message Broker
Flexible Hybrid
Data Management
3
4. World of Watson 2016
IoT applications have a common set of requirements
Opportunities for innovation
Quickly and easily provision new sensors
Create a real-time communication channel with the
sensor
Capture data from the sensor and store it in a time
series database
Provide Secure access to the collected data –
analytics at the Edge and Cloud, in real-time & on
historical data
Trigger events based on specific data conditions
Interact with the sensor from business/enterprise
applications and/or from mobile devices
Monetize the service based on usage
4
5. World of Watson 2016
• Gateways can reduce the cost of the backend cloud
• Reduces cloud storage by filtering/aggregating/analyzing data locally
• Reduces cloud CPU requirements by precomputing values
• Reduces latency since actions can be taken immediately
• Intelligent gateways can detect and respond to local events as they happen rather
than waiting for transfer to the cloud
• Some users are not comfortable putting all their data in the cloud
• Gateways allow customers to capture and get value from their sensors without
sending data to the cloud
• Protocol Consolidation
• Cloud does not need to implement the 100’s of IoT protocols
Over time more and more of the processing will move from the cloud to gateway
devices
How Do Gateways Help IoT Solutions?
5
6. World of Watson 2016
What are the Requirements for a Gateway Database?
• The database management system must:
Have a small install footprint, less than 100 MB
Run with low memory requirements – less than 256 MB
Use lossless compression techniques to minimize storage space
Have built-in support for common types of IoT data like time series,
spatial, and JSON data
Simple application development supporting both NoSQL and SQL
Driverless, easy access to the data
Require little or no administration
Ideally should be able to network multiple gateways together to create a
single distributed database
6
The database must be powerful enough to ingest, process and
analyze data in real-time
7. World of Watson 2016
IBM Informix: The Ideal Database for Gateways
Simple to use
Hands-Free operation – No administration
Supports popular interfaces such as MQTT, REST, and Mongo as
well as ODBC/JDBC
Handles SQL and NoSQL data in the same database
Performance
One of a kind support for TimeSeries and Spatial data
Stream data continuously into the database
Run analytics as data arrives
Dynamically add and update analytics when needed
Storage is typically 1/3 the size compared to other vendors
Invisible
Agile
7
Informix is the only database management system perfectly suited
to run in Gateways
8. World of Watson 2016
Sensor Data is TimeSeries Data
• What is a Time Series?
A logically connected set of records ordered by time
• What are the Key Strengths of Informix TimeSeries?
Much less space required
• Typically about 1/3 the space required by other vendors
Queries run orders of magnitude faster
• Unique optimized storage means codes paths are shorter and more data fits in
memory
Purpose built streaming data loader for sensor data
• Automatically run analytic and/or aggregate functions on new data
Can store structured (SQL) or unstructured (JSON) data for quick application
development
• REST/ODBC/JDBC/MongoDB/MQTT interfaces available
100’s of functions predefined
• Programming APIs available to create your own analytics
8
9. World of Watson 2016
Traditional Table Approach
Informix TimeSeries Approach
Device_ID Time Sensor1 Sensor2 ColN
1 1-1-11 12:00 Value 1 Value 2 ……… Value N
2 1-1-11 12:00 Value 1 Value 2 ……… Value N
3 1-1-11 12:00 Value 1 Value 2 ……… Value N
… … … … ……… …
1 1-1-11 12:15 Value 1 Value 2 ……… Value N
2 1-1-11 12:15 Value 1 Value 2 ……… Value N
3 1-1-11 12:15 Value 1 Value 2 ……… Value N
… … … … ……… …
Device_ID Series
1 [(1-1-11 12:00, value 1, value 2,…, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]
2 [(1-1-11 12:00, value 1, value 2,…, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]
3 [(1-1-11 12:00, value 1, value 2,…, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]
4 [(1-1-11 12:00, value 1, value 2,…, value N), (1-1-11 12:15, value 1, value 2, …, value N), …]
…
Traditional Sensor data storage vs Informix TimeSeries Storage
9
10. World of Watson 2016
IoT Requirements for SpatioTemporal Data
• Many IoT applications have a spatial component
to them
Vehicles, cell phones, even pets… tracking is
common
• In these cases both location and time is
important
Show me the vehicles that have passed by location
X in the last hour
Where has my car been over the last few hours?
• Informix allows you to combine Time series and
Spatial data in the same query
10
11. World of Watson 2016
Ability to Recognize Patterns and Predict Events
an abnormal or a critical pattern
Similar patterns found
Patterns are significant – heart rate, ECG, blood glucose, respiratory rate,
physical activity, …
• Now looking for anomalies and deviations, and symptomatic patterns
12. World of Watson 201612
Both Structured and Unstructured Data is Common in IoT
JSON
Collection
SQL Driver
NoSQL Driver
SQL Data
Join Data
• Informix can store SQL and JSON data in the same database
• Write programs using SQL drivers or Mongo/NoSQL drivers
• SQL data automatically transformed into JSON documents when needed
• NoSQL data automatically transformed into SQL when needed
Embedded
Device or
Database server
Horizontal
Scale-out
with
Shards
13. World of Watson 2016
Informix Data Access Options
13
MongoDB
Client
REST Client
SQLI Client
DRDA Client
Informix
DBMS
Informix NoSQL
Listener
Informix
• NoSQL ↔ SQL Translation
• MQTT, REST, MongoDB
Protocol Support
• SQLI, DRDA Protocol Support
• Relational, Collection, Time
Series, and Spatial Data
Support
Spatial
Time Series
JSON Collection
Relational Table
A REST client is any
program capable of
making a HTTP request
14. World of Watson 2016
Informix Data Access Options
14
MongoDB
Client
REST Client
SQLI Client
DRDA Client
Informix
DBMS
Informix NoSQL
Listener
Informix
• NoSQL ↔ SQL Translation
• MQTT, REST, MongoDB
Protocol Support
• SQLI, DRDA Protocol
Support
• Relational, Collection, Time
Series, and Spatial Data
Support
Spatial
Time Series
JSON Collection
Relational Table
You can use all the
client drivers that are
available for
MongoDB with the
NoSQL Listener
MQTT Client
15. World of Watson 2016
IBM IoT Smart Gateway Kit
• git clone https://github.com/ibm-iot/iot-gateway-kit.git
• The iot-gateway-kit will install the following:
▪ Node.js
▪ Node-red
▪ TimeSeries nodes
▪ Bluetooth node.js application sample
15
16. World of Watson 2016
IoT Developers - Get Started!
• Smart Gateway kit - https://ibm.biz/BdXr2W
• Code samples - https://ibm.biz/BdX4QV
• Github - https://github.com/IBM-IoT/
16
17. World of Watson 2016
Informix on Docker Hub
https://registry.hub.docker.com/u/ibmcom/informix-innovator-c/
• IBM Informix Innovator-C
• 12.10.FC7W1
https://registry.hub.docker.com/r/ibmcom/informix-rpi/
• IBM Informix Developer Edition for Raspberry Pi (32bit)
17
Docker Hub
$docker pull ibmcom/informix-innovator-c
18. World of Watson 2016 18
Informix for the Cloud and
Operational Zone
19. World of Watson 2016 19
What are the IoT Requirements for the Cloud?
• Requirements - similar to gateways (but for different reasons):
• Potentially 1000’s of servers means zero administration is a must
• Data volume adds up very quickly. Low storage overhead is required
• Data flows into the cloud continuously and must be processed in real-time
• Must be able to handle time series, spatial, and NoSQL data natively
• Additional requirements
• Must be able to scale-out
• Must be available as a service
The database must be able to ingest, process and analyze
data in real-time
20. World of Watson 2016 20
Why use Informix in the “Operational Zone”?
Simple to use
• Hands-Free operation
• Supports REST and Mongo APIs as well as ODBC/JDBC
• Stores SQL and JSON database in the same database
Highly Available
• Close to zero down time
• Partition or Hash your data across servers in the cloud
• Dynamically add/remove additional servers
Performance
• Continuous High Performance Analytics
• Specialized support for Time Series and Spatial data
Invisible
Agile
Resilient
21. World of Watson 2016 21
Shards: Scale-out your Database across Servers or Gateways
• Distribute data among servers by
range or hash partitioning
• Each shard can have an associated
secondary server for high availability
• Run queries across all shards or a
subset of the shards
• Only shards that could qualify are
searched
• Shards are searched in parallel
• Ignores shards that are offline
Shards in a
Cloud
22. World of Watson 2016 22
TCP/IP
Bulk Loader
SQL Queries (from apps)
Informix Warehouse Accelerator
Compressed
DB partition
Query
Processor
Data Warehouse
Informix
SQL
Query Router
Results
Informix Warehouse Accelerator:
• Connects to Informix via TCP/IP & DRDA
• Analyzes, compresses, and loads to memory
• Copy of (portion of) warehouse
• Processes routed SQL query and
• returns answer to Informix
Use Informix Warehouse Accelerator for
Mixed Operational/Analytic Workloads
Informix:
• Routes SQL queries to accelerator
• User need not change SQL or apps.
• Can always run query in Informix
• Too short an est. execution time
23. World of Watson 2016 23
Every IoT deployment will need to store time series data and
possibly spatial data and/or NoSQL data
Bluemix Cloud Service
Informix on Cloud – Hosted Service
• Quickly and simply deploy Informix
• Pre installed and pre configured instance
• Multiple size options (S, M, L, XL)
25. World of Watson 2016
Changing Business Model – Health care & Assisted Living
Informix Historian
Operational Analytics
Notification to Assisted
Living Central Monitoring
Station
Change patients medication, closer
monitoring, prevent stroke
1
2
3
Patient/Care giver
Hundreds of patients
Thousands of
devices
Locally Act Upon
Insights
Data
Consolidation
Gateway
Sensor Data
Input
Display Alerts and
Recommended Actions
4
5
Collection and analysis of data
for all devices across assisted
living facilities
Assisted Living Corporation
changes food sodium usage based
on trend of high blood pressure
Filter critical and life-saving
data
Blood pressure threshold
exceeded
• Embedded at
device/gateway
• Local decision making at
Facility
• Leverage all data:
NoSQL/SQL & Timeseries
data
Automatic sensors to
monitor well being
Pendants, shower &
bath buttons
Activity sensors – rising
in the morning, taking
meds, using the fridge
Bed & Chair sensors for
inactivity monitoring
Outside alarms to alert
neighbors
26. World of Watson 2016 26
And Many More…
See for yourself and talk to us @
Analytics Demo Room – 530
DMT09, DMT13
27. World of Watson 2016
Summary
• IBM Informix - best fit for IoT architecture
• IoT gateway
• IoT cloud analytics
• Supported on a wide array of platforms
• Best in class embeddability
• Native support for sensor data - TimeSeries & Spatial data
• Native support for unstructured (JSON) data
• Ease of application development - REST access
• Support to receive IoT data via MQTT protocol
• High availability and dynamic scaling
27