© 2016 IBM Corporation
IoT Analytics from Edge to Cloud
© 2016 IBM Corporation2
Agenda
 What is Edge
 Edgent (formerly Quarks) overview
 Database (Informix) on the edge
 Database in the Cloud
 Spark
© 2016 IBM Corporation3
What is the Edge?
 Systems with constrained compute platform
 Due to cost, weight restrictions, space constraints, borrowing resources, ...
 Limited connectivity to central systems
 Limited by expense, bandwidth
 Periods of being disconnected
 Can be mobile or static
 Access to sensors for system being analyzed
 Potential to control system based on analysis
 Typically systems that exist where data(sensor) is generated
© 2016 IBM Corporation4
Edge Examples
 Vehicle
 Car, truck, race car, bike, train, bus, boat, drone, plane, ...
 Analyze engine sensors to predict/reduce chance of failure
 Mobile, may lose connectivity
 Building/Store/Warehouse
 HVAC, climate, energy use, motion sensors, ...
 Server in machine room
 Analyze load, cpu temps, rack temps
 Raspberry Pi with a couple of sensors
 Cheap $5+
 Smartphone
 Industrial machine
© 2016 IBM Corporation5
What is the value of analytics at the edge?
• Reduce communication cost
• Send relevant data when an event of interest occurs
• Heartbeats alone may not contain enough data or be
too late to take action
• React locally to events
• More intelligent decision making on the device
• Execute analytics while disconnected
• Streaming analysis
• Data is constantly being generated
• Needing real-time/continuous analyzing
• Collaborate with related devices
• Learn from devices with similar characteristics
• or location
© 2016 IBM Corporation6
Streaming and In-database Analytics at the Edge
• Streaming analytics
• Data is constantly being generated
• Needing real-time/continuous analyzing
• In-memory analysis of data
• In-database analytics
• Store data at the edge efficiently and securely
• Provide historical analytics on the data
• Visualize the historical data at the edge
• Perform complex/advanced analytics on historical
data: windowed aggregates, pattern matching,
anomaly detection, etc.
© 2016 IBM Corporation7
Introducing Apache
 Apache Edgent (renamed from Quarks)
 A community for accelerating Edge Analytics
 Open Source, incubating at Apache Software Foundation
• http://edgent.incubator.apache.org/
• http://wiki.apache.org/incubator/QuarksProposal
 Programming model and runtime for analytics at the edge
 A programming SDK with Functional Flow API for streaming analytics
• Initial support for Java & Android, Goal is to support multiple languages with priorities driven by the community
 A modular, lightweight and embeddable runtime
 Initially built by IBM based on 12 years experience with IBM Streams
© 2016 IBM Corporation8
Features
 Quarks for edge analytics – on device or gateway
 Lightweight embedded streaming analytics runtime
 Analyze events locally on the edge
 Reduce communication costs by only sending relevant events
 Functional Flow API for streaming analytics
 Per-event and windowed processing with simple analytics and pattern detection
• Map, FlatMap, Filter, Aggregate
 Connectors
 Messaging systems & data stores
• MQTT, HTTP, Web Sockets, JDBC, File, Apache Kafka and IBM Watson IoT Platform, Informix
 Micro-kernel style runtime with multi-platform support
 Small-footprint edge devices or sensors
• Including Raspberry Pis or smart phones
© 2016 IBM Corporation9
A Simplified IoT/Industry 4.0 Sensor Data Flow
Sensor Data History
Sensors
In-memory Analytics
Predictive
Analytics
Publish /
Subscribe
Cloud
Infrastructure
HDFS / HadoopStream Processing
Real-time Analytics Operational Analytics Big Data Analytics
(no gateway)
(Gateways) IBM Watson IoT Platform
© 2016 IBM Corporation10
Typical IoT Scenario
Industrial IoT-/Edge-Gateway(s)
with specific industry communication stacks
and local Analytic capabilties (in motion / at rest analytics)
Industrial Shop Floor Sensors
Raw Sensor Data
via industry specific
automation protocols
(e.g. PROFINET, BACnet,
OPC UA, Modbus etc.)
Aggregated / Filtered Sensor Data
via standard communication protocols
and formats (e.g. MQTT/JSON) into
the Cloud (IBM Bluemix, IBM IoT Platform),
IBM PMQ plus on premises solutions etc.
© 2016 IBM Corporation11
Informix - The Sensor Data optimized Hybrid Database
 Superior IoT Sensor Data Performance and Scalability in the Cloud,
on Premises and on Edge Gateways
 Innovative Sensor Data Storage and Processing w/o the need for costly all-or-
nothing In-Memory database technologies
 Large Scale and Embedded Sensor Data Benchmarks
 Simplified Application Development due to Integration of all Sensor
Data relevant data types in one hybrid Database
 Relational, Time Series, JSON and Geospatial (all joinable!)
 JSON based Time Series for full Schema Flexibilty
 Free Developer Editions and Docker Images available
 Open, Standardized APIs allow easy Integration into existing
Architectures
 SQL, REST, MongoDB, Node-RED, MQTT, JDBC, ODBC, .NET etc.
 Very Cost Effective Operations in all Deployment Models
 Very efficient HW utilization (less disk space, less CPU & Memory) compared to
other approaches
© 2016 IBM Corporation12
DEMO
Recorded video of the demo
(https://ibm.box.com/s/hye8csvq14f6cylnk8cqrmtu0h5n2m8h)
© 2016 IBM Corporation13
Live Demo: Sensor Data Flow - Edgent & Informix on a Raspberry Pi
Sensortag
to
Node-red
(Node.js)
Raw sensor data
(JSON) captured
every 100ms by
node-red and sent to
Edgent via
Websockets
JDBC
Stream
(Insert)
Dynamic
Stream
Filter
Raspberry Pi
Time Series Data
Data
Aggregated
and
Thresholds
checked
Last, dynamic
moving average
over 30 seconds
Raw sensor
Data
Node-RED Dashboard
WebSockets
© 2016 IBM Corporation14
In-Motion & At-Rest Analytics on the Edge: & Informix
DirectProvider tp = new DirectProvider();
Topology t = tp.newTopology("InformixQuarksSensorTagMQTT");
JdbcStreams mydb = new JdbcStreams(t, () -> InformixConnector.getIfxDataSource() ,
(dataSource) -> dataSource.getConnection() );
MqttConfig mqttConfig = new MqttConfig("tcp://localhost:1883", "");
MqttStreams mqtt = new MqttStreams(t, () -> mqttConfig);
TStream<String> msgs = mqtt.subscribe("sensortag/cc2650", 0);
msgs = msgs.map(tuple -> new JSONObject(new JSONObject(tuple), new String[] {"tstamp","lux"}).toString());
mydb.executeStatement(msgs,
() -> "INSERT into quarks_json_ts_v values (?,?,?)",
(g, stmt) -> { String ts = EpochtoTS(new JSONObject(g).getLong("tstamp"));
stmt.setString(1, "RAW");
stmt.setString(2, ts);
BSONObject bson = new BasicBSONObject("value_r", new JSONObject(g).getDouble("lux"));
stmt.setObject(3, new IfxBSONObject(bson));
} );
mqtt.publish(msgs, "values/raw", 0, false);
TStream<String> threshold = mydb.executeStatement(msgs,
() -> "select elem.timestamp, elem.value FROM (select GetLastElem(Apply('TSRunningAvg($value_r, 30)',
CURRENT - 30 UNITS SECOND, CURRENT, sensor_values)::TimeSeries(one_value_t))
from quarks_json_ts where sensor_id = ?) AS t(elem);",
(g, stmt) -> stmt.setString(1,"RAW"),
(g, rs, exc, consumer) -> {
if (exc != null) { System.err.println("Unable to execute SELECT: " + exc); }
else {
rs.next();
Double last = rs.getDouble("value");
String ts = rs.getString("timestamp");
JSONObject jo = null;
if (ts != null)
{
jo = new JSONObject().put("threshold", setThresholdValue(last)).put("tstamp", TStoEpoch(ts));
}
consumer.accept(jo.toString());
}
}
);
mqtt.publish(threshold, "values/threshold", 0, false);
msgs = msgs.filter(tuple -> new JSONObject(tuple).getDouble("lux") > thresholdValue);
Insert raw sensor data from stream into an Informix time series
Dynamically calculate moving average from raw time series data
Apply dynamic moving average as a dynamic filter to the stream
Create a new Edgent stream from an MQTT queue
© 2016 IBM Corporation15
Deeper Cloud Analytics
 Integrates with centralized analytics
systems through IoT scale message hub
 Any hub
 Any central system
 Informix Hosted Service
 Push data to Spark for deeper analysis
 https://github.com/IBM-
IoT/InformixSparkStreaming
© 2016 IBM Corporation16
Heart Monitor Informix
Message
Broker
Apache Spark
Analytics
Display Results
IOT devices send data into
the Informix server
Data Streams from Informix
into an MQTT broker
From MQTT Data is
streamed into Spark for real-
time Analysis
Results from both Informix
and Spark available to the
end user
IoT Spark Streaming Analytics with Informix
© 2016 IBM Corporation17
Sample Code (Edge)
http://github.com/IBM-IoT/edgent-demo
IBM IoT Gateway Kit
http://github.com/IBM-IoT/iot-gateway-kit

IoT Analytics from Edge to Cloud - using IBM Informix

  • 1.
    © 2016 IBMCorporation IoT Analytics from Edge to Cloud
  • 2.
    © 2016 IBMCorporation2 Agenda  What is Edge  Edgent (formerly Quarks) overview  Database (Informix) on the edge  Database in the Cloud  Spark
  • 3.
    © 2016 IBMCorporation3 What is the Edge?  Systems with constrained compute platform  Due to cost, weight restrictions, space constraints, borrowing resources, ...  Limited connectivity to central systems  Limited by expense, bandwidth  Periods of being disconnected  Can be mobile or static  Access to sensors for system being analyzed  Potential to control system based on analysis  Typically systems that exist where data(sensor) is generated
  • 4.
    © 2016 IBMCorporation4 Edge Examples  Vehicle  Car, truck, race car, bike, train, bus, boat, drone, plane, ...  Analyze engine sensors to predict/reduce chance of failure  Mobile, may lose connectivity  Building/Store/Warehouse  HVAC, climate, energy use, motion sensors, ...  Server in machine room  Analyze load, cpu temps, rack temps  Raspberry Pi with a couple of sensors  Cheap $5+  Smartphone  Industrial machine
  • 5.
    © 2016 IBMCorporation5 What is the value of analytics at the edge? • Reduce communication cost • Send relevant data when an event of interest occurs • Heartbeats alone may not contain enough data or be too late to take action • React locally to events • More intelligent decision making on the device • Execute analytics while disconnected • Streaming analysis • Data is constantly being generated • Needing real-time/continuous analyzing • Collaborate with related devices • Learn from devices with similar characteristics • or location
  • 6.
    © 2016 IBMCorporation6 Streaming and In-database Analytics at the Edge • Streaming analytics • Data is constantly being generated • Needing real-time/continuous analyzing • In-memory analysis of data • In-database analytics • Store data at the edge efficiently and securely • Provide historical analytics on the data • Visualize the historical data at the edge • Perform complex/advanced analytics on historical data: windowed aggregates, pattern matching, anomaly detection, etc.
  • 7.
    © 2016 IBMCorporation7 Introducing Apache  Apache Edgent (renamed from Quarks)  A community for accelerating Edge Analytics  Open Source, incubating at Apache Software Foundation • http://edgent.incubator.apache.org/ • http://wiki.apache.org/incubator/QuarksProposal  Programming model and runtime for analytics at the edge  A programming SDK with Functional Flow API for streaming analytics • Initial support for Java & Android, Goal is to support multiple languages with priorities driven by the community  A modular, lightweight and embeddable runtime  Initially built by IBM based on 12 years experience with IBM Streams
  • 8.
    © 2016 IBMCorporation8 Features  Quarks for edge analytics – on device or gateway  Lightweight embedded streaming analytics runtime  Analyze events locally on the edge  Reduce communication costs by only sending relevant events  Functional Flow API for streaming analytics  Per-event and windowed processing with simple analytics and pattern detection • Map, FlatMap, Filter, Aggregate  Connectors  Messaging systems & data stores • MQTT, HTTP, Web Sockets, JDBC, File, Apache Kafka and IBM Watson IoT Platform, Informix  Micro-kernel style runtime with multi-platform support  Small-footprint edge devices or sensors • Including Raspberry Pis or smart phones
  • 9.
    © 2016 IBMCorporation9 A Simplified IoT/Industry 4.0 Sensor Data Flow Sensor Data History Sensors In-memory Analytics Predictive Analytics Publish / Subscribe Cloud Infrastructure HDFS / HadoopStream Processing Real-time Analytics Operational Analytics Big Data Analytics (no gateway) (Gateways) IBM Watson IoT Platform
  • 10.
    © 2016 IBMCorporation10 Typical IoT Scenario Industrial IoT-/Edge-Gateway(s) with specific industry communication stacks and local Analytic capabilties (in motion / at rest analytics) Industrial Shop Floor Sensors Raw Sensor Data via industry specific automation protocols (e.g. PROFINET, BACnet, OPC UA, Modbus etc.) Aggregated / Filtered Sensor Data via standard communication protocols and formats (e.g. MQTT/JSON) into the Cloud (IBM Bluemix, IBM IoT Platform), IBM PMQ plus on premises solutions etc.
  • 11.
    © 2016 IBMCorporation11 Informix - The Sensor Data optimized Hybrid Database  Superior IoT Sensor Data Performance and Scalability in the Cloud, on Premises and on Edge Gateways  Innovative Sensor Data Storage and Processing w/o the need for costly all-or- nothing In-Memory database technologies  Large Scale and Embedded Sensor Data Benchmarks  Simplified Application Development due to Integration of all Sensor Data relevant data types in one hybrid Database  Relational, Time Series, JSON and Geospatial (all joinable!)  JSON based Time Series for full Schema Flexibilty  Free Developer Editions and Docker Images available  Open, Standardized APIs allow easy Integration into existing Architectures  SQL, REST, MongoDB, Node-RED, MQTT, JDBC, ODBC, .NET etc.  Very Cost Effective Operations in all Deployment Models  Very efficient HW utilization (less disk space, less CPU & Memory) compared to other approaches
  • 12.
    © 2016 IBMCorporation12 DEMO Recorded video of the demo (https://ibm.box.com/s/hye8csvq14f6cylnk8cqrmtu0h5n2m8h)
  • 13.
    © 2016 IBMCorporation13 Live Demo: Sensor Data Flow - Edgent & Informix on a Raspberry Pi Sensortag to Node-red (Node.js) Raw sensor data (JSON) captured every 100ms by node-red and sent to Edgent via Websockets JDBC Stream (Insert) Dynamic Stream Filter Raspberry Pi Time Series Data Data Aggregated and Thresholds checked Last, dynamic moving average over 30 seconds Raw sensor Data Node-RED Dashboard WebSockets
  • 14.
    © 2016 IBMCorporation14 In-Motion & At-Rest Analytics on the Edge: & Informix DirectProvider tp = new DirectProvider(); Topology t = tp.newTopology("InformixQuarksSensorTagMQTT"); JdbcStreams mydb = new JdbcStreams(t, () -> InformixConnector.getIfxDataSource() , (dataSource) -> dataSource.getConnection() ); MqttConfig mqttConfig = new MqttConfig("tcp://localhost:1883", ""); MqttStreams mqtt = new MqttStreams(t, () -> mqttConfig); TStream<String> msgs = mqtt.subscribe("sensortag/cc2650", 0); msgs = msgs.map(tuple -> new JSONObject(new JSONObject(tuple), new String[] {"tstamp","lux"}).toString()); mydb.executeStatement(msgs, () -> "INSERT into quarks_json_ts_v values (?,?,?)", (g, stmt) -> { String ts = EpochtoTS(new JSONObject(g).getLong("tstamp")); stmt.setString(1, "RAW"); stmt.setString(2, ts); BSONObject bson = new BasicBSONObject("value_r", new JSONObject(g).getDouble("lux")); stmt.setObject(3, new IfxBSONObject(bson)); } ); mqtt.publish(msgs, "values/raw", 0, false); TStream<String> threshold = mydb.executeStatement(msgs, () -> "select elem.timestamp, elem.value FROM (select GetLastElem(Apply('TSRunningAvg($value_r, 30)', CURRENT - 30 UNITS SECOND, CURRENT, sensor_values)::TimeSeries(one_value_t)) from quarks_json_ts where sensor_id = ?) AS t(elem);", (g, stmt) -> stmt.setString(1,"RAW"), (g, rs, exc, consumer) -> { if (exc != null) { System.err.println("Unable to execute SELECT: " + exc); } else { rs.next(); Double last = rs.getDouble("value"); String ts = rs.getString("timestamp"); JSONObject jo = null; if (ts != null) { jo = new JSONObject().put("threshold", setThresholdValue(last)).put("tstamp", TStoEpoch(ts)); } consumer.accept(jo.toString()); } } ); mqtt.publish(threshold, "values/threshold", 0, false); msgs = msgs.filter(tuple -> new JSONObject(tuple).getDouble("lux") > thresholdValue); Insert raw sensor data from stream into an Informix time series Dynamically calculate moving average from raw time series data Apply dynamic moving average as a dynamic filter to the stream Create a new Edgent stream from an MQTT queue
  • 15.
    © 2016 IBMCorporation15 Deeper Cloud Analytics  Integrates with centralized analytics systems through IoT scale message hub  Any hub  Any central system  Informix Hosted Service  Push data to Spark for deeper analysis  https://github.com/IBM- IoT/InformixSparkStreaming
  • 16.
    © 2016 IBMCorporation16 Heart Monitor Informix Message Broker Apache Spark Analytics Display Results IOT devices send data into the Informix server Data Streams from Informix into an MQTT broker From MQTT Data is streamed into Spark for real- time Analysis Results from both Informix and Spark available to the end user IoT Spark Streaming Analytics with Informix
  • 17.
    © 2016 IBMCorporation17 Sample Code (Edge) http://github.com/IBM-IoT/edgent-demo IBM IoT Gateway Kit http://github.com/IBM-IoT/iot-gateway-kit