SlideShare a Scribd company logo
Basho Technologies | 1
Scaling Time Series Applications
Basho
Dorothy Pults – Product Evangelist @deepults
Tom Sigler – Solution Architect @tom_sigler
Databricks
Peyman Mohajerian - Solution Architect @mohajeri
Basho Technologies | 2
BASHO TECHNOLOGIES
Distributed Systems Software for Big Data and IoT applications
2011 - Creators of Riak
• Riak KV: NoSQL Key Value database
• Riak TS: NoSQL Time Series database
• Integrations: Spark, Redis caching, Solr, Mesos, Riak S2
120+ employees
Global Offices
• Seattle (HQ), Washington DC, London, Paris
1/3 of the Fortune 50
Basho Technologies | 3
$1.3 Trillion market
spend Internet of
Things in 2019
30 Billion Installed
base of IoT endpoints
in 2020
*Source IDC
Basho Technologies | 4
56% have integrated
IOT data
IoT is 24%
of the average IT
budget
20%
decrease in downtime
21%
increase in revenue
*Vodafone IOT Barometer
Basho Technologies | 55
CRITICAL SUCCESS FACTORS FOR IOT
• Explore new business models
• Address Key IoT challenges like
Edge Analytics
• Provide comprehensive solutions
• Engage with a broader ecosystem
Basho Technologies | 66
100TB DAILY – IOT AND WEATHER DATA
530M personal weather
stations reports each day
9M webcam uploads
2M crowd reports
> 20M IoT barometric reports
Basho Technologies | 7
WEATHER FORECAST PREDICTS SALES
Ideal BERRY purchasing weather
turns out to be low wind with
temperatures below 80 degrees.
People are more likely to eat STEAK
when it's warm out with higher winds
but no rain, but not if it gets too hot.
Basho Technologies | 88
EDGE ANALYTICS
• Edge Analytics
• Fog Computing
• Inverted Web
• Reverse CDN
Basho Technologies | 99
NEW ECOSYSTEM – DATA PIPELINE
Basho Technologies | 1010
WHAT’S NEEDED TO SCALE FOR IoT
• A database optimized for IoT data
• Review your data life cycle
• Summations and aggregation
• Data expiration
• Data cleansing
• Processing close to devices
• Scale for unstructured metadata
Basho Technologies | 11
TIME SERIES (TS) DATA
• Consists of successive observations made
over a time interval
• Structured
• Time + State/Measurement
• Metadata/Context
• Frequency
Basho Technologies | 12
TIME SERIES CHALLENGES AT SCALE
• Ingestion Velocity
• Data Volume
• Post Ingestion Workloads
– Real time
– Batch
• Lifecycle/Expiry
Basho Technologies | 13
Riak TS Overview &
Architecture
Basho Technologies | 14
WHAT IS RIAK TS?
Riak TS is a distributed NoSQL key/value store optimized for time
series data.
It provides a time series database solution that is extensible and
scalable.
Riak TS is derived from Riak KV and adds the ability to co-locate data
by composite primary key, including quanta, for efficient sequential
read i/o operations.
Basho Technologies | 15
Why Riak TS?
• Highly available
• Fault Tolerant
• Geo data locality
• Scalability
– Operations
– Real-time range query performance
15
Basho Technologies | 16
RIAK TS MASTERLESS ARCHITECUTURE
Riak has a masterless architecture. Every node is:
• homogenous
• capable of serving all read and write requests
• responsible for a subset of data
Basho Technologies | 17
RIAK TS: DISTRIBUTION AND CO-LOCATION
• Variation of Dynamo
• Composite key drives
grouping on disk
– Partition Key
– Local Key (sort)
Basho Technologies | 18
RIAK: REPLICATION OF DATA
• Intra-cluster replication
• Multi-cluster replication
put(“bucket/key”)
Basho Technologies | 19
RIAK: HIGH AVAILABILITY
Hinted handoff allows Riak nodes to
temporarily take over storage
operations for a failed node and
update that node with changes when
it comes back online.
Basho Technologies | 20
RIAK TS: SCALABILITY
Riak TS scales in a near-linear fashion so increasing the number of a nodes in a cluster
increases the number of reads and writes a cluster can handle in a predictable fashion.
Rebalancing of the cluster is a non-blocking operation, which doesn’t require downtime to
perform.
If 10 nodes can serve 40,000 Writes/Second Then 20 nodes should serve 72,000+ Writes/Second
> riak-admin cluster join riak@192.168.2.2
> riak-admin cluster plan
> riak-admin cluster commit
A d d i n g a n o d e
Basho Technologies | 21
RIAK TS: QUERY
select * from GeoCheckin where
time > 1453224610000 and time < 1453225490000 and
deviceId = 'abc-xxx-001-001'
select MIN(temperature), AVG(temperature), MAX(temperature)
from GeoCheckin where
time > 1453224610000 and time < 1453225490000 and
deviceId = 'abc-xxx-001-001'
select (temperature * 2), (pressure - 1)
from GeoCheckin where
time > 1453224610000 and time < 1453225490000 and
deviceId = 'abc-xxx-001-001'
Arithmetic
Aggregate
Range
• SQL Interface
• Arithmetic Support
• Aggregate
– Count()
– Sum()
– Mean() & Avg()
– Min() & Max()
– STDDEV()
• Group By
• Expanded
capabilities
in future releases
Basho Technologies | 22
BATCH PROCESSING
• Real-time vs. Batch
• Spark Connector
• Parallel Extract
Basho Technologies | 23
DATA LIFECYCLE
• Global expiry
• Per table expiry
coming soon
• Spark batch for
rollups/aggregation
Basho Technologies | 24
Time Series
Data Modeling
Basho Technologies | 25
SUPPORTED DDL DATA TYPES
• VARCHAR - Any string content is valid, including Unicode. Can only be
compared using strict equality, and will not be typecast (e.g., to an integer) for
comparison purposes. Use single quotes to delimit varchar strings.
• BOOLEAN - true or false (any case)
• TIMESTAMP - Timestamps are integer values expressing UNIX epoch time in
UTC in milliseconds. Zero is not a valid timestamp.
• SINT64 - Signed 64-bit integer
• DOUBLE - This type does not comply with its IEEE specification: NaN (not a
number) and INF (infinity) cannot be used.
Basho Technologies | 26
THE KEY
Consists of:
• Partition Key
(node/partition)
• Quantum (optional)
• Local Key (sort order)
Basho Technologies | 27
RIAK TS: CREATE TABLE
CREATE TABLE GeoCheckin (
deviceID varchar not null,
time timestamp not null,
weather varchar not null,
temperature double,
PRIMARY KEY (
(deviceID, quantum(time, 15, 'm')),
deviceID, time
)
)
Partition Key
Local Key
Basho Technologies | 28
MODELING THE KEY
Methodology:
• What questions does your
application ask?
• How is the data presented?
Basho Technologies | 29
USE CASE: PEDOMETER
• Questions
– How many steps today
(distance) for user?
– How many steps per
day this week for user?
– Daily average?
– Change in elevation?
• Key
– Partition: UserID
– Local: timestamp
– Optimized for reads:
quantum of 1 week
– Optimized for writes
quantum of 1 day
• Fields
– timestamp
– steps
– device_id
– elevation
– geohash
Basho Technologies | 30
DEMO
• Riak TS
• Python client
• Jupyter Notebook
• Pandas
• Matplotlib
Basho Technologies | 31
THE DATA
Description Field Type
Sensor Status status varchar
Exit ID exitid varchar
Timestamp ts timestamp
Average Measured Time avgMeasuredTime sint64
Average Speed avgSpeed sint64
Median Measured Time medianMeasuredTime sint64
Number of Vehicles vehicleCount sint64
Sensor ID id sint64
Report ID report_id sint64
• Vehicle traffic data
• City of Aarhus,
Denmark
• Two sensors placed
at each exit
• 5 min intervals
Spark and Riak: In-situ analytics
beyond Hadoop
33
Who is Databricks
Why Us Our Product
• Creators of Apache Spark. Contribute 75%
of the code - 10x more than others
• Trained 20K Spark users
• Largest number of customers deploying
Spark (200+)
• Just-in-Time Data Platform – powered by
Apache Spark.
• Empower your organization to swiftly
build and deploy advanced analytics with
Spark.
open source data processing engine built around speed,
ease of use, and sophisticated analytics
largest open source data project with 1000+ contributors
UNIFIED ENGINE ACROSS DIVERSE WORKLOADS & ENVIRONMENTS
Scale out, fault tolerant
Python, Java, Scala, and R APIs
Standard libraries
APACHE SPARK ENGINE
First Cellular Phones Unified DeviceSpecialized Devices
ANALOGY: EVOLUTION OF CONSUMER ELECTRONICS
HISTORY REPEATS: FASTER, EASIER TO USE, UNIFIED
First Distributed
Processing Engine
Specialized Data
Processing Engines
Unified Data
Processing Engine
Google Trends: Hadoop vs. Spark
Analytics in-situ
SQL
Streaming
MLEnable SQL analytics over Riak
Use Riak to store streaming data
Use Riak to serve results generated by Spark
Riak Spark Connector
User application contacts the
coordinating node returning the
locations of the data using cluster
replication and availability
information.
Then “N” Spark workers open “N”
parallel connections to different
nodes, which allow the application to
retrieve the desired dataset “N”
times faster, without generating “hot
spots”.
Demo
Build a PoC on Databricks today.
Professional services and training also available.
Contactsales@databricks.com
or
Signupforatrialathttps://databricks.com/try-databricks
Basho Technologies | 43
Thank You!
If you have any questions
please reach out to us at
basho.com/contact

More Related Content

What's hot

Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
Robert Chong
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
Kent Graziano
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
Caserta
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
Perficient, Inc.
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
DATAVERSITY
 
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the FieldPartner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
Denodo
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
sambiswal
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Seeling Cheung
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
Bui Ha
 
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
MapR Technologies
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data Architecture
Zaloni
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Data Con LA
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Cloudera, Inc.
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
Ricky Barron
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
jdijcks
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
Capgemini
 

What's hot (20)

Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the FieldPartner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
[EN] Trends in Records, Document and Enterprise Content Management | Ulrich K...
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data Architecture
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
 

Viewers also liked

kiran New
kiran Newkiran New
kiran New
Kiran David
 
Demand generation using mobile and data to boost bricks and mortar sales (1)
Demand generation using mobile and data to boost bricks and mortar sales (1)Demand generation using mobile and data to boost bricks and mortar sales (1)
Demand generation using mobile and data to boost bricks and mortar sales (1)
ad:tech London
 
Retail Road to Recovery-Bricks and Mortar Retail is Back!
Retail Road to Recovery-Bricks and Mortar Retail is Back!Retail Road to Recovery-Bricks and Mortar Retail is Back!
Retail Road to Recovery-Bricks and Mortar Retail is Back!
Desley Cowley
 
phoebe cv one
phoebe cv onephoebe cv one
phoebe cv one
phoebe muli
 
Apache Spark with Hortonworks Data Platform - Seattle Meetup
Apache Spark with Hortonworks Data Platform - Seattle MeetupApache Spark with Hortonworks Data Platform - Seattle Meetup
Apache Spark with Hortonworks Data Platform - Seattle Meetup
Saptak Sen
 
Projecte vertebrats A
Projecte vertebrats AProjecte vertebrats A
Projecte vertebrats A
mmatarin
 
WSDM2014
WSDM2014WSDM2014
WSDM2014
Jun Yu
 
SIP Presenation
SIP PresenationSIP Presenation
SIP Presenation
Rebecca Shapiro
 
A report on Nuclear energy -Globally
A report on Nuclear energy -GloballyA report on Nuclear energy -Globally
A report on Nuclear energy -Globally
Pranab Ghosh
 

Viewers also liked (9)

kiran New
kiran Newkiran New
kiran New
 
Demand generation using mobile and data to boost bricks and mortar sales (1)
Demand generation using mobile and data to boost bricks and mortar sales (1)Demand generation using mobile and data to boost bricks and mortar sales (1)
Demand generation using mobile and data to boost bricks and mortar sales (1)
 
Retail Road to Recovery-Bricks and Mortar Retail is Back!
Retail Road to Recovery-Bricks and Mortar Retail is Back!Retail Road to Recovery-Bricks and Mortar Retail is Back!
Retail Road to Recovery-Bricks and Mortar Retail is Back!
 
phoebe cv one
phoebe cv onephoebe cv one
phoebe cv one
 
Apache Spark with Hortonworks Data Platform - Seattle Meetup
Apache Spark with Hortonworks Data Platform - Seattle MeetupApache Spark with Hortonworks Data Platform - Seattle Meetup
Apache Spark with Hortonworks Data Platform - Seattle Meetup
 
Projecte vertebrats A
Projecte vertebrats AProjecte vertebrats A
Projecte vertebrats A
 
WSDM2014
WSDM2014WSDM2014
WSDM2014
 
SIP Presenation
SIP PresenationSIP Presenation
SIP Presenation
 
A report on Nuclear energy -Globally
A report on Nuclear energy -GloballyA report on Nuclear energy -Globally
A report on Nuclear energy -Globally
 

Similar to Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applications

Pydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
Pydata london meetup - RiakTS, PySpark and Python by Stephen EtheridgePydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
Pydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
Emmanuel Marchal
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQL
Basho Technologies
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Maya Lumbroso
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Dataconomy Media
 
Intro to InfluxDB
Intro to InfluxDBIntro to InfluxDB
Intro to InfluxDB
InfluxData
 
Riak TS
Riak TSRiak TS
Riak TS
clive boulton
 
Io t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeIo t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moe
Shawn Moe
 
Informix - The Ideal Database for IoT
Informix - The Ideal Database for IoTInformix - The Ideal Database for IoT
Informix - The Ideal Database for IoT
Pradeep Natarajan
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
DataStax Academy
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析
Etu Solution
 
Santhosh Resume
Santhosh ResumeSanthosh Resume
Santhosh Resume
Santhosh Ravisankar
 
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Citus Data
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ss
Anil Nair
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
Cisco DevNet
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Dataconomy Media
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Dataconomy Media
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Maya Lumbroso
 
IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud
Pradeep Natarajan
 

Similar to Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applications (20)

Pydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
Pydata london meetup - RiakTS, PySpark and Python by Stephen EtheridgePydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
Pydata london meetup - RiakTS, PySpark and Python by Stephen Etheridge
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQL
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Intro to InfluxDB
Intro to InfluxDBIntro to InfluxDB
Intro to InfluxDB
 
Riak TS
Riak TSRiak TS
Riak TS
 
Io t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeIo t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moe
 
Informix - The Ideal Database for IoT
Informix - The Ideal Database for IoTInformix - The Ideal Database for IoT
Informix - The Ideal Database for IoT
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析Track A-2 基於 Spark 的數據分析
Track A-2 基於 Spark 的數據分析
 
Santhosh Resume
Santhosh ResumeSanthosh Resume
Santhosh Resume
 
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
 
Scaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ssScaling paypal workloads with oracle rac ss
Scaling paypal workloads with oracle rac ss
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
DATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
zjhamm304
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 

Recently uploaded (20)

Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 

Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applications

  • 1. Basho Technologies | 1 Scaling Time Series Applications Basho Dorothy Pults – Product Evangelist @deepults Tom Sigler – Solution Architect @tom_sigler Databricks Peyman Mohajerian - Solution Architect @mohajeri
  • 2. Basho Technologies | 2 BASHO TECHNOLOGIES Distributed Systems Software for Big Data and IoT applications 2011 - Creators of Riak • Riak KV: NoSQL Key Value database • Riak TS: NoSQL Time Series database • Integrations: Spark, Redis caching, Solr, Mesos, Riak S2 120+ employees Global Offices • Seattle (HQ), Washington DC, London, Paris 1/3 of the Fortune 50
  • 3. Basho Technologies | 3 $1.3 Trillion market spend Internet of Things in 2019 30 Billion Installed base of IoT endpoints in 2020 *Source IDC
  • 4. Basho Technologies | 4 56% have integrated IOT data IoT is 24% of the average IT budget 20% decrease in downtime 21% increase in revenue *Vodafone IOT Barometer
  • 5. Basho Technologies | 55 CRITICAL SUCCESS FACTORS FOR IOT • Explore new business models • Address Key IoT challenges like Edge Analytics • Provide comprehensive solutions • Engage with a broader ecosystem
  • 6. Basho Technologies | 66 100TB DAILY – IOT AND WEATHER DATA 530M personal weather stations reports each day 9M webcam uploads 2M crowd reports > 20M IoT barometric reports
  • 7. Basho Technologies | 7 WEATHER FORECAST PREDICTS SALES Ideal BERRY purchasing weather turns out to be low wind with temperatures below 80 degrees. People are more likely to eat STEAK when it's warm out with higher winds but no rain, but not if it gets too hot.
  • 8. Basho Technologies | 88 EDGE ANALYTICS • Edge Analytics • Fog Computing • Inverted Web • Reverse CDN
  • 9. Basho Technologies | 99 NEW ECOSYSTEM – DATA PIPELINE
  • 10. Basho Technologies | 1010 WHAT’S NEEDED TO SCALE FOR IoT • A database optimized for IoT data • Review your data life cycle • Summations and aggregation • Data expiration • Data cleansing • Processing close to devices • Scale for unstructured metadata
  • 11. Basho Technologies | 11 TIME SERIES (TS) DATA • Consists of successive observations made over a time interval • Structured • Time + State/Measurement • Metadata/Context • Frequency
  • 12. Basho Technologies | 12 TIME SERIES CHALLENGES AT SCALE • Ingestion Velocity • Data Volume • Post Ingestion Workloads – Real time – Batch • Lifecycle/Expiry
  • 13. Basho Technologies | 13 Riak TS Overview & Architecture
  • 14. Basho Technologies | 14 WHAT IS RIAK TS? Riak TS is a distributed NoSQL key/value store optimized for time series data. It provides a time series database solution that is extensible and scalable. Riak TS is derived from Riak KV and adds the ability to co-locate data by composite primary key, including quanta, for efficient sequential read i/o operations.
  • 15. Basho Technologies | 15 Why Riak TS? • Highly available • Fault Tolerant • Geo data locality • Scalability – Operations – Real-time range query performance 15
  • 16. Basho Technologies | 16 RIAK TS MASTERLESS ARCHITECUTURE Riak has a masterless architecture. Every node is: • homogenous • capable of serving all read and write requests • responsible for a subset of data
  • 17. Basho Technologies | 17 RIAK TS: DISTRIBUTION AND CO-LOCATION • Variation of Dynamo • Composite key drives grouping on disk – Partition Key – Local Key (sort)
  • 18. Basho Technologies | 18 RIAK: REPLICATION OF DATA • Intra-cluster replication • Multi-cluster replication put(“bucket/key”)
  • 19. Basho Technologies | 19 RIAK: HIGH AVAILABILITY Hinted handoff allows Riak nodes to temporarily take over storage operations for a failed node and update that node with changes when it comes back online.
  • 20. Basho Technologies | 20 RIAK TS: SCALABILITY Riak TS scales in a near-linear fashion so increasing the number of a nodes in a cluster increases the number of reads and writes a cluster can handle in a predictable fashion. Rebalancing of the cluster is a non-blocking operation, which doesn’t require downtime to perform. If 10 nodes can serve 40,000 Writes/Second Then 20 nodes should serve 72,000+ Writes/Second > riak-admin cluster join riak@192.168.2.2 > riak-admin cluster plan > riak-admin cluster commit A d d i n g a n o d e
  • 21. Basho Technologies | 21 RIAK TS: QUERY select * from GeoCheckin where time > 1453224610000 and time < 1453225490000 and deviceId = 'abc-xxx-001-001' select MIN(temperature), AVG(temperature), MAX(temperature) from GeoCheckin where time > 1453224610000 and time < 1453225490000 and deviceId = 'abc-xxx-001-001' select (temperature * 2), (pressure - 1) from GeoCheckin where time > 1453224610000 and time < 1453225490000 and deviceId = 'abc-xxx-001-001' Arithmetic Aggregate Range • SQL Interface • Arithmetic Support • Aggregate – Count() – Sum() – Mean() & Avg() – Min() & Max() – STDDEV() • Group By • Expanded capabilities in future releases
  • 22. Basho Technologies | 22 BATCH PROCESSING • Real-time vs. Batch • Spark Connector • Parallel Extract
  • 23. Basho Technologies | 23 DATA LIFECYCLE • Global expiry • Per table expiry coming soon • Spark batch for rollups/aggregation
  • 24. Basho Technologies | 24 Time Series Data Modeling
  • 25. Basho Technologies | 25 SUPPORTED DDL DATA TYPES • VARCHAR - Any string content is valid, including Unicode. Can only be compared using strict equality, and will not be typecast (e.g., to an integer) for comparison purposes. Use single quotes to delimit varchar strings. • BOOLEAN - true or false (any case) • TIMESTAMP - Timestamps are integer values expressing UNIX epoch time in UTC in milliseconds. Zero is not a valid timestamp. • SINT64 - Signed 64-bit integer • DOUBLE - This type does not comply with its IEEE specification: NaN (not a number) and INF (infinity) cannot be used.
  • 26. Basho Technologies | 26 THE KEY Consists of: • Partition Key (node/partition) • Quantum (optional) • Local Key (sort order)
  • 27. Basho Technologies | 27 RIAK TS: CREATE TABLE CREATE TABLE GeoCheckin ( deviceID varchar not null, time timestamp not null, weather varchar not null, temperature double, PRIMARY KEY ( (deviceID, quantum(time, 15, 'm')), deviceID, time ) ) Partition Key Local Key
  • 28. Basho Technologies | 28 MODELING THE KEY Methodology: • What questions does your application ask? • How is the data presented?
  • 29. Basho Technologies | 29 USE CASE: PEDOMETER • Questions – How many steps today (distance) for user? – How many steps per day this week for user? – Daily average? – Change in elevation? • Key – Partition: UserID – Local: timestamp – Optimized for reads: quantum of 1 week – Optimized for writes quantum of 1 day • Fields – timestamp – steps – device_id – elevation – geohash
  • 30. Basho Technologies | 30 DEMO • Riak TS • Python client • Jupyter Notebook • Pandas • Matplotlib
  • 31. Basho Technologies | 31 THE DATA Description Field Type Sensor Status status varchar Exit ID exitid varchar Timestamp ts timestamp Average Measured Time avgMeasuredTime sint64 Average Speed avgSpeed sint64 Median Measured Time medianMeasuredTime sint64 Number of Vehicles vehicleCount sint64 Sensor ID id sint64 Report ID report_id sint64 • Vehicle traffic data • City of Aarhus, Denmark • Two sensors placed at each exit • 5 min intervals
  • 32. Spark and Riak: In-situ analytics beyond Hadoop
  • 33. 33 Who is Databricks Why Us Our Product • Creators of Apache Spark. Contribute 75% of the code - 10x more than others • Trained 20K Spark users • Largest number of customers deploying Spark (200+) • Just-in-Time Data Platform – powered by Apache Spark. • Empower your organization to swiftly build and deploy advanced analytics with Spark.
  • 34. open source data processing engine built around speed, ease of use, and sophisticated analytics largest open source data project with 1000+ contributors
  • 35. UNIFIED ENGINE ACROSS DIVERSE WORKLOADS & ENVIRONMENTS Scale out, fault tolerant Python, Java, Scala, and R APIs Standard libraries APACHE SPARK ENGINE
  • 36. First Cellular Phones Unified DeviceSpecialized Devices ANALOGY: EVOLUTION OF CONSUMER ELECTRONICS
  • 37. HISTORY REPEATS: FASTER, EASIER TO USE, UNIFIED First Distributed Processing Engine Specialized Data Processing Engines Unified Data Processing Engine
  • 39. Analytics in-situ SQL Streaming MLEnable SQL analytics over Riak Use Riak to store streaming data Use Riak to serve results generated by Spark
  • 40. Riak Spark Connector User application contacts the coordinating node returning the locations of the data using cluster replication and availability information. Then “N” Spark workers open “N” parallel connections to different nodes, which allow the application to retrieve the desired dataset “N” times faster, without generating “hot spots”.
  • 41. Demo
  • 42. Build a PoC on Databricks today. Professional services and training also available. Contactsales@databricks.com or Signupforatrialathttps://databricks.com/try-databricks
  • 43. Basho Technologies | 43 Thank You! If you have any questions please reach out to us at basho.com/contact