Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Data Warehouse Blueprint for
ML, AI, and Hybrid Cloud
@garyorenstein @memsql
MemSQL 1
Today’s Talk
A Data Warehouse Blueprint for
• Machine Learning and Artificial Intelligence
• Hybrid Cloud
Live demonstratio...
Demonstration Step 1
1. Launch cluster
2. Setup k_means functions with MemSQL extensibility
3. Load data
4. Train data
5. ...
Step 1
Launch Cluster
!MemSQL 4
The Real-Time Data Warehouse
for the front lines of your business
MemSQL 5
What is a real-time data warehouse?
Similar to an
“Operational Data Warehouse”
MemSQL 6
A Real-Time Data Warehouse
• Adds real-time to analytics
• Reduces latency and ETL
• Manages structured data, loaded conti...
MemSQL: A Real-Time Data Warehouse
Streaming, Live and Historical Data
Immediate Insights with SQL
Scalable and distribute...
Sequel Pro Client and MemSQL Cluster
MemSQL 9
MemSQL 10
Perspective
MemSQL 11
MemSQL
#1 Operational
Data Warehouse in
2016
MemSQL 12
MemSQL
Top
“non-megavendor”
Operational
Data Warehouse
in 2017
MemSQL 13
MemSQL 14
MemSQL 15
Digital Transformation
is data based
MemSQL 16
Digital Transformation
database
MemSQL 17
MemSQL is also a top ranked
database by Gartner
MemSQL 18
MemSQL
Top
“non-megavendor”
HTAP Database in
2016
MemSQL 19
What is the advantage of being in
both the data warehouse and
database magic quadrants?
MemSQL 20
INSERT UPDATE DELETE
MemSQL 21
...you can’t do AI without
machine learning. You also can’t
do machine learning without
analytics, and you can’t do analyt...
Demonstration Step 2 and 3
1. Launch cluster
2. Setup k_means functions with MemSQL extensibility
3. Load data
4. Train da...
Step 2 and 3
Setup and Load
!MemSQL 24
MemSQL 25
Over a billion users
Almost 1/3 of all people on the
Internet
Every day those users watch a
billion hours of video, genera...
Videos have tags
What can they tell us?
MemSQL 27
YouTube Tags Data Set
Channel, Video, Tag
(Gary’s Channel, GO Video 1, hi)
(Gary’s Channel, GO Video 1, hello)
(Gary’s Cha...
Now we can compare vectors and
calculate clusters with k-means
MemSQL 29
k-means clustering partitions
observations into k clusters
Each observation belongs to the
cluster with the nearest mean,
...
MemSQL 31
MemSQL 32
K-means in MemSQL with Extensibility
create or replace procedure k_means(num_its bigint, num_centroids bigint)
as
begin
ca...
Demonstration Step 4 and 5
1. Launch cluster
2. Setup k_means functions with MemSQL extensibility
3. Load data
4. Train da...
Steps 4 and 5
Train and Gain Insights
!MemSQL 35
important_tags.sql
select centroid_id, field_ids.field_id, importance, rn
from
(
select centroids.centroid_id,
centroids.f...
k_means results
MemSQL 37
MemSQL 38
representative channels
MemSQL 39
A bit about Hybrid Cloud
MemSQL 40
MemSQL 41
MemSQL 42
MemSQL 43
Check out our
book!
memsql.com/oreillyml
MemSQL 44
Thank you!
Visit us at the MemSQL
Booth (behind you)
Grab a tshirt!
Chat with engineers
See more tech demos
@garyorenstein...
Upcoming SlideShare
Loading in …5
×

Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud

1,755 views

Published on

Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud

Published in: Data & Analytics
  • Be the first to comment

Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud

  1. 1. The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud @garyorenstein @memsql MemSQL 1
  2. 2. Today’s Talk A Data Warehouse Blueprint for • Machine Learning and Artificial Intelligence • Hybrid Cloud Live demonstration of machine learning in SQL • K_means clustering MemSQL 2
  3. 3. Demonstration Step 1 1. Launch cluster 2. Setup k_means functions with MemSQL extensibility 3. Load data 4. Train data 5. Gain insights • important_tags.sql • representative_channels.sql MemSQL 3
  4. 4. Step 1 Launch Cluster !MemSQL 4
  5. 5. The Real-Time Data Warehouse for the front lines of your business MemSQL 5
  6. 6. What is a real-time data warehouse? Similar to an “Operational Data Warehouse” MemSQL 6
  7. 7. A Real-Time Data Warehouse • Adds real-time to analytics • Reduces latency and ETL • Manages structured data, loaded continuously • Supports real-time decisions with embedded analytics • Serves as an operational data store • Delivers low latency reporting with automated queries MemSQL 7
  8. 8. MemSQL: A Real-Time Data Warehouse Streaming, Live and Historical Data Immediate Insights with SQL Scalable and distributed MemSQL 8
  9. 9. Sequel Pro Client and MemSQL Cluster MemSQL 9
  10. 10. MemSQL 10
  11. 11. Perspective MemSQL 11
  12. 12. MemSQL #1 Operational Data Warehouse in 2016 MemSQL 12
  13. 13. MemSQL Top “non-megavendor” Operational Data Warehouse in 2017 MemSQL 13
  14. 14. MemSQL 14
  15. 15. MemSQL 15
  16. 16. Digital Transformation is data based MemSQL 16
  17. 17. Digital Transformation database MemSQL 17
  18. 18. MemSQL is also a top ranked database by Gartner MemSQL 18
  19. 19. MemSQL Top “non-megavendor” HTAP Database in 2016 MemSQL 19
  20. 20. What is the advantage of being in both the data warehouse and database magic quadrants? MemSQL 20
  21. 21. INSERT UPDATE DELETE MemSQL 21
  22. 22. ...you can’t do AI without machine learning. You also can’t do machine learning without analytics, and you can’t do analytics without data infrastructure. — Hilary Mason, Data Scientist MemSQL 22
  23. 23. Demonstration Step 2 and 3 1. Launch cluster 2. Setup k_means functions with MemSQL extensibility 3. Load data 4. Train data 5. Gain insights • important_tags.sql • representative_channels.sql MemSQL 23
  24. 24. Step 2 and 3 Setup and Load !MemSQL 24
  25. 25. MemSQL 25
  26. 26. Over a billion users Almost 1/3 of all people on the Internet Every day those users watch a billion hours of video, generating billions of views. MemSQL 26
  27. 27. Videos have tags What can they tell us? MemSQL 27
  28. 28. YouTube Tags Data Set Channel, Video, Tag (Gary’s Channel, GO Video 1, hi) (Gary’s Channel, GO Video 1, hello) (Gary’s Channel, GO Video 2, hello) (Gary’s Channel, GO Video 2, blue) “Tag” Vector for Gary’s Channel (hi:1, hello:2, blue:1) MemSQL 28
  29. 29. Now we can compare vectors and calculate clusters with k-means MemSQL 29
  30. 30. k-means clustering partitions observations into k clusters Each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster MemSQL 30
  31. 31. MemSQL 31
  32. 32. MemSQL 32
  33. 33. K-means in MemSQL with Extensibility create or replace procedure k_means(num_its bigint, num_centroids bigint) as begin call initialize_centroids(num_centroids); for i in 1 .. num_its loop call k_means_iteration(); end loop; end // MemSQL 33
  34. 34. Demonstration Step 4 and 5 1. Launch cluster 2. Setup k_means functions with MemSQL extensibility 3. Load data 4. Train data 5. Gain insights • important_tags.sql • representative_channels.sql MemSQL 34
  35. 35. Steps 4 and 5 Train and Gain Insights !MemSQL 35
  36. 36. important_tags.sql select centroid_id, field_ids.field_id, importance, rn from ( select centroids.centroid_id, centroids.field_id, centroids.val - centroid_sums.val importance, row_number() over (partition by centroids.centroid_id order by centroids.val - centroid_sums.val desc) rn from centroids join ( select field_id, sum(val) / (select count(distinct centroid_id) from centroids) as val from centroids group by field_id ) centroid_sums on centroids.field_id = centroid_sums.field_id ) centroids join field_ids on centroids.field_id = field_ids.id where rn < 10 order by centroid_id, rn; MemSQL 36
  37. 37. k_means results MemSQL 37
  38. 38. MemSQL 38
  39. 39. representative channels MemSQL 39
  40. 40. A bit about Hybrid Cloud MemSQL 40
  41. 41. MemSQL 41
  42. 42. MemSQL 42
  43. 43. MemSQL 43
  44. 44. Check out our book! memsql.com/oreillyml MemSQL 44
  45. 45. Thank you! Visit us at the MemSQL Booth (behind you) Grab a tshirt! Chat with engineers See more tech demos @garyorenstein @memsql MemSQL 45

×