SlideShare a Scribd company logo
1 of 28
Download to read offline
C* for Deep Learning
Tractable
2
• Artificial Intelligence products for businesses
• Combining Deep Learning / Neural
Networks and traditional Machine Learning
• AI using Deep Learning has surpassed
human intelligence:
– Go
– Image Recognition
Deep Learning 101
3
Deep Learning
A branch of machine learning based on a set of algorithms that
attempt to model high-level abstractions in data by using a deep
graph with multiple processing layers, composed of multiple linear
and non-linear transformations.
1.
Requires large amounts of data to train
networks
2.
Computations made feasible by use of GPUs for dramatic speedup
Semantic Image Search
4
Similar ImageSimilar object
Search in the meaning of the image
NOT search on the image itself
Semantic Image Search
5
Similar ImageSimilar object
Semantic Search 2
6
Same
object
Same
style
• Intent can vary!
• With training AI can do both of these tasks
• Semantic search is more than just
classification:
– Ranking within a classification
– Search for things that are not
classification categories
Feature Extraction
7
Feature Vector
Trained Deep Net
V1
V2
.
.
.
Vn-1
Vn
1 Image
1 TB Images
4000 dimension vector
300 GB of features
Feature Extraction
GPU Servers C*
Feature Extraction
9
CREATE TABLE features
listing_id uuid,
image_id uuid,
feature_vector blob,
PRIMARY KEY ((listing_id), image_id)
• Processing images at
5 GigaBytes per second
• Features generated at
1.6 GigaBytes per second
Search
10
“I want a truck like this”
• 300 GB of features
• Millions of rows in C*
Spark
11
Distributed, in-memory computation
• Map-reduce
• Graph analysis
• SQL abstraction
DataStax Spark Connector
• Understands Cassandra partitioning
• Push-down queries to C* keys where possible
• Join on C* tables
Spark Similarity Search
12
CREATE TABLE features
listing_id uuid,
image_id uuid,
feature_vector blob,
PRIMARY KEY ((listing_id), image_id)
CREATE TABLE query_results
query_id uuid,
listing_id uuid,
score float,
PRIMARY KEY ((query_id), score)
1. Cache our 300GB features in Spark
2. Score every feature vector with our search
image
3. Aggregate score for images in each listing
4. (optional) Join on ‘live’ C* data
5. Write result back to C*
CREATE TABLE stock_level
listing_id uuid,
inventory int,
etag uuid,
PRIMARY KEY (listing_id)
Search
13
Search returns results within 5 seconds
100x Speedup from holding data in-memory in Spark
10x Speedup from co-locating Spark and C*
20x Speedup from partitioning on the group key (listing id)
5x Speedup from “Cassandra sort”
Search Results
14
Search Results
15
Sophisticated Analytics
• Visualisation
• Clustering
• Outlier Detection
Real time, interactive
exploration of large visual
data sets
Sophisticated Analytics
Gas pipe
Gas joint
Water pipe
Water joint
Electricity cable
Telecoms cable
• Visualisation
• Clustering
• Outlier Detection
Real time, interactive
exploration of large visual
data sets
© DataStax, All Rights Reserved. 18
Pre-processing
Ingest / ETL
Training Annotation /
Labelling
Analytics &
Exploration
Classification &
Feature Extraction
© DataStax, All Rights Reserved. 19
Pre-processing
SPANCKIngest / ETL
Training Annotation /
Labelling
Analytics &
Exploration
Classification &
Feature Extraction
© DataStax, All Rights Reserved.
SPANCK: Spark-Python Algolia Node Cassandra Kafka
20
Data Sources, Files,
DB extracts
Batched Data
Read the data

Write the model
Data Modeling
Fast Analytics
Distributed

Data Store
Event 

Processing
Read the model
Streaming Data
Web API
Alerts and 

Notifications
Fast Data
Processing
Data Exploration
Search Index
© DataStax, All Rights Reserved.
Many Services, 1 Engineer
21
FOSS70+ engineers
FOSSBerkley RAD Lab200+ contributors
© DataStax, All Rights Reserved.
Many Services, 1 Engineer
22
FOSS70+ engineers
FOSSBerkley RAD Lab200+ contributors
Resource Allocation 

Scheduling
Install & Configuration | Maintenance
AI Software
Persistence | Compute | Queue SPANCK
Service Discovery | Resilience
© DataStax, All Rights Reserved.
DCOS Frameworks
Easy Install
23
$ dcos package install kafka --
options kafka-options.json
$ dcos package install cassandra --
options cassandra-options.json
{

"nodes": {

"count": 12,

"cpus": 9,

"mem": 30000,

"disk": 400000,

"heap": {

"size": 8000,

"new": 1500

}

},

"cassandra": {

"row_cache_size_in_mb": 12000,

"row_cache_save_period": 600,

"commitlog_segment_size_in_mb": 64,

"concurrent_reads": 24,

"concurrent_writes": 64,

"memtable_allocation_type": "offheap_objects",

"compaction_throughput_mb_per_sec": 128

}

}
© DataStax, All Rights Reserved.
DCOS Frameworks
24
function fetchMarathonServiceBaseUrl(serviceName){

var dns_url = ['_' + serviceName, "_tcp.marathon.mesos"].join('.');

return resolveSrv(dns_url).then(function(addresses){

return "http://" + addresses[0].name + ":" + addresses[0].port;

})

}
fetchCassandraNodes = (mesosHost, authToken) ->
mesosDns.fetchMarathonServiceBaseUrl('cassandra').then (baseUrl) ->

request(

url: baseUrl + "/v1/nodes/connect",

headers: headers

).then (result) ->

result = JSON.parse(result)

return _.pluck(result.nodes, “ip")

Service Discovery
• 0 configuration files
• 0 configuration / orchestration systems
• Works ever time, in every environment
© DataStax, All Rights Reserved.
DCOS Frameworks
25
dcos cassandra —name=cassandra 
cleanup —key_spaces=dev,test
dcos cassandra —name=cassandra 
replace node-4
dcos kafka broker replace 3
Maintenance and Admin DCOS Command
Line Tools
• Replace a C* node
• Backup / Restore C* to AWS S3
• Run C* repair / cleanup
• Restart C* nodes
• Replace a Kafka broker
• Rebalance Kafka brokers
• Restart Kafka brokers
© DataStax, All Rights Reserved.
DCOS Frameworks
Shipping Docker Apps
1 JSON file
1 CLI command
Apps include
• Spark Drivers
• Web Servers
• Kafka Producers
• Kafka Consumers
• APIs
26
{

"id": "/my-app",

"cpus": 1,

"mem": 2048,

"instances": 1,

"env": {

"NODE_ENV": "dev"

},

"container": {

"type": "DOCKER",

"docker": {

"image": "tractableio/my-app:0.1",

"network": "BRIDGE"

}

},

"healthChecks": [

{

"path": "/healthy",

"maxConsecutiveFailures": 2

}

]

}
*JSON Slightly simplified
DCOS Cluster in 15 minutes
27
1.
Deploy DCOS cluster into AWS using CloudFormation template
2.
Install Cassandra & Kafka
3.
Deploy app docker containers into DCOS
4.
Profit!
Conclusion
• Cassandra’s high write speeds allow it to ingest features from
Deep Networks
• Cassandra and Spark provides a powerful compute+storage system
• Spark, Cassandra and Kafka can provide a versatile data backbone
that supports a range of use cases
• Mesosphere DCOS is a low-effort, high-reward way of running
distributed systems
28

More Related Content

What's hot

Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016DataStax
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016DataStax
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
 
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016DataStax
 
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...DataStax
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...DataStax
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...DataStax
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...DataStax
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyondMatija Gobec
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformDataStax Academy
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...DataStax
 
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...DataStax
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in CassandraShogo Hoshii
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXzznate
 

What's hot (19)

Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
 
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyond
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
 
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
Develop Scalable Applications with DataStax Drivers (Alex Popescu, Bulat Shak...
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in Cassandra
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 

Viewers also liked

Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopLinaro
 
Music Recommendations at Spotify
Music Recommendations at SpotifyMusic Recommendations at Spotify
Music Recommendations at SpotifyEmily Samuels
 
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data WarehouseBuilding the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data WarehouseFormant
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
 
Using Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesUsing Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesHJ van Veen
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 

Viewers also liked (9)

Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
 
Music Recommendations at Spotify
Music Recommendations at SpotifyMusic Recommendations at Spotify
Music Recommendations at Spotify
 
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data WarehouseBuilding the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Using Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesUsing Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar Dresses
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Similar to C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016

Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Databricks
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecJosh Patterson
 
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)Amazon Web Services Korea
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingDatabricks
 
Materials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan GunterMaterials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan GunterDan Gunter
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADtab0ris_1
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Spark Summit
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015eddiebaggott
 
Designing Artificial Intelligence
Designing Artificial IntelligenceDesigning Artificial Intelligence
Designing Artificial IntelligenceDavid Chou
 
Datacamp @ Transparency Camp 2010
Datacamp @ Transparency Camp 2010Datacamp @ Transparency Camp 2010
Datacamp @ Transparency Camp 2010Knowerce
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioAlluxio, Inc.
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksDatabricks
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbLucidworks
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQLYousun Jeong
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Databricks
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTORiccardo Zamana
 

Similar to C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016 (20)

Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
MCT Virtual Summit 2021
MCT Virtual Summit 2021MCT Virtual Summit 2021
MCT Virtual Summit 2021
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
 
Real-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to StreamingReal-Time Spark: From Interactive Queries to Streaming
Real-Time Spark: From Interactive Queries to Streaming
 
Materials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan GunterMaterials Project Validation, Provenance, and Sandboxes by Dan Gunter
Materials Project Validation, Provenance, and Sandboxes by Dan Gunter
 
Scalding big ADta
Scalding big ADtaScalding big ADta
Scalding big ADta
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015
 
Designing Artificial Intelligence
Designing Artificial IntelligenceDesigning Artificial Intelligence
Designing Artificial Intelligence
 
Datacamp @ Transparency Camp 2010
Datacamp @ Transparency Camp 2010Datacamp @ Transparency Camp 2010
Datacamp @ Transparency Camp 2010
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 

More from DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 

More from DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Recently uploaded

Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainAbdul Ahad
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 

Recently uploaded (20)

Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software Domain
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 

C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016

  • 1. C* for Deep Learning
  • 2. Tractable 2 • Artificial Intelligence products for businesses • Combining Deep Learning / Neural Networks and traditional Machine Learning • AI using Deep Learning has surpassed human intelligence: – Go – Image Recognition
  • 3. Deep Learning 101 3 Deep Learning A branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using a deep graph with multiple processing layers, composed of multiple linear and non-linear transformations. 1. Requires large amounts of data to train networks 2. Computations made feasible by use of GPUs for dramatic speedup
  • 4. Semantic Image Search 4 Similar ImageSimilar object Search in the meaning of the image NOT search on the image itself
  • 5. Semantic Image Search 5 Similar ImageSimilar object
  • 6. Semantic Search 2 6 Same object Same style • Intent can vary! • With training AI can do both of these tasks • Semantic search is more than just classification: – Ranking within a classification – Search for things that are not classification categories
  • 7. Feature Extraction 7 Feature Vector Trained Deep Net V1 V2 . . . Vn-1 Vn 1 Image 1 TB Images 4000 dimension vector 300 GB of features
  • 9. Feature Extraction 9 CREATE TABLE features listing_id uuid, image_id uuid, feature_vector blob, PRIMARY KEY ((listing_id), image_id) • Processing images at 5 GigaBytes per second • Features generated at 1.6 GigaBytes per second
  • 10. Search 10 “I want a truck like this” • 300 GB of features • Millions of rows in C*
  • 11. Spark 11 Distributed, in-memory computation • Map-reduce • Graph analysis • SQL abstraction DataStax Spark Connector • Understands Cassandra partitioning • Push-down queries to C* keys where possible • Join on C* tables
  • 12. Spark Similarity Search 12 CREATE TABLE features listing_id uuid, image_id uuid, feature_vector blob, PRIMARY KEY ((listing_id), image_id) CREATE TABLE query_results query_id uuid, listing_id uuid, score float, PRIMARY KEY ((query_id), score) 1. Cache our 300GB features in Spark 2. Score every feature vector with our search image 3. Aggregate score for images in each listing 4. (optional) Join on ‘live’ C* data 5. Write result back to C* CREATE TABLE stock_level listing_id uuid, inventory int, etag uuid, PRIMARY KEY (listing_id)
  • 13. Search 13 Search returns results within 5 seconds 100x Speedup from holding data in-memory in Spark 10x Speedup from co-locating Spark and C* 20x Speedup from partitioning on the group key (listing id) 5x Speedup from “Cassandra sort”
  • 16. Sophisticated Analytics • Visualisation • Clustering • Outlier Detection Real time, interactive exploration of large visual data sets
  • 17. Sophisticated Analytics Gas pipe Gas joint Water pipe Water joint Electricity cable Telecoms cable • Visualisation • Clustering • Outlier Detection Real time, interactive exploration of large visual data sets
  • 18. © DataStax, All Rights Reserved. 18 Pre-processing Ingest / ETL Training Annotation / Labelling Analytics & Exploration Classification & Feature Extraction
  • 19. © DataStax, All Rights Reserved. 19 Pre-processing SPANCKIngest / ETL Training Annotation / Labelling Analytics & Exploration Classification & Feature Extraction
  • 20. © DataStax, All Rights Reserved. SPANCK: Spark-Python Algolia Node Cassandra Kafka 20 Data Sources, Files, DB extracts Batched Data Read the data
 Write the model Data Modeling Fast Analytics Distributed
 Data Store Event 
 Processing Read the model Streaming Data Web API Alerts and 
 Notifications Fast Data Processing Data Exploration Search Index
  • 21. © DataStax, All Rights Reserved. Many Services, 1 Engineer 21 FOSS70+ engineers FOSSBerkley RAD Lab200+ contributors
  • 22. © DataStax, All Rights Reserved. Many Services, 1 Engineer 22 FOSS70+ engineers FOSSBerkley RAD Lab200+ contributors Resource Allocation 
 Scheduling Install & Configuration | Maintenance AI Software Persistence | Compute | Queue SPANCK Service Discovery | Resilience
  • 23. © DataStax, All Rights Reserved. DCOS Frameworks Easy Install 23 $ dcos package install kafka -- options kafka-options.json $ dcos package install cassandra -- options cassandra-options.json {
 "nodes": {
 "count": 12,
 "cpus": 9,
 "mem": 30000,
 "disk": 400000,
 "heap": {
 "size": 8000,
 "new": 1500
 }
 },
 "cassandra": {
 "row_cache_size_in_mb": 12000,
 "row_cache_save_period": 600,
 "commitlog_segment_size_in_mb": 64,
 "concurrent_reads": 24,
 "concurrent_writes": 64,
 "memtable_allocation_type": "offheap_objects",
 "compaction_throughput_mb_per_sec": 128
 }
 }
  • 24. © DataStax, All Rights Reserved. DCOS Frameworks 24 function fetchMarathonServiceBaseUrl(serviceName){
 var dns_url = ['_' + serviceName, "_tcp.marathon.mesos"].join('.');
 return resolveSrv(dns_url).then(function(addresses){
 return "http://" + addresses[0].name + ":" + addresses[0].port;
 })
 } fetchCassandraNodes = (mesosHost, authToken) -> mesosDns.fetchMarathonServiceBaseUrl('cassandra').then (baseUrl) ->
 request(
 url: baseUrl + "/v1/nodes/connect",
 headers: headers
 ).then (result) ->
 result = JSON.parse(result)
 return _.pluck(result.nodes, “ip")
 Service Discovery • 0 configuration files • 0 configuration / orchestration systems • Works ever time, in every environment
  • 25. © DataStax, All Rights Reserved. DCOS Frameworks 25 dcos cassandra —name=cassandra cleanup —key_spaces=dev,test dcos cassandra —name=cassandra replace node-4 dcos kafka broker replace 3 Maintenance and Admin DCOS Command Line Tools • Replace a C* node • Backup / Restore C* to AWS S3 • Run C* repair / cleanup • Restart C* nodes • Replace a Kafka broker • Rebalance Kafka brokers • Restart Kafka brokers
  • 26. © DataStax, All Rights Reserved. DCOS Frameworks Shipping Docker Apps 1 JSON file 1 CLI command Apps include • Spark Drivers • Web Servers • Kafka Producers • Kafka Consumers • APIs 26 {
 "id": "/my-app",
 "cpus": 1,
 "mem": 2048,
 "instances": 1,
 "env": {
 "NODE_ENV": "dev"
 },
 "container": {
 "type": "DOCKER",
 "docker": {
 "image": "tractableio/my-app:0.1",
 "network": "BRIDGE"
 }
 },
 "healthChecks": [
 {
 "path": "/healthy",
 "maxConsecutiveFailures": 2
 }
 ]
 } *JSON Slightly simplified
  • 27. DCOS Cluster in 15 minutes 27 1. Deploy DCOS cluster into AWS using CloudFormation template 2. Install Cassandra & Kafka 3. Deploy app docker containers into DCOS 4. Profit!
  • 28. Conclusion • Cassandra’s high write speeds allow it to ingest features from Deep Networks • Cassandra and Spark provides a powerful compute+storage system • Spark, Cassandra and Kafka can provide a versatile data backbone that supports a range of use cases • Mesosphere DCOS is a low-effort, high-reward way of running distributed systems 28