SlideShare a Scribd company logo
1 of 78
Download to read offline
The New
Analytics Toolbox
Going beyond Hadoop
whoami
Robbie Strickland
Software Dev Manager
linkedin.com/in/robbiestrickland
@rs_atl
rostrickland@gmail.com
Hadoop works fine, right?
Hadoop works fine, right?
● It’s 11 years old (!!)
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
● … because everything gets written to disk
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
● … because everything gets written to disk
● Writing mapreduce code sucks
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
● … because everything gets written to disk
● Writing mapreduce code sucks
● Lots of code for even simple tasks
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
● … because everything gets written to disk
● Writing mapreduce code sucks
● Lots of code for even simple tasks
● Boilerplate
Hadoop works fine, right?
● It’s 11 years old (!!)
● It’s relatively slow and inefficient
● … because everything gets written to disk
● Writing mapreduce code sucks
● Lots of code for even simple tasks
● Boilerplate
● Configuration isn’t for the faint of heart
Landscape has changed ...
Lots of new tools available:
● Cloudera Impala
● Apache Drill
● Proprietary (Splunk, keen.io, etc.)
● Spark / Spark SQL
● Shark
Landscape has changed ...
Lots of new tools available:
● Cloudera Impala
● Apache Drill
● Proprietary (Splunk, keen.io, etc.)
● Spark / Spark SQL
● Shark
MPP queries for Hadoop data
Landscape has changed ...
Lots of new tools available:
● Cloudera Impala
● Apache Drill
● Proprietary (Splunk, keen.io, etc.)
● Spark / Spark SQL
● Shark
Varies by vendor
Landscape has changed ...
Lots of new tools available:
● Cloudera Impala
● Apache Drill
● Proprietary (Splunk, keen.io, etc.)
● Spark / Spark SQL
● Shark
Generic in-memory analysis
Landscape has changed ...
Lots of new tools available:
● Cloudera Impala
● Apache Drill
● Proprietary (Splunk, keen.io, etc.)
● Spark / Spark SQL
● Shark Hive queries on Spark, replaced by Spark SQL
Spark wins?
Spark wins?
● Can be a Hadoop replacement
Spark wins?
● Can be a Hadoop replacement
● Works with any existing InputFormat
Spark wins?
● Can be a Hadoop replacement
● Works with any existing InputFormat
● Doesn’t require Hadoop
Spark wins?
● Can be a Hadoop replacement
● Works with any existing InputFormat
● Doesn’t require Hadoop
● Supports batch & streaming analysis
Spark wins?
● Can be a Hadoop replacement
● Works with any existing InputFormat
● Doesn’t require Hadoop
● Supports batch & streaming analysis
● Functional programming model
Spark wins?
● Can be a Hadoop replacement
● Works with any existing InputFormat
● Doesn’t require Hadoop
● Supports batch & streaming analysis
● Functional programming model
● Direct Cassandra integration
Spark vs. Hadoop
MapReduce:
MapReduce:
MapReduce:
MapReduce:
MapReduce:
MapReduce:
Spark (angels sing!):
What is Spark?
What is Spark?
● In-memory cluster computing
What is Spark?
● In-memory cluster computing
● 10-100x faster than MapReduce
What is Spark?
● In-memory cluster computing
● 10-100x faster than MapReduce
● Collection API over large datasets
What is Spark?
● In-memory cluster computing
● 10-100x faster than MapReduce
● Collection API over large datasets
● Scala, Python, Java
What is Spark?
● In-memory cluster computing
● 10-100x faster than MapReduce
● Collection API over large datasets
● Scala, Python, Java
● Stream processing
What is Spark?
● In-memory cluster computing
● 10-100x faster than MapReduce
● Collection API over large datasets
● Scala, Python, Java
● Stream processing
● Supports any existing Hadoop input / output
format
What is Spark?
● Native graph processing via GraphX
What is Spark?
● Native graph processing via GraphX
● Native machine learning via MLlib
What is Spark?
● Native graph processing via GraphX
● Native machine learning via MLlib
● SQL queries via SparkSQL
What is Spark?
● Native graph processing via GraphX
● Native machine learning via MLlib
● SQL queries via SparkSQL
● Works out of the box on EMR
What is Spark?
● Native graph processing via GraphX
● Native machine learning via MLlib
● SQL queries via SparkSQL
● Works out of the box on EMR
● Easily join datasets from disparate sources
Spark on Cassandra
Spark on Cassandra
● Direct integration via DataStax driver -
cassandra-driver-spark on github
Spark on Cassandra
● Direct integration via DataStax driver -
cassandra-driver-spark on github
● No job config cruft
Spark on Cassandra
● Direct integration via DataStax driver -
cassandra-driver-spark on github
● No job config cruft
● Supports server-side filters (where clauses)
Spark on Cassandra
● Direct integration via DataStax driver -
cassandra-driver-spark on github
● No job config cruft
● Supports server-side filters (where clauses)
● Data locality aware
Spark on Cassandra
● Direct integration via DataStax driver -
cassandra-driver-spark on github
● No job config cruft
● Supports server-side filters (where clauses)
● Data locality aware
● Uses HDFS, CassandraFS, or other
distributed FS for checkpointing
Cassandra with Hadoop
Cassandra
DataNode
TaskTracker
Cassandra
DataNode
TaskTracker
Cassandra
DataNode
TaskTracker
NameNode
2ndaryNN
JobTracker
Cassandra with Spark (using HDFS)
Cassandra
Spark Worker
DataNode
Cassandra
Spark Worker
DataNode
Cassandra
Spark Worker
DataNode
Spark Master
NameNode
2ndaryNN
A Typical Spark Application
A Typical Spark Application
● SparkContext + SparkConf
A Typical Spark Application
● SparkContext + SparkConf
● Data Source to RDD[T]
A Typical Spark Application
● SparkContext + SparkConf
● Data Source to RDD[T]
● Transformations/Actions
A Typical Spark Application
● SparkContext + SparkConf
● Data Source to RDD[T]
● Transformations/Actions
● Saving/Displaying
Resilient Distributed Dataset (RDD)
Resilient Distributed Dataset (RDD)
● A distributed collection of items
Resilient Distributed Dataset (RDD)
● A distributed collection of items
● Transformations
○ Similar to those found in scala collections
○ Lazily processed
Resilient Distributed Dataset (RDD)
● A distributed collection of items
● Transformations
○ Similar to those found in scala collections
○ Lazily processed
● Can recalculate from any point of failure
RDD Transformations vs Actions
Transformations:
Produce new RDDs
Actions:
Require the materialization of the records to produce a
value
RDD Transformations/Actions
Transformations:
filter, map, flatMap, collect(λ):RDD[T], distinct, groupBy,
subtract, union, zip, reduceByKey ...
Actions:
collect:Array[T], count, fold, reduce ...
Resilient Distributed Dataset (RDD)
val numOfAdults = persons.filter(_.age > 17).count()
Transformation Action
Example
Demo
Spark SQL Example
Spark Streaming
Spark Streaming
● Creates RDDs from stream source on a
defined interval
Spark Streaming
● Creates RDDs from stream source on a
defined interval
● Same ops as “normal” RDDs
Spark Streaming
● Creates RDDs from stream source on a
defined interval
● Same ops as “normal” RDDs
● Supports a variety of sources
○ Queues
○ Twitter
○ Flume
○ etc.
Spark Streaming
The Output
Real World Task Distribution
Real World Task Distribution
Transformations and Actions:
Similar to Scala but …
Your choice of transformations need be made with task
distribution and memory in mind
Real World Task Distribution
How do we do that?
● Partition the data appropriately for number of cores (at
least 2 x number of cores)
● Filter early and often (Queries, Filters, Distinct...)
● Use pipelines
Real World Task Distribution
How do we do that?
● Partial Aggregation
● Create an algorithm to be as simple/efficient as it can be
to appropriately answer the question
Real World Task Distribution
Real World Task Distribution
Some Common Costly Transformations:
● sorting
● groupByKey
● reduceByKey
● ...
Thank you!
Robbie Strickland
@rs_atl
rostrickland@gmail.com
https://github.com/rstrickland/sparkdemo

More Related Content

What's hot

Spark Workflow Management
Spark Workflow ManagementSpark Workflow Management
Spark Workflow ManagementRomi Kuntsman
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Chris Fregly
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkSamy Dindane
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for TrainingBryan Yang
 
CliqueSquare processing
CliqueSquare processingCliqueSquare processing
CliqueSquare processingINRIA-OAK
 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingKristian Alexander
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environmentDelhi/NCR HUG
 
Spark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksSpark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksData Con LA
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Databricks
 
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark Summit
 
RESTful writable APIs for the web of Linked Data using relational storage sol...
RESTful writable APIs for the web of Linked Data using relational storage sol...RESTful writable APIs for the web of Linked Data using relational storage sol...
RESTful writable APIs for the web of Linked Data using relational storage sol...Antonio Garrote Hernández
 
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsFugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsDatabricks
 
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoPostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoUri Savelchev
 
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Lucidworks
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works BestEDB
 
The Revolution Will be Streamed
The Revolution Will be StreamedThe Revolution Will be Streamed
The Revolution Will be StreamedDatabricks
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Spark Summit
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
 
What Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoWhat Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoScyllaDB
 

What's hot (20)

Spark Workflow Management
Spark Workflow ManagementSpark Workflow Management
Spark Workflow Management
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for Training
 
CliqueSquare processing
CliqueSquare processingCliqueSquare processing
CliqueSquare processing
 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Spark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksSpark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of Databricks
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
 
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
 
RESTful writable APIs for the web of Linked Data using relational storage sol...
RESTful writable APIs for the web of Linked Data using relational storage sol...RESTful writable APIs for the web of Linked Data using relational storage sol...
RESTful writable APIs for the web of Linked Data using relational storage sol...
 
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsFugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
Fugue: Unifying Spark and Non-Spark Ecosystems for Big Data Analytics
 
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in ZalandoPostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
PostgreSQL Finland October meetup - PostgreSQL monitoring in Zalando
 
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
 
The Revolution Will be Streamed
The Revolution Will be StreamedThe Revolution Will be Streamed
The Revolution Will be Streamed
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
What Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoWhat Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and Go
 

Viewers also liked

チーミングを通じて最強のチームを
チーミングを通じて最強のチームをチーミングを通じて最強のチームを
チーミングを通じて最強のチームをYoshiaki Yoshida
 
プログラマは何を勉強するか
プログラマは何を勉強するかプログラマは何を勉強するか
プログラマは何を勉強するかなおき きしだ
 
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudy
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudyはじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudy
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudyShinichi Nakagawa
 
Jenkinsを使った継続的セキュリティテスト
Jenkinsを使った継続的セキュリティテストJenkinsを使った継続的セキュリティテスト
Jenkinsを使った継続的セキュリティテストichikaway
 
本格的に始めるzsh
本格的に始めるzsh本格的に始めるzsh
本格的に始めるzshHideaki Miyake
 
ネイティブコードを語る
ネイティブコードを語るネイティブコードを語る
ネイティブコードを語るKenji Imasaki
 
zshでコマンドライン履歴を活用する
zshでコマンドライン履歴を活用するzshでコマンドライン履歴を活用する
zshでコマンドライン履歴を活用するHideaki Miyake
 
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)Yuichiro Yamamoto
 
Mackerelに触れる前にサーバー監視について考えてみよう
Mackerelに触れる前にサーバー監視について考えてみようMackerelに触れる前にサーバー監視について考えてみよう
Mackerelに触れる前にサーバー監視について考えてみようgu4
 
インフラ自動化とHashicorp tools
インフラ自動化とHashicorp toolsインフラ自動化とHashicorp tools
インフラ自動化とHashicorp toolsUchio Kondo
 
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_auto
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_autohb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_auto
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_autoYuichiro Saito
 
Livesense tech night immutable-js at a glance
Livesense tech night   immutable-js at a glanceLivesense tech night   immutable-js at a glance
Livesense tech night immutable-js at a glanceYuta Shimakawa
 
メタ勉強会 - カジュアルトーク駆動学習
メタ勉強会 - カジュアルトーク駆動学習メタ勉強会 - カジュアルトーク駆動学習
メタ勉強会 - カジュアルトーク駆動学習Yoshiaki Yoshida
 
はじめてのVue.js
はじめてのVue.jsはじめてのVue.js
はじめてのVue.jskamiyam .
 
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.Kensaku Komatsu
 
Behat+Symfony2ではじめるBDD超入門
Behat+Symfony2ではじめるBDD超入門Behat+Symfony2ではじめるBDD超入門
Behat+Symfony2ではじめるBDD超入門晃 遠山
 

Viewers also liked (20)

チーミングを通じて最強のチームを
チーミングを通じて最強のチームをチーミングを通じて最強のチームを
チーミングを通じて最強のチームを
 
プログラマは何を勉強するか
プログラマは何を勉強するかプログラマは何を勉強するか
プログラマは何を勉強するか
 
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudy
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudyはじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudy
はじめてのLean Canvas〜最初のアイディアを言語化してみよう #bpstudy
 
Jenkinsを使った継続的セキュリティテスト
Jenkinsを使った継続的セキュリティテストJenkinsを使った継続的セキュリティテスト
Jenkinsを使った継続的セキュリティテスト
 
pecoを使おう
pecoを使おうpecoを使おう
pecoを使おう
 
本格的に始めるzsh
本格的に始めるzsh本格的に始めるzsh
本格的に始めるzsh
 
ネイティブコードを語る
ネイティブコードを語るネイティブコードを語る
ネイティブコードを語る
 
zshでコマンドライン履歴を活用する
zshでコマンドライン履歴を活用するzshでコマンドライン履歴を活用する
zshでコマンドライン履歴を活用する
 
Antigenを使おう
Antigenを使おうAntigenを使おう
Antigenを使おう
 
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)
スクラムナイト7 ユーザーストーリーマッピングやで、知らんけど(公開用抜粋)
 
Mackerelに触れる前にサーバー監視について考えてみよう
Mackerelに触れる前にサーバー監視について考えてみようMackerelに触れる前にサーバー監視について考えてみよう
Mackerelに触れる前にサーバー監視について考えてみよう
 
インフラ自動化とHashicorp tools
インフラ自動化とHashicorp toolsインフラ自動化とHashicorp tools
インフラ自動化とHashicorp tools
 
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_auto
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_autohb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_auto
hb-agent 秘伝のタレからソースコードへ (ITインフラ 業務自動化現状確認会 ) #infra_auto
 
Livesense tech night immutable-js at a glance
Livesense tech night   immutable-js at a glanceLivesense tech night   immutable-js at a glance
Livesense tech night immutable-js at a glance
 
宇宙zsh #2
宇宙zsh #2宇宙zsh #2
宇宙zsh #2
 
今から始めるzsh
今から始めるzsh今から始めるzsh
今から始めるzsh
 
メタ勉強会 - カジュアルトーク駆動学習
メタ勉強会 - カジュアルトーク駆動学習メタ勉強会 - カジュアルトーク駆動学習
メタ勉強会 - カジュアルトーク駆動学習
 
はじめてのVue.js
はじめてのVue.jsはじめてのVue.js
はじめてのVue.js
 
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.
最新Web 通信系API総まくり!WebRTC, Streams, Push api etc.
 
Behat+Symfony2ではじめるBDD超入門
Behat+Symfony2ではじめるBDD超入門Behat+Symfony2ではじめるBDD超入門
Behat+Symfony2ではじめるBDD超入門
 

Similar to New Analytics Toolbox

New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015Robbie Strickland
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMHolden Karau
 
A super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMA super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMHolden Karau
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...Holden Karau
 
Streaming & Scaling Spark - London Spark Meetup 2016
Streaming & Scaling Spark - London Spark Meetup 2016Streaming & Scaling Spark - London Spark Meetup 2016
Streaming & Scaling Spark - London Spark Meetup 2016Holden Karau
 
Apache Jena Elephas and Friends
Apache Jena Elephas and FriendsApache Jena Elephas and Friends
Apache Jena Elephas and FriendsRob Vesse
 
Beyond Shuffling and Streaming Preview - Salt Lake City Spark Meetup
Beyond Shuffling and Streaming Preview - Salt Lake City Spark MeetupBeyond Shuffling and Streaming Preview - Salt Lake City Spark Meetup
Beyond Shuffling and Streaming Preview - Salt Lake City Spark MeetupHolden Karau
 
The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...Holden Karau
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018Holden Karau
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYCHolden Karau
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Holden Karau
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsA fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsHolden Karau
 
[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache SparkNaukri.com
 
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton UniversitySpark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton UniversityAlex Zeltov
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018Holden Karau
 
MapReduce with Hadoop and Ruby
MapReduce with Hadoop and RubyMapReduce with Hadoop and Ruby
MapReduce with Hadoop and RubySwanand Pagnis
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Managerharidasnss
 

Similar to New Analytics Toolbox (20)

New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at last
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVM
 
A super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMA super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAM
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...
 
Streaming & Scaling Spark - London Spark Meetup 2016
Streaming & Scaling Spark - London Spark Meetup 2016Streaming & Scaling Spark - London Spark Meetup 2016
Streaming & Scaling Spark - London Spark Meetup 2016
 
Apache Jena Elephas and Friends
Apache Jena Elephas and FriendsApache Jena Elephas and Friends
Apache Jena Elephas and Friends
 
Beyond Shuffling and Streaming Preview - Salt Lake City Spark Meetup
Beyond Shuffling and Streaming Preview - Salt Lake City Spark MeetupBeyond Shuffling and Streaming Preview - Salt Lake City Spark Meetup
Beyond Shuffling and Streaming Preview - Salt Lake City Spark Meetup
 
The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018
 
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional   w/ Apache Spark @ Scala Days NYCKeeping the fun in functional   w/ Apache Spark @ Scala Days NYC
Keeping the fun in functional w/ Apache Spark @ Scala Days NYC
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016Getting started with Apache Spark in Python - PyLadies Toronto 2016
Getting started with Apache Spark in Python - PyLadies Toronto 2016
 
A fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFsA fast introduction to PySpark with a quick look at Arrow based UDFs
A fast introduction to PySpark with a quick look at Arrow based UDFs
 
[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark
 
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton UniversitySpark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
 
MapReduce with Hadoop and Ruby
MapReduce with Hadoop and RubyMapReduce with Hadoop and Ruby
MapReduce with Hadoop and Ruby
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

New Analytics Toolbox