SlideShare a Scribd company logo
1 of 63
DATA SCIENTIST’S
DAILY LIFE
BRYAN YANG 2015.09
A B O U T M E
• Blog
Bryan的行銷研究及資料分析筆記
http://bryannotes.blogspot.tw
• Group
Spark.TW
A G E N D A
• Data scientist?
• Big data and data scientist
• Data scientist’s Toolbox
• Data is the biggest
Derive
Knowledge
from
Big data
Efficiently
and
Intelligently
F R O M B A C K E N D T O F R O N T E N D
https://doubleclix.wordpress.com/2012/12/15/what-or-who-is-a-data-scientist/
WHAT IS BIG DATA?
WHERE DO THE DATA COME FROM
• Web Log data
• Machine data
• Transactional data
• Social media data
• …
https://plus.google.com/+DigitalStrategyIE
A WEB SERVICE RECEIVE THE LOG DATA MORE THEN 50G PER DAY
TOTAL SPACE USED LAST THREE MONTH :4500G
TOTAL SPACE USED LAST ONE YEAR :18,000G(17.6T)
• Data Storage/ Backup
• 2T/per HDD
• How to save the data MORE than 2T?
• $0.3 USD/per gigabyte
• Pay 900 USR for KEEPING data but do nothing else.
• Read/Write Speed
• Read: 131.6 MB/s / Write 131.4MB/s
• Spend 393s(6 min) reading just ONE day data.
• Large number of transactions immediately
HADOOP AND
MAPREDUCE
H A D O O P A N D H D F S
http://www.fraudtechwire.com/f-level-guide-to-hadoop-hdfs/
– D I S T R I B U T E D A L G O R I T H M
「The world will change,when data is distributed」
M A P R E D U C E
http://www.milanor.net/blog/?p=853
https://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/
http://blog.agro-know.com/?p=3810
P E R F O R M A N C E O F H A D O O P ?
• Not good, but at least can run.
• Count 86,389,084 rows/per day in 39 sec.
(64G ram, E5 8core * 2/per node * 10)
• How about 39sec * 30days ?
BEFORE ANALYTIC…
E X T R A C T T R A S F O R M L O A D
m/e-university/data-warehouse-etl-toolkit-tutorial-201/surrounding-the-requirements-1
e.net/capgemini/emc-world-2014-breakout-move-to-the-business-data-lake-not-a
/hortonworks/modern-data-architecture-for-a-data-lake-with-informatica-and-h
DATA SCIENTIST’S
TOOL BOX
L I N U X
• The best server choice
• Free and freedom
• Easy to control system
• Easy data processing
• Hadoop is based on Linux
P O W E R F U L S H E L L S C R I P T
S Q L D A T A B A S E
• MySql, Postgresql, Hive, MongoDB(NOSQL)
• Standard SQL Language
• Store and Manage data
R E L A T I O N A L D A T A B A S E
T A B L E R E L A T I O N
https://cloudant.com/blog/foundbites-data-model-relational-db-vs-nosql-on-cloudant/
http://ghtorrent.org/relational.html
S Q L S Y N T A X
R & P Y T H O N
• Basic Analysis Tools
• Easy to Learn
• Many Packages
• Example
• http://bryannotes.blogspot.tw/2014/08/r-ptt-
wantedsocial-network-analysis.html
• http://bryannotes.blogspot.tw/2014/10/python-k-
means-script.html
E T C …
• Excel
• Google Analytics
• Visualisation tools (tableau)
• Web Crawler
• Version control management (git)
• ETL and job scheduling tools (jenkins)
• …
D A T A I S T H E B I G G E S T
– J O S H W I L L S
“Person who is better at statistics than any software
engineer and better at software engineering than
any statistician.”
S T A T I S T I C
W H Y D O W E N E E D M A C H I N E
L E A R N I N G ?
• Clustering
這些人可以分成幾類
• Classification
哪個人屬於哪一類?
• Regression
某個事件發生或某人屬於哪類的機率是多少?
• Dimensionality reduction
降維
C L U S T E R I N G
http://simplystatistics.org/2014/02/18/k-means-clustering-in-a-gif/
source http://humble-developer.blogspot.tw/2011/01/kmeans-clustering-algorithm-part-1.html
C L A S S I F I C A T I O N
http://letsmakerobots.com/content/tcs3200-color-sensor-with-k-nearest-neighbor-classification-algorithm
http://www.astroml.org/sklearn_tutorial/
L O G I S T I C R E G R E S S I O N
https://www.coursera.org/instructor/andrewng
C O S T F U N C T I O N
https://www.coursera.org/instructor/andrewng
O V E R F I T T I N G
https://www.coursera.org/instructor/andrewng
OH MY GOD!
HOW TO CHOOSE IT
M A C H I N E L E A R N I N G A L G O R I T H M N
http://amueller.github.io/sklearn_tutorial/
S T A T I S T I C V S M L
S T A T T I S T I C
M A C H I N E
L E A R N I N G
F O C U S O N
U N D E R S T A N D I N G D A T A
I N T E R M S O F M O D E L S
F O C U S O N T H E A N A L Y S I S
O F L E A R N I N G
A L G O R I T H M S
I N T E R P R E T A B I L I T Y ,
H Y P O T H E S I S T E S T I N G
G R E A T E R F O C U S O N
P R E D I C T I O N
S Y S T E M A T I C S A N D A U T O M A T I O N
http://www.slideshare.net/CetasAnalytics/cetas-e-baymeetupprezofinal
http://mlg.postech.ac.kr/projects/
SHOW YOUR DATA AND
FINDINGS
http://hortonworks.com/wp-content/uploads/2012/06/Tableau2.png
http://www.tableau.com
http://www.tableau.com
http://www.tableau.com
THE REAL CASE
HOW TO START?
• Codecademy http://www.codecademy.com/
Include kinds of programming language, i.e. python,
JavaSrtipt, even shell script and sql
• Coursera http://www.codecademy.com/
Famous self-learning MOOC website.
http://nirvacana.com/thoughts/becoming-a-data-scientist/

More Related Content

What's hot

Managing Descriptive Metadata with Open XML...For Now
Managing Descriptive Metadata with Open XML...For NowManaging Descriptive Metadata with Open XML...For Now
Managing Descriptive Metadata with Open XML...For NowGregory Wiedeman
 
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...VoltDB
 
GraphQL vs. (the) REST
GraphQL vs. (the) RESTGraphQL vs. (the) REST
GraphQL vs. (the) RESTcoliquio GmbH
 
A Journey from Hexagonal Architecture to Event Sourcing
A Journey from Hexagonal Architecture to Event SourcingA Journey from Hexagonal Architecture to Event Sourcing
A Journey from Hexagonal Architecture to Event SourcingCarlos Buenosvinos
 
Rounds tips & tricks
Rounds tips & tricksRounds tips & tricks
Rounds tips & tricksAviv Laufer
 
Agile Lab_BigData_Meetup_AKKA
Agile Lab_BigData_Meetup_AKKAAgile Lab_BigData_Meetup_AKKA
Agile Lab_BigData_Meetup_AKKAPaolo Platter
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache GeodePivotalOpenSourceHub
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixData Con LA
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paperstrange_loop
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Codemotion
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...confluent
 
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"Flink Forward
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...HostedbyConfluent
 
Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Zhenxiao Luo
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and sparkbabatunde ekemode
 
Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit
 

What's hot (20)

Managing Descriptive Metadata with Open XML...For Now
Managing Descriptive Metadata with Open XML...For NowManaging Descriptive Metadata with Open XML...For Now
Managing Descriptive Metadata with Open XML...For Now
 
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
Kyle Kingsbury Talks about the Jepsen Test: What VoltDB Learned About Data Ac...
 
GraphQL vs. (the) REST
GraphQL vs. (the) RESTGraphQL vs. (the) REST
GraphQL vs. (the) REST
 
A Journey from Hexagonal Architecture to Event Sourcing
A Journey from Hexagonal Architecture to Event SourcingA Journey from Hexagonal Architecture to Event Sourcing
A Journey from Hexagonal Architecture to Event Sourcing
 
Rounds tips & tricks
Rounds tips & tricksRounds tips & tricks
Rounds tips & tricks
 
Lspe
LspeLspe
Lspe
 
Agile Lab_BigData_Meetup_AKKA
Agile Lab_BigData_Meetup_AKKAAgile Lab_BigData_Meetup_AKKA
Agile Lab_BigData_Meetup_AKKA
 
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ Netflix
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
Cloud powered search
Cloud powered searchCloud powered search
Cloud powered search
 
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
 
Azkaban
AzkabanAzkaban
Azkaban
 
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
 
Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and spark
 
Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean Wampler
 

Viewers also liked

Introduce to Spark sql 1.3.0
Introduce to Spark sql 1.3.0 Introduce to Spark sql 1.3.0
Introduce to Spark sql 1.3.0 Bryan Yang
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoopCraig Jordan
 
Build your ETL job using Jenkins - step by step
Build your ETL job using Jenkins - step by stepBuild your ETL job using Jenkins - step by step
Build your ETL job using Jenkins - step by stepBryan Yang
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material Bryan Yang
 
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRI
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRIArtificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRI
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRIAssist
 
手把手教你 R 語言分析實務
手把手教你 R 語言分析實務手把手教你 R 語言分析實務
手把手教你 R 語言分析實務Helen Afterglow
 
Word2vec (中文)
Word2vec (中文)Word2vec (中文)
Word2vec (中文)Yiwei Chen
 
Blockchain Smartnetworks
Blockchain Smartnetworks Blockchain Smartnetworks
Blockchain Smartnetworks Melanie Swan
 
DSP 資料科學計畫簡介
DSP 資料科學計畫簡介DSP 資料科學計畫簡介
DSP 資料科學計畫簡介codefortomorrow
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for TrainingBryan Yang
 
Big data para principiantes
Big data para principiantesBig data para principiantes
Big data para principiantesCarlos Toxtli
 
Estudio "Big Data: retos y oportunidades para el turismo"
Estudio "Big Data: retos y oportunidades para el turismo"Estudio "Big Data: retos y oportunidades para el turismo"
Estudio "Big Data: retos y oportunidades para el turismo"Invattur
 
Introducción al Big Data
Introducción al Big DataIntroducción al Big Data
Introducción al Big DataDavid Alayón
 
Business Intelligence - Intro
Business Intelligence - IntroBusiness Intelligence - Intro
Business Intelligence - IntroDavid Hubbard
 
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨手把手教你 R 語言資料分析實務/張毓倫&陳柏亨
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨台灣資料科學年會
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialJustin Lin
 
Building A Bi Strategy
Building A Bi StrategyBuilding A Bi Strategy
Building A Bi Strategylarryzagata
 

Viewers also liked (20)

Introduce to Spark sql 1.3.0
Introduce to Spark sql 1.3.0 Introduce to Spark sql 1.3.0
Introduce to Spark sql 1.3.0
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoop
 
Xsd examples
Xsd examplesXsd examples
Xsd examples
 
Build your ETL job using Jenkins - step by step
Build your ETL job using Jenkins - step by stepBuild your ETL job using Jenkins - step by step
Build your ETL job using Jenkins - step by step
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
 
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRI
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRIArtificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRI
Artificial Intelligence at Work - Assist Workshop 2016 - Nick Triantos - SRI
 
手把手教你 R 語言分析實務
手把手教你 R 語言分析實務手把手教你 R 語言分析實務
手把手教你 R 語言分析實務
 
Word2vec (中文)
Word2vec (中文)Word2vec (中文)
Word2vec (中文)
 
Blockchain Smartnetworks
Blockchain Smartnetworks Blockchain Smartnetworks
Blockchain Smartnetworks
 
DSP 資料科學計畫簡介
DSP 資料科學計畫簡介DSP 資料科學計畫簡介
DSP 資料科學計畫簡介
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for Training
 
Big data para principiantes
Big data para principiantesBig data para principiantes
Big data para principiantes
 
Estudio "Big Data: retos y oportunidades para el turismo"
Estudio "Big Data: retos y oportunidades para el turismo"Estudio "Big Data: retos y oportunidades para el turismo"
Estudio "Big Data: retos y oportunidades para el turismo"
 
Introducción al Big Data
Introducción al Big DataIntroducción al Big Data
Introducción al Big Data
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Business Intelligence - Intro
Business Intelligence - IntroBusiness Intelligence - Intro
Business Intelligence - Intro
 
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨手把手教你 R 語言資料分析實務/張毓倫&陳柏亨
手把手教你 R 語言資料分析實務/張毓倫&陳柏亨
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 Tutorial
 
Building A Bi Strategy
Building A Bi StrategyBuilding A Bi Strategy
Building A Bi Strategy
 

Similar to DATA SCIENTIST’S DAILY LIFE

Data Modelling at Scale
Data Modelling at ScaleData Modelling at Scale
Data Modelling at ScaleDavid Simons
 
New Era of Software with modern Application Security v1.0
New Era of Software with modern Application Security v1.0New Era of Software with modern Application Security v1.0
New Era of Software with modern Application Security v1.0Dinis Cruz
 
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]New Relic
 
Decoupled APIs through Microservices
Decoupled APIs through MicroservicesDecoupled APIs through Microservices
Decoupled APIs through MicroservicesDavid Simons
 
So You Want to be an OpenStack Contributor
So You Want to be an OpenStack ContributorSo You Want to be an OpenStack Contributor
So You Want to be an OpenStack ContributorAnne Gentle
 
Development and Deployment: The Human Factor
Development and Deployment: The Human FactorDevelopment and Deployment: The Human Factor
Development and Deployment: The Human FactorBoris Adryan
 
Choosing the Right Database
Choosing the Right DatabaseChoosing the Right Database
Choosing the Right DatabaseDavid Simons
 
The Changing Face of Government IT
The Changing Face of Government ITThe Changing Face of Government IT
The Changing Face of Government ITDustin Haisler
 
Neotys PAC - Todd De Capua
Neotys PAC - Todd De CapuaNeotys PAC - Todd De Capua
Neotys PAC - Todd De CapuaNeotys_Partner
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Denodo
 
How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationWarrenCruz3
 
Choosing the right database
Choosing the right databaseChoosing the right database
Choosing the right databaseDavid Simons
 
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...Living Online
 
Creating Modern Metadata Systems [FutureStack16 NYC]
Creating Modern Metadata Systems [FutureStack16 NYC]Creating Modern Metadata Systems [FutureStack16 NYC]
Creating Modern Metadata Systems [FutureStack16 NYC]New Relic
 
Elasticsearch Atlanta Meetup 3/15/16
Elasticsearch Atlanta Meetup 3/15/16Elasticsearch Atlanta Meetup 3/15/16
Elasticsearch Atlanta Meetup 3/15/16Roy Russo
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Massimiliano Crosato
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Strangler Pattern in practice @PHPers Day 2019
Strangler Pattern in practice @PHPers Day 2019Strangler Pattern in practice @PHPers Day 2019
Strangler Pattern in practice @PHPers Day 2019Michał Kurzeja
 

Similar to DATA SCIENTIST’S DAILY LIFE (20)

Data Modelling at Scale
Data Modelling at ScaleData Modelling at Scale
Data Modelling at Scale
 
New Era of Software with modern Application Security v1.0
New Era of Software with modern Application Security v1.0New Era of Software with modern Application Security v1.0
New Era of Software with modern Application Security v1.0
 
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]
Creating Modern Metadata Systems with New Relic, Dow Jones [FutureStack16]
 
Decoupled APIs through Microservices
Decoupled APIs through MicroservicesDecoupled APIs through Microservices
Decoupled APIs through Microservices
 
So You Want to be an OpenStack Contributor
So You Want to be an OpenStack ContributorSo You Want to be an OpenStack Contributor
So You Want to be an OpenStack Contributor
 
Development and Deployment: The Human Factor
Development and Deployment: The Human FactorDevelopment and Deployment: The Human Factor
Development and Deployment: The Human Factor
 
Choosing the Right Database
Choosing the Right DatabaseChoosing the Right Database
Choosing the Right Database
 
The Changing Face of Government IT
The Changing Face of Government ITThe Changing Face of Government IT
The Changing Face of Government IT
 
Neotys PAC - Todd De Capua
Neotys PAC - Todd De CapuaNeotys PAC - Todd De Capua
Neotys PAC - Todd De Capua
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
 
How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven Organization
 
Choosing the right database
Choosing the right databaseChoosing the right database
Choosing the right database
 
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...
Practical Routers and Switches (Including TCP/IP and Ethernet) for Engineers ...
 
Creating Modern Metadata Systems [FutureStack16 NYC]
Creating Modern Metadata Systems [FutureStack16 NYC]Creating Modern Metadata Systems [FutureStack16 NYC]
Creating Modern Metadata Systems [FutureStack16 NYC]
 
Elasticsearch Atlanta Meetup 3/15/16
Elasticsearch Atlanta Meetup 3/15/16Elasticsearch Atlanta Meetup 3/15/16
Elasticsearch Atlanta Meetup 3/15/16
 
Vikram emerging technologies
Vikram emerging technologiesVikram emerging technologies
Vikram emerging technologies
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Strangler Pattern in practice @PHPers Day 2019
Strangler Pattern in practice @PHPers Day 2019Strangler Pattern in practice @PHPers Day 2019
Strangler Pattern in practice @PHPers Day 2019
 
SENCER_panel.ppt
SENCER_panel.pptSENCER_panel.ppt
SENCER_panel.ppt
 

More from Bryan Yang

敏捷開發心法
敏捷開發心法敏捷開發心法
敏捷開發心法Bryan Yang
 
Data pipeline essential
Data pipeline essentialData pipeline essential
Data pipeline essentialBryan Yang
 
資料分析的快樂就是如此樸實無華且枯燥
資料分析的快樂就是如此樸實無華且枯燥資料分析的快樂就是如此樸實無華且枯燥
資料分析的快樂就是如此樸實無華且枯燥Bryan Yang
 
Data pipeline 101
Data pipeline 101Data pipeline 101
Data pipeline 101Bryan Yang
 
Building a data driven business
Building a data driven businessBuilding a data driven business
Building a data driven businessBryan Yang
 
產業數據力-以傳統零售業為例
產業數據力-以傳統零售業為例產業數據力-以傳統零售業為例
產業數據力-以傳統零售業為例Bryan Yang
 
Serverless ETL
Serverless ETLServerless ETL
Serverless ETLBryan Yang
 
敏捷開發心法
敏捷開發心法敏捷開發心法
敏捷開發心法Bryan Yang
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to dockerBryan Yang
 

More from Bryan Yang (10)

敏捷開發心法
敏捷開發心法敏捷開發心法
敏捷開發心法
 
Data pipeline essential
Data pipeline essentialData pipeline essential
Data pipeline essential
 
Docker 101
Docker 101Docker 101
Docker 101
 
資料分析的快樂就是如此樸實無華且枯燥
資料分析的快樂就是如此樸實無華且枯燥資料分析的快樂就是如此樸實無華且枯燥
資料分析的快樂就是如此樸實無華且枯燥
 
Data pipeline 101
Data pipeline 101Data pipeline 101
Data pipeline 101
 
Building a data driven business
Building a data driven businessBuilding a data driven business
Building a data driven business
 
產業數據力-以傳統零售業為例
產業數據力-以傳統零售業為例產業數據力-以傳統零售業為例
產業數據力-以傳統零售業為例
 
Serverless ETL
Serverless ETLServerless ETL
Serverless ETL
 
敏捷開發心法
敏捷開發心法敏捷開發心法
敏捷開發心法
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

DATA SCIENTIST’S DAILY LIFE