SlideShare a Scribd company logo
1 of 13
Project Progress
What we’ve been doing(1)
 • Hacking Hadoop API.
 • Writing different kinds of programs to
   understand it. (Not CV programs)
 • Adaboost
 • SIFT, SURF
 • Reading, Reading
Segmentation

ROI   ROI
segmentation with overlap


             get SIFT/SURF descriptor for partial segments


              reduce no. of descriptors by grouping them.


region of interest (positive&negative)

          count the frequency of occurrence of visual words


                               AdaBoost
Methodology

• For simplicity, assume the the same image is
  stored on all slave nodes.
• Use ROI to run the algorithm.
• Hopefully this will make it easier for the
  “Reduce”
Map-Reduce???
• It’s just a framework
• You can also implement it by reading the
  paper[1]. :)
• Hadoop is one implementation. (Apache +
  Yahoo)
• Google’s implementation is not made
  public.
Map-Reduce for Machine
 Learning on Multi-core
Introduction

• Algorithm fitting Statistical Query Model
  may be written in a certain “summation
  form”
• Divide into data set into as many pieces as
  the number of cores.
• Algorithm fitting Statistical Query Model may be
  written in a certain “summation form”
• Divide into data set into as many pieces as the number
  of cores.
Algorithms(1)
• Locally Weight Linear Regression
• Naive Bayes
• Gaussian Discriminative Analysis
• k-means
• Logistic Regression
• Neural Network
Algorithms(2)

• Principal Components Analysis
• Independent Components Analysis
• Expansion Maximization
• Support Vector Machine
Example (LWLR)


          divide the computation among different mappers to compute:




2 reducers sum up the partial values for A and b and finally computes the solution
Experiment Result
• Used UCI Machine Learning repository
• Used only 2 cores.
• 1.9x times faster
• 54 times speed up on 64 cores.
• Speed up is achieved by “throwing cores”
  only

More Related Content

What's hot

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopHadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson
 
Get involved with the Apache Software Foundation
Get involved with the Apache Software FoundationGet involved with the Apache Software Foundation
Get involved with the Apache Software FoundationShalin Shekhar Mangar
 
Spark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko KorndorfSpark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko KorndorfSpark Summit
 
Spark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza KarimiSpark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza KarimiSpark Summit
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data LaboratoryJ Singh
 
Spark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoopColin Su
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
3rd Hivemall meetup
3rd Hivemall meetup3rd Hivemall meetup
3rd Hivemall meetupMakoto Yui
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit
 
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningA Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningMakoto Yui
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyJay Nagar
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit
 
Spark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit
 
Deep Learning to Production with MLflow & RedisAI
Deep Learning to Production with MLflow & RedisAIDeep Learning to Production with MLflow & RedisAI
Deep Learning to Production with MLflow & RedisAIDatabricks
 
Spark_Intro_Syed_Academy
Spark_Intro_Syed_AcademySpark_Intro_Syed_Academy
Spark_Intro_Syed_AcademySyed Hadoop
 
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceJ Singh
 

What's hot (20)

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopHadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
 
Get involved with the Apache Software Foundation
Get involved with the Apache Software FoundationGet involved with the Apache Software Foundation
Get involved with the Apache Software Foundation
 
Spark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko KorndorfSpark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko Korndorf
 
Spark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza KarimiSpark Summit EU talk by Reza Karimi
Spark Summit EU talk by Reza Karimi
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data Laboratory
 
Tailored for Spark
Tailored for SparkTailored for Spark
Tailored for Spark
 
Spark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena Lazovik
 
Introduction to MapReduce & hadoop
Introduction to MapReduce & hadoopIntroduction to MapReduce & hadoop
Introduction to MapReduce & hadoop
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
3rd Hivemall meetup
3rd Hivemall meetup3rd Hivemall meetup
3rd Hivemall meetup
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
 
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine LearningA Database-Hadoop Hybrid Approach to Scalable Machine Learning
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar Castaneda
 
Spark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital Kedia
 
Deep Learning to Production with MLflow & RedisAI
Deep Learning to Production with MLflow & RedisAIDeep Learning to Production with MLflow & RedisAI
Deep Learning to Production with MLflow & RedisAI
 
Spark_Intro_Syed_Academy
Spark_Intro_Syed_AcademySpark_Intro_Syed_Academy
Spark_Intro_Syed_Academy
 
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub Hava
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
 
The Evolution of Apache Kylin
The Evolution of Apache KylinThe Evolution of Apache Kylin
The Evolution of Apache Kylin
 

Viewers also liked

Wildi 2009 Resume Addendum
Wildi 2009 Resume  AddendumWildi 2009 Resume  Addendum
Wildi 2009 Resume AddendumWildi
 
OW2con'14 - Nanoko, 2 years feedback, Ubidreams
OW2con'14 - Nanoko, 2 years feedback, UbidreamsOW2con'14 - Nanoko, 2 years feedback, Ubidreams
OW2con'14 - Nanoko, 2 years feedback, UbidreamsOW2
 
Chapter 13
Chapter 13Chapter 13
Chapter 13dphil002
 
Microsoft Power Point Customview360 Linked In
Microsoft Power Point   Customview360 Linked InMicrosoft Power Point   Customview360 Linked In
Microsoft Power Point Customview360 Linked InMichiel Castelijns
 
Billboard Liberation Front - Steve Lambert
Billboard Liberation Front - Steve LambertBillboard Liberation Front - Steve Lambert
Billboard Liberation Front - Steve LambertCrisis 999
 
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...OCCIware project and OCCI standard presented at China Cloud Computing & Stand...
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...OW2
 
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OW2
 
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...OW2
 
Kalimucho Research Project, OW2con11, Nov 24-25, Paris
Kalimucho Research Project, OW2con11, Nov 24-25, ParisKalimucho Research Project, OW2con11, Nov 24-25, Paris
Kalimucho Research Project, OW2con11, Nov 24-25, ParisOW2
 
NFPA Presentation Social Media
NFPA Presentation Social MediaNFPA Presentation Social Media
NFPA Presentation Social Mediatellem
 
Git, как инструмент управления веб-контентом
Git, как инструмент управления веб-контентомGit, как инструмент управления веб-контентом
Git, как инструмент управления веб-контентомAlex Musayev
 
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...OW2
 
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...OCCIware, a formal framework for Everything as a Service. OW2con'15, November...
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...OW2
 
Big Data with SpagoBI. OW2con'15, November 17, Paris.
Big Data with SpagoBI. OW2con'15, November 17, Paris. Big Data with SpagoBI. OW2con'15, November 17, Paris.
Big Data with SpagoBI. OW2con'15, November 17, Paris. OW2
 
Slide Boothphotos
Slide BoothphotosSlide Boothphotos
Slide Boothphotosparisyoyo
 
Hahn Golf Academia & Club
Hahn Golf Academia & ClubHahn Golf Academia & Club
Hahn Golf Academia & ClubCsaba Hahn
 
Adivina Que Ciudad Es
Adivina Que Ciudad EsAdivina Que Ciudad Es
Adivina Que Ciudad Esalfcoltrane
 

Viewers also liked (20)

Wildi 2009 Resume Addendum
Wildi 2009 Resume  AddendumWildi 2009 Resume  Addendum
Wildi 2009 Resume Addendum
 
OW2con'14 - Nanoko, 2 years feedback, Ubidreams
OW2con'14 - Nanoko, 2 years feedback, UbidreamsOW2con'14 - Nanoko, 2 years feedback, Ubidreams
OW2con'14 - Nanoko, 2 years feedback, Ubidreams
 
Chapter 13
Chapter 13Chapter 13
Chapter 13
 
Microsoft Power Point Customview360 Linked In
Microsoft Power Point   Customview360 Linked InMicrosoft Power Point   Customview360 Linked In
Microsoft Power Point Customview360 Linked In
 
Billboard Liberation Front - Steve Lambert
Billboard Liberation Front - Steve LambertBillboard Liberation Front - Steve Lambert
Billboard Liberation Front - Steve Lambert
 
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...OCCIware project and OCCI standard presented at China Cloud Computing & Stand...
OCCIware project and OCCI standard presented at China Cloud Computing & Stand...
 
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.
 
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...
OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...
 
Kalimucho Research Project, OW2con11, Nov 24-25, Paris
Kalimucho Research Project, OW2con11, Nov 24-25, ParisKalimucho Research Project, OW2con11, Nov 24-25, Paris
Kalimucho Research Project, OW2con11, Nov 24-25, Paris
 
NFPA Presentation Social Media
NFPA Presentation Social MediaNFPA Presentation Social Media
NFPA Presentation Social Media
 
Git, как инструмент управления веб-контентом
Git, как инструмент управления веб-контентомGit, как инструмент управления веб-контентом
Git, как инструмент управления веб-контентом
 
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...
CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...
 
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...OCCIware, a formal framework for Everything as a Service. OW2con'15, November...
OCCIware, a formal framework for Everything as a Service. OW2con'15, November...
 
Chapter 6
Chapter 6Chapter 6
Chapter 6
 
Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02Los 88 pelda+os del +ëxitov 02
Los 88 pelda+os del +ëxitov 02
 
Serpica Naro
Serpica NaroSerpica Naro
Serpica Naro
 
Big Data with SpagoBI. OW2con'15, November 17, Paris.
Big Data with SpagoBI. OW2con'15, November 17, Paris. Big Data with SpagoBI. OW2con'15, November 17, Paris.
Big Data with SpagoBI. OW2con'15, November 17, Paris.
 
Slide Boothphotos
Slide BoothphotosSlide Boothphotos
Slide Boothphotos
 
Hahn Golf Academia & Club
Hahn Golf Academia & ClubHahn Golf Academia & Club
Hahn Golf Academia & Club
 
Adivina Que Ciudad Es
Adivina Que Ciudad EsAdivina Que Ciudad Es
Adivina Que Ciudad Es
 

Similar to Project Progress

High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)Jose Luis Lopez Pino
 
BDA R20 21NM - Summary Big Data Analytics
BDA R20 21NM - Summary Big Data AnalyticsBDA R20 21NM - Summary Big Data Analytics
BDA R20 21NM - Summary Big Data AnalyticsNetajiGandi1
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionDong Ngoc
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark Mostafa
 
SSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSSSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSEugene Lazutkin
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using HadoopDataWorks Summit
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreducehansen3032
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitMilind Bhandarkar
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Kylin and Druid Presentation
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentationargonauts007
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Jen Aman
 
Internship final report@Treasure Data Inc.
Internship final report@Treasure Data Inc.Internship final report@Treasure Data Inc.
Internship final report@Treasure Data Inc.Ryuichi ITO
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in HadoopAnalyticsWeek
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Thomas W. Dinsmore
 

Similar to Project Progress (20)

High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)
 
BDA R20 21NM - Summary Big Data Analytics
BDA R20 21NM - Summary Big Data AnalyticsBDA R20 21NM - Summary Big Data Analytics
BDA R20 21NM - Summary Big Data Analytics
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
 
Implementing your own Google App Engine
Implementing your own Google App Engine Implementing your own Google App Engine
Implementing your own Google App Engine
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
SSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJSSSJS, NoSQL, GAE and AppengineJS
SSJS, NoSQL, GAE and AppengineJS
 
Operational Intelligence Using Hadoop
Operational Intelligence Using HadoopOperational Intelligence Using Hadoop
Operational Intelligence Using Hadoop
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreduce
 
Big Data training
Big Data trainingBig Data training
Big Data training
 
JavaFX 101
JavaFX 101JavaFX 101
JavaFX 101
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & Profit
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Kylin and Druid Presentation
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentation
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Internship final report@Treasure Data Inc.
Internship final report@Treasure Data Inc.Internship final report@Treasure Data Inc.
Internship final report@Treasure Data Inc.
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in Hadoop
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Project Progress

  • 2. What we’ve been doing(1) • Hacking Hadoop API. • Writing different kinds of programs to understand it. (Not CV programs) • Adaboost • SIFT, SURF • Reading, Reading
  • 4. segmentation with overlap get SIFT/SURF descriptor for partial segments reduce no. of descriptors by grouping them. region of interest (positive&negative) count the frequency of occurrence of visual words AdaBoost
  • 5. Methodology • For simplicity, assume the the same image is stored on all slave nodes. • Use ROI to run the algorithm. • Hopefully this will make it easier for the “Reduce”
  • 6. Map-Reduce??? • It’s just a framework • You can also implement it by reading the paper[1]. :) • Hadoop is one implementation. (Apache + Yahoo) • Google’s implementation is not made public.
  • 7. Map-Reduce for Machine Learning on Multi-core
  • 8. Introduction • Algorithm fitting Statistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.
  • 9. • Algorithm fitting Statistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.
  • 10. Algorithms(1) • Locally Weight Linear Regression • Naive Bayes • Gaussian Discriminative Analysis • k-means • Logistic Regression • Neural Network
  • 11. Algorithms(2) • Principal Components Analysis • Independent Components Analysis • Expansion Maximization • Support Vector Machine
  • 12. Example (LWLR) divide the computation among different mappers to compute: 2 reducers sum up the partial values for A and b and finally computes the solution
  • 13. Experiment Result • Used UCI Machine Learning repository • Used only 2 cores. • 1.9x times faster • 54 times speed up on 64 cores. • Speed up is achieved by “throwing cores” only