SlideShare a Scribd company logo
1 of 8
Apache Spark MLlib
● What is Apache Spark ?
● What is MLlib ?
● Functionality
● Dependencies
● Books
● Eco-system
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Spark – What is it ?
● Alternative to Map Reduce for certain applications
● A low latency cluster computing system
● For very large data sets
● May be 100 times faster than Map Reduce
● Used with Hadoop / HDFS
● Uses in memory cluster computing
● Memory access faster than disk access
● Has API's written in Scala / Java / Python
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Spark MLlib – What is it ?
● Spark Machine Learning Library
● Provided with Spark Install
● Code in Scala / Java / Python
● Contain libraries
– Spark.mllib
– Spark.ml ( V1.2 )
● Provides common functionality
– classification, regression, clustering
– collaborative filtering, dimensionality reduction
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Spark MLlib – Functionality
● Basic Stats
● Classification and regression
● Collaborative Filtering
● Clustering
● Dimensionality reduction
● Feature extraction and transformation
● Optimization
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Spark MLlib – Dependencies
● NumPy for Python
● Breeze ( linear algebra )
● Netlib-java
● Jblas
● Gfortran runtime library
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Available Books
● See our Hadoop book from Apress / Springer
– “Big Data Made Easy”
● Look out for our Apache Spark based book
– from Packt in 2015
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Spark Eco system
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

More Related Content

Viewers also liked

Presentación final
Presentación finalPresentación final
Presentación finaldocentecis
 
8 kl vostochno-evropeyskaya_ravnina
8 kl vostochno-evropeyskaya_ravnina8 kl vostochno-evropeyskaya_ravnina
8 kl vostochno-evropeyskaya_ravninaones123
 
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvy
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvyDay 4 Reflection at #SXSW 2013 -- #SXSWOgilvy
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvyOgilvy Consulting
 
PEDIDO DE PROVIDÊNCIA 814
PEDIDO DE PROVIDÊNCIA 814PEDIDO DE PROVIDÊNCIA 814
PEDIDO DE PROVIDÊNCIA 814vereadoreduardo
 
8ink 기획서V1 0 김수현,유지은
8ink 기획서V1 0 김수현,유지은8ink 기획서V1 0 김수현,유지은
8ink 기획서V1 0 김수현,유지은jin_yoo
 
Profile Optimisation
Profile OptimisationProfile Optimisation
Profile OptimisationLinkedIn
 
効果的なXPの導入を目的とした プラクティス間の相互作用の分析
効果的なXPの導入を目的とした プラクティス間の相互作用の分析効果的なXPの導入を目的とした プラクティス間の相互作用の分析
効果的なXPの導入を目的とした プラクティス間の相互作用の分析Makoto SAKAI
 
8 Truths About Exercising presented by Terry Febrey
8 Truths About Exercising presented by Terry Febrey8 Truths About Exercising presented by Terry Febrey
8 Truths About Exercising presented by Terry FebreyTerry Febrey
 
The sps code of conduct 2011
The sps code of conduct 2011The sps code of conduct 2011
The sps code of conduct 2011bambangsaja
 
Excel dad6 8
Excel dad6 8Excel dad6 8
Excel dad6 8daalt209
 
Smokeless Tobacco and Oral Cancer
Smokeless Tobacco and Oral CancerSmokeless Tobacco and Oral Cancer
Smokeless Tobacco and Oral CancerSteven Kizior
 

Viewers also liked (13)

Entonar
EntonarEntonar
Entonar
 
Presentación final
Presentación finalPresentación final
Presentación final
 
8 kl vostochno-evropeyskaya_ravnina
8 kl vostochno-evropeyskaya_ravnina8 kl vostochno-evropeyskaya_ravnina
8 kl vostochno-evropeyskaya_ravnina
 
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvy
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvyDay 4 Reflection at #SXSW 2013 -- #SXSWOgilvy
Day 4 Reflection at #SXSW 2013 -- #SXSWOgilvy
 
PEDIDO DE PROVIDÊNCIA 814
PEDIDO DE PROVIDÊNCIA 814PEDIDO DE PROVIDÊNCIA 814
PEDIDO DE PROVIDÊNCIA 814
 
8ink 기획서V1 0 김수현,유지은
8ink 기획서V1 0 김수현,유지은8ink 기획서V1 0 김수현,유지은
8ink 기획서V1 0 김수현,유지은
 
Profile Optimisation
Profile OptimisationProfile Optimisation
Profile Optimisation
 
効果的なXPの導入を目的とした プラクティス間の相互作用の分析
効果的なXPの導入を目的とした プラクティス間の相互作用の分析効果的なXPの導入を目的とした プラクティス間の相互作用の分析
効果的なXPの導入を目的とした プラクティス間の相互作用の分析
 
8 Truths About Exercising presented by Terry Febrey
8 Truths About Exercising presented by Terry Febrey8 Truths About Exercising presented by Terry Febrey
8 Truths About Exercising presented by Terry Febrey
 
94 1006-1-pb
94 1006-1-pb94 1006-1-pb
94 1006-1-pb
 
The sps code of conduct 2011
The sps code of conduct 2011The sps code of conduct 2011
The sps code of conduct 2011
 
Excel dad6 8
Excel dad6 8Excel dad6 8
Excel dad6 8
 
Smokeless Tobacco and Oral Cancer
Smokeless Tobacco and Oral CancerSmokeless Tobacco and Oral Cancer
Smokeless Tobacco and Oral Cancer
 

More from Mike Frampton (20)

Apache Airavata
Apache AiravataApache Airavata
Apache Airavata
 
Apache MADlib AI/ML
Apache MADlib AI/MLApache MADlib AI/ML
Apache MADlib AI/ML
 
Apache MXNet AI
Apache MXNet AIApache MXNet AI
Apache MXNet AI
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Apache Singa AI
Apache Singa AIApache Singa AI
Apache Singa AI
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
OrientDB
OrientDBOrientDB
OrientDB
 
Prometheus
PrometheusPrometheus
Prometheus
 
Apache Tephra
Apache TephraApache Tephra
Apache Tephra
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 

An introduction to Apache Spark MLlib

  • 1. Apache Spark MLlib ● What is Apache Spark ? ● What is MLlib ? ● Functionality ● Dependencies ● Books ● Eco-system www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 2. Spark – What is it ? ● Alternative to Map Reduce for certain applications ● A low latency cluster computing system ● For very large data sets ● May be 100 times faster than Map Reduce ● Used with Hadoop / HDFS ● Uses in memory cluster computing ● Memory access faster than disk access ● Has API's written in Scala / Java / Python www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 3. Spark MLlib – What is it ? ● Spark Machine Learning Library ● Provided with Spark Install ● Code in Scala / Java / Python ● Contain libraries – Spark.mllib – Spark.ml ( V1.2 ) ● Provides common functionality – classification, regression, clustering – collaborative filtering, dimensionality reduction www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 4. Spark MLlib – Functionality ● Basic Stats ● Classification and regression ● Collaborative Filtering ● Clustering ● Dimensionality reduction ● Feature extraction and transformation ● Optimization www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 5. Spark MLlib – Dependencies ● NumPy for Python ● Breeze ( linear algebra ) ● Netlib-java ● Jblas ● Gfortran runtime library www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 6. Available Books ● See our Hadoop book from Apress / Springer – “Big Data Made Easy” ● Look out for our Apache Spark based book – from Packt in 2015 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 7. Spark Eco system www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 8. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems