SlideShare a Scribd company logo
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
2
Algorithms
The Brains
– Introduction to Data Science
– Data Munging & Fusion
– Text Mining
• Naïve Bayes
– Recommendation Engines
– Principal Component Analysis
– Classification
• Decision Trees
• Random Forest
• Gradient Boosting Machines
– Generalized Linear Models
– Clustering
• KNN
• K-Means
– Graph Theory
– Stable Marriage
Hadoop
Big Data
Core
Engineering
Our Training Offerings
Skills you need
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
3
Training Overview
Evening Classes
Big DataBig Data Track1Big Data
Big Data Track 2Machine Learning
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
4
Big Data Training
4 week intensive big data Evening Classes
Week 1 Week 2 Week 3 Week 4 Self Study
Certifications
Complete the industry
standard Hadoop certification
For Data Science
Track1
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
5
Machine Learning Training
6 Week Data Science Evening Classes
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6
Introduction to
Machine Learning
Recommendation Engines
Collaborative Filtering
Gradient Boosting
Machines
For Data Science
Data Fusion
and Fuzzy Matching
Principal Component
Analysis
Graph Theory
and Stable Marriage
Generalized Linear Models
Linear Regression
Regularization
Logistic Regression
Decision Trees
Text Mining
Naive Bayes
Random Forests
Clustering
Knn
K-Means
Data Aggregation Project Data Science Project Career Counseling
Track2
Big DataBig Data Track1Big Data
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
7
Big Data Training
4 week intensive big data training
Week 1 Week 2 Week 3 Week 4 Self Study
Certifications
Complete the industry
standard Hadoop certification
For Data Science
Track1
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
8
Week 1
Introductions
• Motivation for Big Data
• Unix for Data Science
• Pushing and Pulling data from remote servers
• Columnar Compressions
• Extended Data Dictionary
Monday - 6:30 PM Wednesday - 6:30 PM
Pulling and Processing Data
• SQL overview
• SQL design patterns for data analytics
o Pivot Tables
o Aggregation
o Network Analysis
Unix Assignments
• Process data in parallel
• Working with remote Machines
SQL Assignments
Big Data Training
Master the basics
• Five key design patterns
• Joins, Aggregation, Temp Tables,
Indexes, Functions
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
9
Cluster Setup
• Introduction to Big Data Ecosystem
• Acquire 5 machines in AWS
• Prepare machines for Hadoop
• Setup 5 – 10 Node Cluster
• Say Hello to Hadoop
Monday - 6:30 PM Wednesday - 6:30 PM
Introduction Hadoop
• Motivation for Hadoop
• HDFS
• ETL in Hadoop with large dataset
• SQOOP
• OOZIE
• Hadoop Streaming
Cluster Setup Assignment
• Setup Cluster in cloud
• Develop automation scripts
ETL In Hadoop
Big Data Training
Spin up the cluster
• N Gram data in Hadoop
• Develop ETL jobs in cluster
Week 2
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
10
Hive
• Motivation for hive
• Hive architecture
• Aggregation and data selection
• Hive and Python Integration
Monday - 6:30 PM Wednesday - 6:30 PM
Advanced Hive
• Hive Jobs and Variables
• Custom Functions
• Custom data types
• Indexing and Performance issues
Hive Assignment
• Data aggregation
Hive Assignment 2
Big Data Training
Wrangle millions of records in Hadoop
• N Gram data in Hadoop
• Develop ETL jobs in cluster
Week 3
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
11
Hadoop Map Reduce
• Motivation for Map Reduce
• Map Reduce in action
• Map Reduce API
• Splitter and Combiners
• Custom data format
Monday - 6:30 PM Wednesday - 6:30 PM
Advanced Map Reduce
• Distributed Joins
• Data Compression in Map Reduce
• Optimizations
• Debugging and Tracing
M/R Assignment
• Data aggregation
• Extended Data Dictionaries
M/R Assignment 2
Big Data Training
Hadoop under the hood with Map Reduce
• N Gram data in Hadoop
• Develop ETL jobs in cluster
Week 4
Big Data Track 2Machine Learning
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
13
Machine Learning Training
6 Week Data Science Evening Classes
Week 1 Week 2 Week 3 Week 4 Week 5 Week 6
Introduction to
Machine Learning
Recommendation Engines
Collaborative Filtering
Gradient Boosting
Machines
For Data Science
Data Fusion
and Fuzzy Matching
Principal Component
Analysis
Graph Theory
and Stable Marriage
Generalized Linear Models
Linear Regression
Regularization
Logistic Regression
Decision Trees
Text Mining
Naive Bayes
Random Forests
Clustering
Knn
K-Means
Data Aggregation Project Data Science Project Career Counseling
Track2
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
14
Week 1
Introduction to ML & Unix
• Motivation for Machine Learning (ML)
• Geometric , Probabilistic and Logical Models
• Standardized ML Model lifecycle
• Unix for Data Science
• Pushing and Pulling data from remote servers
• Extended Data Dictionary
Tuesday - 6:30 PM Thursday - 6:30 PM
Python for Data Science
• Thinking in Python
• Python design patterns for data analytics
• Pandas
• Data Frames
• Aggregations
• Scripting in Python
1. Unix Assignments
• Data Processing in UNIS
• Data Processing in parallel
• Working with remote machines
2. SQL & Python Assignments
Machine Learning
• Data Processing in Python
• Data Processing in SQL
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
15
Tuesday - 6:30 PM Thursday - 6:30 PM
3. Titanic Survivors
• Who is most likely to survive the Titanic
disaster?
4. Classify recipes by ingredien
Machine Learning
• Analyzing ingredients to identify the
origin of cuisine
• Data munging
Week 2
Decision Trees
• Motivation for Decision Trees
• ID3, C4.5 and CART
• Entropy, Information Gain
• Pruning and Purging
• Trees in Actions
Text Mining / Naïve Bayes
• Motivation for Text Mining
• Working with unstructured datasets
• Tokenization and Standardization of text
• Naïve Bayes
• Applications and Results
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
16
Recommendation Engine
• Motivation for recommendation Engines
• Sparse Matrices operations
• Manhattan Distance, Euclidean Distance,
Cosine Distance
• Similarity Matrices and results
Tuesday - 6:30 PM Thursday - 6:30 PM
5. Predict Customer Churn
• Data Munging
• Telecom customer churn model development
• Validate the model
6. Collaborative Filter
Machine Learning
• Identify similar analytical topics based
on hacker news feed
Week 3
Random Forest
• Motivation for Random Forest
• Vote by democracy / Variable Importance
• Random Forest in Action
• Industry Use Cases
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
17
Tuesday - 6:30 PM Thursday - 6:30 PM
Principal Component Analysis
• Motivation for Principal Component Analysis
• Curse of dimensionality
• Best Practices for dimensionality reduction
• Use cases and applications
7. GBM Assignment
• Data Munging
• Telecom customer churn model development
• Compare Random Forest and GBM
8. Image processing
Machine Learning
• Reduce dimensions in image data
• Classify images in categories
Week 4
Gradient Boosting Machines (GBM)
• Motivation for GBM
• Boosting vs. Bagging
• Residual error and tree generations
• Metrics Search for best GBM Trees
• GBM in action
• Industry Use cases
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
18
Tuesday - 6:30 PM Thursday - 6:30 PM
9. Regression models
• Predict housing pricing
• Data munging
10. Clustering
Machine Learning
• Clustering around flower types
Week 5
Generalized Linear Models
• Linear Regression
• Regularization ( Ridge, Lasso )
• Logistic Regression
• Generalized Linear Models
• Feature Selections
• Industry Use Case
Clustering : Knn & K-means
• Motivation for Un-supervised learning methods
• Intuition behind Knn and Applications
• Intuition behind K-Means and Applications
• Multi class classification
• Hierarchical Clustering
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
19
Graph Theory and Stable Marriage
• Master key graph theory metrics
• Bi-partite graphs
• Visualizing graph with Gephi
• Motivations for matching algorithms with preferences
• Preferences with both parties
• Incomplete List and Ties
• Industry Use cases
Tuesday - 6:30 PM Thursday - 6:30 PM
11. Data Fusion
• Fuzzy matching on Names and Address
• Data Munging
12. Graph Theory, Stable Marriage
Machine Learning
• Determine stable pairs between two
groups based on preferences
Week 6
Data Fusion and Fuzzy Matching
• Merging data sets from multiple sources
• Probabilistic and Deterministic Matching
• String Fuzzy Matching
- Edit Distances, Jaro Winkler Distance
• Fuzzy Address Matching
• Swap-in / Swap-out analysis
• Industry Use Cases
© 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com
20
Enroll@bitbootcamp.com 917-819-0106 www.bitbootcamp.com
25 Broadway
Suite 1032
New York, NY
Contact Us
Made in NYC

More Related Content

What's hot

Starfish-A self tuning system for bigdata analytics
Starfish-A self tuning system for bigdata analyticsStarfish-A self tuning system for bigdata analytics
Starfish-A self tuning system for bigdata analytics
sai Pramoda
 
Hadoop Training
Hadoop TrainingHadoop Training
Hadoop Training
faizrashid1995
 
Big Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media AnalyticsBig Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media Analytics
hafeeznazri
 
Big data from the trenches
Big data from the trenchesBig data from the trenches
Big data from the trenches
Azrul MADISA
 
Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business, Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business,
saravana krishnamurthy
 
E tailing and big datas
E tailing and big datasE tailing and big datas
E tailing and big datas
Libu Thomas
 
Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)
S3 Infotech IEEE Projects
 
IBM Big Data Platform, 2012
IBM Big Data Platform, 2012IBM Big Data Platform, 2012
IBM Big Data Platform, 2012
Rob Thomas
 
Are you ready for BIG DATA?
Are you ready for BIG DATA?Are you ready for BIG DATA?
Are you ready for BIG DATA?
Putchong Uthayopas
 
Image Processing Phd Thesis Projects
Image Processing Phd Thesis ProjectsImage Processing Phd Thesis Projects
Image Processing Phd Thesis Projects
Phdtopiccom
 
Taking Data Science to Enterprise level
Taking Data Science to Enterprise levelTaking Data Science to Enterprise level
Taking Data Science to Enterprise level
Christos Charmatzis
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
Ujjwal Gupta
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
Muh Saleh
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
Srinath Perera
 
Tor Hovland: Taking a swim in the big data lake
Tor Hovland: Taking a swim in the big data lakeTor Hovland: Taking a swim in the big data lake
Tor Hovland: Taking a swim in the big data lake
AnalyticsConf
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
GreyCampus
 
Frequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoopFrequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoop
SWAMI06
 
Project Topics in Data Mining
Project Topics in Data MiningProject Topics in Data Mining
Project Topics in Data Mining
Phdtopiccom
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
Dr. Anita Goel
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
晨揚 施
 

What's hot (20)

Starfish-A self tuning system for bigdata analytics
Starfish-A self tuning system for bigdata analyticsStarfish-A self tuning system for bigdata analytics
Starfish-A self tuning system for bigdata analytics
 
Hadoop Training
Hadoop TrainingHadoop Training
Hadoop Training
 
Big Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media AnalyticsBig Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media Analytics
 
Big data from the trenches
Big data from the trenchesBig data from the trenches
Big data from the trenches
 
Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business, Guest Lecture on Big Data in Business,
Guest Lecture on Big Data in Business,
 
E tailing and big datas
E tailing and big datasE tailing and big datas
E tailing and big datas
 
Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)
 
IBM Big Data Platform, 2012
IBM Big Data Platform, 2012IBM Big Data Platform, 2012
IBM Big Data Platform, 2012
 
Are you ready for BIG DATA?
Are you ready for BIG DATA?Are you ready for BIG DATA?
Are you ready for BIG DATA?
 
Image Processing Phd Thesis Projects
Image Processing Phd Thesis ProjectsImage Processing Phd Thesis Projects
Image Processing Phd Thesis Projects
 
Taking Data Science to Enterprise level
Taking Data Science to Enterprise levelTaking Data Science to Enterprise level
Taking Data Science to Enterprise level
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Tor Hovland: Taking a swim in the big data lake
Tor Hovland: Taking a swim in the big data lakeTor Hovland: Taking a swim in the big data lake
Tor Hovland: Taking a swim in the big data lake
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Frequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoopFrequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoop
 
Project Topics in Data Mining
Project Topics in Data MiningProject Topics in Data Mining
Project Topics in Data Mining
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
 

Similar to BitBootCamp Evening Classes

Day 00 - Introduction to machine learning with big data
Day 00 - Introduction to machine learning with big dataDay 00 - Introduction to machine learning with big data
Day 00 - Introduction to machine learning with big data
ssusere5ddd6
 
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
ICRISAT
 
Hadoop and SAP BI
Hadoop and SAP BI   Hadoop and SAP BI
Hadoop and SAP BI
Praveen Kumar (Tyagi)
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
XanGwaps
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
Cambridge Semantics
 
Crafting bigdatabenchmarks
Crafting bigdatabenchmarksCrafting bigdatabenchmarks
Crafting bigdatabenchmarks
Tilmann Rabl
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
Bob Hardaway
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1
Aravindharamanan S
 
Bigdata-Intro.pptx
Bigdata-Intro.pptxBigdata-Intro.pptx
Bigdata-Intro.pptx
smitasatpathy2
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
DataWorks Summit
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
Trieu Nguyen
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Perficient, Inc.
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
Bhupesh Bansal
 
Machine learning in python course contents
Machine learning in python course contentsMachine learning in python course contents
Machine learning in python course contents
MRUNALINI
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 

Similar to BitBootCamp Evening Classes (20)

Day 00 - Introduction to machine learning with big data
Day 00 - Introduction to machine learning with big dataDay 00 - Introduction to machine learning with big data
Day 00 - Introduction to machine learning with big data
 
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
ICRISAT Global Planning Meeting 2019: Research Data Management by Abhishek Ra...
 
Hadoop and SAP BI
Hadoop and SAP BI   Hadoop and SAP BI
Hadoop and SAP BI
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
Crafting bigdatabenchmarks
Crafting bigdatabenchmarksCrafting bigdatabenchmarks
Crafting bigdatabenchmarks
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1Eecs6893 big dataanalytics-lecture1
Eecs6893 big dataanalytics-lecture1
 
Machine learninginspark
Machine learninginsparkMachine learninginspark
Machine learninginspark
 
Bigdata-Intro.pptx
Bigdata-Intro.pptxBigdata-Intro.pptx
Bigdata-Intro.pptx
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 
Machine learning in python course contents
Machine learning in python course contentsMachine learning in python course contents
Machine learning in python course contents
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 

Recently uploaded

Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 

Recently uploaded (20)

Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 

BitBootCamp Evening Classes

  • 1.
  • 2. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 2 Algorithms The Brains – Introduction to Data Science – Data Munging & Fusion – Text Mining • Naïve Bayes – Recommendation Engines – Principal Component Analysis – Classification • Decision Trees • Random Forest • Gradient Boosting Machines – Generalized Linear Models – Clustering • KNN • K-Means – Graph Theory – Stable Marriage Hadoop Big Data Core Engineering Our Training Offerings Skills you need
  • 3. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 3 Training Overview Evening Classes Big DataBig Data Track1Big Data Big Data Track 2Machine Learning
  • 4. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 4 Big Data Training 4 week intensive big data Evening Classes Week 1 Week 2 Week 3 Week 4 Self Study Certifications Complete the industry standard Hadoop certification For Data Science Track1
  • 5. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 5 Machine Learning Training 6 Week Data Science Evening Classes Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Introduction to Machine Learning Recommendation Engines Collaborative Filtering Gradient Boosting Machines For Data Science Data Fusion and Fuzzy Matching Principal Component Analysis Graph Theory and Stable Marriage Generalized Linear Models Linear Regression Regularization Logistic Regression Decision Trees Text Mining Naive Bayes Random Forests Clustering Knn K-Means Data Aggregation Project Data Science Project Career Counseling Track2
  • 6. Big DataBig Data Track1Big Data
  • 7. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 7 Big Data Training 4 week intensive big data training Week 1 Week 2 Week 3 Week 4 Self Study Certifications Complete the industry standard Hadoop certification For Data Science Track1
  • 8. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 8 Week 1 Introductions • Motivation for Big Data • Unix for Data Science • Pushing and Pulling data from remote servers • Columnar Compressions • Extended Data Dictionary Monday - 6:30 PM Wednesday - 6:30 PM Pulling and Processing Data • SQL overview • SQL design patterns for data analytics o Pivot Tables o Aggregation o Network Analysis Unix Assignments • Process data in parallel • Working with remote Machines SQL Assignments Big Data Training Master the basics • Five key design patterns • Joins, Aggregation, Temp Tables, Indexes, Functions
  • 9. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 9 Cluster Setup • Introduction to Big Data Ecosystem • Acquire 5 machines in AWS • Prepare machines for Hadoop • Setup 5 – 10 Node Cluster • Say Hello to Hadoop Monday - 6:30 PM Wednesday - 6:30 PM Introduction Hadoop • Motivation for Hadoop • HDFS • ETL in Hadoop with large dataset • SQOOP • OOZIE • Hadoop Streaming Cluster Setup Assignment • Setup Cluster in cloud • Develop automation scripts ETL In Hadoop Big Data Training Spin up the cluster • N Gram data in Hadoop • Develop ETL jobs in cluster Week 2
  • 10. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 10 Hive • Motivation for hive • Hive architecture • Aggregation and data selection • Hive and Python Integration Monday - 6:30 PM Wednesday - 6:30 PM Advanced Hive • Hive Jobs and Variables • Custom Functions • Custom data types • Indexing and Performance issues Hive Assignment • Data aggregation Hive Assignment 2 Big Data Training Wrangle millions of records in Hadoop • N Gram data in Hadoop • Develop ETL jobs in cluster Week 3
  • 11. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 11 Hadoop Map Reduce • Motivation for Map Reduce • Map Reduce in action • Map Reduce API • Splitter and Combiners • Custom data format Monday - 6:30 PM Wednesday - 6:30 PM Advanced Map Reduce • Distributed Joins • Data Compression in Map Reduce • Optimizations • Debugging and Tracing M/R Assignment • Data aggregation • Extended Data Dictionaries M/R Assignment 2 Big Data Training Hadoop under the hood with Map Reduce • N Gram data in Hadoop • Develop ETL jobs in cluster Week 4
  • 12. Big Data Track 2Machine Learning
  • 13. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 13 Machine Learning Training 6 Week Data Science Evening Classes Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Introduction to Machine Learning Recommendation Engines Collaborative Filtering Gradient Boosting Machines For Data Science Data Fusion and Fuzzy Matching Principal Component Analysis Graph Theory and Stable Marriage Generalized Linear Models Linear Regression Regularization Logistic Regression Decision Trees Text Mining Naive Bayes Random Forests Clustering Knn K-Means Data Aggregation Project Data Science Project Career Counseling Track2
  • 14. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 14 Week 1 Introduction to ML & Unix • Motivation for Machine Learning (ML) • Geometric , Probabilistic and Logical Models • Standardized ML Model lifecycle • Unix for Data Science • Pushing and Pulling data from remote servers • Extended Data Dictionary Tuesday - 6:30 PM Thursday - 6:30 PM Python for Data Science • Thinking in Python • Python design patterns for data analytics • Pandas • Data Frames • Aggregations • Scripting in Python 1. Unix Assignments • Data Processing in UNIS • Data Processing in parallel • Working with remote machines 2. SQL & Python Assignments Machine Learning • Data Processing in Python • Data Processing in SQL
  • 15. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 15 Tuesday - 6:30 PM Thursday - 6:30 PM 3. Titanic Survivors • Who is most likely to survive the Titanic disaster? 4. Classify recipes by ingredien Machine Learning • Analyzing ingredients to identify the origin of cuisine • Data munging Week 2 Decision Trees • Motivation for Decision Trees • ID3, C4.5 and CART • Entropy, Information Gain • Pruning and Purging • Trees in Actions Text Mining / Naïve Bayes • Motivation for Text Mining • Working with unstructured datasets • Tokenization and Standardization of text • Naïve Bayes • Applications and Results
  • 16. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 16 Recommendation Engine • Motivation for recommendation Engines • Sparse Matrices operations • Manhattan Distance, Euclidean Distance, Cosine Distance • Similarity Matrices and results Tuesday - 6:30 PM Thursday - 6:30 PM 5. Predict Customer Churn • Data Munging • Telecom customer churn model development • Validate the model 6. Collaborative Filter Machine Learning • Identify similar analytical topics based on hacker news feed Week 3 Random Forest • Motivation for Random Forest • Vote by democracy / Variable Importance • Random Forest in Action • Industry Use Cases
  • 17. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 17 Tuesday - 6:30 PM Thursday - 6:30 PM Principal Component Analysis • Motivation for Principal Component Analysis • Curse of dimensionality • Best Practices for dimensionality reduction • Use cases and applications 7. GBM Assignment • Data Munging • Telecom customer churn model development • Compare Random Forest and GBM 8. Image processing Machine Learning • Reduce dimensions in image data • Classify images in categories Week 4 Gradient Boosting Machines (GBM) • Motivation for GBM • Boosting vs. Bagging • Residual error and tree generations • Metrics Search for best GBM Trees • GBM in action • Industry Use cases
  • 18. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 18 Tuesday - 6:30 PM Thursday - 6:30 PM 9. Regression models • Predict housing pricing • Data munging 10. Clustering Machine Learning • Clustering around flower types Week 5 Generalized Linear Models • Linear Regression • Regularization ( Ridge, Lasso ) • Logistic Regression • Generalized Linear Models • Feature Selections • Industry Use Case Clustering : Knn & K-means • Motivation for Un-supervised learning methods • Intuition behind Knn and Applications • Intuition behind K-Means and Applications • Multi class classification • Hierarchical Clustering
  • 19. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 19 Graph Theory and Stable Marriage • Master key graph theory metrics • Bi-partite graphs • Visualizing graph with Gephi • Motivations for matching algorithms with preferences • Preferences with both parties • Incomplete List and Ties • Industry Use cases Tuesday - 6:30 PM Thursday - 6:30 PM 11. Data Fusion • Fuzzy matching on Names and Address • Data Munging 12. Graph Theory, Stable Marriage Machine Learning • Determine stable pairs between two groups based on preferences Week 6 Data Fusion and Fuzzy Matching • Merging data sets from multiple sources • Probabilistic and Deterministic Matching • String Fuzzy Matching - Edit Distances, Jaro Winkler Distance • Fuzzy Address Matching • Swap-in / Swap-out analysis • Industry Use Cases
  • 20. © 2015 Hudson Data Corp. All Rights Reserved. www.bitbootcamp.com 20 Enroll@bitbootcamp.com 917-819-0106 www.bitbootcamp.com 25 Broadway Suite 1032 New York, NY Contact Us Made in NYC