This document outlines a 20 module, 50 hour course from zekeLabs to become a data scientist. The course covers topics like numerical computation with NumPy, essential statistics, machine learning algorithms like linear regression, logistic regression, naive bayes, trees, and ensemble methods. It also discusses model evaluation, feature engineering, deployment and scaling. The document provides details on the topics covered in each module and contact information for the course.
Online Machine Learning: introduction and examplesFelipe
In this talk I introduce the topic of Online Machine Learning, which deals with techniques for doing machine learning in an online setting, i.e. where you train your model a few examples at a time, rather than using the full dataset (off-line learning).
Machine learning for IoT - unpacking the blackboxIvo Andreev
Have you ever considered Machine Learning as a black box? It sounds as a kind of magic happening. Although being one among many solutions available, Azure ML has proved to be a great balance between flexibility, usability and affordable price. But how does Azure ML compare with the other ML providers? How to choose the appropriate algorithm? Do you understand the key performance indicators and how to improve the quality of your models? The session is about understanding the black box and using it for IoT workload and not only.
IoT with Azure Machine Learning and InfluxDBIvo Andreev
Devices from the IoT realm generate data in a rate and magnitude that make it practically impossible to retrieve valuable information without support of adequate AI engines. Although being one among many solutions available, Azure ML has proved to be a great balance between flexibility, usability and affordable price.
Storing and serving billions of data measurements over time is also a non-trivial task addressed by the special class of Time Series DBs. Out of these, InfluxDB has the largest popularity, provides comprehensive documentation and above all - is available open source.
This session is about managing and understanding IoT data.
QCon Rio - Machine Learning for EveryoneDhiana Deva
Já não são mais necessários supercomputadores e times de PhDs do MIT para a criação de modelos preditivos baseados em dados. Estamos presenciando inovações em Aprendizado de Máquina que estão tornando este campo cada vez mais acessível.
Esta palestra tem como objetivo desmistificar o aprendizado de máquina, através da exposição de conceitos e uso de uma série de tecnologias.
Serão abordados os tipos de problemas desta área(classificação, regressão, clusterização, redução de dimensionalidade, etc.), suas as etapas (normalização, treinamento, otimização, regularização, etc.) e seus algoritmos, desde regressão linear, k-means, passando por árvores de decisão e até redes neurais, sempre aplicadas a problemas reais.
Na palestra, também conheceremos ferramentas como Sckit-learn, Pandas, R, MATLAB e Amazon Machine Learning, além de uma forma para praticar e experimentar estas ideias através de competições como o Kaggle.
Online Machine Learning: introduction and examplesFelipe
In this talk I introduce the topic of Online Machine Learning, which deals with techniques for doing machine learning in an online setting, i.e. where you train your model a few examples at a time, rather than using the full dataset (off-line learning).
Machine learning for IoT - unpacking the blackboxIvo Andreev
Have you ever considered Machine Learning as a black box? It sounds as a kind of magic happening. Although being one among many solutions available, Azure ML has proved to be a great balance between flexibility, usability and affordable price. But how does Azure ML compare with the other ML providers? How to choose the appropriate algorithm? Do you understand the key performance indicators and how to improve the quality of your models? The session is about understanding the black box and using it for IoT workload and not only.
IoT with Azure Machine Learning and InfluxDBIvo Andreev
Devices from the IoT realm generate data in a rate and magnitude that make it practically impossible to retrieve valuable information without support of adequate AI engines. Although being one among many solutions available, Azure ML has proved to be a great balance between flexibility, usability and affordable price.
Storing and serving billions of data measurements over time is also a non-trivial task addressed by the special class of Time Series DBs. Out of these, InfluxDB has the largest popularity, provides comprehensive documentation and above all - is available open source.
This session is about managing and understanding IoT data.
QCon Rio - Machine Learning for EveryoneDhiana Deva
Já não são mais necessários supercomputadores e times de PhDs do MIT para a criação de modelos preditivos baseados em dados. Estamos presenciando inovações em Aprendizado de Máquina que estão tornando este campo cada vez mais acessível.
Esta palestra tem como objetivo desmistificar o aprendizado de máquina, através da exposição de conceitos e uso de uma série de tecnologias.
Serão abordados os tipos de problemas desta área(classificação, regressão, clusterização, redução de dimensionalidade, etc.), suas as etapas (normalização, treinamento, otimização, regularização, etc.) e seus algoritmos, desde regressão linear, k-means, passando por árvores de decisão e até redes neurais, sempre aplicadas a problemas reais.
Na palestra, também conheceremos ferramentas como Sckit-learn, Pandas, R, MATLAB e Amazon Machine Learning, além de uma forma para praticar e experimentar estas ideias através de competições como o Kaggle.
Building machine learning muscle in your team & transitioning to make them do machine learning at scale. We also discuss about Spark & other relevant technologies.
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
Using Bayesian Optimization to Tune Machine Learning Models: In this talk we briefly introduce Bayesian Global Optimization as an efficient way to optimize machine learning model parameters, especially when evaluating different parameters is time-consuming or expensive. We will motivate the problem and give example applications.
We will also talk about our development of a robust benchmark suite for our algorithms including test selection, metric design, infrastructure architecture, visualization, and comparison to other standard and open source methods. We will discuss how this evaluation framework empowers our research engineers to confidently and quickly make changes to our core optimization engine.
We will end with an in-depth example of using these methods to tune the features and hyperparameters of a real world problem and give several real world applications.
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
In this talk I'll show how the Bayesian Optimization methods used by SigOpt, coupled with the incredibly scalable deep learning architecture provided with ncloud and neon, allow anyone it easily tune their models to quickly achieve higher accuracy. I'll walk through the techniques and show an explicit example with results.
This presentation briefs about machine learning technologies, its various learning methodologies, its types. Also it briefs about the Open Computer Vision, Graphics Processing Unit and CUDA Frameworks.
Sample Codes: https://github.com/davegautam/dotnetconfsamplecodes
Presentation on How you can get started with ML.NET. If you are existing .NET Stack Developer and Wanna use the same technology into Machine Learning, this slide focuses on how you can use ML.NET for Machine Learning.
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
I will share the vision and the production journey of how we build enterprise shared AI As A Service platforms with distributed deep learning technologies. Including those topics:
1) The vision of Enterprise Shared AI As A Service and typical AI services use cases at FinTech industry
2) The high level architecture design principles for AI As A Service
3) The technical evaluation journey to choose an enterprise deep learning framework with comparisons, such as why we choose Deep learning framework based on Spark ecosystem
4) Share some production AI use cases, such as how we implemented new Users-Items Propensity Models with deep learning algorithms with Spark,improve the quality , performance and accuracy of offer and campaigns design, targeting offer matching and linking etc.
5) Share some experiences and tips of using deep learning technologies on top of Spark , such as how we conduct Intel BigDL into a real production.
The Data Science Process - Do we need it and how to apply?Ivo Andreev
Machine learning is not black magic but a discipline that involves statistics, data science, analysis and hard work. From searching patterns and data preparation through applying and optimizing algorithms to obtaining usable predictions, one would need background and appropriate tools.
But do we need it, when there is already available AI as a service solution out there? Do we need to try hard with artificial neural networks? And if we decide to do so, what tools would be a safe bet?
In this session we will go through real world examples, mention key tools from Microsoft and open source world to do data science and machine learning and most importantly - we will provide a workflow and some best practices.
Containerization of your application is only the first step towards modernizing your application. Building cloud-native application requires other tools like Container orchestration platform, Service Mesh tool, Logging & Alert Monitoring tool and Visualization tools.
Real cloud-native platforms need to be equipped with the necessary tool-stack like Kubernetes, Istio, Prometheus, Grafana, and Kiali.
In this webinar, we will cover building a cloud-native platform from zero.
Take home from the webinar -
- What and Why of a cloud-native application
- Steps to build a cloud-native platform from scratch and its challenges
- A high-level overview of Istio, Prometheus, Grafana, and Kiali
- Integrating your cloud-native application with Istio, Prometheus, Grafana, and Kiali
- Live Demo - Deploy, Monitor, and control a full-fledged Microservice-based application.
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabszekeLabs Technologies
The combination of Docker and Kubernetes is quickly becoming the de-facto standard for building Microservices. Whether you are a developer or an architect you need to know how to bundle your application into Containers and Pods. Docker and Kubernetes give a lot of good features out of the box. To effectively leverage these features, you need to know - how to use them, what are some commonly used Pod design patterns and the best practices.
In this webinar, we will explore various such questions and their answers along with appropriate examples. Some of those questions would be-
1. When and how to build multi-container pods?
2. What are some of the well-adopted design patterns for pods?
3. What are some multi-pod design patterns?
4. How to use Lifecycle hooks, Init Containers and Health probes?
Github repo - https://github.com/ashishrpandey/pod-design-pattern-webinar
More Related Content
Similar to Master guide to become a data scientist
Building machine learning muscle in your team & transitioning to make them do machine learning at scale. We also discuss about Spark & other relevant technologies.
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
Using Bayesian Optimization to Tune Machine Learning Models: In this talk we briefly introduce Bayesian Global Optimization as an efficient way to optimize machine learning model parameters, especially when evaluating different parameters is time-consuming or expensive. We will motivate the problem and give example applications.
We will also talk about our development of a robust benchmark suite for our algorithms including test selection, metric design, infrastructure architecture, visualization, and comparison to other standard and open source methods. We will discuss how this evaluation framework empowers our research engineers to confidently and quickly make changes to our core optimization engine.
We will end with an in-depth example of using these methods to tune the features and hyperparameters of a real world problem and give several real world applications.
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
In this talk I'll show how the Bayesian Optimization methods used by SigOpt, coupled with the incredibly scalable deep learning architecture provided with ncloud and neon, allow anyone it easily tune their models to quickly achieve higher accuracy. I'll walk through the techniques and show an explicit example with results.
This presentation briefs about machine learning technologies, its various learning methodologies, its types. Also it briefs about the Open Computer Vision, Graphics Processing Unit and CUDA Frameworks.
Sample Codes: https://github.com/davegautam/dotnetconfsamplecodes
Presentation on How you can get started with ML.NET. If you are existing .NET Stack Developer and Wanna use the same technology into Machine Learning, this slide focuses on how you can use ML.NET for Machine Learning.
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
I will share the vision and the production journey of how we build enterprise shared AI As A Service platforms with distributed deep learning technologies. Including those topics:
1) The vision of Enterprise Shared AI As A Service and typical AI services use cases at FinTech industry
2) The high level architecture design principles for AI As A Service
3) The technical evaluation journey to choose an enterprise deep learning framework with comparisons, such as why we choose Deep learning framework based on Spark ecosystem
4) Share some production AI use cases, such as how we implemented new Users-Items Propensity Models with deep learning algorithms with Spark,improve the quality , performance and accuracy of offer and campaigns design, targeting offer matching and linking etc.
5) Share some experiences and tips of using deep learning technologies on top of Spark , such as how we conduct Intel BigDL into a real production.
The Data Science Process - Do we need it and how to apply?Ivo Andreev
Machine learning is not black magic but a discipline that involves statistics, data science, analysis and hard work. From searching patterns and data preparation through applying and optimizing algorithms to obtaining usable predictions, one would need background and appropriate tools.
But do we need it, when there is already available AI as a service solution out there? Do we need to try hard with artificial neural networks? And if we decide to do so, what tools would be a safe bet?
In this session we will go through real world examples, mention key tools from Microsoft and open source world to do data science and machine learning and most importantly - we will provide a workflow and some best practices.
Containerization of your application is only the first step towards modernizing your application. Building cloud-native application requires other tools like Container orchestration platform, Service Mesh tool, Logging & Alert Monitoring tool and Visualization tools.
Real cloud-native platforms need to be equipped with the necessary tool-stack like Kubernetes, Istio, Prometheus, Grafana, and Kiali.
In this webinar, we will cover building a cloud-native platform from zero.
Take home from the webinar -
- What and Why of a cloud-native application
- Steps to build a cloud-native platform from scratch and its challenges
- A high-level overview of Istio, Prometheus, Grafana, and Kiali
- Integrating your cloud-native application with Istio, Prometheus, Grafana, and Kiali
- Live Demo - Deploy, Monitor, and control a full-fledged Microservice-based application.
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabszekeLabs Technologies
The combination of Docker and Kubernetes is quickly becoming the de-facto standard for building Microservices. Whether you are a developer or an architect you need to know how to bundle your application into Containers and Pods. Docker and Kubernetes give a lot of good features out of the box. To effectively leverage these features, you need to know - how to use them, what are some commonly used Pod design patterns and the best practices.
In this webinar, we will explore various such questions and their answers along with appropriate examples. Some of those questions would be-
1. When and how to build multi-container pods?
2. What are some of the well-adopted design patterns for pods?
3. What are some multi-pod design patterns?
4. How to use Lifecycle hooks, Init Containers and Health probes?
Github repo - https://github.com/ashishrpandey/pod-design-pattern-webinar
Information Technology is nothing but a reflection of the needs of Business.
Before Industry 4.0, as IT professionals we were just 'coding' or 'decoding' the trend of Business. Any change in the Business scenario would shake the IT sector but the reverse was not true.
But now, after the Industry 4.0, due to High-Speed Internet boom, omniChannel presence of consumer needs, market consolidation, and above all - consumer psyche, the business service providers cannot wait for long to see their product in the market.
This is where there is a call for Process Change - from Waterfall to Agile.
WHAT THIS WEBINAR IS ALL ABOUT:
1. Discuss the macroscopic view of Business & Technology and how they beautifully merge together
2. How Agile is becoming more relevant to the current trend
3. What preparatory works are needed to get into an Agile perspective
4. The Agile StoryBoard - a walkthrough of concepts and terminologies
5. Do's and Don'ts of 'Team Agile'
6. Next Steps
Agenda
1. The changing landscape of IT Infrastructure
2. Containers - An introduction
3. Container management systems
4. Kubernetes
5. Containers and DevOps
6. Future of Infrastructure Mgmt
About the talk
In this talk, you will get a review of the components & the benefits of Container technologies - Docker & Kubernetes. The talk focuses on making the solution platform-independent. It gives an insight into Docker and Kubernetes for consistent and reliable Deployment. We talk about how the containers fit and improve your DevOps ecosystem and how to get started with containerization. Learn new deployment approach to effectively use your infrastructure resources to minimize the overall cost.
The slides talk about Docker and container terminologies but will also be able to see the big picture of where & how it fits into your current project/domain.
Topics that are covered:
1. What is Docker Technology?
2. Why Docker/Containers are important for your company?
3. What are its various features and use cases?
4. How to get started with Docker containers.
5. Case studies from various domains
What is Serverless?
How it evolved?
What are its features?
What are the tradeoffs?
Should I use serverless?
How is it different from the container as a service?
Our subject matter expert answered these in a technology conference hosted by one of our esteemed client that works in the domain of Marketing Data Analytics.
Terraform is an Infrastructure Automation tools. This can work equally good for on-premises, public cloud, private cloud, hybrid-cloud and multi-cloud infrastructure.
Visit us for more at www.zekeLabs.com
Terraform is an Infrastructure Automation tools. This can work equally good for on-premises, public cloud, private cloud, hybrid-cloud and multi-cloud infrastructure.
Visit us for more at www.zekeLabs.com
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
2. “Goal - Become a Data Scientist”
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
“A Dream becomes a Goal when action is taken towards its achievement” - Bo Bennett
7. 2. Essential Statistics & Maths - 5 hrs
● Relationships - Deterministic vs Statistical
● Statistics - Descriptive vs Inferential
● Sampling
● Variables
● Distribution
● Summarizing Distribution
● Correlation, Collinearity, Causation
● Probability
● Normal Distribution
● Confidence Interval
● Hypothesis Testing
● Calculus
● Linear Algebra
● Matrix Ops
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
8. 3. Pandas & scipy for Data Wrangling & Statistics - 5 hrs
● Series vs DataFrames
● Loading CSV, JSON, DB etc.
● Access & Filters
● DataFrame
● Exploratory Data Analysis
● Finding & Handling Missing Data
● Duplicate Handling
● Rolling averages
● Applying functions
● Handling Time Series Data
● Merging & Grouping Data
● Pivot Table & Crosstab
● Random data using scipy
● Comparing datasets using scipy
● Analyzing sample using scipy
● Kernel Density Estimation using scipy
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
9. 4. Data Visualization - 4 hrs
● Understanding matplotlib
● Plotting Quantitative data
● Plotting Qualitative data
● Histograms
● Frequency Polygons
● Box-Plots
● Bar charts
● Line Graphs
● Scatter Plots
● 3D Plots
● Exploring seaborn & Bokeh
● Introduction to Tableau
● Plotting scatter plot
● Bubble chart
● Bullet chart
● Gantt chart
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
13. 8. Feature Selection 2 hrs
● SelectKBest for Regression
● SelectKBest for Classification
● Variance Threshold
● Drop Highly correlated features
● Dropping based on non null values
● SelectFromModel
● Feature Selection using RandomForest
● Based on correlation with target
● Univariate Feature Selection
● Recursive Feature Elimination
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
14. 9. Model Evaluation - 1 hr
● Why do we need to evaluate at all ?
● Metrics for Classification
● Metrics for Regression
● Clustering matrices
● Probability Calibration
● Pairwise matrices
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
15. 10. Model Selection 1 hr
● Motivation
● KFold
● StratifiedKFold
● Splitting training testing data
● Cross Validate
● GridSearchCV
● RandomizedSearchCV
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
16. 11. Linear Regression - 3 hrs
● Understanding Ordinary Least Squares
● Cost Function
● Bias & Variance
● Coefficients & Intercept
● Simple Linear Regression
● Polynomial Linear Regression
● Ridge
● Lasso
● Elastic Net
● Stochastic Gradient Descent
● Robustness Regression
● Problem - Insurance Payout Prediction
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
17. 12. Logistic Regression - 2 hrs
● Basics of Logistic Regression
● Sigmoid
● Cost Function
● Understanding important
hyperparameters
● Predicting linear separator
● Predicting nonlinear decision boundary
● Handling Imbalanced classes
● Project - Predicting if income is less than
50K or more
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
18. 13. Naive Bayes - 2 hrs
● Bayes Theorem
● Gaussian Naive Bayes
● Multinomial Naive Bayes
● Bernoulli’s Naive Bayes
● Out-of-core naive bayes using partial-fit
● Limitations of naive bayes
● Choosing right
● Problem - Mail data classification
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
19. 14. Trees - 2 hrs
● Understanding Information Theory
● Entropy
● Decision Tree creation
● Tree for Classification
● Tree for Regression
● Advantages of Decision Tree
● Important Hyper-parameters
● Limitations of Decision Tree
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
20. 15. Ensemble Methods - 3 hrs
● Bagging vs Boosting
● Forests
● AdaBoost
● XGBoost
● Gradient Tree Boosting
● Voting Classifier
● Role weak estimators play
● Problem - Attack detection on network
data
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
22. 17. Support Vector Machine 3 hrs
● Understanding SVM
● Classification
● Regression
● OneClassSVM
● Imbalanced Classes
● Kernel Functions
● Understanding Maths behind it
● Problem - Face recognition
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
23. 17b. Novelty & Outlier Detection 1 hr
● Novelty vs Outlier
● OneClassSVM
● Fitting data in Elliptical Envelop
● Isolation Forest
● Local Outlier Factor
● When to use what
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
25. 19. Deployment & Scaling - 3 hrs
● Bottom-Up approach for dealing with large
data
● Extracting features using Hashing
Techniques
● Incremental learning
● Serializing data for quicker access
● Running as a Python .egg or wheel
● Model behind REST server
● Persisting & Loading model
● Deploying model behind web application
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
26. 20. Use Cases
● Credit Risk - Predicting Defaulters
● Amazon Food Review Sentiment
● Predicting Employee Attrition
● Identify characters on unknown language
● Predicting insurance payout amount
● Text Categorization
● Churn Prediction
● Attack Prediction on network data
● Identifying faces
● Predict patient stay in hospital
info@zekeLabs.com | www.zekeLabs.com | +91 8095465880
31. Visit : www.zekeLabs.com for more details
Let us know how can we help your organization to Upskill the employees to
stay updated in the ever-evolving IT Industry.
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com