Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Data Science Tutorial | Introductio... by Edureka! 4043 views
- What Is Data Science? Data Science ... by Edureka! 5096 views
- Introduction to Data Science by Edureka! 2872 views
- Django Rest Framework | How to Crea... by Edureka! 2140 views
- What Is DevOps? | Introduction To D... by Edureka! 2200 views
- Big Data Use Cases | Hadoop Tutoria... by Edureka! 1749 views

This Edureka Data Science Training will help you understand what is Data Science and you will learn about different Data Science components and concepts. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts. Below are the topics covered in this tutorial:

1. What is Data Science?

2. Job Roles in Data Science

3. Components of Data Science

4. Concepts of Statistics

5. Power of Data Visualization

6. Introduction to Machine Learning using R

7. Supervised & Unsupervised Learning

8. Classification, Clustering & Recommenders

9. Text Mining & Time Series

10. Deep Learning

To take a structured training on Data Science, you can check complete details of our Data Science Certification Training course here: https://goo.gl/OCfxP2

No Downloads

Total views

1,766

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

4

Comments

6

Likes

19

No notes for slide

- 1. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
- 2. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What to expect? What is Data Science? Job Roles in Data Science Components of Data Science Concepts of Statistics Power of Data Visualization Introduction to Machine Learning using R Supervised & Unsupervised Learning Classification, Clustering & Recommenders Text Mining & Time Series Deep Learning
- 3. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Data Science?
- 4. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Data Science? Data Science involves using automated methods to analyze massive amounts of data and to extract knowledge from them. By combining aspects of statistics, computer science, applied mathematics and visualization, data science can turn the vast amounts of data the digital age generates into new insights and new knowledge. Data Science Components
- 5. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science
- 6. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager
- 7. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Cleans and organizes big data. Works on distributed computing and predictive modeling. Languages: R, SAS, Python, Matlab, SQL, Hive, Pig and Spark
- 8. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Collects, processes and performs statistical data analyses Languages: R, Python, HTML, JS, C, C++ and SQL
- 9. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Creates blueprints for data management systems to integrate, centralize, protect and maintain data sources. Languages: SQL, XML, Hive, Pig and Spark
- 10. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Develops, constructs, tests and maintains architectures such as databases and large-scale processing systems. Languages: SQL, Hive, Pig, R, Matlab, SAS, Python, Java, Ruby, C++ and Perl
- 11. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Collects, analyses and interprets qualitative and quantitative data with statistical theories and methods Languages: R, SA, SPSS, Matlab, Tableau, Stata, Python, Perl, Hive, Spark and SQL
- 12. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Ensures that the database is available to all relevant users, is performing properly and is being kept safe Languages: SQL, Java, Ruby on Rails, XML, C# and Python
- 13. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Improves business processes as intermediary between business and IT Languages: SQL, C, Excel, Tableau, Power BI and Python
- 14. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Manages a team of analysts and data scientists Languages: SQL, R, SAS, Python, Matlab and Java
- 15. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Components of Data Science
- 16. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Components of Data Science Data Science has the following components. Statistics Visualization Machine Learning Deep Learning Statistics
- 17. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Concepts of Statistics
- 18. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Concepts of Statistics Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation and organization of data. Statistics began in the ancient civilization, going back at least to the 5th century BC, but it was not until the 18th century that it started to draw more heavily from calculus and probability theory. Collection Analysis Interpretation Presentation DATA Figure: Concepts of Statistics Visual RepresentationPredictive Analysis
- 19. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Power of Visualization
- 20. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Scope of Visual Analytics
- 21. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Data Visualization Integrate Different Data Sets Analyze Visualize
- 22. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Machine Learning using R
- 23. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Machine Learning using R Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data. Closely related to computational statistics. Used to devise complex models and algorithms that lend themselves to a prediction which in commercial use is known as predictive analytics. Speech Recognition Face Recognition Anti Virus Weather Prediction
- 24. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Supervised & Unsupervised Learning
- 25. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Supervised & Unsupervised Learning Supervised Learning Supervised learning is the machine learning task of inferring a function from labelled training data. The training data consists of a set of training examples. E.g. If you built a fruit classifier, the labels will be “this is an orange, this is an apple and this is a banana”, based on showing the classifier examples of apples, oranges and bananas. Algorithms: SVM, Regression, Naive Bayes, Decision Trees, K-nearest Neighbour Algorithm & Neural Networks Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labelled responses. Unsupervised Learning E.g. In the same example, a fruit clustering will categorize as “fruits with soft skin and lots of dimples”, “fruits with shiny hard skin” and “elongated yellow fruits”. Algorithms: Clustering, Anomaly Detection, Neural Networks and Latent Variable Models
- 26. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Reinforcement Learning Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. It differs from standard supervised learning in that correct input/output pairs are never presented nor sub-optimal actions explicitly corrected. Reinforcement Learning Applications: Robots used in Manufacturing, Advertising, Inventory Management, Player vs AI Games.
- 27. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classifiers
- 28. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Classification Classification is the problem of identifying to which set of categories a new observation belongs. Classification belongs to the supervised learning. It is based on the training set of data containing observations. Figure: Examples of Classification
- 29. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Radial Basis Function (RBF) Non Linear PerceptronLinear Classification Algorithms Classifier Quadratic Linear SVM Logistic Regression Naive Bayes Neural Networks Decision Trees Kernel Estimation Recurrent Neural Network (RNN) Modular Neural Network
- 30. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classification Example Let us look at how a classification algorithm works. Here is an example of Linear Regression using alternating least squares method.
- 31. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Clustering
- 32. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Clustering is the problem of categorizing objects to which different groups without any prior information about labels or classes. Clustering belongs to the unsupervised learning. Clustering
- 33. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems
- 34. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems Recommender System is a subclass of information filtering system that seeks to predict the "rating" or "preference" that a user would give to an item. Recommendations can be everywhere from Netflix & BookMyShow movies to YouTube videos, Amazon products to Goibibo hotels, Xbox games to Zomato restaurants. Figure: Companies using Recommendation Systems
- 35. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems - Example Recommendation systems work in two ways: 1. Collaborative Filtering: Collaborative filtering approaches building a model from a user's past behaviour as well as similar decisions made by other users. 2. Content-based Filtering: Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties. Figure: Movie Recommendation in IMDb
- 36. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining
- 37. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text Clustering Text Categorization Sentiment Analysis Concept Extraction Document Summarization
- 38. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text categorization (a.k.a. text classification) is the task of assigning predefined categories to free-text documents. E.g. News categories, academic paper categories. Text Categorization
- 39. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Document Summarization Multi-document summarization is an automatic procedure aimed at extraction of information from multiple texts written about the same topic.
- 40. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text Clustering Text Clustering is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering
- 41. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Concept Extraction Concept mining is an activity that results in the extraction of concepts from artifacts. Solutions to the task typically involve aspects of artificial intelligence and statistics, such as data mining and text mining.
- 42. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Sentiment Analysis Sentiment Analysis is the process of determining whether a piece of writing is positive, negative or neutral. Use Case: Twitter Sentiment Analysis, Customer Sentiment Analysis
- 43. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Time Series
- 44. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Time Series A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average. Ocean Tides Sunspots Stock Market Prices
- 45. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning
- 46. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Before moving ahead let us look at some of the drawbacks of machine learning. 1. Traditional ML algorithms are not useful while working with high dimensional data, that is where we have a large number of inputs and outputs. For example, in case of handwriting recognition we have large amount of input where we will have different type of inputs associated with different type of handwriting. 2. Second major challenge with traditional machine learning models is a process called feature extraction. Specifically, the programmer needs to tell the computer what kinds of things it should look for so as to make more accurate decision.
- 47. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Deep learning is one of the only methods by which we can circumvent the challenges of feature extraction in machine learning. This is because deep learning models are capable of learning to focus on the right features by themselves, requiring little guidance from the programmer. Therefore, we can say that Deep Learning is: 1. A collection of statistical machine learning techniques 2. Used to learn feature hierarchies 3. Often based on artificial neural networks Artificial Intelligence Machine Learning Deep Learning
- 48. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Examples Figure: Face Recognition using Deep Learning
- 49. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Examples Speech Recognition Self Driving Cars Automatic Translation
- 50. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary
- 51. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary Power of Data VisualizationConcepts of Statistics Machine Learning using R Components of Data ScienceJob Roles in Data ScienceWhat is Data Science?
- 52. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary Time SeriesText Mining Deep Learning Recommendation SystemsClusteringClassification Clustering is the problem of categorizing objects to which different groups without any prior information about labels or classes. Clustering belongs to the unsupervised learning.
- 53. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Thank You … Questions/Queries/Feedback

No public clipboards found for this slide

Login to see the comments