Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Science Training | Data Science Tutorial | Data Science Certification | Edureka

This Edureka Data Science Training will help you understand what is Data Science and you will learn about different Data Science components and concepts. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts. Below are the topics covered in this tutorial:

1. What is Data Science?
2. Job Roles in Data Science
3. Components of Data Science
4. Concepts of Statistics
5. Power of Data Visualization
6. Introduction to Machine Learning using R
7. Supervised & Unsupervised Learning
8. Classification, Clustering & Recommenders
9. Text Mining & Time Series
10. Deep Learning

To take a structured training on Data Science, you can check complete details of our Data Science Certification Training course here: https://goo.gl/OCfxP2

  • Login to see the comments

Data Science Training | Data Science Tutorial | Data Science Certification | Edureka

  1. 1. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
  2. 2. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What to expect?  What is Data Science?  Job Roles in Data Science  Components of Data Science  Concepts of Statistics  Power of Data Visualization  Introduction to Machine Learning using R  Supervised & Unsupervised Learning  Classification, Clustering & Recommenders  Text Mining & Time Series  Deep Learning
  3. 3. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Data Science?
  4. 4. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Data Science?  Data Science involves using automated methods to analyze massive amounts of data and to extract knowledge from them.  By combining aspects of statistics, computer science, applied mathematics and visualization, data science can turn the vast amounts of data the digital age generates into new insights and new knowledge. Data Science Components
  5. 5. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science
  6. 6. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager
  7. 7. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Cleans and organizes big data. Works on distributed computing and predictive modeling. Languages: R, SAS, Python, Matlab, SQL, Hive, Pig and Spark
  8. 8. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Collects, processes and performs statistical data analyses Languages: R, Python, HTML, JS, C, C++ and SQL
  9. 9. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Creates blueprints for data management systems to integrate, centralize, protect and maintain data sources. Languages: SQL, XML, Hive, Pig and Spark
  10. 10. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Develops, constructs, tests and maintains architectures such as databases and large-scale processing systems. Languages: SQL, Hive, Pig, R, Matlab, SAS, Python, Java, Ruby, C++ and Perl
  11. 11. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Collects, analyses and interprets qualitative and quantitative data with statistical theories and methods Languages: R, SA, SPSS, Matlab, Tableau, Stata, Python, Perl, Hive, Spark and SQL
  12. 12. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Ensures that the database is available to all relevant users, is performing properly and is being kept safe Languages: SQL, Java, Ruby on Rails, XML, C# and Python
  13. 13. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Improves business processes as intermediary between business and IT Languages: SQL, C, Excel, Tableau, Power BI and Python
  14. 14. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Job Roles of Data Science Data Scientist Data Analyst Data Architect Statistician Data Engineer Database Administrator Business Analyst Data & Analyst Manager Role: Manages a team of analysts and data scientists Languages: SQL, R, SAS, Python, Matlab and Java
  15. 15. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Components of Data Science
  16. 16. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Components of Data Science Data Science has the following components. Statistics Visualization Machine Learning Deep Learning Statistics
  17. 17. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Concepts of Statistics
  18. 18. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Concepts of Statistics  Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation and organization of data.  Statistics began in the ancient civilization, going back at least to the 5th century BC, but it was not until the 18th century that it started to draw more heavily from calculus and probability theory. Collection Analysis Interpretation Presentation DATA Figure: Concepts of Statistics Visual RepresentationPredictive Analysis
  19. 19. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Power of Visualization
  20. 20. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Scope of Visual Analytics
  21. 21. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Data Visualization Integrate Different Data Sets Analyze Visualize
  22. 22. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Machine Learning using R
  23. 23. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Machine Learning using R  Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data.  Closely related to computational statistics.  Used to devise complex models and algorithms that lend themselves to a prediction which in commercial use is known as predictive analytics. Speech Recognition Face Recognition Anti Virus Weather Prediction
  24. 24. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Supervised & Unsupervised Learning
  25. 25. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Supervised & Unsupervised Learning Supervised Learning Supervised learning is the machine learning task of inferring a function from labelled training data. The training data consists of a set of training examples. E.g. If you built a fruit classifier, the labels will be “this is an orange, this is an apple and this is a banana”, based on showing the classifier examples of apples, oranges and bananas. Algorithms: SVM, Regression, Naive Bayes, Decision Trees, K-nearest Neighbour Algorithm & Neural Networks Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labelled responses. Unsupervised Learning E.g. In the same example, a fruit clustering will categorize as “fruits with soft skin and lots of dimples”, “fruits with shiny hard skin” and “elongated yellow fruits”. Algorithms: Clustering, Anomaly Detection, Neural Networks and Latent Variable Models
  26. 26. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Reinforcement Learning  Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.  It differs from standard supervised learning in that correct input/output pairs are never presented nor sub-optimal actions explicitly corrected. Reinforcement Learning Applications: Robots used in Manufacturing, Advertising, Inventory Management, Player vs AI Games.
  27. 27. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classifiers
  28. 28. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Classification  Classification is the problem of identifying to which set of categories a new observation belongs.  Classification belongs to the supervised learning.  It is based on the training set of data containing observations. Figure: Examples of Classification
  29. 29. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Radial Basis Function (RBF) Non Linear PerceptronLinear Classification Algorithms Classifier Quadratic Linear SVM Logistic Regression Naive Bayes Neural Networks Decision Trees Kernel Estimation Recurrent Neural Network (RNN) Modular Neural Network
  30. 30. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classification Example  Let us look at how a classification algorithm works.  Here is an example of Linear Regression using alternating least squares method.
  31. 31. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Clustering
  32. 32. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING  Clustering is the problem of categorizing objects to which different groups without any prior information about labels or classes.  Clustering belongs to the unsupervised learning. Clustering
  33. 33. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems
  34. 34. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems  Recommender System is a subclass of information filtering system that seeks to predict the "rating" or "preference" that a user would give to an item.  Recommendations can be everywhere from Netflix & BookMyShow movies to YouTube videos, Amazon products to Goibibo hotels, Xbox games to Zomato restaurants. Figure: Companies using Recommendation Systems
  35. 35. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Recommender Systems - Example Recommendation systems work in two ways: 1. Collaborative Filtering: Collaborative filtering approaches building a model from a user's past behaviour as well as similar decisions made by other users. 2. Content-based Filtering: Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties. Figure: Movie Recommendation in IMDb
  36. 36. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining
  37. 37. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text Clustering Text Categorization Sentiment Analysis Concept Extraction Document Summarization
  38. 38. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text categorization (a.k.a. text classification) is the task of assigning predefined categories to free-text documents. E.g. News categories, academic paper categories. Text Categorization
  39. 39. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Document Summarization Multi-document summarization is an automatic procedure aimed at extraction of information from multiple texts written about the same topic.
  40. 40. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Text Clustering Text Clustering is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering
  41. 41. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Concept Extraction Concept mining is an activity that results in the extraction of concepts from artifacts. Solutions to the task typically involve aspects of artificial intelligence and statistics, such as data mining and text mining.
  42. 42. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Text Mining Sentiment Analysis Sentiment Analysis is the process of determining whether a piece of writing is positive, negative or neutral. Use Case: Twitter Sentiment Analysis, Customer Sentiment Analysis
  43. 43. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Time Series
  44. 44. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Time Series  A time series is a series of data points indexed (or listed or graphed) in time order.  Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data.  Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average. Ocean Tides Sunspots Stock Market Prices
  45. 45. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning
  46. 46. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Before moving ahead let us look at some of the drawbacks of machine learning. 1. Traditional ML algorithms are not useful while working with high dimensional data, that is where we have a large number of inputs and outputs. For example, in case of handwriting recognition we have large amount of input where we will have different type of inputs associated with different type of handwriting. 2. Second major challenge with traditional machine learning models is a process called feature extraction. Specifically, the programmer needs to tell the computer what kinds of things it should look for so as to make more accurate decision.
  47. 47. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning  Deep learning is one of the only methods by which we can circumvent the challenges of feature extraction in machine learning.  This is because deep learning models are capable of learning to focus on the right features by themselves, requiring little guidance from the programmer.  Therefore, we can say that Deep Learning is: 1. A collection of statistical machine learning techniques 2. Used to learn feature hierarchies 3. Often based on artificial neural networks Artificial Intelligence Machine Learning Deep Learning
  48. 48. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Examples Figure: Face Recognition using Deep Learning
  49. 49. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Deep Learning Examples Speech Recognition Self Driving Cars Automatic Translation
  50. 50. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary
  51. 51. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary Power of Data VisualizationConcepts of Statistics Machine Learning using R Components of Data ScienceJob Roles in Data ScienceWhat is Data Science?
  52. 52. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary Time SeriesText Mining Deep Learning Recommendation SystemsClusteringClassification  Clustering is the problem of categorizing objects to which different groups without any prior information about labels or classes.  Clustering belongs to the unsupervised learning.
  53. 53. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Thank You … Questions/Queries/Feedback

×