This document provides an introduction to data science. It discusses the different types of data including traditional structured data, big unstructured data, and semi-structured data. It also summarizes the key differences between analysis and analytics, qualitative and quantitative analytics, business intelligence, machine learning, and traditional data science methods. Common data science tools and job positions are also outlined.
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Simplilearn
This presentation about Deep Learning is designed for beginners who want to learn Deep Learning from scratch. We will look at where Deep Learning is applied and what exactly this term means. We'll see how Deep Learning, Machine Learning, and AI are different and why Deep Learning even came into the picture. We will then proceed to look at Neural Networks, which are the core of Deep Learning. Before we move into the working of Neural Networks, we'll cover activation and cost functions. The video will also introduce you to the most popular Deep Learning platforms. We wrap it up with a demo in TensorFlow to predict if a person receives a salary above or below 50k. Now, let us get started and understand Deep Learning in detail.
Below topics are explained in this Deep Learning presentation:
1. Applications of Deep Learning
2. What is Deep Learning
3. Why is Deep Learning important
4. What are Neural Networks
5. Activation function
6. Cost function
7. How do Neural Networks work
8. Deep Learning platforms
9. Introduction to TensorFlow
10. Use case implementation using TensorFlow
Why Deep Learning?
It is one of the most popular software platforms used for Deep Learning and contains powerful tools to help you build and implement artificial Neural Networks.
Advancements in Deep Learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in Deep Learning models, learn to operate TensorFlow to manage Neural Networks and interpret the results. According to payscale.com, the median salary for engineers with Deep Learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations and the execution pipeline
2. Implement Deep Learning algorithms, understand Neural Networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional Neural Networks, Recurrent Neural Networks, training deep networks and high-level interfaces
4. Build Deep Learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of Artificial Neural Networks
6. Troubleshoot and improve Deep Learning models
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
Deep Learning Tutorial | Deep Learning Tutorial For Beginners | What Is Deep ...Simplilearn
This presentation about Deep Learning is designed for beginners who want to learn Deep Learning from scratch. We will look at where Deep Learning is applied and what exactly this term means. We'll see how Deep Learning, Machine Learning, and AI are different and why Deep Learning even came into the picture. We will then proceed to look at Neural Networks, which are the core of Deep Learning. Before we move into the working of Neural Networks, we'll cover activation and cost functions. The video will also introduce you to the most popular Deep Learning platforms. We wrap it up with a demo in TensorFlow to predict if a person receives a salary above or below 50k. Now, let us get started and understand Deep Learning in detail.
Below topics are explained in this Deep Learning presentation:
1. Applications of Deep Learning
2. What is Deep Learning
3. Why is Deep Learning important
4. What are Neural Networks
5. Activation function
6. Cost function
7. How do Neural Networks work
8. Deep Learning platforms
9. Introduction to TensorFlow
10. Use case implementation using TensorFlow
Why Deep Learning?
It is one of the most popular software platforms used for Deep Learning and contains powerful tools to help you build and implement artificial Neural Networks.
Advancements in Deep Learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in Deep Learning models, learn to operate TensorFlow to manage Neural Networks and interpret the results. According to payscale.com, the median salary for engineers with Deep Learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations and the execution pipeline
2. Implement Deep Learning algorithms, understand Neural Networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional Neural Networks, Recurrent Neural Networks, training deep networks and high-level interfaces
4. Build Deep Learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of Artificial Neural Networks
6. Troubleshoot and improve Deep Learning models
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Slides for 'JOSA TechTalks: Arabic NLP in Practice' with an introduction to Natural Language Understanding (NLU) with focus on Arabic, covers main issues in (Arabic) NLU and used tools and resources.
By Dr. Samir Tartir - Senior Research Scientist of AI at Mawdoo3
This presentation educates you about objectives of python with example syntax, OOP Terminology, Creating Classes, Creating Instance Objects, Accessing Attributes and Built-In Class Attributes.
Welcome to the Supervised Machine Learning and Data Sciences.
Algorithms for building models. Support Vector Machines.
Classification algorithm explanation and code in Python ( SVM ) .
Abstract: This PDSG workshop introduces basic concepts of splitting a dataset for training a model in machine learning. Concepts covered are training, test and validation data, serial and random splitting, data imbalance and k-fold cross validation.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...Simplilearn
This presentation on Introduction to Machine Learning will explain what is Machine Learning and how does Machine Learning works. By the end of this presentation, you will be able to understand what are the types of Machine Learning, Machine Learning algorithms and some of the breakthroughs in Machine Learning industry. You will also learn what Machine Learning has to offer to us in terms of career opportunities.
This Machine Learning presentation will cover the following topics:
1. Real life applications of Machine Learning
2. Machine Learning Challenges
3. How did Machine Learning evolve?
4. Why Machine Learning / Machine Learning benefits
5. What is Machine Learning?
6. Types of Machine Learning ( Supervised, Unsupervised & Reinforcement Learning )
7. Machine Learning algorithms
8. Breakthroughs in Machine Learning
9. Machine Learning Future
10. Machine Learning Career
11. Machine Learning job trends
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Slides for 'JOSA TechTalks: Arabic NLP in Practice' with an introduction to Natural Language Understanding (NLU) with focus on Arabic, covers main issues in (Arabic) NLU and used tools and resources.
By Dr. Samir Tartir - Senior Research Scientist of AI at Mawdoo3
This presentation educates you about objectives of python with example syntax, OOP Terminology, Creating Classes, Creating Instance Objects, Accessing Attributes and Built-In Class Attributes.
Welcome to the Supervised Machine Learning and Data Sciences.
Algorithms for building models. Support Vector Machines.
Classification algorithm explanation and code in Python ( SVM ) .
Abstract: This PDSG workshop introduces basic concepts of splitting a dataset for training a model in machine learning. Concepts covered are training, test and validation data, serial and random splitting, data imbalance and k-fold cross validation.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
How Much Do Data Scientists Make?
The demand and salary for data scientists tend to be higher than most other ITES jobs. Experience is one of the key factors in determining the salary range of a data science professional.
According to Glassdoor, a Data Scientist in the United States earns an annual average of USD 117,212, and the same site reports that Data Scientists in India make a yearly average of ₹1,000,000.
Data Scientist Career Path
Data Science is currently considered one of the most lucrative careers available. Companies across all major industries/sectors have data scientist requirements to help them gain valuable insights from big data. There is a sharp growth in demand for highly skilled data science professionals who can straddle the business and IT worlds.
The career path to becoming a data scientist isn’t clearly defined since this is a relatively new profession. People from different backgrounds like mathematics, statistics, computer science or economics, end up in data science.
The major designations for data science professionals are:
Data Analyst
Data Scientist (entry-level)
Associate data scientist
Data Scientist (senior-level)
Product Manager
Lead data scientist
Director/VP/SVP
That was all about Data Scientist Job Description.
Become a Data Scientist Today!
In this write-up, we covered the Data Scientist job description in detail. Irrespective of which location you are in, there is no dearth of jobs for skillful data scientists. A career in data science is a rewarding journey to embark on, especially in the finance, retail, and e-commerce sectors. Jobs are also available with Government departments, universities and research institutes, telecoms, transports, the list goes on.
This video covers
Introductory Questions
Data Science Introduction
Data Science Technical Interview QnA :
#Excel
#SQL
#Python3
#MachineLearning
#DataAnalyticstechnical Interview
#DataScienceProjects
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#rohitdubey
#teachtechtoe
#datascience #datasciencetraining #datasciencejobs #datasciencecourse #datasciencenigeria #datasciencebootcamp #datascienceworkshop #datasciencecareers #datasciencestudent #datascienceproject #datascienceforall #datasciencetraininginpatelnagar#datasciencetrainingindelhi
There are ten areas in Data Science which are a key part of a project, and you need to master those to be able to work as a Data Scientist in much big organization.
BA is used to gain insights that inform business decisions and can be used to automate and optimize business processes. Data-driven companies treat their data as a corporate asset and leverage it for a competitive advantage. Successful business analytics depends on data quality, skilled analysts who understand the technologies and the business, and an organizational commitment to data-driven decision-making.
Business analytics examples
Business analytics techniques break down into two main areas. The first is basic business intelligence. This involves examining historical data to get a sense of how a business department, team or staff member performed over a particular time. This is a mature practice that most enterprises are fairly accomplished at using.
Data Science. Business Analytics is the statistical study of business data to gain insights. Data science is the study of data using statistics, algorithms and technology. Uses mostly structured data. Uses both structured and unstructured data.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
2. ABOUT DATA SCIENCE
• we can say that it is the interdisciplinary field concerned
with generating business insights and value by processing
large amounts of data. But that does not help us grasp the
complexity and nuanced nature of the discipline. Nor does
it provide us with an understanding of how it is shaping the
technological and business landscapes.
3. 1 -The different data science fields
Traditional data: Data in the form of tables containing numeric or text
values; Data that is structured and stored in databases
Big data: Extremely large data; Humongous in terms of volume. It can
be in various formats:
- structured
- semi-structured
- unstructured
9. 2 - Analysis vs analytics
• Analysis – Dividing data into digestible components that
are easier to understand and examining how different
parts relate to each other. Performed on past data,
explaining why the story ended in the way that it did. We
want to explain ‘how’ and ‘why’ something happened.
• Analytics – Explores the future. The application of logical
and computational reasoning to the component parts
obtained in an analysis. In doing this, you are looking for
patterns and exploring what you can do with them in the
future.
10. 3 – Qualitative vs Quantitative analytics
• Qualitative analytics – using intuition and experience in
conjunction with analysis to plan your next business move
• Quantitative analytics – applying formulas and algorithms
to numbers that you have gathered from your analysis.
11. 4 -Business Intelligence
• Business Intelligence : can be seen as the preliminary step
of predictive analytics. First, you analyze past data and
then using these inferences would allow you to create
appropriate models that could predict the future of your
business accurately.
• Business Intelligence (BI) : is the process of analyzing and
reporting historical business data. After reports and
dashboards have been prepared, they can be used to make
informed strategic and business decisions by end-users
such as the general manager. Concisely put, business
intelligence aims to explain past events using business
data.
12. 5 -Machine learning
• AI simulates human knowledge and decision making with
computers. Humans have managed to reach AI through
machine and deep learning.
• Machine learning is the ability of machines to predict
outcomes without being explicitly programmed to do so. It is
about creating and implementing algorithms that let
machines receive data and use this data to:
Make predictions | Analyze patterns | Give recommendations
• Symbolic reasoning is a type of AI that makes an
exception and does not use ML and deep learning.
13. 6 -Traditional data:Techniques
• Class labelling: Labelling the data point to the correct data
type (or arranging data by category).
• Data cleansing: (‘data cleaning’, ‘data scrubbing’): deal with
inconsistent data. For example, working on a dataset
containing US states and finding that some of the names are
misspelled.
• Data shuffling: Shuffling the observations from your dataset
just like shuffling a deck of cards. This will ensure your
dataset is free from unwanted patterns caused by
problematic data collection.
14. 6 -Traditional data:Techniques
• Data balancing: Ensuring that the sample gives equal
priority to each class. For example, if you work with a
dataset that contains 80% male and 20% female data, and
you know that the population contains approximately 50%
men and 50% women, then you need apply a balancing
technique to counteract this problem (using an equal
number of data from each group).
15. 6.1-Traditional data: Examples
• Numerical variable: Numbers that are easily manipulated
(for ex. Added), which gives us useful information
• Categorical variable: Numbers that hold no numerical value
can be considered categorical data. Dates are also
considered categorical data.
16. 7 - Big data:Techniques
• Text data mining: The process of deriving valuable,
unstructured data from text.
• Data masking: As a business, when you work with user
private data, you must be able to preserve confidential
information. However, this doesn’t mean that the data can’t
be touched or used for analysis. Instead, you must apply
some data masking techniques to utilize the information
without compromising private details. In essence, data
masking conceals the original data with random and false
data, allowing you to conduct analysis and keep confidential
information in a secure place.
17. 7.1 - Big data: Examples
• We find big data in increasingly more industries and
companies.
• Probably the most notable example of a company leveraging
the true potential of big data is Facebook. The company
keeps track of its users’ names, personal data, photos,
videos, recorded messages and so on. This means their data
has a lot of variety. And with 2 billion users worldwide, the
volume of data stored on their servers is tremendous.
18. 8 - Business IntelligenceTechniques
• Metric: refers to a value that derives from the measures you
have obtained and aims at gauging business performance or
progress. Has a business meaning attached to it.
• Measure: simple descriptive statistics of past performance
Metric = Measure + Business meaning
• KPIs: It doesn’t make sense to keep track of all metrics. So,
companies choose to focus on the most important ones.
KPIs = metrics + Business objective
Filtering out the boring metrics and turning the interesting and informative
KPIs into easily understood and comparable visualizations is an important part
of the business intelligence analyst job.
19. 8.1 - Business Intelligence Examples
• BI allows you to adjust your strategy to past data as soon
as it is available. If done right, Business Intelligence will
help to efficiently manage your shipment logistics and, in
turn, reduce costs and increase profit.
20. 9 - Traditional methods:Techniques
• (classical statistical methods for forecasting)
• Clustering: grouping the data in neighborhoods to analyze
meaningful patterns
• Time series: used in economics and finance, showing the
development of certain values over time, such as stock
prices or sales volume.
21. 9.1 - Traditional methods : Examples
• The application of traditional methods is extremely broad.
Two relevant examples:
• Forecasting sales data: using time series data to predict a
firm’s future expected sales
• UX: plot customer satisfaction and customer revenue to find
that each cluster represents a different geographical
location
22. 10 - Machine Learning: Techniques
• There are four ingredients for machine learning: data,
model, objective function, optimization algorithm
• Model: the computer uses an algorithm to recognize certain
types of patterns
• Objective function: specification of the machine learning
problem; a function to be maximized or minimized
depending on the task at hand
• Optimization algorithm: A process in which previous
solution of the problem are compared until reaching an
optimal solution
23. 10.1- Machine Learning:Types
• Three main types of machine learning:
• Supervised learning :Training an algorithm resembles a
teacher supervising her students. Provides feedback every
step of the way. Telling students whether they did ‘good’ or
whether they need to improve their performance. When
using supervised learning you use labelled data (every data
point is categorized as ‘good’ performance or as
‘performance that needs improvement’ in our example).
24. 10.1- Machine Learning:Types
• Three main types of machine learning:
• Unsupervised learning : In this case, the algorithm trains
itself. There isn’t a teacher who provides feedback. The
algorithm uses unlabeled data that is not categorized as
‘good’ or as ‘performance that needs improvement’. The
unsupervised ML model simply uses the data and sorts in
different groups. In our example, it will be able to show us
two groups – ‘good performing’ and ‘performance that
needs to be improved’, however the ML model would not be
able to tell us which one is which.
25. 10.1- Machine Learning:Types
• Three main types of machine learning:
• Reinforcement learning: A reward system is introduced.
Every time a student does a task better than it used to in the
past they will receive a reward (and nothing if the task is
not performed better). Instead of minimizing an error, we
maximize a reward, or in other words, maximizing the
objective function.
• Deep learning – the modern state-of-the-art approach to
machine learning – leverages the power of neural networks
and can be placed in both categories – supervised and
unsupervised learning.
26. 11- Common data science tools
• There are two main types of tools
• Programming languages : enable you to devise programs
that can execute specific operations. Moreover, you can
reuse these programs whenever you need to execute the
same action. the most popular programming language for
data science is Python followed by R.
- Python and R have their limitations. They are not able
to address problems specific to some domains. One
example is ‘relational database management systems’.
In these instances, SQL works best.
27. 11- Common data science tools
• There are two main types of tools
• terms of software : Excel plays an important role. It is
able to do relatively complex computations and good
visualizations quickly. SPSS is another popular tool for
working with traditional data and applying statistical
analysis.
-There is a significant amount of software designed for
working with big data – Apache Hadoop, Apache Hbase,
and Mongo.
PowerBI, Qlik, Tableau are top-notch examples of
software designed for business intelligence visualizations.
28. 12- Data science job positions
• Data architect – designs the way data will be retrieved
processed and consumed
• Data engineer – process the obtained data so that it is ready
for analysis
• Database administrator – handles this control of data;
works with traditional data
• BI analyst – performs analyses and reporting of past
historical data
29. 12- Data science job positions
• BI consultant – ‘external BI analyst’
• BI developer – performs analyses specifically designed for
the company
• Data scientist – employs traditional statistical methods or
unconventional machine learning techniques for making
predictions
• Data analyst – prepare advanced analyses
• Machine learning engineer – applies state-of-the-art ML
techniques