2. What is predictive analytics?
• Predictive analytics is an applied field that uses a variety of
quantitative methods that make use of data in order to make
predictions.
3. • Predicative Analytics is
⮚Is an applied field:the field of predictive analytics is always used to solve problems and it
is being applied in virtually every industry and domain: finance, telecommunications,
advertising, insurance, healthcare, education, entertainment, banking, and so on. So
keep in mind that you will be always using predictive analytics to solve problems within a
particular domain
⮚Uses a variety of quantitative methods; When doing predictive analytics, you will be a
user of the techniques, theorems, best practices, empirical findings, and theoretical
results of mathematical sciences
⮚That makes use of data:data is the raw material used for building the models. A key
aspect of predictive analytics is the use of data to extract useful information from it.
⮚To make predictions:the context of predictive analytics is to , a prediction is an unknown
event, not necessarily about the future , we can build a predictive model that is able to
"predict"
• For example : whether the patient has disease based on clinical data
4. How a Doctor Analyze a Patient
• Ask him about his problem .
• Then ask abt his age , report and medication is taking
• Based on the report , he will find out the problem.
5. Concepts of Predictive Analytics
Concept Description
Data Any record that is captured and stored and that is meaningful in some context.
Unit of observation The entity that is the subject of analysis. For Sales Data the unit of observation
are stores, cash registers, transactions, days, and so on
Attribute A characteristic of a unit of analysis.for patient the attributes are age, height,
weight, body mass index, cholesterol level, and so on.
Data point, sample,
observation, and instance
A single unit of observation with all its available attributes
Dataset A collection of data points, usually in a table format; think of a relational
database table or a spreadsheet
6.
7. Predictive Analytics Process
• 1. Problem understanding and definition:Understand the problem and how the
potential solution would look. Also, define the requirements for solving the
problem
• 2. Data collection and preparation :Get a dataset that is ready for analysis
• 3. Data understanding using exploratory data analysis (EDA) :Understand your
dataset using EDA which is combination of numerical and visualization
techniques that allow us to understand different characteristics of our dataset,
its variables, and the potential relationship between them. T
• 4. Model building :Produce some predictive models that solve the problem
• 5. Model evaluation :: Choose the best model among a subset of the most
promising ones and determine how good the model is in providing the solution
• 6. Communication and/or deployment :deploy the model and start using it for
predicting the results.
8. Python’s Data Science Stack
• Anaconda:Anaconda is a distribution of the Python and R programming
languages for scientific computing, that aims to simplify package
management and deployment. It has more than 300+ libaraies.
• Jupyter: JupyterLab is the latest web-based interactive development
environment for notebooks, code, and data.
• NumPy: NumPy is a Python library used for working with arrays. It also has
functions for working in domain of linear algebra, fourier transform, and
matrices.
• SciPy:SciPy is a free and open-source Python library used for scientific
computing and technical computing.
• Pandas: Pandas is an open source Python package that is widely used for
data science/data analysis and machine learning task
9. Python’s Data Science Stack
• Matplotlib:Matplotlib is a cross-platform, data visualization and graphical
plotting library for Python
• Seaborn: Seaborn is a Python data visualization library based on
matplotlib
• Scikit-learn: Scikit-learn is a free machine learning library for Python. It
features various algorithms like support vector machine, random forests,
and k-neighbours.
• TensorFlow and Keras :Keras is a deep learning API written in Python,
running on top of the machine learning platform TensorFlow
• Dash : Dash is an open-source Python framework used for building
analytical web applications.