Data Visualization using different python libraries.pptx
1. DATA VISUALIZATION USING DIFFERENT
PYTHON LIBRARIES (E.G., PANDAS,
NUMPY, MATPLOTLIB)
TEHMIMA ISMAIL
0000412064
MSCS-II
2. WHAT IS DATA VISUALIZATION?
• Introduction to Data Visualization
• Definition: "Field in data analysis, visually represents data."
• Purpose: "Effectively communicates inferences through graphics."
• Benefit: "Provides a quick visual summary of data."
• Tools: "Utilizes pictures, maps, and graphs."
• Advantage: "Easier processing and understanding for the human mind."
• Application: "Applicable to both small and large data sets."
• Challenge: "Impossible manual processing for large data sets."
3. DATA VISUALIZATION IN PYTHON
• Python offers several plotting libraries, namely Matplotlib, Seaborn and
many other such data visualization packages with different features for
creating informative, customized, and appealing plots to present data in the
most simple and effective way.
4. MATPLOTLIB AND SEABORN
• Matplotlib and Seaborn are python libraries that are used for data
visualization.
• They have inbuilt modules for plotting different graphs.
• Matplotlib is used to embed graphs into applications.
• Seaborn is primarily used for statistical graphs.
5. LINE CHARTS
• A Line chart is a graph that represents information as a series of data points connected by a
straight line.
• In line charts, each data point or marker is plotted and connected with a line or curve.
Let's consider the apple yield (tons per hectare) in Kanto. Let's plot a line graph using this data
and see how the yield of apples changes over time. We start by importing Matplotlib and
Seaborn.
7. To better understand the graph and its purpose, we can add the x-axis values
too.
8. • Let's add labels to the axes so that we can show what each axis represents.
9. .
• To plot multiple datasets on the same graph, just use the plt.plot function
once for each dataset. Let's use this to compare the yields of apples vs.
oranges on the same graph.
10. We can add a legend which tells us what each line in our graph means. To
understand what we are plotting, we can add a title to our graph.
11. To show each data point on our graph, we can highlight them with markers
using the marker argument. Many different marker shapes like a circle, cross,
square, diamond, etc. are provided by Matplotlib.
12. You can use the plt.figure function to change the size of the figure.
13. USING SEABORN
• An easy way to make your charts look beautiful is to use some default styles from the
Seaborn library. These can be applied globally using the sns.set_style function.
14. • We can also use the darkgrid option to change the background color to a
darker shade
15. BAR GRAPHS
• Categorical Data Representation:
• Utilize bar graphs for effective representation.
• Y-axis: Represents values; X-axis: Represents categories.
• Axis Interpretation:
• Y-axis reflects numerical data values.
• X-axis denotes categorical data labels.
• Data-Category Relationship:
• Bars visually linked to specific categories.
• Offers clear representation of data distribution.
• Effective Communication:
• Facilitates easy communication of categorical data.
• Provides quick understanding.
• Flexibility for Categorical Data:
• Applicable to various types of categorical data.
• Enables easy comparison between different categories.
17. WE CAN ALSO STACK BARS ON TOP OF EACH OTHER. LET'S
PLOT THE DATA FOR APPLES AND ORANGES
18. PLOTTING AVERAGES OF EACH BAR
• We can draw a bar chart to visualize how the average bill amount varies across different days of the
week. We can do this by computing the day-wise averages and then using plt.bar. The Seaborn library
also provides a barplot function that can automatically compute averages.
19. PLOTTING MULTIPLE BAR GRAPHS
• If you want to compare bar plots side-by-side, you can use the hue argument. The comparison will be
done based on the third feature specified in this argument.
21. HISTOGRAMS
Histogram Overview:
• Utilizes bars to represent data variation across a range.
• Y-axis indicates data frequency, while the X-axis shows value ranges.
• Bars represent data quantities within specific value ranges.
22. IRIS DATASET
• Let's again use the ‘Iris’ data which contains information about flowers to plot histograms.
25. CHANGING NUMBER AND SIZE OF BINS
• We can change the number and size of bins using numpy too.
26. BINS OF UNEQUAL SIZE
• We can create bins of unequal size too.
27. MULTIPLE HISTOGRAMS
• Similar to line charts, we can draw multiple histograms in a single chart. We can reduce each
histogram's opacity so that one histogram's bars don't hide the others'. Let's draw separate histograms
for each species of flowers.
28. STACKING HISTOGRAMS
• Multiple histograms can be stacked on top of one another by setting the stacked parameter to True.
29. 📊STOCK MARKET ANALYSIS 📈 + PREDICTION USING
LSTM
• Tesla Stock Price, S&P 500 stock data, AMZN, DPZ, BTC, NTFX adjusted
May 2013-May2019 +1
Data Project - Stock Market Analysis
30. TIME SERIES DATA
• Time Series data is a series of data points indexed in time order.
• We will discover and explore data from the stock market, particularly some
technology stocks (Apple, Amazon, Google, and Microsoft). We will learn
how to use yfinance to get stock information, and visualize different aspects
of it using Seaborn and Matplotlib. We will also be predicting future stock
prices through a Long Short Term Memory (LSTM) method!
31. WE'LL BE ANSWERING THE FOLLOWING QUESTIONS
ALONG THE WAY
1.) What was the change in price of the stock over time
2.) What was the daily return of the stock on average
3.) What was the moving average of the various stocks
4.) What was the correlation between different stocks
5.) How much value do we put at risk by investing in a particular stock 6.) How
can we attempt to predict future stock behavior (Predicting the closing price
stock price of APPLE inc using LSTM)
32. GETTING THE DATA
• The first step is to get the data and load it to memory.
• We will get our stock data from the Yahoo Finance website.
• Yahoo Finance is a rich resource of financial market data and tools to find
compelling investments.
• To get the data from Yahoo Finance, we will be using yfinance library which
offers a threaded and Pythonic way to download market data from Yahoo.
33. WHAT WAS THE CHANGE IN PRICE OF THE STOCK
OVERTIME?
• In this section we'll go over how to handle requesting stock information with
pandas, and how to analyze basic attributes of a stock.
34.
35. WHAT WAS THE CHANGE IN PRICE OF THE STOCK
OVERTIME?
https://www.kaggle.com/code/faressayah/stock-market-analysis-
prediction-using-lstm?scriptVersionId=117825740&cellId=5
37. DESCRIPTIVE STATISTICS ABOUT THE DATA
• .describe() generates descriptive statistics. Descriptive statistics include those that
summarize the central tendency, dispersion, and shape of a dataset’s distribution,
excluding NaN values.
• Analyzes both numeric and object series, as well as DataFrame column sets of mixed data
types. The output will vary depending on what is provided. Refer to the notes below for more
detail.
38. DESCRIPTIVE STATISTICS ABOUT THE DATA
We have only 255 records in one year because weekends are not included in the data
39. INFORMATION ABOUT THE DATA
• .info() method prints information about a DataFrame including the
index dtype and columns, non-null values, and memory usage.
40. CLOSING PRICE
• The closing price is the last price at which the stock is traded during the regular trading day.
A stock’s closing price is the standard benchmark used by investors to track its performance
over time.
42. VOLUME OF SALES
• Volume is the amount of an asset or security that changes hands over some period of time, often over
the course of a day. For instance, the stock trading volume would refer to the number of shares of
security traded between its daily open and close. Trading volume, and changes to volume over the
course of time, are important inputs for technical traders.