SlideShare a Scribd company logo
1 of 5
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 721
Spotify Stream Prediction using Regression Models
Aakash Kapadia1, Tanvi Khasgiwal2, Leena Kirtikar3, Sanjay Pandey4
1,2,3Student, Information Technology Department, Thadomal Shahani Engineering College, Mumbai
4 Asst. Professor, Dept. Information Technology, Thadomal Shahani Engineering College, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - One of the biggest musicplatformsinthe worldis
Spotify. With over a 100 million users per day and over 80
million tracks on it, people have been using this music
streaming app for multiple things like podcasts, motivational
audio and most importantly songs. Spotify is mainly used for
songs as most of the worlds renowned artists to publish their
content so the world can hear it and enjoy it. Being avid users
of Spotify ourselves, we tried to find out what drives the fame
of a song – or even try to understand why people listen to a
specific song. In this research paper and project, we used
multiple algorithms to check what attributes of a song makes
it famous and why does it top the weekly charts. This paper
uses a Spotify Database and performs Exploratory Data
analysis on it to recognize the most influential variables and
then further work on them. We have used algorithms like
Linear Regression, RandomForest, RidgeRegressionandLasso
Regression to compare accuracies of our admired results.
Finally, the accuracy will be compared so that we can
calculate the approximate streams of a song based on the
relevant attributes.
Key Words: Spotify, streams, linear regression, ridge
regression, random forest, lasso regression.
1. INTRODUCTION
We chose a Spotify as our subject because for years as
students, we have been using this app to help in various
ways. Spotify aides us in entertainment with its millions of
songs and several genres. It has many more advantages
which makes it a better music app than other appsthatexist.
Firstly, it remains of the easiest apps to use and thus
multiple age dynamics can be seen as users of the app.
Spotify gives a great music sharing experienceasfriendsand
family can share songs using a shared account. The app also
provides us with a premium version with noadvertisements
which enhances the experience by saving the music offline
and not using network data to stream it. The biggest
advantage is compatibility. It can be used over manydevices
including iOS, Android and Windows.
Finally, the music collection is what drives people on the
app and that’s why mainly the world uses it. We wanted to
know why is a song famous, does the artist on the song
matter, and what attributes of the song can affect the
number of times it has been streamed. To answer our
questions, we have implemented the following in our
research work:
i) Using data visualization plots like bar chart, scatter plot,
and heatmap we find the attributes which affect the streams
of the song.
ii) We then move forward by dropping the attributes like
Artist name, Song name, and its release date as in step 1 we
realize that they do not affect the streams of a song by a
larger scale.
iii) Characteristics of a song like Loudness, Popularityofthe
song, and its energy were the most contributing factors.
iv) Lastly, we take the regression algorithmsmentionedand
predict the streams of the song using the relevant factors.
2. LITERATURE REVIEW
Authors of research paper [2] investigated the connection
between song information, such as key & tempo from the
database of Spotify audio properties, and song popularity as
determined by the numerous streams of Spotify. They
researched four ML algorithms: Linear Regression (LR),
Random Forest (RF), K-means, Clustering & created a highly
accurate model for predicting success of particular songs.
Their research offers a prediction model for figuring out
whether a song will be well-liked by the general public and
uses machine learning to categorize songs according to how
well-liked they are.
In [3], Charts Carlos Vicente S. Araujo Marco A. P. Cristo
Rafael Guisti make predictions on whether an existing well-
liked music will garner greater than normal public interest
and go "viral." They also make predictions about whether
unexpected jumps in popularity will last over time. They
base their conclusions on information from the streaming
service Spotify, using "Most Popular" list as a proxy for
popularity and its "Virals" list as a proxy for interest
increase. Additionally, they take a classification approach to
the issue and use a SVM model predicated on famous data to
forecast interest and vice versa. Finally, they check to see if
acoustic data can be helpful features for both tasks.
In [4], Elena Georgieva, Marcella Suta, and Nicholas Burton
attempted to foretell which songs will top the Billboard Hot
100. They collected dataset of about four thousand popular
& nonpopular songs and used web API of Spotify to extract
the audio properties of each song. Using five machine
learning algorithms, they were able to predict a song's
billboard success with about 75% accuracyonthevalidation
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 722
set. The two most effective methods were neural network
logistic regression.
In [5], Rutger Nijkamp’s, study looks into the connection
between music information, such as the key and tempo of a
song & popularity of song as determined on the basis of
number of song’s streams received on Spotify.Theattribute-
approach was utilized to investigate the potential
explanatory power of song qualities on stream count. 1000
tracks from ten different genres were examined via the
Spotify database API. Regression was used to create a
prediction model. With this research design, the results
indicate that Spotify's audio features have minimal to
average explanation power of greater count of stream.
3. METHODOLOGY
Methodology section of the paper explores the different
methods which we have applied on the Spotify dataset to
predict the number of streams of a song. The dataset is used
in multiple cases of data visualization and predicting an
approximate number. Fig 1 displays a flowchart of the
methodology which has been used and implemented in our
research and includes data cleaning, its display on graphs
and usage of various regression models and obtain
appropriate results.
Fig. 1. Flow diagram of research work
3.1 Data Collection
The data is obtained from [1]. The dataset includes features
like song name, artist, artist followers, genre, popularity,
danceability, energy, loudness and many more.
Fig. 2. Dataset
3.2 Data Preprocessing
In this step we remove the inaccurate or null values directly
from the .csv file of the dataset which might lead to wrong
results. We had to perform less cleaning as the obtained
dataset was nearly obsolete.
3.3 Data Visualization
Figure 6 shows a heat map of the data which wehaveused to
achieve our results. Heatmap map is defined as graphical
notation of large volume of data coded by different colours.
The heat map takes all the variables present and correlates
them with each other at the same time. Figure 7 shows a
histogram of number of times a song was charted vs Song
Name, figure 8 is a bar chart plotting of an artist vs streams
and figure 9 is bar chart reflection of genre and streams.
3.4 Data Analysis
On further evaluation of the heatmap, we came to a
conclusion that the most prominent features are Loudness
and Energy. This can also be verified with the graphs shown
in Figure 3 and 4.
Figure 3 shows us a scatter plot between popularity and
loudness. The graph displays the relation that higher the
loudness, higher the popularity of the song.
Fig. 3. Popularity Vs Loudness
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 723
Figure 4 shows a scatter plot betweenpopularityandenergy
and it shows that higher the energy of the song, more the
popularity.
Fig. 4. Popularity Vs Energy
Figure 5 displays the regression models we have used with
their respective accuracies. The regression model with the
most accuracy is random forest.
Fig. 5. Accuracy Table
Linear Regression is a supervised machine learning
algorithm which predicts the target value on the basis of the
independent variables provided.
According to the accuracies found, we deduced that Linear
Regression algorithm would provide us with the best
possible results for our dataset.
The Hypothesis function of linear regressionisstated below:
y = α1 + α2.x … (A)
here x denotes input training
y = labels to data
α1 = intercept
α2 = coefficient of x
Regression in Random Forest makes use of ensemble
learning method. It operates by construction of many
decision trees during the training periodand providesusthe
mean of the classes as a prediction of all trees.
Here LASSO is used acronym of least absolute shrinkages
selection operator. Extension of it is penalty term related to
cost function. Summation of the coefficients is represented
by this phrase. This particular term confine, forcing the
model to curtail the value of coefficientsinordertominimize
the losses, when the value of coefficients increases from 0 to
1. In contrast to lasso regression, which commonly makes
the coefficient absolute zero, ridge regression never puts
coefficient value as zeros.
The LASSO regressioncombinesstatisticsandMLtoenhance
predictability as well as understandability of the resultant
model.
In essence, a regularised linear regressor is what a ridge
regressor is. In other words, we add a regularized term to
the initial cost function of the linear regressor in order to
drive the learning algorithm to suit the data and help
maintain the weights as low as feasible. The 'alpha'
parameter of the regularised term controls the
regularisation of model, therefore reducing the variance of
the estimations. In some conditions where variables are
independent & correlated, technique of ridge regression is
used for forecasting the coefficients of numerous
regressions.
Cost Function for Ridge Regressor:
Fig. 6. Heat map of Correlation Plot
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 724
Figure 7 shows that Dance Monkey is the most charted song
while Beggin’ is the least charted song.
Fig. 7. Histogram of number of times charted Vs Song
Name
Figure 8 shows that the artists, Maneskin have the highest
streams for their song while Bad Bunny has the lowest.
Fig. 8. Histogram of Artist Vs Streams
Figure 9 shows that the genres, Indie rock Italiano have the
highest streams while trap latino has the lowest.
Fig. 9. Histogram of Genre Vs Streams
4. CONCLUSION
Spotify is a very large platform that is used aroundtheworld
every day by millions of users and thus this research project
helped us to gain more knowledge about it. It was insightful
to know about the attributes that actually affect the streams
of a song and what causes a user to listen to a particular
artist. Moreover, we gained knowledge about various
regression models which can be used for other projects in
the future.
In this research paper, we have successfully predicted the
popularity of songs using relevant attributes, which helped
us gaining our desired accuracy. The most influential factors
that fulfilled our objective to determine streams were
loudness and energy. In the initial part of our research, we
have used techniques like Lasso Regression, Ridge
Regression, Random Forest and Linear Regression. We have
obtained the best results from Random Forest, which came
out to be 97.48%, Hence the most accurateregressionmodel
was Random Forest.
In the future, the research work can be improved with the
help of a dataset with more songs and attributes to gain
better accuracies of the models.
REFERENCES
[1] Dataset:https://www.kaggle.com/datasets/sashankpilla
i/spotify-top-200-charts-20202021
[2] J. S. Gulmatico, J. A. B. Susa, M. A. F. Malbog, A. Acoba, M.
D. Nipas and J. N. Mindoro, "SpotiPred: A Machine
Learning Approach Prediction of Spotify Music
Popularity by Audio Features," 2022 Second
International Conference on Power, Control and
Computing Technologies (ICPC2T), 2022, pp. 1-5, doi:
10.1109/ICPC2T53885.2022.9776765. R. Nicole, “Title
of paper with only first word capitalized,”J.NameStand.
Abbrev., in press.
[3] Araujo, C. S., Cristo, M., & Giusti, R. (2019). Predicting
music popularity on streaming platforms. Anais Do
Simpósio Brasileiro De Computação Musical (SBCM
2019).
Available:
https://doi.org/10.5753/sbcm.2019.10436
[4] Suta, M. (2018, January 1). Hitpredict: Predicting hit
songs using Spotify Data Stanford Computer Science
229:Machinelearning.Academia.edu.Retrieved October
7, 2022.
Available:
https://www.academia.edu/73249006/Hitpredict_Pred
icting_Hit_Songs_Using_Spotify_Data_Stanford_Computer
_Science_229_Machine_Learning
[5] Nijkamp, R. (n.d.). Prediction of product success:
Explaining song popularity by audio features from
Spotify data. Retrieved October 6, 2022.
Available:
https://essay.utwente.nl/75422/1/NIJKAMP_BA_IBA.pd
f
[6] https://www.statisticshowto.com/lasso-regression/
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 725
[7] https://towardsdatascience.com/ridge-regression-for-
better-usage-2f19b3a202db
[8] https://www.analyticsvidhya.com/blog/2021/06/unde
rstanding-random-forest/
[9] http://www.stat.yale.edu/Courses/1997-
98/101/linreg.htm
[10] https://www.mathworks.com/help/stats/what-is-
linear-regression.html

More Related Content

Similar to Spotify Stream Prediction using Regression Models

Turkey music( june 2021).pdf
Turkey music( june 2021).pdfTurkey music( june 2021).pdf
Turkey music( june 2021).pdfSrinivas Kanakala
 
Linear Regression with R programming.pptx
Linear Regression with R programming.pptxLinear Regression with R programming.pptx
Linear Regression with R programming.pptxanshikagoel52
 
IRJET- Musical Instrument Recognition using CNN and SVM
IRJET-  	  Musical Instrument Recognition using CNN and SVMIRJET-  	  Musical Instrument Recognition using CNN and SVM
IRJET- Musical Instrument Recognition using CNN and SVMIRJET Journal
 
Using R for Classification of Large Social Network Data
Using R for Classification of Large Social Network DataUsing R for Classification of Large Social Network Data
Using R for Classification of Large Social Network DataIJCSIS Research Publications
 
Predicting movie success from search
Predicting movie success from searchPredicting movie success from search
Predicting movie success from searchijaia
 
Project on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentProject on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentSiddharth Chaudhary
 
Automatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningAutomatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningIRJET Journal
 
Music Recommendation System
Music Recommendation SystemMusic Recommendation System
Music Recommendation SystemIRJET Journal
 
DATA MINING USING BUSINESS INTELLIGENCE AND SQL
DATA MINING USING BUSINESS INTELLIGENCE AND SQLDATA MINING USING BUSINESS INTELLIGENCE AND SQL
DATA MINING USING BUSINESS INTELLIGENCE AND SQLIRJET Journal
 
Understanding and Supporting Mobile Application Usage
Understanding and Supporting Mobile Application UsageUnderstanding and Supporting Mobile Application Usage
Understanding and Supporting Mobile Application UsageMatthias Böhmer
 
IRJET - Music Generation using Deep Learning
IRJET -  	  Music Generation using Deep LearningIRJET -  	  Music Generation using Deep Learning
IRJET - Music Generation using Deep LearningIRJET Journal
 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET Journal
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoopIRJET Journal
 
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET Journal
 
Innovaccer service capabilities with case studies
Innovaccer service capabilities with case studiesInnovaccer service capabilities with case studies
Innovaccer service capabilities with case studiesAbhinav Shashank
 
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET Journal
 
Recommendation Engine Powered by Hadoop
Recommendation Engine Powered by HadoopRecommendation Engine Powered by Hadoop
Recommendation Engine Powered by HadoopPranab Ghosh
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshBigDataCloud
 

Similar to Spotify Stream Prediction using Regression Models (20)

Turkey music( june 2021).pdf
Turkey music( june 2021).pdfTurkey music( june 2021).pdf
Turkey music( june 2021).pdf
 
Linear Regression with R programming.pptx
Linear Regression with R programming.pptxLinear Regression with R programming.pptx
Linear Regression with R programming.pptx
 
IRJET- Musical Instrument Recognition using CNN and SVM
IRJET-  	  Musical Instrument Recognition using CNN and SVMIRJET-  	  Musical Instrument Recognition using CNN and SVM
IRJET- Musical Instrument Recognition using CNN and SVM
 
Using R for Classification of Large Social Network Data
Using R for Classification of Large Social Network DataUsing R for Classification of Large Social Network Data
Using R for Classification of Large Social Network Data
 
Predicting movie success from search
Predicting movie success from searchPredicting movie success from search
Predicting movie success from search
 
Project on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentProject on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environment
 
Automatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningAutomatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep Learning
 
Music Recommendation System
Music Recommendation SystemMusic Recommendation System
Music Recommendation System
 
DATA MINING USING BUSINESS INTELLIGENCE AND SQL
DATA MINING USING BUSINESS INTELLIGENCE AND SQLDATA MINING USING BUSINESS INTELLIGENCE AND SQL
DATA MINING USING BUSINESS INTELLIGENCE AND SQL
 
Emofy
Emofy Emofy
Emofy
 
Understanding and Supporting Mobile Application Usage
Understanding and Supporting Mobile Application UsageUnderstanding and Supporting Mobile Application Usage
Understanding and Supporting Mobile Application Usage
 
IRJET - Music Generation using Deep Learning
IRJET -  	  Music Generation using Deep LearningIRJET -  	  Music Generation using Deep Learning
IRJET - Music Generation using Deep Learning
 
IRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation SystemIRJET- A Personalized Music Recommendation System
IRJET- A Personalized Music Recommendation System
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoop
 
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
IRJET - Automatic Lip Reading: Classification of Words and Phrases using Conv...
 
Innovaccer service capabilities with case studies
Innovaccer service capabilities with case studiesInnovaccer service capabilities with case studies
Innovaccer service capabilities with case studies
 
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social Media
 
Recommendation Engine Powered by Hadoop
Recommendation Engine Powered by HadoopRecommendation Engine Powered by Hadoop
Recommendation Engine Powered by Hadoop
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfKira Dess
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New HorizonMorshed Ahmed Rahath
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Toolssoginsider
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniR. Sosa
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptjigup7320
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfssuser5c9d4b1
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUankushspencer015
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsVIEW
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentationsj9399037128
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...Amil baba
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisDr.Costas Sachpazis
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
 

Recently uploaded (20)

Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney Uni
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdf
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, Functions
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 

Spotify Stream Prediction using Regression Models

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 721 Spotify Stream Prediction using Regression Models Aakash Kapadia1, Tanvi Khasgiwal2, Leena Kirtikar3, Sanjay Pandey4 1,2,3Student, Information Technology Department, Thadomal Shahani Engineering College, Mumbai 4 Asst. Professor, Dept. Information Technology, Thadomal Shahani Engineering College, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - One of the biggest musicplatformsinthe worldis Spotify. With over a 100 million users per day and over 80 million tracks on it, people have been using this music streaming app for multiple things like podcasts, motivational audio and most importantly songs. Spotify is mainly used for songs as most of the worlds renowned artists to publish their content so the world can hear it and enjoy it. Being avid users of Spotify ourselves, we tried to find out what drives the fame of a song – or even try to understand why people listen to a specific song. In this research paper and project, we used multiple algorithms to check what attributes of a song makes it famous and why does it top the weekly charts. This paper uses a Spotify Database and performs Exploratory Data analysis on it to recognize the most influential variables and then further work on them. We have used algorithms like Linear Regression, RandomForest, RidgeRegressionandLasso Regression to compare accuracies of our admired results. Finally, the accuracy will be compared so that we can calculate the approximate streams of a song based on the relevant attributes. Key Words: Spotify, streams, linear regression, ridge regression, random forest, lasso regression. 1. INTRODUCTION We chose a Spotify as our subject because for years as students, we have been using this app to help in various ways. Spotify aides us in entertainment with its millions of songs and several genres. It has many more advantages which makes it a better music app than other appsthatexist. Firstly, it remains of the easiest apps to use and thus multiple age dynamics can be seen as users of the app. Spotify gives a great music sharing experienceasfriendsand family can share songs using a shared account. The app also provides us with a premium version with noadvertisements which enhances the experience by saving the music offline and not using network data to stream it. The biggest advantage is compatibility. It can be used over manydevices including iOS, Android and Windows. Finally, the music collection is what drives people on the app and that’s why mainly the world uses it. We wanted to know why is a song famous, does the artist on the song matter, and what attributes of the song can affect the number of times it has been streamed. To answer our questions, we have implemented the following in our research work: i) Using data visualization plots like bar chart, scatter plot, and heatmap we find the attributes which affect the streams of the song. ii) We then move forward by dropping the attributes like Artist name, Song name, and its release date as in step 1 we realize that they do not affect the streams of a song by a larger scale. iii) Characteristics of a song like Loudness, Popularityofthe song, and its energy were the most contributing factors. iv) Lastly, we take the regression algorithmsmentionedand predict the streams of the song using the relevant factors. 2. LITERATURE REVIEW Authors of research paper [2] investigated the connection between song information, such as key & tempo from the database of Spotify audio properties, and song popularity as determined by the numerous streams of Spotify. They researched four ML algorithms: Linear Regression (LR), Random Forest (RF), K-means, Clustering & created a highly accurate model for predicting success of particular songs. Their research offers a prediction model for figuring out whether a song will be well-liked by the general public and uses machine learning to categorize songs according to how well-liked they are. In [3], Charts Carlos Vicente S. Araujo Marco A. P. Cristo Rafael Guisti make predictions on whether an existing well- liked music will garner greater than normal public interest and go "viral." They also make predictions about whether unexpected jumps in popularity will last over time. They base their conclusions on information from the streaming service Spotify, using "Most Popular" list as a proxy for popularity and its "Virals" list as a proxy for interest increase. Additionally, they take a classification approach to the issue and use a SVM model predicated on famous data to forecast interest and vice versa. Finally, they check to see if acoustic data can be helpful features for both tasks. In [4], Elena Georgieva, Marcella Suta, and Nicholas Burton attempted to foretell which songs will top the Billboard Hot 100. They collected dataset of about four thousand popular & nonpopular songs and used web API of Spotify to extract the audio properties of each song. Using five machine learning algorithms, they were able to predict a song's billboard success with about 75% accuracyonthevalidation
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 722 set. The two most effective methods were neural network logistic regression. In [5], Rutger Nijkamp’s, study looks into the connection between music information, such as the key and tempo of a song & popularity of song as determined on the basis of number of song’s streams received on Spotify.Theattribute- approach was utilized to investigate the potential explanatory power of song qualities on stream count. 1000 tracks from ten different genres were examined via the Spotify database API. Regression was used to create a prediction model. With this research design, the results indicate that Spotify's audio features have minimal to average explanation power of greater count of stream. 3. METHODOLOGY Methodology section of the paper explores the different methods which we have applied on the Spotify dataset to predict the number of streams of a song. The dataset is used in multiple cases of data visualization and predicting an approximate number. Fig 1 displays a flowchart of the methodology which has been used and implemented in our research and includes data cleaning, its display on graphs and usage of various regression models and obtain appropriate results. Fig. 1. Flow diagram of research work 3.1 Data Collection The data is obtained from [1]. The dataset includes features like song name, artist, artist followers, genre, popularity, danceability, energy, loudness and many more. Fig. 2. Dataset 3.2 Data Preprocessing In this step we remove the inaccurate or null values directly from the .csv file of the dataset which might lead to wrong results. We had to perform less cleaning as the obtained dataset was nearly obsolete. 3.3 Data Visualization Figure 6 shows a heat map of the data which wehaveused to achieve our results. Heatmap map is defined as graphical notation of large volume of data coded by different colours. The heat map takes all the variables present and correlates them with each other at the same time. Figure 7 shows a histogram of number of times a song was charted vs Song Name, figure 8 is a bar chart plotting of an artist vs streams and figure 9 is bar chart reflection of genre and streams. 3.4 Data Analysis On further evaluation of the heatmap, we came to a conclusion that the most prominent features are Loudness and Energy. This can also be verified with the graphs shown in Figure 3 and 4. Figure 3 shows us a scatter plot between popularity and loudness. The graph displays the relation that higher the loudness, higher the popularity of the song. Fig. 3. Popularity Vs Loudness
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 723 Figure 4 shows a scatter plot betweenpopularityandenergy and it shows that higher the energy of the song, more the popularity. Fig. 4. Popularity Vs Energy Figure 5 displays the regression models we have used with their respective accuracies. The regression model with the most accuracy is random forest. Fig. 5. Accuracy Table Linear Regression is a supervised machine learning algorithm which predicts the target value on the basis of the independent variables provided. According to the accuracies found, we deduced that Linear Regression algorithm would provide us with the best possible results for our dataset. The Hypothesis function of linear regressionisstated below: y = α1 + α2.x … (A) here x denotes input training y = labels to data α1 = intercept α2 = coefficient of x Regression in Random Forest makes use of ensemble learning method. It operates by construction of many decision trees during the training periodand providesusthe mean of the classes as a prediction of all trees. Here LASSO is used acronym of least absolute shrinkages selection operator. Extension of it is penalty term related to cost function. Summation of the coefficients is represented by this phrase. This particular term confine, forcing the model to curtail the value of coefficientsinordertominimize the losses, when the value of coefficients increases from 0 to 1. In contrast to lasso regression, which commonly makes the coefficient absolute zero, ridge regression never puts coefficient value as zeros. The LASSO regressioncombinesstatisticsandMLtoenhance predictability as well as understandability of the resultant model. In essence, a regularised linear regressor is what a ridge regressor is. In other words, we add a regularized term to the initial cost function of the linear regressor in order to drive the learning algorithm to suit the data and help maintain the weights as low as feasible. The 'alpha' parameter of the regularised term controls the regularisation of model, therefore reducing the variance of the estimations. In some conditions where variables are independent & correlated, technique of ridge regression is used for forecasting the coefficients of numerous regressions. Cost Function for Ridge Regressor: Fig. 6. Heat map of Correlation Plot
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 724 Figure 7 shows that Dance Monkey is the most charted song while Beggin’ is the least charted song. Fig. 7. Histogram of number of times charted Vs Song Name Figure 8 shows that the artists, Maneskin have the highest streams for their song while Bad Bunny has the lowest. Fig. 8. Histogram of Artist Vs Streams Figure 9 shows that the genres, Indie rock Italiano have the highest streams while trap latino has the lowest. Fig. 9. Histogram of Genre Vs Streams 4. CONCLUSION Spotify is a very large platform that is used aroundtheworld every day by millions of users and thus this research project helped us to gain more knowledge about it. It was insightful to know about the attributes that actually affect the streams of a song and what causes a user to listen to a particular artist. Moreover, we gained knowledge about various regression models which can be used for other projects in the future. In this research paper, we have successfully predicted the popularity of songs using relevant attributes, which helped us gaining our desired accuracy. The most influential factors that fulfilled our objective to determine streams were loudness and energy. In the initial part of our research, we have used techniques like Lasso Regression, Ridge Regression, Random Forest and Linear Regression. We have obtained the best results from Random Forest, which came out to be 97.48%, Hence the most accurateregressionmodel was Random Forest. In the future, the research work can be improved with the help of a dataset with more songs and attributes to gain better accuracies of the models. REFERENCES [1] Dataset:https://www.kaggle.com/datasets/sashankpilla i/spotify-top-200-charts-20202021 [2] J. S. Gulmatico, J. A. B. Susa, M. A. F. Malbog, A. Acoba, M. D. Nipas and J. N. Mindoro, "SpotiPred: A Machine Learning Approach Prediction of Spotify Music Popularity by Audio Features," 2022 Second International Conference on Power, Control and Computing Technologies (ICPC2T), 2022, pp. 1-5, doi: 10.1109/ICPC2T53885.2022.9776765. R. Nicole, “Title of paper with only first word capitalized,”J.NameStand. Abbrev., in press. [3] Araujo, C. S., Cristo, M., & Giusti, R. (2019). Predicting music popularity on streaming platforms. Anais Do Simpósio Brasileiro De Computação Musical (SBCM 2019). Available: https://doi.org/10.5753/sbcm.2019.10436 [4] Suta, M. (2018, January 1). Hitpredict: Predicting hit songs using Spotify Data Stanford Computer Science 229:Machinelearning.Academia.edu.Retrieved October 7, 2022. Available: https://www.academia.edu/73249006/Hitpredict_Pred icting_Hit_Songs_Using_Spotify_Data_Stanford_Computer _Science_229_Machine_Learning [5] Nijkamp, R. (n.d.). Prediction of product success: Explaining song popularity by audio features from Spotify data. Retrieved October 6, 2022. Available: https://essay.utwente.nl/75422/1/NIJKAMP_BA_IBA.pd f [6] https://www.statisticshowto.com/lasso-regression/
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 11 | Nov 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 725 [7] https://towardsdatascience.com/ridge-regression-for- better-usage-2f19b3a202db [8] https://www.analyticsvidhya.com/blog/2021/06/unde rstanding-random-forest/ [9] http://www.stat.yale.edu/Courses/1997- 98/101/linreg.htm [10] https://www.mathworks.com/help/stats/what-is- linear-regression.html