Spotify Wrapped - Association Rules

This program uses the apriori algorithm to create association rules for visualizing listening patterns

DATA SOURCE

Spotify

YEAR

2022

INDUSTRY

Music

Association Rules: Spotify Listening Patterns

Background

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases.

The data used in this analysis is my own personal Spotify history over the past year. Individuals can request data from Spotify in privacy settings. Spotify provides listening data in json format. I designed an Alteryx workflow to convert json data into a data frame that can be read by the apriori algorithm. 

The Apriori Algorithm

The Apriori Algorithm generates association rules based on common groups of items in transactions. It is traditionally used for market research in retail markets to group products commonly bought together on store shelves. 

The Alteryx workflow for this analysis groups songs into baskets of six songs listened to successively.

Key Terms:

  • Support: The percentage of transactions that included this relationship
  • Confidence: A measure of the reliability of the rule. It is the probability the rule is correct in predicting another item in the transaction. 
  • Lift: The increased probability an item is included in the transaction in relation to the population as a whole

Due to the high number of unique tracks and artists, a small support threshold was used to evaluate the rules. Additionally, the small basket size could be improved. With only six tracks per basket, songs listened to in a particular order, such as musicals, are favored with this algorithm.


Spotify Wrapped

Artist Visualization

The following visualization includes the top 100 artist association rules by lift. Lift is a value that represents the increase in probability that another item will be in the basket as opposed to the population as a whole. For example, . Clusters form around commonly grouped artists.

plot(spotify_rules,  method = "graph", limit = 100, engine = "htmlwidget")

Track Visualization

The following visualization includes the top 100 track association rules by lift. Clusters form around commonly grouped songs.

plot(track_rules, method = "graph", limit = 100, engine = "htmlwidget")
© 2023 Andrew Hattling
Powered by Webnode
Create your website for free! This website was made with Webnode. Create your own for free today! Get started