
Spotify Wrapped - Association Rules
This program uses the apriori algorithm to create association rules for visualizing listening patterns
Association Rules: Spotify Listening Patterns
Background
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases.
The data used in this analysis is my own personal Spotify history over the past year. Individuals can request data from Spotify in privacy settings. Spotify provides listening data in json format. I designed an Alteryx workflow to convert json data into a data frame that can be read by the apriori algorithm.
The Apriori Algorithm
The Apriori Algorithm generates association rules based on common groups of items in transactions. It is traditionally used for market research in retail markets to group products commonly bought together on store shelves.
The Alteryx workflow for this analysis groups songs into baskets of six songs listened to successively.
Key Terms:
- Support: The percentage of transactions that included this relationship
- Confidence: A measure of the reliability of the rule. It is the probability the rule is correct in predicting another item in the transaction.
- Lift: The increased probability an item is included in the transaction in relation to the population as a whole
Due to the high number of unique tracks and artists, a small support threshold was used to evaluate the rules. Additionally, the small basket size could be improved. With only six tracks per basket, songs listened to in a particular order, such as musicals, are favored with this algorithm.
Spotify Wrapped
Andrew Hattling
11/20/2022
Artist Visualization
The following visualization includes the top 100 artist association rules by lift. Lift is a value that represents the increase in probability that another item will be in the basket as opposed to the population as a whole. For example, . Clusters form around commonly grouped artists.
plot(spotify_rules, method = "graph", limit = 100, engine = "htmlwidget")
Track Visualization
The following visualization includes the top 100 track association rules by lift. Clusters form around commonly grouped songs.
plot(track_rules, method = "graph", limit = 100, engine = "htmlwidget")