Sentiment Analysis and Data Exploration Spotify Web App

Project Overview

This project uses Spotify's API (Spotipy), the Genius API, and Flask to create a web app that performs sentiment analysis on user-selected Spotify playlists. The goal is to provide insights into playlist sentiment and explore musical attributes. The app was later migrated to an AWS EC2 instance.

Key libraries used:

Data Sources

The data for analysis is sourced from the user’s Spotify playlists, providing attributes such as:

App Initialization

To use Spotipy, the app was registered with Spotify for Developers, obtaining a Client ID and Secret. Similarly, a Genius API token was required for lyrics retrieval.

Database management is handled using SQLAlchemy, storing the song attributes mentioned previously for efficient access.

Data Fetching & Handling

Users sign in to their Spotify accounts to select playlists, and Spotipy functions retrieve the playlist's song data. Songs are checked against the MySQL database—if not present, they are fetched and stored.

All images that appear throughout this page are based on one of my playlists, Novelty!?.

playlist page tracks table

Playlist Analysis

The analysis includes:

Analysis results

Based on this analysis, the playlist mean polence is slightly higher than the database mean.

Songs which contributed the highest polence values were 'Failed at Math(s)' by Panchiko and 'L.A. NIGHT' by Yasuko Agawa. These two songs are indeed fairly positive. 'L.A. NIGHT' especially is a jazzy, groovy song about the romanticized allure of a night in Los Angeles, so I believe it fits very well up near the top.

Songs which contributed the lowest polence values were 'Loverboy' by The Marías and 'Welcome To Heartbreak' by Kanye West. Respectively, they are about ending a relationship with a cheating lover and an inability to find true happiness even with a lavish lifestyle. Both are both reserved instrumentally. As such, I believe they fit at the bottom of the scale.

The cluster visualization attempts to find groupings of songs which exhibit similarities in some way. Cluster 0 in dark blue contains many songs I would classify as downtempo, but not necessarily sad. These are songs that might have a tinge of negativity in their composition and lyrical sentiment that could go either way. Cluster 1 in cyan contains songs that are sadder to me. Many slow, longing tracks here; especially love songs. Cluster 2 in yellow contains more upbeat sounding songs. A lot of these are very positive. Not all variance is captured within the two principal components; 46% of the full dataset's variance is represented here, which is something I'll look to improve.

AWS Migration

The app was migrated to AWS using an EC2 instance for hosting and an RDS instance for database management. The migration consisted of:

Code & Implementation

The source code for this project is available on GitHub. Key files include:

Future Work

Planned improvements:

Project Takeaways

This project reinforced my understanding of data processing, API integration, and clustering techniques. I'd never used AWS before, so using it to host the app has provided me with a deeper understanding of what AWS entails.