Skip to content

Instantly share code, notes, and snippets.

@cavedave
Last active December 6, 2019 17:57
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cavedave/e08073f23ccc742c3eadf82fa2cd352a to your computer and use it in GitHub Desktop.
Save cavedave/e08073f23ccc742c3eadf82fa2cd352a to your computer and use it in GitHub Desktop.
Graph of how loud Music Genres are
---
title: "Loudness by Genre"
output: html_notebook
---
An analysis of music by Genre to see if loudness varies
It was believed that online streaming platform have reduced loudness. But does this have the same effect accross all genres of music?
There are 26 genres so it is a total of 232,725 tracks.
If we are right and loudness in the music production is bad for hearing
but that music streaming services like spotify
The most popular genres in the top 40 are now
Hip-hop, Pop, Rock, Electronic are the most popular genres in the top-40
https://www.economist.com/graphic-detail/2018/02/02/popular-music-is-more-collaborative-than-ever
The data is in
https://www.kaggle.com/zaheenhamidani/ultimate-spotify-tracks-db/download
Spotify explain their api at
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/
This is digital loudness with is different to acoustic loudness. This comment explains this well https://www.reddit.com/r/dataisbeautiful/comments/cl5m3a/music_has_gotten_louder_oc/evube0a/?context=3
Digital loudness measures soemthing like the difference between the average loundess and the peak loudness of the song. Songs with bigger differences junmp out more on the radio and this lead to the loudness wars.
First load some data. This is a version of the full dataset but missing lots of columns. How we construct it is below
```{r}
library(stringr)
library(tidyverse)
library(ggplot2)
library(cowplot)
library(readr)
df2=read.csv("smalldata.csv",encoding = "UTF-8" )
head(df2)
```
How these raincloud graphs are made
```{r}
source("https://gist.githubusercontent.com/benmarwick/2a1bb0133ff568cbe28d/raw/fb53bd97121f7f9ce947837ef1a4c65a73bffb3f/geom_flat_violin.R")
```
https://micahallen.org/2018/03/15/introducing-raincloud-plots/
and a paper 'Allen M, Poggiali D, Whitaker K et al. Raincloud plots: a multi-platform tool for robust data visualization' https://wellcomeopenresearch.org/articles/4-63/v1
```{r}
p3 <- ggplot(df2,aes(x=genre,y=loudness, fill = genre))+
geom_flat_violin(position = position_nudge(x = .2, y = 0),adjust = 2)+
geom_point(position = position_jitter(width = .15), size = .25)+
ylab('Loudness (dB)')+xlab('Genre')+coord_flip()+theme_cowplot()+guides(fill = FALSE)+
geom_boxplot(width = .1, guides = FALSE, outlier.shape = NA, alpha = 0.5) +
# geom_boxplot(aes(x = as.numeric(genre), y = loudness),outlier.shape = NA, alpha = 0.3, width = .1, colour = "BLACK") +#+0.25
# geom_errorbar(data = summary_loud, aes(x = genre, y = Mean, ymin = Mean-ci, ymax = Mean+ci), position = position_nudge(.25), colour = "BLACK", width = 0.1, size = 0.8)+
ggtitle('Loudness by Genre')+
theme(plot.title = element_text(hjust = 0.5))
ggsave("GenreR.png", height=20, width=20)
p3
```
```{r}
summary_loud<-as.data.frame(tapply(df2$loudness, df2$genre, summary))
```
```{r}
summary_loud<-as.data.frame(summary(df2))
```
```{r}
summary_loud
```
```{r}
df=read.csv("spotify.csv",encoding = "UTF-8" )
head(df)
```
```{r}
names(df)[1] <- "genre"
head(df)
```
```{r}
#library(lavaan)
df<-dplyr::select(df, -c('artist_name', 'key','mode','time_signature','track_name','track_id','popularity','danceability','acousticness','duration_ms','energy','instrumentalness','liveness','speechiness','tempo','valence'))
```
```{r}
summary(df)
```
This gets the stats on each genre. Median ,mean loudness and such
```{r}
#summary<-
tapply(df$loudness, df$genre, summary)
```
$`Children's Music`
Min. 1st Qu. Median Mean 3rd Qu. Max.
-36.721 -14.094 -11.286 -11.642 -8.614 0.948
$`Children’s Music`
data['genre'] = data['genre'].str.replace('’','\'')
df<-str_replace_all(df$genre, "’", "'")
summary(df)
```{r}
head(df)
```
filter out uncommon genres or ones I think are repetition
```{r}
df2 <-filter(df, grepl('Dance|Rock|Pop|Classical', genre))
head(df2)
```
```{r}
# Write CSV in R
write.csv(df2, file = "smalldata.csv", row.names=FALSE)
```
```{r}
quiet <-filter(df, loudness>1)
quiet
```
Quietest songs are
Brian Eno Neroli and Shakuhachi Sakano Call to Wake
Loudest are Justice We Are Your Friends - Justice Vs Simian
The Stooges Shake Appeal - Iggy Pop Mix
@cavedave
Copy link
Author

cavedave commented Dec 1, 2019

GenreR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment