Skip to content

TZstatsADS/Fall2016-proj4-jaime-gacitua

Repository files navigation

Project: Words 4 Music

image

Term: Fall 2016

  • Data link-(courseworks login required)
  • Data description
  • Contributor's name: Jaime Gacitua
  • Project title: GBM to Recommend song lyrics
  • Project summary: A model is proposed to predict song lyrics given song features. A brief description of the model is below.

  • We have a dictionary of 5000 words and, 2700 songs.

  • We have a matrix that, for every song, indicates how many times each word appears (bag of words)

  • We also have access to multiple features, for every song

  • The 19 features chosen to predict words are the following:

    1. tempo.median
    2. tempo.var
    3. tatums.median
    4. tatums.var
    5. loudness.median
    6. loudness.var
    7. duration
    8. timbre (median of each of the 12 dimensions)
  • The word matrix is converted into a binary matrix.

    • If a song is present in a song, the value is 1. Otherwise, 0.
  • For each column (word) of the matrix, a Generalized Boosting Model (GBM) was fitted, with bernoulli responses.

    • The 19 features are the input, and the (0-1) word column is the output.
    • In total 5000 GBM models are fit.
  • Parameter tuning was done using cross validation.

    • The error is calculated using the sum of ranks.
  • The model trains in around 30 minutes

  • The best average sum of ranks achieved was 0.229, and the simplest model for that result was n=100 trees and depth=8.

Following suggestions by RICH FITZJOHN (@richfitz). This folder is orgarnized as follows.

proj/
├── lib/
├── data/
├── doc/
├── figs/
└── output/

Please see each subfolder for a README file.

About

Fall2016-proj4-jaime-gacitua created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published