GitHub - amithash/spectro: Auto music analysis and auto playlist generator

amithash / spectro Public
Notifications You must be signed in to change notification settings
Fork 0
Star 1
Auto music analysis and auto playlist generator
Notifications
Name		Name	Last commit message	Last commit date
Latest commit History 302 Commits
experiments		experiments
include		include
lib		lib
spectradio		spectradio
utils		utils
.gitignore		.gitignore
README		README
SConstruct		SConstruct
setmaxspeed.pl		setmaxspeed.pl
Repository files navigation

===============================================================================
				SPECTRO

			Author: Amithash Prasad
			Email: [email protected]
===============================================================================

Description:
------------

Spectro is a collection of tools which in the end aid in listening to music
similar to an initially predetermined song name. Usage of each tool is provided

Install:
-------
In the root directory, 
scons
sudo scons install
# or scons install # as root

Uninstall
---------
sudo scons install -c
# or scons install -c # as root

===============================================================================
			Requirements
===============================================================================
I cannot right now, describe what all are required. I can state the basic minimum
(probably incomplete) required to compile and use the package:

0. Compilers - gcc, g++, scons (And in turn python)

Decoders
--------
libmpg123
libmpg123-dev
libvorbis0a
libvorbis-dev

Spectgen
--------
libfftw3-3
libfftw3-dev

Histdb
------
libtag1-dev
libtag1c2a 
libtagc0
libtagc0-dev

spectradio
----------
libqt4
libqt4-dev
libphonon-dev 
libphonon4

# To playback, you will also need one of the following
phonon-backend-gstreamer 
phonon-backend-xine
phonon-backend-vlc
#(Kubuntu is usually packaged with the xine or gstreamer backend)
#If you are on a KDE desktop (Like I), then you probably dont care as something should be installed :-)

===============================================================================
				DB Generation
===============================================================================
Run genhistdb 

genhistdb <Music Dir> <Hist DB File> [<Optional number of threads to use>]

1. If "Hist DB File" does not exist (Or was generated some time ago during
   which is no longer compatible), a new one will be created.
2. If "Hist DB File" exists, any music files in "Music Dir" not in the DB
   will be generated and appended to the DB (Update mode).

You can also use the "auto update mode" of genhistdb to add multiple 
music directories to the db.

Please note that, even if you provide number of threads as 1, the program
is pipelined and hence is already parallelized. 

===============================================================================
				Using spectradio
===============================================================================

1. Start up spectradio,
2. Load DB (the Open icon)
3. Browse and select your Music Directory

If you have already done this before, or used genhistdb to generate
/path/to/Music/db.hdb, your collection will show up instantly. 

Else, a DB file will be generated for you. This will take quite a while
depending on the size of your music collection. Just so that you can 
have a ballpark, it takes a bit more than 4 hours for a collection
of 8000 music files. 

4. From here on, you can select a song you want to listen to, then spectradio
   will pick songs similar to that on. (Like pandora, with your own collection).
   You can also search for a specific string in title, aritst or album.

There is a button to switch to player mode which will make spectradio 
to work like a normal music player. I dont know why you would use this,
there are _much_ better music players out there. In other words, use 
Amarok. 

You can also start spectradio from the command line with the path of the hist
db file:
> spectradio ~/Music.hdb ~/experimental.hdb # You get the drift

Pressing the more options button, you can select the distance function.
(This is mostly for the people who want to check out if it performs
 better with other distance functions)

Some features which might need more explanation:
1. The "next" icon skips the current song, and re-predicts the next
   one based on the current song.
2. The "refresh/retry" icon skips the current playing song, and 
   re-predicts the next one based on the previous played song.

Thus the next/retry functions should be understood in the context of
"this is a radio" and not as a music player which would have a static
playlist.

The "Distance choice" lets you choose which distance function is used.
Obviously, this is mostly there to try out different algorithms.

===============================================================================
				Design
===============================================================================

The design follows a pipelined paradim. The pipeline stages are as follows.

 *---------*    Queue    *----------*  Queue   *-----------* In Memory *--------*
 | DECODER |------------>| SPECTGEN |--------->| GENHISTDB |---------->| HISTDB |
 *---------* Float Mono  *----------*  spect   *-----------*  spect    *--------*
              samples                 samples               histograms

The decoder and spectgen stages are in their own respective threads. 
Genhistdb is a function call which initiates the pipeline for each music file,
accumilates the spect histograms in memory, and finally commits the results
to the histdb.

Thus even with nr_threads = 1, on a dual core CPU, you would find genhistdb
to consume around 150% CPU. Setting nr_threads to greater than 1 implies
that there will be multiple decoder and spectgen pipelines (all connecting to
genhistdb) with a guarded (mutex) write to the memory location. This essentially
makes use of the fact that procesessing each music file is independent of the
next and hence can be parallelizable. 

DECODER
-------

The decoder is split in to multiple segments:
1. Decoder backend: Registers the file extensions it can process.
2. Decoder:         Based on the file extension of the input file, it will
		    call the open of the respective registered backend.

Currently I have support for the following backends:
mpg123 - All mpg files.
vorbis - Vorbis media (.ogg)

There exists lib/decoder/decoder_backend_example.c which should aid in
writing backends to new formats. Please note that the term "backend" might
be misleading, as it is just an abstraction to the real libs which do the 
decoding and does not do any decoding on its own. 

I am not motivated enough to write any more backends other than the two.
(All my music files are mp3's, and I wrote the vorbis backend as no 
open-source music application is complete without supporting the open 
format)

SPECTGEN
--------

Spectgen reads the PCM samples from the decoder as floating point mono
samples with the range [-1, +1]. 

It collects enough samples to form an array of SPECT_WINDOW_SIZE.

runs the amalgamated array through fftw to get the fourier transform of 
the window.

Groups the result into signal powers into 24 bands based on the bark.

Multiplies the power in each bark-band with the inverse equal loudness
coefficients (Which is computed for each bark band at init).

queues up the resultant array of size 24.

Shifts by SPECT_STEP_SIZE samples of the input, and restarts.

GENHISTDB
---------

For each mp3 file, starts the above two stages to get the band stream.

For each sample (array of 24), it increments the respective bin.
Note, the histogram has SPECT_HIST_LEN bins, with the minimum value
of SPECT_MIN_VAL and maximum value of SPECT_MAX_VAL. 

Once the stream ends, it checks if there were sufficient samples, and if so,
run the music files through taglib to get the tags, and populates a
hist structure with the result and stores it in memory.

HISTDB FORMAT:
--------------

The final Hist DB in file, has a header which is a structure containing
all configuration and the length of the array of hist elements.
Finally the file contains the array of hist elements. 

CONFIGURATION/TWEAKING:
-----------------------

All mentions of values in CAPITAL_LETTERS are the configuration of
the hist db. These can be tweaked in include/spect-config.h

Note, currently NBANDS cannot be changed. It just exists for the global
define advantage in spect-config.h

===============================================================================
			       Performance
===============================================================================
System:
Intel Core 2 Duo (Dual core CPU)
CPU Governor set to "performance" to run at the maximum frequency: 2101 MHz

Music Files: 8109
Number of sessions set: 4 (The nr_threads parameter to genhistdb)

Time to complete: 4.00 hours
Average time per music file: ~1.77 seconds. 

Average CPU Load: 196% (Fire in all the cylinders)
Note, my CPU kept dropping to low speeds... So I ran this script

sudo ./setmaxspeed.pl

and killed it after genhistdb


Maximum Memory requirements
---------------------------

Average HIST DB size per MP3 ~2kB. This is usually negligable unless
you have close to a billion music files, or running this on a 1980's computer

Per session computation maximum memory:
MAX_SAMPLING_RATE - the maximum sampling rate of all music files.
MAX_LENGTH        - The longest song length in seconds.

MAXIMUM_POSSIBLE_MEMORY_USE = MAX_SAMPLING_RATE * MAX_LENGTH * 8
8 = 4 for float, 2 for each 16bit PCM channel (Assuming stereo). 

Assuming a maximum song length of 40 minutes (Trust me, I have a few of those!):
MAX_LENGTH = 2400
MAX_SAMPLIN_RATE = 44100 (This is generally the default for most music files).
   =   807MB!!!

Now multiply that with the number of sessions. 

Of course, this is the maximum in theory. even with the assumed maximums,
it will never reach that as this assumes that the decoder allocated memory
for all samples were consumed so slowly that the entire pipeline ended
before genhistdb got to process them! :-)

On my dual core CPU, I use an average of 200MB of my memory!
Thats a drastic reduction as samples are consumed as fast as they are 
generated due to the pipelined nature. You might see more memory 
consumption on single core CPUs and performance might be worse due to
constant switching between the decoder, spectgen and genhistdb threads.


===============================================================================
				Caveats
===============================================================================
1. The paths of mp3 files are stored as absolute paths in the .hdb file.
2. The hdb file contains binary floating point data.

Points 1 and 2 combined, implies that
a. You cannot rename/move your music directory. Everything will fail afterwhich.
b. The hdb files is not portable from one machine to another. 

You can probably move your MP3 collection and hdb file to
another machine if and only if:
a. They are of the same architecture (32bit to 32bit, or 64 bit to 64bit).
b. The Music directory is placed in exactly the same absolute path.

It is also recommended that you do not move the binaries around. Some binaries,
are compiled with specific information about the machine like RAM size, and
num cpus. The binaries are also compiled with optimization parameters which
are native to the current CPU. Essentially the compilation is to generate
the least portable and most performant binaries.

===============================================================================
			    Experimentation
===============================================================================
I maintain the experiments/ directory for all sorts of experiments I want to
do. They are mostly not related to the project, but are generally related
to music analysis. 

===============================================================================
			Credit where credit is due
===============================================================================

I started off with the code of moodbar which I no longer use, but it did help
start off the project.

spectradio was started using the base example code of the QT Music player
example code. Of course, it no longer the same, but it was based off it,
and hence my thanks.