-
Notifications
You must be signed in to change notification settings - Fork 0
amithash/spectro
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
=============================================================================== SPECTRO Author: Amithash Prasad Email: [email protected] =============================================================================== Description: ------------ Spectro is a collection of tools which in the end aid in listening to music similar to an initially predetermined song name. Usage of each tool is provided Install: ------- In the root directory, scons sudo scons install # or scons install # as root Uninstall --------- sudo scons install -c # or scons install -c # as root =============================================================================== Requirements =============================================================================== I cannot right now, describe what all are required. I can state the basic minimum (probably incomplete) required to compile and use the package: 0. Compilers - gcc, g++, scons (And in turn python) Decoders -------- libmpg123 libmpg123-dev libvorbis0a libvorbis-dev Spectgen -------- libfftw3-3 libfftw3-dev Histdb ------ libtag1-dev libtag1c2a libtagc0 libtagc0-dev spectradio ---------- libqt4 libqt4-dev libphonon-dev libphonon4 # To playback, you will also need one of the following phonon-backend-gstreamer phonon-backend-xine phonon-backend-vlc #(Kubuntu is usually packaged with the xine or gstreamer backend) #If you are on a KDE desktop (Like I), then you probably dont care as something should be installed :-) =============================================================================== DB Generation =============================================================================== Run genhistdb genhistdb <Music Dir> <Hist DB File> [<Optional number of threads to use>] 1. If "Hist DB File" does not exist (Or was generated some time ago during which is no longer compatible), a new one will be created. 2. If "Hist DB File" exists, any music files in "Music Dir" not in the DB will be generated and appended to the DB (Update mode). You can also use the "auto update mode" of genhistdb to add multiple music directories to the db. Please note that, even if you provide number of threads as 1, the program is pipelined and hence is already parallelized. =============================================================================== Using spectradio =============================================================================== 1. Start up spectradio, 2. Load DB (the Open icon) 3. Browse and select your Music Directory If you have already done this before, or used genhistdb to generate /path/to/Music/db.hdb, your collection will show up instantly. Else, a DB file will be generated for you. This will take quite a while depending on the size of your music collection. Just so that you can have a ballpark, it takes a bit more than 4 hours for a collection of 8000 music files. 4. From here on, you can select a song you want to listen to, then spectradio will pick songs similar to that on. (Like pandora, with your own collection). You can also search for a specific string in title, aritst or album. There is a button to switch to player mode which will make spectradio to work like a normal music player. I dont know why you would use this, there are _much_ better music players out there. In other words, use Amarok. You can also start spectradio from the command line with the path of the hist db file: > spectradio ~/Music.hdb ~/experimental.hdb # You get the drift Pressing the more options button, you can select the distance function. (This is mostly for the people who want to check out if it performs better with other distance functions) Some features which might need more explanation: 1. The "next" icon skips the current song, and re-predicts the next one based on the current song. 2. The "refresh/retry" icon skips the current playing song, and re-predicts the next one based on the previous played song. Thus the next/retry functions should be understood in the context of "this is a radio" and not as a music player which would have a static playlist. The "Distance choice" lets you choose which distance function is used. Obviously, this is mostly there to try out different algorithms. =============================================================================== Design =============================================================================== The design follows a pipelined paradim. The pipeline stages are as follows. *---------* Queue *----------* Queue *-----------* In Memory *--------* | DECODER |------------>| SPECTGEN |--------->| GENHISTDB |---------->| HISTDB | *---------* Float Mono *----------* spect *-----------* spect *--------* samples samples histograms The decoder and spectgen stages are in their own respective threads. Genhistdb is a function call which initiates the pipeline for each music file, accumilates the spect histograms in memory, and finally commits the results to the histdb. Thus even with nr_threads = 1, on a dual core CPU, you would find genhistdb to consume around 150% CPU. Setting nr_threads to greater than 1 implies that there will be multiple decoder and spectgen pipelines (all connecting to genhistdb) with a guarded (mutex) write to the memory location. This essentially makes use of the fact that procesessing each music file is independent of the next and hence can be parallelizable. DECODER ------- The decoder is split in to multiple segments: 1. Decoder backend: Registers the file extensions it can process. 2. Decoder: Based on the file extension of the input file, it will call the open of the respective registered backend. Currently I have support for the following backends: mpg123 - All mpg files. vorbis - Vorbis media (.ogg) There exists lib/decoder/decoder_backend_example.c which should aid in writing backends to new formats. Please note that the term "backend" might be misleading, as it is just an abstraction to the real libs which do the decoding and does not do any decoding on its own. I am not motivated enough to write any more backends other than the two. (All my music files are mp3's, and I wrote the vorbis backend as no open-source music application is complete without supporting the open format) SPECTGEN -------- Spectgen reads the PCM samples from the decoder as floating point mono samples with the range [-1, +1]. It collects enough samples to form an array of SPECT_WINDOW_SIZE. runs the amalgamated array through fftw to get the fourier transform of the window. Groups the result into signal powers into 24 bands based on the bark. Multiplies the power in each bark-band with the inverse equal loudness coefficients (Which is computed for each bark band at init). queues up the resultant array of size 24. Shifts by SPECT_STEP_SIZE samples of the input, and restarts. GENHISTDB --------- For each mp3 file, starts the above two stages to get the band stream. For each sample (array of 24), it increments the respective bin. Note, the histogram has SPECT_HIST_LEN bins, with the minimum value of SPECT_MIN_VAL and maximum value of SPECT_MAX_VAL. Once the stream ends, it checks if there were sufficient samples, and if so, run the music files through taglib to get the tags, and populates a hist structure with the result and stores it in memory. HISTDB FORMAT: -------------- The final Hist DB in file, has a header which is a structure containing all configuration and the length of the array of hist elements. Finally the file contains the array of hist elements. CONFIGURATION/TWEAKING: ----------------------- All mentions of values in CAPITAL_LETTERS are the configuration of the hist db. These can be tweaked in include/spect-config.h Note, currently NBANDS cannot be changed. It just exists for the global define advantage in spect-config.h =============================================================================== Performance =============================================================================== System: Intel Core 2 Duo (Dual core CPU) CPU Governor set to "performance" to run at the maximum frequency: 2101 MHz Music Files: 8109 Number of sessions set: 4 (The nr_threads parameter to genhistdb) Time to complete: 4.00 hours Average time per music file: ~1.77 seconds. Average CPU Load: 196% (Fire in all the cylinders) Note, my CPU kept dropping to low speeds... So I ran this script sudo ./setmaxspeed.pl and killed it after genhistdb Maximum Memory requirements --------------------------- Average HIST DB size per MP3 ~2kB. This is usually negligable unless you have close to a billion music files, or running this on a 1980's computer Per session computation maximum memory: MAX_SAMPLING_RATE - the maximum sampling rate of all music files. MAX_LENGTH - The longest song length in seconds. MAXIMUM_POSSIBLE_MEMORY_USE = MAX_SAMPLING_RATE * MAX_LENGTH * 8 8 = 4 for float, 2 for each 16bit PCM channel (Assuming stereo). Assuming a maximum song length of 40 minutes (Trust me, I have a few of those!): MAX_LENGTH = 2400 MAX_SAMPLIN_RATE = 44100 (This is generally the default for most music files). = 807MB!!! Now multiply that with the number of sessions. Of course, this is the maximum in theory. even with the assumed maximums, it will never reach that as this assumes that the decoder allocated memory for all samples were consumed so slowly that the entire pipeline ended before genhistdb got to process them! :-) On my dual core CPU, I use an average of 200MB of my memory! Thats a drastic reduction as samples are consumed as fast as they are generated due to the pipelined nature. You might see more memory consumption on single core CPUs and performance might be worse due to constant switching between the decoder, spectgen and genhistdb threads. =============================================================================== Caveats =============================================================================== 1. The paths of mp3 files are stored as absolute paths in the .hdb file. 2. The hdb file contains binary floating point data. Points 1 and 2 combined, implies that a. You cannot rename/move your music directory. Everything will fail afterwhich. b. The hdb files is not portable from one machine to another. You can probably move your MP3 collection and hdb file to another machine if and only if: a. They are of the same architecture (32bit to 32bit, or 64 bit to 64bit). b. The Music directory is placed in exactly the same absolute path. It is also recommended that you do not move the binaries around. Some binaries, are compiled with specific information about the machine like RAM size, and num cpus. The binaries are also compiled with optimization parameters which are native to the current CPU. Essentially the compilation is to generate the least portable and most performant binaries. =============================================================================== Experimentation =============================================================================== I maintain the experiments/ directory for all sorts of experiments I want to do. They are mostly not related to the project, but are generally related to music analysis. =============================================================================== Credit where credit is due =============================================================================== I started off with the code of moodbar which I no longer use, but it did help start off the project. spectradio was started using the base example code of the QT Music player example code. Of course, it no longer the same, but it was based off it, and hence my thanks.
About
Auto music analysis and auto playlist generator
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published