[ml] Meeting Notes: Generative Music, Music identification, Restricted Boltzmann Machines, etc.

Mike Schachter mschachter at eigenminds.com
Thu Mar 29 01:04:52 PDT 2012


I'm not gonna be around tomorrow night - maybe Gershon and
others will?

  mike


On Thu, Mar 29, 2012 at 12:12 AM, xone <xone at fromthegut.org> wrote:
> This looks like it was a really productive meeting..  When's the next meet up?
>
>
> On Mar 24, 2012, at 10:17 AM, Ian Esten wrote:
>
>> Hi all,
>>
>> It sounds like the goal of the project is generating new music based on a seed piece (or maybe a training set?).
>>
>> If that is the case, I would recommend adding a couple of extra steps. As part of step 3, I would add additional feature detection to extract the musical structure of the audio. Features could include onsets, key detection, chords, etc. The generative music step would operate on this data and this would be resynthesised to make new music.
>>
>> The motivation for these extra features is that without them, you are generating new music based mainly on spectral data. The end result is likely to be very similar to the source and will likely have the same musical events.
>>
>> If this sounds like a good suggestion, libxtract would be a good tool to take a look at.
>>
>> Looking forward to join in with this!
>>
>> Ian
>>
>> On Mar 23, 2012, at 10:17 PM, gershon bialer <gershon.bialer at gmail.com> wrote:
>>
>>> == Generative Music
>>> We discussed making something for generative music. I'm going to try
>>> to start something with this, and I should push this onto my github.
>>>
>>> This should involve:
>>> 1) Get the raw audio data from a file into a waveform
>>> PyMir (see https://github.com/jsawruk/pymir does this) does this
>>> calling ffmpeg (see
>>> https://github.com/jsawruk/pymir/blob/master/pymir/audio/mp3.py).
>>>
>>> 2) Get the spectogram
>>> Basically, we just take apply a hamming function (see
>>> http://en.wikipedia.org/wiki/Hamming_function), and then do a Fourier
>>> transform. This gets the signals from the sound, and  You can see this
>>> in action at https://github.com/jsawruk/pymir/blob/master/pymir/audio/transform.py.
>>>
>>> 3) Additional pre-processing
>>> Some choices are
>>> a) MFCC (see http://en.wikipedia.org/wiki/Mel-frequency_cepstrum)
>>> b) Linear Predictive Coding (see
>>> http://en.wikipedia.org/wiki/Linear_predictive_coding)
>>> c) NMF (see http://en.wikipedia.org/wiki/NMF)
>>> d) Something better?
>>>
>>> 4) Fit the music to some sort of model for generating the music. The
>>> idea is to predict s_k (pre-processed sound at time k) from
>>> s_{k-1},s_{k-2},..s_{k-l} with some lag.
>>>
>>> 5) Apply the generative model from step 4 to generate music
>>>
>>> 6) Invert pre-processing steps to get a new waveform
>>> This may or may not work very well.
>>>
>>> == Music Identification
>>> Another interesting project is echoprint. The echoprint project (see
>>> http://echoprint.me/) has code for fingerprinting music. This involves
>>> some sort of preprocessing and then binning. It might be interesting
>>> to improve this.
>>>
>>> The relevant code seems to be at
>>> https://github.com/echonest/echoprint-codegen/tree/master/src in the
>>> SubBandAnalysis and FingerPrint classes. The sub-band class seems to
>>> create a time series of the amplitude of various frequency bands. I
>>> think the FingerPrint class quantizes this data, and applies
>>> MurmurHash. If someone has a better understanding of this, let me
>>> know.
>>>
>>> == Contributing
>>> === PyMir
>>> It is at https://github.com/jsawruk/pymir and numpy and other python libraries.
>>> Things to do:
>>> * Add more audio pre-processing functions (MFCC, NMF, LPC, etc.)
>>> * Improve documentation
>>> * Add better unit testing
>>> * Better visualization of audio (this should be fairly easy with pyPlot)
>>> * Direct bindings to the FFMPEG api
>>> === A new library for restricted Boltzmann machine deep learning
>>> The idea would be to create a new C/C++ library for doing deep
>>> learning. Theano has some capabilities, but it isn't as fast as it
>>> could be, and it requires a CUDA Nvidia GPU to be fast. Presumably,
>>> this would follow the ideas of http://deeplearning.net/.
>>>
>>> For linear algebra operations, we could use Armadillo (see
>>> http://arma.sourceforge.net/), or Eigen (see
>>> http://eigen.tuxfamily.org/index.php?title=Main_Page).
>>>
>>> Boost (see http://www.boost.org/) might be useful for its
>>> pseudo-random number generator (see
>>> http://www.boost.org/doc/libs/1_49_0/doc/html/boost_random.html) and
>>> possibly other things.
>>>
>>> You can look over how this is done with Theano at
>>> http://deeplearning.net/tutorial/.
>>>
>>> == Upcoming conferences, contests, etc.
>>> We are looking at entering PyMir in the ACM Multimedia conference in
>>> Japan (see http://www.acmmm12.org/call-for-open-source-software-competition/).
>>> I understand there are some other local conferences relating to this
>>> stuff. If you have details, please send them to the list.
>>>
>>> == Next meeting
>>> When does everyone want to meet next?
>>>
>>> ---------------------
>>> Gershon Bialer
>>> _______________________________________________
>>> ml mailing list
>>> ml at lists.noisebridge.net
>>> https://www.noisebridge.net/mailman/listinfo/ml
>> _______________________________________________
>> ml mailing list
>> ml at lists.noisebridge.net
>> https://www.noisebridge.net/mailman/listinfo/ml
>
> _______________________________________________
> ml mailing list
> ml at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/ml


More information about the ml mailing list