Compressing phase vocoder data

View: New views
4 Messages — Rating Filter:   Alert me  

Compressing phase vocoder data

by Ben Gillett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
I have written a phase vocoder, and would like to store the data on disk in this representation.  I'm using an overlap of 4, and the number of bands can be either 128, 256, 512 or 1024.  I've found that using 10 bits to represent the amplitude and 6 bits to represent the deviation from the bin frequency gives good quality results.  Currently the phase vocoder data uses up twice as many bytes as the equivalent audio file (at 16bit, 44kHz).  I would like to compress it down so that it uses about the same amount of space as the audio file, in such a way that it sounds indistinguishable or very similar to the original file.  I would like the decompression algorithm to be very fast, and would like to implement the compression in a couple of days coding time at most. 
I've searched on the web for any relevant information, but have come up with a blank so far.  Can anyone point me to relevant information about this problem, or alternatively give me the benefit of their experience? 
The things that I was thinking of trying were storing an amplitude gain per time instant, so that fewer bits could be used to represent the amplitude component.  Also, I was considering assigning different numbers of bits to amp and freq depending on the frequency of the bin, on the assumption that higher frequency bins could be represented less accurately.  I was also wondering whether it might be possible to store amplitude data less frequently in either the time or frequency dimension for higher frequency bins without noticeable degredation in quality.  I was also considering using ADPCM with uLaw compression to compress both the amplitudes and frequencies.  If anyone has any thoughts on how well they think any of the above might work, I'd be very grateful for your thoughts.
Thanks
Ben
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: Compressing phase vocoder data

by Andy Farnell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 14 May 2008 11:26:34 +0000 (GMT)
Ben Gillett <ben_j_gillett@...> wrote:

> I was also wondering whether it might be possible to store amplitude
> data less frequently in either the time or frequency dimension for higher
> frequency bins without noticeable degredation in quality.  I was also
> considering using ADPCM with uLaw compression to compress both the
>  amplitudes and frequencies. 

This seems the most useful data reduction approach. The first thing
that came to mind is fitting breakpoint line envelopes as is often
done with additive resynthesis data.

You may be able to find a ready solution and hack it to your data
format.

--
Use the source
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Parent Message unknown IOWA music samples - processing required?

by Joey McKay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Just wondering is anybody on the list familiar with the IOWA music
samples? And if you are, do you know how it is best to process the
samples for the purpose of classification?

This is what I'm trying to do - I'm currently developing a musical
instrument ID system. After an extended literature review for my
research, the IOWA samples have proved a popular DB for musical
instrument ID research.However, I haven't come across any literature
which details how the IOWA samples have been processed. For example the
file "AltoFlute.ff.C4B4.aiff" contains Alto Flute note samples played in
sequence across the frequency note range C4 to B4. Unlike the RWC
database there is no mute (silence) gap inserted between the notes in
the IOWA samples.

I'm wondering how do I continue? In order to train and test classifiers,
I'm assuming it is best to separate each note from the IOWA sample and
then train using the individual notes. Is this correct? But how does one
separate the notes? Is there an efficient process,  manual or automated.
As the onsets, decays (and other dynamics) are of importance, how does
one decide where to make the cut?

Maybe I'm over complicating this issue, but if there is somebody out
there who has preprocessed the IOWA samples and has thought this task
through, then maybe they can assist?

Thank you,

Joey.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp links
http://music.columbia.edu/cmc/music-dsp 
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: IOWA music samples - processing required?

by Frederik Hjorth-Jensen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Joey

I've used the IOWA music sample for my master project in Instrument
Recognition. I did try and separate individual note but only to be able to
use them in a softwaresampler to produce more musical data. Eventually I
didnt have enogh time so I didnt use the separation after all.

I can send you a pdf of my thesis if you're interested. Also I've combined
files to get the largest registre for each instrument at different levels
pp,mf and ff. For most instruments I have made rex2 files for note
separation. If interested maybe I can upload these data somewhere so you can
get it.

Right now I'm on vacation without access to the data but I'll be back during
the weekend.

Best regards,

Frederik

>
> Just wondering is anybody on the list familiar with the IOWA music
> samples? And if you are, do you know how it is best to process the samples
> for the purpose of classification?
>
> This is what I'm trying to do - I'm currently developing a musical
> instrument ID system. After an extended literature review for my research,
> the IOWA samples have proved a popular DB for musical instrument ID
> research.However, I haven't come across any literature which details how
> the IOWA samples have been processed. For example the file
> "AltoFlute.ff.C4B4.aiff" contains Alto Flute note samples played in
> sequence across the frequency note range C4 to B4. Unlike the RWC database
> there is no mute (silence) gap inserted between the notes in the IOWA
> samples.
>
> I'm wondering how do I continue? In order to train and test classifiers,
> I'm assuming it is best to separate each note from the IOWA sample and
> then train using the individual notes. Is this correct? But how does one
> separate the notes? Is there an efficient process,  manual or automated.
> As the onsets, decays (and other dynamics) are of importance, how does one
> decide where to make the cut?
>
> Maybe I'm over complicating this issue, but if there is somebody out there
> who has preprocessed the IOWA samples and has thought this task through,
> then maybe they can assist?
>
> Thank you,
>
> Joey.
> --
> dupswapdrop -- the music-dsp mailing list and website: subscription info,
> FAQ, source code archive, list archive, book reviews, dsp links
> http://music.columbia.edu/cmc/music-dsp 
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
>

--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp links
http://music.columbia.edu/cmc/music-dsp 
http://music.columbia.edu/mailman/listinfo/music-dsp
LightInTheBox - Buy quality products at wholesale price