README - Test data for ICML 2013 - ML4B - Machine Learning for Bioacoustics =========================================================================== contact :, - Updated version 28th of may (small correction in the dev set, precisions on Metadata) The data for this challenge are copyright of Fernand Deroussen Jerome Sueur of the Museum National d Histoire Naturelle, their usage is restricted to this challenge. The competition test data was graciously provided by Jerome Sueur. These data were recorded by 3 microphones in the same area, into 3 different forest state (A,B,C : mature, young, open), 16bits, frequency sample = 44.1kHz, format: .wav. All recorded the same day, 30 minutes before sunrise, in Vallee Chevreuse (Paris). The geography of A, B, C sites are (from west to east) given on this map : Singing species were identified by Frederic Jiguet (Museum national d'Histoire naturelle, France). Each test file contains the name of the site (A,B or C), the date (year, month, day) and the hour, min, sec. of recording, details are given in : Depraetere M, Pavoine S, Jiguet F, Gasc A, Duvail S, Sueur, J - Monitoring animal diversity using acoustic indices: Implementation in a temperate woodland. Ecological Indicators, 13: 46-54 We give below a short groundtruth (1), a metadata (2), and (3) we suggest the use of the given MFCC features that have been optimized for bird song computation. 1) DEVELOPMENT SET (updated the 28th of may) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Please remind that the DEVELOPMENT set has been updated the 28th of may. We give the groundtruth of the first day x 3 microphones (A,B,C) of the test set (1 = the species songs, else 0, *: corrected value). It is respectively for A_20090324_063100.wav, B_20090324_063100.wav, C_20090324_063100.wav : Certhia brachydactyla = 1,1,0 Corvus corone = 0,1,0 Erithacus rubecula = 0,1*, 1 * Motacilla alba = 0,0,1* Parus caeruleus = 0,0,1 Phasianus colchicus = 0,0,0* Sitta europaea = 1,0,0 Troglodytes troglodytes = 1,1,0 Turdus merula = 0,1,1 Turdus philomelos Brehm = 1,1,1 all others species (26) = 0,0,0 2) METADATA: %%%%%%%%%%%%% We give the meteorology during the recording: it may be correlated to some species or noises and may be integrated in to your models (* even in official runs *). Source: Meteo France. Starting day: the first day of A, B, or C data set. Data are : Date Tmin (degre C) Tmax (degre C) Tmoy (degre C) Rain (mm) TotalSunshine (h) Wind (km/h) WindOrientation HumidityMax (%) HumidityMin (%) Dates are given in day, Day 01: 24th of march 2009, Day 02: 25th of march 2009,... These data are available in CVS format are The content is : D 01 2.2 9.1 5.6 0 3 12.1 NW 91 47 D 02 3.2 10.6 6.6 5 1 14.6 WNW 94 55 D 03 5.3 9.3 7.5 2 0 13.2 SW 93 75 D 04 5.3 10.7 7.9 3 1 12.1 SW 93 57 D 05 3.9 9.9 6.3 4 4 9.8 SW 93 58 D 06 0.5 10.1 5.6 0 4 6 NNW 94 44 D 07 -0.3 11.7 5.9 0 7 4 NNE 89 35 D 08 2.1 14.7 8.6 0 11 8.5 NNE 75 37 D 09 3.5 16.5 10.4 0 11 11.3 NNE 81 36 D 10 5.9 18.2 12.4 0 11 7.1 NNE 77 48 D 11 5.8 17.3 11.8 0 3 4 WNW 94 53 D 12 5.8 13.4 9.9 0 3 5.8 SW 93 57 D 13 7.7 14.9 11.2 0 2 1.8 NW 90 54 D 14 7.4 20.7 13.9 6 10 6.1 SE 83 35 D 15 9.6 15.7 11.7 0 3 5.8 SW 95 40 D 16 6 12.1 9.2 4 0 13.4 SSW 95 69 D 17 10.7 20.2 14.3 0 7 10.8 SE 95 42 D 18 11 19.2 14.8 0 4 9 SE 86 48 D 19 7.5 19.4 13.3 2 4 5.5 SE 98 43 D 20 10.4 18.5 14.1 0 5 5.8 NW 95 56 D 21 8.1 17.9 12.9 0 5 5 NW 96 56 D 22 11.2 19.6 15.1 7 3 2.1 R 94 52 D 23 11.9 20.4 14.9 1 5 8.5 SE 93 38 D 24 9.1 13.9 11.3 0 1 3.5 SW 93 56 D 25 7.4 15.6 11.3 0 5 6.1 WNW 91 51 D 26 7.3 10.9 9.3 5 0 4.8 WNW 95 78 D 27 9.2 15.2 11.4 0 2 5.6 NNE 95 57 D 28 6.8 19.2 13.3 0 9 8.7 NNW 96 47 D 29 6.7 19.5 13.1 0 10 10.1 NNW 94 44 D 30 6.3 19 13.1 0 9 6 NW 91 44 D 31 5.5 18.4 12.4 0 10 2.6 NE 90 24 D 32 6.3 17.8 12.3 0 13 7.4 E 78 36 D 33 8.4 14.1 10.5 0 2 7.7 SE 90 56 D 34 7.8 14.7 10.9 0 1 2.4 SE 94 57 D 35 6.7 15 10.1 11 3 11.3 SW 94 57 D 36 3.8 13.8 8.4 5 5 6.1 SSW 100 53 D 37 6.4 13.9 9.8 1 3 4.3 WNW 94 53 D 38 4.1 18.2 11.6 0 9 3.1 SE 98 40 D 39 6.1 20.2 13.6 0 12 4.8 NW 95 64 D 40 6.2 16.9 12.4 0 3 3.9 NNW 95 64 D 41 7.5 13.3 10.8 0 0 4.3 NW 95 57 D 42 4.6 15 10.5 0 7 6 NW 98 35 D 43 7.2 15.8 11.6 0 0 6.8 W 89 69 D 44 8.2 16.5 12.8 0 4 7.1 W 90 59 D 45 6 23.8 15.5 0 10 3.7 SSE 96 47 D 46 10.5 18.5 13.7 0 5 6.6 NW 91 49 D 47 9.7 16.1 12.5 0 2 2.7 E 93 66 D 48 9.3 20.2 14.2 10 2 4.8 E 95 62 D 49 13 14.2 13.7 18 0 7.1 NNE 96 87 D 50 11.4 17.8 13.9 1 3 6.8 NE 97 68 D 51 11.1 20.7 15.9 7 2 3.4 NE 99 65 D 52 12.4 18.3 14.8 3 0 2.4 W 96 79 D 53 10.9 14.8 12 3 3 8.4 SW 96 52 D 54 7.3 15.4 11.8 4 1 10 SW 92 75 D 55 9 15.8 11.4 0 2 9.7 SSW 95 68 D 56 7.2 18.7 12.4 0 7 8.4 SW 95 47 D 57 7.4 19.4 13.6 0 5 3.5 SW 96 47 D 58 6.7 22.5 15.4 0 9 1.6 E 96 44 D 59 11.9 21.9 16.9 0 4 4 WNW 92 52 D 60 8.4 19.5 14.8 0 14 5.1 NW 92 37 D 61 9.7 21.8 16.1 0 5 6 NNE 83 49 D 62 12.9 28.3 21.2 0 9 2.3 NE 94 53 D 63 17.9 28.4 22.8 4 9 6.8 NE 89 56 D 64 12.3 16.1 14.3 7 4 12.7 WNW 94 51 D 65 6.8 16.2 11.9 1 3 7.2 W 95 53 D 66 12.2 17 14.5 0 0 5.6 NE 95 76 D 67 11.8 21.9 16.9 0 11 7.7 E 95 40 D 68 12 22.5 17.7 0 15 10.9 NNE 60 35 D 69 11.5 23 17.4 0 9 7.7 NW 76 47 D 70 13.1 25.1 19.5 0 11 9.7 NNE 89 44 D 71 13.8 23.5 18.9 0 14 10.3 N 85 37 D 72 11.4 23.1 17.6 0 11 7.1 N 84 40 D 73 8.9 18.6 14.1 0 11 6.4 NNE 79 41 D 74 7.9 18.6 13.6 2 10 7.2 E 77 43 3) SUGGESTED AUDIO FEATURES: %%%%%%%%%%%%%%%%%%%%%%%%%%%% We provide the usual MFCC (Mel Frequency Cepstral Coefficient) features for your classification task (they are availa ble in OCTAVE, matlab -V4 format, and cvs in the Kaggle web site) The ICML4B BIRD TRAIN and TEST data have been processed with the same parameters, that have been tuned for bird songs on several sets (minimizing the residual energy of the spectral reconstruction from MFCC). The power cepstrum of a signal is defined as the squared magnitude of the Fourier transform of the logarithm of the squared magnitude of the Fourier transform of a signal (Bogert and al. 1963). We used the melfcc.m function of ROSA lab from University of Columbia, available at the following URL: melfcc.m function is tuned by 17 input parameters given below. The "melfcc.m" function successively calculates three matrices: pspectrum, aspectrum and cepstra. First, the auditory spectrum (apsectrum) is converted directly into LPC coefficients (Linear Predictive Coding) following the method of Levinson-Durbin. Then LPC coefficients are directly converted into cepstral values. The given Cepstra is therefore a matrix that indexes for each temporal bin values 16 cepstral coefficients according to these parameters : The suggested MFCC features were computed according to a minimum error (in average on all the species) reconstruction signal of the signal. The scripts we run are given here : using these parameters : % Here is the matlab use of this toolkit so that you can easely change your features : window=[512]; val_fbtype='mel'; val_broaden=0; val_maxfreq=sr/2; val_minfreq=0; val_wintime=window/sr; val_hoptime=val_wintime/3; val_numcep=[16]; val_usecmp=[0]; val_dcttype=[3]; val_nbands=[32]; val_dither=[0]; val_lifterexp=[0]; val_sumpower=[1]; val_preemph=[0]; val_modelorder=0; val_bwidth=[1]; val_useenergy=[1]; % set the first coefficient to log(energy) ; if you prefer to set it to the first MFCC coeff, rerun with val_useenergy=0 cd /NAS2/CNN/METHODES/MFCC_DUFOUR/SCRIPTS/RASTAMAT/ [cepstra,aspectrum,pspectrum] = melfcc(samples, sr,'wintime',val_wintime,'hoptime',val_hoptime,'numcep',val_numcep,'l ifterexp',val_lifterexp,'sumpower',val_sumpower,'preemph',val_preemph,'dither',val_dither,'minfreq',val_minfreq,'maxf req',val_maxfreq,'nbands',val_nbands,'bwidth',val_bwidth,'dcttype',val_dcttype,'fbtype',val_fbtype,'usecmp',val_usecm p,'modelorder',val_modelorder,'broaden',val_broaden,'useenergy',val_useenergy);