IEEE Int. Workshop on
Environmental Acoustic Data Mining [EADM]
held in conjunction with 2015 IEEE Int. Conf. on Data Mining (ICDM'2015)
Atlantic City, USA, Nov. 14, 2015
PROGRAM
PROGRAM of the workshop, morning of the 14th nov. (.PDF)PROCEEDINGS
PROCEEDINGS and DEMO (.PDF and .MP4)EADM objectives
This workshop aims to bring together researchers and professionals from worldwide academia and industry for showcasing, discussing, and reviewing the whole spectrum of scientific and technological opportunities, challenges, solutions, and emerging applications in environmental acoustic data mining. We also encourage original work based on interdisciplinary research, such as computer science and ecology, where quantitative evidence is available demonstrating the mutual advantage of such an approach.
These data mining methods now become the key among others for recommendation, alerting, management of natural resources. Providing analysis, such as tracing, propagation, visualization or simulation, new computational approaches focus on representing, analyzing, and extracting useful pattern from them.
This workshop focuses on environmental acoustics data mining that has advanced significantly these years due to the prevalence of the biodiversity researches in long term autonomous sound recording all over the world, in deep forests, undersea, in lakes etc., as well as in habitats with different levels of anthropization, including agricultural lands and urban areas.
During the four sessions of the workshop, the large scale data mining methods will be investigated for soundscape or bioacoustic pattern detection and classification, at either low and high signal to noise ratio. Thus, the objectives are two folds : (a) to produce fast indexing with supervised or unsupervised data mining of complex bioacoustic patterns, (b) to propose summary / abstraction / overview of soundscape structures as theoritical network or flow analyses which is required for biodiversity monitoring.
We will focus on methods scaled to environmental survey using passive acoustics, and on the design of methods for accurate mid-level or high level features detection/classification based on advanced signal decomposition, compressed sensing for large scale analyses, Deep Neural Net for accurate classification, as well as methods for real-time spatial tracking.
Illustrations will be given ranging from cetaceans to birds songs, bats to dolphins biosonars and other animals from deep forest and abysses. Biodiversity analysis and environmental protection projects are some of the direct outcomes of these algorithms.
Paper submissions are welcome on the topics below.
This workshop will open discussions to define an original data mining challenge on sound recordings from two Natural Reserves in Italy. This challenge aims to be a demonstration of a concret EADM paradigm. It aims to help in developping standards that are required to define methodologies to be used in this expanding, but still unexplored, interdisciplinary data science area.
TOPICS
Topics of Interest (but not limited to):
- Environmental Acoustic Data (EAD) acquisition ;
- EAD analysis and event detection ;
- Habitat and acoustic comparisons ;
- Evolution and dynamics of EAD;
- Applications of EAD analysis and mining;
- Acoustic scene analysis in EAD ;
- Modeling and analysis of multidimensional EAD ;
- Anomaly detection in EAD;
- Collective analysis EAD / Crowdsourcing ;
- Data mining and machine learning on EAD ;
- Deep learning for analysis in EAD ;
- Unsupervised learning in EAD ;
- Data mining on dynamic, heterogeneous and large-scale EAD ;
- Real-world applications of EAD ;
- Anthropic noise estimation in EAD ;
- Habitat quality and biodiversity estimation in EAD ;
- Surveillance applications of EAD.
Discussion on categorization of soundscape
An original paradigm towards categorization of terrestrial soundscapes, to be applied on PetaBytes of continuous recordings is discussed at EADM. A sample of more than one month from SABIOD project (CIBRA G. Pavan & LSIS Glotin) will be soon distributed for automatic classification in 11 categories.
We give below the spectrogram of some samples of this data set, also described in this report (clic here):
This challenge shall open in end 2015, and begin of 2016, each challenger will be able to submit two runs weekly. Anonymous scores and the max pooler will be published online. Ranking is based on mean average precision (MAP) metrics computed from the similarity scores of the 11 categories (evaluation script will be distributed Python/octave/matlab).
Definition of the 11 classes of the coming EADM challenge
- Biophony:
- - Class 1 = 'Insects' (inse) = 'It exists at least 10 continous seconds with high frequency not tonal sounds in long series, which may occupy a large frequency band'.
- - Class 2 = 'Low song' (lson) = 'It exists at least 10 continuous seconds with structured patterns of different notes below 1.5 kHz'.
- - Class 3 = 'High song' (hson) = 'It exists at least 10 continous seconds with structured patterns of different notes higher than 1.5 kHz'.
- - Class 4 = 'Low call' (lcal) = 'It exists at least 10 continuous seconds with a simple note (eventually in series) below 1.5 kHz (may be by birds or frogs)'.
- - Class 5 = 'High call' (hcal) = 'It exists at least 10 continuous seconds with a simple note (eventually in series) higher than 1.5 kHz (likely all emitted by birds)'.
- - Class 6 = 'Mammals' (mamm)= 'It exists at least 10 continuous seconds with low frequency sounds, tonal or harsh'.
- Anthropophony:
- - Class 7 = 'Plane' (plan) = 'It exists at least 1 minute of airplane low frequency noise (generally below 2 kHz)'.
- Geophony:
- - Class 8 = 'Rain' (rain) = 'It exists at least 1 minute of impulsive noise from drops to continuous wide-band noise'.
- - Class 9 = 'Wind' (wind) 'It exists at least 1 minute of low frequency noise and/or white noise from leaves'.
- - Class 10 = 'Thunder' (thun) = 'It exists at least 1 sample of loud noise generated by strikes'.
- Other:
- - Class 11 = 'Other' (othe) = 'An acoustic event which is none of the classes 1 to 10, of any duration and at least at a peak absolute amplitude double than the background in the same frequency band of the event'.
Format of the run
For each category and each file, the participant has to produce a table with similarity score between 0 and 1, (1 for high similarity); the starting time and the cumulative estimated duration of the category. Thus each prediction item (i.e. each line of the file of 2000*11 lines * 5 columns) has to respect the following .csv format:
- file number N ; ClassId 1; probability of ClassId 1 ; start time ClassId 1 in sec.; cumulative duration of classId 1 in sec.
- ...
- file number N ; ClassId 11; probability of ClassId 11; start time ClassId 11 in sec.; cumulative duration of classId 11 in sec.
- ...
Here is an exemple of a run file:
- 1;inse;0.323;12.14;3.20
- ...
- 1;rain;0.452;32.34;123.45
- ...
- 1;thun;0.000;0.00;0.00
- ...
- 2000;rain;0.932;0.23;300.43
- 2000;thun;0.561;0.30;1.54
- 2000;othe;0.916;300.21;123.45
The final ranking of the challengers will be based on the Mean Average Precision on the 11 classes.
Each submitted run will be completed with a short readme text file with few words depicting : 'signal representation for each class', 'detector definition for each class', 'CPU time for each class with precision of the machine type'.
Challengers are invited to send a working note (3 pages) describing their method to be published in the IEEE proceedings.
Challenge Registration :
register for free / ask details by simple mail to glotin ( a t ) univ-tln.frCALL FOR PAPER / SUBMISSION
Paper on any of the EADM topics are welcome. The paper review will be consistent with standard double-blind practice with rigorous peer reviewing by at least 3 peer reviewers. Papers will be selected based on their originality, significance, relevance, technical contents, and clarity of presentation.
A journal special issue is organized into Elsevier Ed. based on the presented papers into EADM.Full paper submissions (August 12) should be around 3 to 5 pages, and are limited to a maximum of 10 p. It must follow the IEEE ICDM format.
Keynote paper submissions (Sept. 30) on the EADM challenge should be around 2 to 4 p. (max 8p) do follow the IEEE ICDM format.
More detailed informations: IEEE ICDM 2015 Submission Instructions.
Please submit your manuscript through the EADM IEE ICDM 2015 submission site.
All accepted papers will be included in the ICDM'15 Workshop Proceedings published by the IEEE Computer Society Press.
IMPORTANT DATES
- Data release (train and test data sets) starts: August 16 (please register for free by simple mail to glotin@univ-tln.fr)
- Full paper submissions due (independant of the challenge): August 12 (* last extension)
- Notifications of acceptance of the full paper: September 1
- Camera ready paper: October 30
- Workshop day: November 14, morning
WORKSHOP ORGANIZATION
Place
The EADM workshop is taking place into the ICDM 2015 conference. Registration and accomodation informations are available here at ICDMWorkshop Chair
- Prof. Hervé Glotin, Univ. Toulon, LSIS CNRS and Institut Univ. de France, glotin at univ-tln.fr, [PI]
- Prof. Gianni Pavan, Univ. of Pavia, Italy, pavan at gianni.pavan at unipv.it
- Dr. Peter Dugan, Cornell Univ, head of data analytic system, Bioacoustics Research Program, USA, peterdugan68 at gmail.com
- Dr. Zhong-Qiu Zhao, Hefei University of Technology, China, zhongqiuzhao at gmail.com
Program Committee
- Emmanuel Bruno, LSIS CNRS, Toulon univ., FR
- Peter Dugan, Cornell univ., USA
- Karl-Heinz Frommolt, Biodiversity Leibniz-Institut, Berlin univ., DE
- Hervé Glotin, IUF, LSIS CNRS, Toulon univ., FR
- Alexis Joly, INRIA Zenith, FR
- Sébastien Paris, LSIS CNRS, Aix-Marseille univ., FR
- Gianni Pavan, CIBRA, Pavia univ., IT
- Joseph Razik, LSIS CNRS, Toulon univ.,FR
- Zhong-Qiu Zhao, Heifi univ, CN
Program
This workshop is planned to be a half-day event, including a keynote and two oral sessions. The preliminary program is:
- Welcome (5 min)
- Invited speaker (60 min)
- Oral Session I (paper presentations) (60 min)
- Challenge session (40 min)
- Oral Session II (paper presentations) (60 min)
- General discussion / presentation of the journal special issue call on this worksphop in Elsevier Ed. / Closing
SHORT BIO. OF CHAIR MEMBERS
Prof. Hervé Glotin
Hervé Glotin is a Professor at the Institut Universitaire de France and Univ. of Toulon, in the Systems & Information Sciences CNRS lab. He is leading the Scaled Acoustic Biodiversity Project for the Big Data French National Research Concil (http://sabiod.org). He carried out his PhD at the Inst. of Perceptual Artificial Intelligence (IDIAP), CH and Inst. of Spoken Communication - Perception Team Grenoble on "Robust adaptive multi-stream automatic speech recognition using voicing and localization cues". In 2000 he was involved as an expert at the Johns Hopkins CSLP lab with the IBM human language team in audiovisual Large Vocabulary Speech Recognition. He became an assistant professor at the University of Toulon in 2003, where his research focuses on multimodal pattern analysis and retrieval systems, audiovisual indexing, cognitive models and bioacoustics. He is the co-author of one hundred of international refereed articles, and of an USA patent on a real-time bio-acoustic indexing algorithm. He was the general chair of ICML 2013, NIPS2013 and ICML2014 worshops on machine learning for bioacoustics ( http://sabiod.org/events.html ).
Selected publications :- - Bartcus, Chamroukhi, Glotin, 'Hierarchical Diriclet Process Hidden Markov Model for Unsupervised Biocoustic Analysis', IJCNN 2015
- - Lellouch, Pavoine, Jiguet, Glotin, Sueur, 'Monitoring temporal change of bird communities with dissimilarity acoustic indices'. Methods in Ecology and Evolution, V4, (2014)
- - Paris S., Glotin et al.(2013) Physeter catodon localization by sparse coding, ICML for Bioacoustics workshop
- - Glotin H. (2013), Soundscape Semiotics, Ed. Intech, 220 pp.
- - Glotin H., Clark C., LeCun Y., Dugan P., Halkias X. and Sueur J., (2013), Proc. of the 1st wkp on Machine Learning for Bioacoustics, V1, 104 p, joint to Int. Conf. on Machine Learning, ICML 2013, Atlanta, ISBN 979-10-90821-02-6
- - Glotin H., Mallat, Artières, Lecun, Halkias X (2013), Proc. of the 1st wkp on Neural Inf. Proc. for Bioacoustics, Vol.1, joint to Int. Conf. NIPS
- - Halkias X., Paris S., Glotin H. (2014) Classification of Mysticete Sounds using Machine Learning Techniques, Jour. Acoustical Society of America (JASA)
- - Glotin H., Caudal F, Giraudet P. (2008-14) Whales cocktail party: a real-time tracking of multiple whales, int. Journal Canadian Acoustics, V.36(1), ISSN 0711-6659, ONLINE DEMO : http://sabiod.org , also published as USA Patent
Prof. Gianni Pavan
Professor of Ecology at the University of Architecture of Venice (1994-2006), Professor of Bioacoustics at the University of Pavia (2006 – now), President of the Bioacoustic and Environmental Research Interdisciplinary Center of the University of Pavia. Started working on bioacoustics in 1980, expert in computational bioacoustics, develops software and equipment for acoustic monitoring of terrestrial and marine habitats; after many years of work on marine mammals, he is now mainly involved with the study of terrestrial soundscapes.
Recent publications :- - Polidori C., Pavan G., Ruffato G., D’Asís J., Josè Tormos J.; 2013. Common features and species-specific differences in stridulatory organ and stridulation patterns of velvet ants (Hymenoptera: Mutillidae). ZOOLOGISCHER ANZEIGER 252(4): 457-468.Obrist M.K.,
- - Obrist M.K., Pavan G., Sueur J., Riede K., Llusia D. and Márquez R., 2010. Bioacoustic approaches in biodiversity inventories. In: Manual on Field Recording Techniques and Protocols for All Taxa Biodiversity Inventories, Abc Taxa, Vol. 8: 68-99. ISSN 1784-1283
- - Pavan G., 2012. Paesaggi sonori terrestri e marini. In: “Filogenesi e ontogenesi della musica”, a cura di Avanzini G., Longo T., Majno M., Malavasi S., Martinelli D., pp 45-54. Franco Angeli Editore. ISBN 978-88-204-1130-5.
- - Pavan G., Fossati C., Caltavuturo G., 2013. Marine Bioacoustics and Computational Bioacoustics at the University of Pavia, in “Detection Classification and Localization of Marine Mammals using passive acoustics. 2003-2013: 10 years of international research.”, Adam O., Samaran F. Ed.,2013, ISBN 978-2-7466-6118-9.
- - Pavan G., 2013. Listening Underwater. In: On Listening (Edited by Carlyle A. & Lane C.), Uniformbooks: 63-70. ISBN 978-1910010-01-3.
Dr. Peter Dugan
Peter Dugan is currently the PI on the National Oceanic Partnership (NOPP) Grant focusing on detection, classification and localization of marine mammals. He received his PhD in Electrical Engineering and Combined behavioral biology from Binghamton University in NY. Prior to Cornell University he held positions in the industry in companies such as Hughes Link Flight Simulation and Lockheed Martin.His interests and motivations include the research and development of computationally intelligent systems, by combining traditional "shallow systems" with "deep learning systems" for object detection and classification in order to enhance system accuracy. The NOPP grant has been awarded 1M$ for the years 2012-2015. As the PI, his goal is to investigate new approaches and deliever comparative studies working on integrated teams representing Science, Technology, Engineering and Mathematics (STEM). He is the head of the data analytic system at the Cornell Bioacoustics Research Program (BRP). This project funded through ONR, combines the application of high-performance-computing system to explore the spatio-temporal dynamics for a suite of acoustically active marine mammals (fin, humpback, minke, and right whales).
Recent publications:- - Dugan, P., M. Pourhomayoun, Y. Shiu, R. Paradis, A. Rice, and C. Clark. 2013. Using high performance computing to explore large complex bioacoustic soundscapes: case study for right whale acoustics. Procedia Computer Science 20.
- - Rice, Aaron N., Peter J. Dugan, Christopher W. Clark. 2010. Finding the calls of whales in a sea of sound. Sea Technology 51(10).
Dr. Zhong-Qiu Zhao
Dr. Zhong-Qiu Zhao is an associate professor at Hefei University of Technology, China. He obtained the Master’s degree in Pattern Recognition & Intelligent System at Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, China, in 2004, and the PhD degree in Pattern Recognition & Intelligent System at University of Science and Technology, China, in 2007. From April 2008 to November 2009, he held a postdoctoral position in image processing in CNRS UMR6168 Lab Sciences de l’Information et des Systmes, France. From Jan. 2012 to Dec. 2014, he held a ‘Hongkong Scholar’ research position in pattern recognition at the Department of Computer Science of Hongkong Baptist University, Hongkong, China. Now he works in Laboratory of Data Mining and Intelligent Computing, Hefei University of Technology, China. His research is about pattern recognition, For multimodal data, including environmental data.
Recent publications :- - Zhong-Qiu Zhao, et al., 'ApLeaf: An efficient android-based plant leaf identification system', Neurocomputing, V151, 2015, ISSN 0925-2312, http://dx.doi.org/10.1016/j.neucom.2014.02.077.
- - Zhong-Qiu Zhao, Bao-Jian Xie, Yiu-ming Cheung, Xindong Wu, 'Plant Leaf Identification via A Growing Convolution Neural Network with Progressive Sample Learning', ACCV, 2014.
- - J. Wang, Z.Q. Zhao, X. Hu, X. Wu, P.P. Li, Y.M. Cheung, M. Wang, “Online Group Feature Selection”, 23rd International Joint Conference on Artificial Intelligence (IJCAI), 2013.
- - Z.Q. Zhao, H. Glotin, Z. Xie, J. Gao, X.D. Wu, “Cooperative Sparse Representation in Two Opposite Directions for Semi-supervised Image Annotation”, IEEE Trans. on Image Proc., V21 I9, 2012.
LINKS
Past Workshops
Supports
- SABIOD CNRS
- University of Toulon
- Institut Universitaire de France
CONTACT
- Hervé GLOTIN
- Université de Toulon - France
- Tel. (+33) 04 94 14 28 24
- Email: glotin@univ-tln.fr