SoundSort

Introduction

In passive acoustic monitoring (PAM) it is often relatively easy to make a detector, i.e. some algorithm which picks out all or a subset of interesting sounds e.g. all short transient sounds (check out http://www.pamguard.org). However making a classifier which accurately picks out a very particular subset of sound types can be difficult, due to variations within species, different noise conditions etc. Machine learning algorithms are now being developed and deployed rapidly to analyse the vast quantities of acoustic data that are collected on modern PAM hardware. Such algorithms can make excellent classifiers (check out this paper on right whale detection, the BANTER framework and even a Google humpback whale classifier). However, they can often heavily reflect the context of the data they were trained on i.e. if something unexpected turns up that was not in the training datasets these algorithms can fall down easily. Humans are slow at analysis but are both excellent pattern recognition machines and crucially we can recognise and deal with unexpected inconstancies in datasets.

Because we have not come anywhere near to completely categorising the multitude of sounds that exist in our terrestrial and marine environments there is an ongoing and significant role to be played by humans in analysis of acoustic data. We are initially required to explore datasets and create the training data that is used to develop automated classifiers. We are then needed to validate if analysis algorithms are working effectively on novel data and check for any strange patterns or inconsistencies in results. This can be a slow process but is greatly speeded up if powerful data visualisation tools are available.

Thus an effective acoustic workflow requires both sophisticated analysis algorithms (that’s our machine learning algorithms) and powerful data visualisation tools to allow us to check the performance of those algorithms.

What is SoundSort?

SoundSort is designed to fulfil two functions; it is a powerful data visualisation and annotation tool and a platform to test machine learning algorithms on. It is designed to be ultra user friendly and thus can be used by any researcher who understands how to interpret a spectrogram.

There are many powerful PAM programs out there (PAMGuard, Raven, Ishmael, Triton) which can also be used to and visualise raw recordings, analyse large quantities of acoustic data and/or deploy classification algorithms. However all have a somewhat steep learning curve and some are not open source. SoundSort is not a replacement for these but an open source experiment in UI design and machine learning which performs a limited set of functions well.

Data Annotation

SoundSort was initially inpsired from a google experiment which presents a fantastic mosaic of birds sounds to explore within a web browser. SoundSort can import a large set of sound clips providing the user with a highly interactive and intuitive UI to visualise and annotate the acoustic data. It is designed to be used in a situation where an initial detection algorithm (e.g. Ishmael energy sum) has extracted a large number of possible detections at a high false positive rate. The clips are presented as spectrograms on a large mosaic which the user can zoom in to and out of. Spectrogram colours and colour limits can be changed, files decimated and individual clips inspected more closely. Groups can be created and clips categorised- then results exported into group folders or a .mat file. Data presented in this way quickly allows an analyst to quickly spot patterns and is significantly more efficient than listening to clips individually.

Annotating clips in SoundSort. The annotated clips are coloured with each colour corresponding to a different group e.g. boat noise, fish sounds, dolphin whistle.

Machine Learning

SoundSort provides a framework to run machine learning algorithms on the imported clips then allow the user to inspect results. Currently there is only one algorithm implemented – a clustering algorithm. Clips can be clustered using the Barnes-Hut implementation of the t-SNE algorithm which organises clips based on spectral content. This allows more rapid exploration and annotation of the dataset by grouping similar clips together.

SoundSort grouping detections using a t-SNE clustering algorithm. Like clips are clustered together.

Downloads

You can download the latest SoundSort from Github or here’s the direct link for a .exe file.

Upcoming features

SoundSort is a work in progress. Some of the planned features are:

  • Import PAMGuard click files and display as either Wigner plots or spectrograms.
  • Much improved inspection graph for individual clips including options to explore waveform, spectrum and Wigner transform.
  • Allow users a framework to import classifiers developed using machine learning API’s e.g. TensorFlow
  • Additional annotation options including drawing a polygon to select large numbers of clips at once (coming very soon).
  • Increase options for feature extraction e.g. by adding thresholds to spectrograms (coming very soon)
  • Apple compatibility. SoundSort works on Mac OS and Linux but neither have easy to use installers. If anyone has a Mac and wants to make a .dmg file for SoundSort get in touch!

Acknowledgments

SoundSort was originally created to analyse data for a project funded by Code4Africa (@Code4Africa @SeaSensors). See this blog post about the project.

There are multiple libraries used in AIPAM without which a program like this would simply not be feasible to build.

  • JavaFX is an impressive modern UI framework, perfect for an application like this which shows large number of images and requires a user friendly, modern design.
  • The excellent controlsfx library for extra JavaFX bits and pieces.
  • A fast and native Java implementation of the t-SNE algorithm.
  • JMetro for styling the app with fluent design theme.
  • FontawesomeFX for icons.
  • Apache Commons Math 3 for the fast fourier transform and plenty of other useful functions.
  • iirj for filtering acoustic data before decimating.
  • alg4 for the solving the assignment problem, i.e. taking clustered points from t-SNE and assigning them to a grid.
  • MatFileRW for writing and reading .mat files. This allows integration of the Java code with MATLAB/Octave.