Disclaimer: I am a PAMGuard developer, however my main job is the study of wild harbour porpoise behaviour using PAM and therefore I have used both C-PODs and high frequency recorders in my research. This represents my views and not that of the PAMGuard team.
At a few recent conferences there have been posters, people and talks which have been comparing the performance of C-PODs to PAMGuard. Underlying this appears to be the surfacing or perhaps emergence of a rivalry that seemingly splits opinion amongst the marine bioacoustics community. Some people love C-PODs, some say they don’t work or are even unscientific. Others say PAMGuard is unusable, too complicated or performs badly. Usually competition is good, it pushes methods forward and indeed C-PODs and acoustic recordings analysed by PAMGuard represent two very different methods of bioacoustics analysis. However, the debate that is being had at the moment, is neither pushing competition nor moving the field forward; it is in fact entirely framed in the wrong way and as much a product of ignorance as it is competing ideologies. In this long-read post, I’m going to try an explain why.
What are C-PODs and PAMGuard?
Before going delving into detection theory and how it applies to C-PODs and PAMGuard it is important to begin by describing what both C-PODs and PAMGuard are.
C-PODs (www.chelonia.co.uk) are relatively low-cost zero-crossing detectors that are sold alongside a proprietary detection and classification system. C-PODs can detect a variety of impulsive sounds including dolphin and porpoise echolocation clicks, boat sonar, snapping shrimp, and sediment transport noise caused by suspended particles colliding with the instrument housing. For each detected impulsive sound, high-resolution time, amplitude, peak frequency, number of zero crossings and slope are saved to an SD card. Thus, each detected sound is represented by only 6 metrics (time, duration, amplitude, frequency, zero crossing and slope) and so the volume of data collected by the instruments per unit time is very low. For example, a continuous recorder sampling at 500 kHz for six months would result in ~8.25TB of 500kHz of raw recordings where as a C-POD deployed for a similar period of time would collect only a dozen or so gigabytes. This low data volume and the ultra-low power consumption of C-POD devices means they can continuously monitor for up to a year without requiring recharging or data recovery.
Once data has been retrieved from a C-POD it is analysed by CPOD.exe, a bespoke and closed source PAM software package. C-POD.exe uses a mostly undocumented click train detection algorithm to extract probable porpoise click trains and returns a spreadsheet of ‘porpoise positive minutes’. The analysis is fully automated and relatively fast with little room for user control. It is only recently that practical PAM devices capable of recording the 130kHz narrow band high frequency clicks of harbour porpoises have become available to the research community, however C-PODs and T-PODs (the C-POD predecessor) have been available for the last two decades and are therefore ubiquitous amongst the PAM community. The decades long dominance of C-PODs in porpoise research and their relative user friendliness (i.e. the ability to get results with minimal input) means large numbers of past and current research projects are based on C-PODs and many university departments, consultancies and governmental institutes have multiple operating C-POD devices.
Contrary to the C-PODs, which are sold as a unified hardware + detection and classification system, PAMGuard is an open source software package for the analysis of PAM data (www.pamguard.org). It has no associated hardware and instead can be used to analyse raw recordings collected on any PAM device (it can unpack raw C-POD data, SoundTrap click data too). The software was first released in 2005 with the goal of providing data analysis platform to aid in the mitigation of anthropogenic impacts for the seismic offshore industry. However, since then, PAMGuard has morphed into a comprehensive PAM software suite used by both industry and researchers. It is capable of measuring ambient noise, detecting and classifying impulsive sounds such as cetacean clicks, extracting the contours of tonal sounds such as whistles and many other tasks not discussed here. Essentially PAMGuard can be thought of a PAM sound analysis toolbox with an associated graphical user interface which can work both in real time and post data collection. To deal with a wide range of different hardware types, PAMGuard has always prioritised flexibility. So, the different tools (called modules in PAMGuard) usually have a multitude of options introducing an almost infinite number of possible configurations. This makes PAMGuard very flexible, however the flip side is that it that it is often a daunting prospect trying to use PAMGuard for the first time.
The general idea behind PAMGuard has always been to provide users with a set of powerful signal processing, annotation and data visualisation tools, which allow for a semi-automated approach to acoustic analysis. Users can run automated detection and classification algorithms which can then be assessed by a user manually scrolling through an acoustic dataset and checking detectors and species classifiers are performing well. Data can be reprocessed easily allowing users to add or alter classifiers until a satisfactory performance is achieved. Data can be easily exported directly to MATLAB or an SQLLite database, allowing users to perform further analysis if required.
From the suite of available detection and classification algorithms native to the PAMGuard system, the click detector is the most analogous module to CPOD.exe. The click detector scans acoustic recordings for impulsive sounds within a specified frequency band. The software then saves the transient sounds with a time stamp and a snippet of the raw waveform. By saving the whole waveform of each click, measurements of peak frequency, amplitude, Wigner plots etc. are possible at any time post processing. These metrics have proved invaluable in discriminating between different odontocete species and inferring behaviour. The PAMGuard click detector module also contains a classifier, however, the classifier compares individual clicks to user-specified metrics. Unlike C-PODs it does not classify based on likely click trains. PAMGuard click classification settings are user defined and based upon all or any combination of a clicks amplitude, peak frequency, energy within a specified band and zero crossings. This click detector module is the most popular in PAMGuard and used in multiple research areas, from towed array surveys off the coast of California to tracking harbour porpoises around tidal turbine in the frigid northern water of Scotland.
For the remainder of this article we will be comparing C-PODs to PAMGuard. More specifically we are comparing both hardware and software; a deployed C-POD device which then has it’s CP1 files analysed in CPOD.exe software and a PAM recorder with a typical sampling frequency used to detect harbour porpoise’s (e.g. 384kHz to 576kHz) which has it’s raw recordings analysed by the PAMGuard click detector module.
Before getting into the technicalities of comparing two very different passive acoustic monitoring systems it’s important to have a good grounding in basic detection theory
Humans are excellent at pattern recognition, which is a very useful skill in bioacoustics analysis. For example, we can easily trace dolphin whistles on spectrogram, something computer algorithms still find difficult to consistently achieve. However, humans can also be inconsistent between individuals, who may perform differently depending on mood, tiredness etc. and crucially we are extremely slow and expensive compared to a computer. Therefore, with the increasing quantities of acoustic data being collected, automated algorithms are required for analysis. These have the great advantage of consistency and speed; however, computer algorithms do not perform well if unexpected inconsistencies in data occur – they must be trained for most possible eventualities. For example, a porpoise detection algorithm might classify a 120kHz ship sounder pings as porpoises unless it is specifically set up to recognize this sound as non-biological.
No matter how advanced and/or well-trained a detector, whether human or automated, it will never be 100% accurate. For a binary classifier (i.e. one that assigns either yes it’s species A or no its not species A), there are generally two types of error, false positives (FP) when a sound is falsely identified as belonging to a species or group of species, and false negatives (FN) which is when a detection has been missed by the algorithm. These two metrics are linked i.e. it is possible to make an algorithm that almost never misses a detection, however, that will usually result in a much higher false positive rate. Conversely it is possible to make an algorithm which has almost no false positives however it will miss lots of possible detections because the criteria for detection/classification will be set so high. This interplay between false positives and false negatives leads to a classic concept in detection theory called receiver operating characteristic (ROC) curves. ROCs are simply curves that show the false negative rate (proportion of calls misclassified) versus the false positive rate (proportion of classified call which are not true calls) for classifiers. i.e. for a specific algorithm on a specific dataset, to achieve a false positive rate of X then the false negative rate will be Y. So, for any classification algorithm, as users adjust the system parameters to decrease the false negative rate, the false positive rate necessarily increases and vice versa. (Precision/Recall curves are analogous but more useful in marine mammal science see this paper by Marie A Roch et al. 2011)
So, what causes these false negatives and false positives? Assuming a relatively well-trained algorithm that deals with call variability in the target species, the single all-encompassing answer to that question is noise, or more specifically the signal to noise ratio (SNR) which is the ratio of the signal (e.g. detected click) to the noise. Noise is essentially any signal which is unwanted, e.g. ambient ocean noise, non-target species such as snapping shrimp or indeed other marine mammals, flow noise, system (electrical) noise and anthropogenic noise (echo-sounders, shipping, sonars). If, in the frequency band of interest and consistent, very little can be done about noise; it will simply be impossible to detect any signal below the noise level. In some cases, it’s possible to reduce noise by filtering out certain frequency bands or, for intermittent impulsive sounds, using a classifier to at least partially reject unwanted transients before data is analysed for cetacean species. However, no matter how intelligent signal processing algorithms are, noise will still be present and will still result in some degree of false positive (unwanted) and false negative (missed) classifications. Different environments have different noise profiles and soundscapes of varying complexity and these will alter the ratio of false positives to false negatives, leading to the different ROC curves shown in Figure 3. For example, where one environment might have lots of transients which could increase the false positive rate, another may have a higher uniform noise level, increasing false negative rate. Thus, the ratio of false positive to false negative rates for a specified detector threshold will be different in different environment; so a key point of this section is this: A detector and classifier will have different ROC curves depending on the type of dataset which is being analysed.
When building an automated detector and classifier there should therefore be two objectives.
- To effectively detect a target species calls and achieve a false positive and false negative rate which is acceptable for the target study. For example, some studies do not require an efficient detector (low false positive and high true positive). However, in some cases efficient detectors with low errors are vital (see Marjolaine Caillat et al 2013).
- Knowing what the false positive and false negative rate of the detect/classifier is and how that responds to changes in the different environments the algorithm is being used in e.g. how many of the impulsive sounds classified as belonging to the target species are actually from the target species and how many of the target species called are missed. And then crucially, how do these rates change with noise?
This is a very brief introduction to detection theory which has omitted a lot of interesting ideas and concepts. A more comprehensive introduction, written by Doug Gillespie, can be found in section 3.2 of this SMRU ltd report.
Comparison of C-PODs and PAMGuard
Now that we have an understanding of detection and classification theory, we can better understand the potential pitfalls of comparing the C-POD detection and classification system to PAMGuard software.
So, let’s begin with hardware by comparing the C-POD devices to a generic high frequency recorder (remember PAMGuard has no associated hardware). Impulse detectors, such as those used in the C-POD, are simple and well-developed. There is little doubt that the raw C-POD data, i.e. recording impulsive sounds in CP1 files, is robust and performs in a similar way to other detectors used in PAMGuard or other analysis programs. However, the saved representation of the clicks/impulsive sounds is very sparse; where as a continuous recorder would measure several hundred or thousand points per echolocation click, the CP1 files record only 6 metrics. However, C-PODs can last for months and high frequency recordings take up a lot of space. Modern recording devices, at full bandwidth and constantly recording, can only last a few weeks before hard drives or SD cards are filled. The C-POD’s can be deployed for an order of magnitude longer which means the average cost of data per hour monitored is extremely low. For cash-strapped researchers this is a major selling point. However, it should be mentioned that a hybrid approach has recently been gaining traction. SoundTraps are recorders/detectors produced by Ocean Instruments Ltd. in New Zealand (www.oceaninstruments.co.nz). They can run a click detector at high frequency (576kHz sample rate) and at the same time record data at medium frequencies e.g. 48kHz. Detected impulsive sounds are saved along with a snippet of the raw waveform and so provide far more data than C-POD detections whilst also allowing for 99.9% data compression compared to raw recordings at the same sample rate. Once retrieved, the detected impulsive sounds can be unpacked in PAMGuard and classified whilst the lower frequency continuous recordings can be analysed to detect dolphin whistles and other interesting sounds which are missed by click detectors. With an additional battery pack, SoundTraps can over two months (70 days) and provide a much richer dataset. This is a relatively new development and so financial/institutional inertia is an important consideration. SoundTraps are the same price as a C-POD but C-PODs have been around a long time and lots of institutions have multiple units and people trained to use them. Even if everyone decided switching is a good idea, it would be expensive and the additional training of staff would be time-consuming; in other words, even with other companies catching up on hardware, C-PODs are here to stay, at least in the short to medium term.
Next is the analysis software. This is where the discussions on C-PODs and PAMGuard begin to become problematic. C-PODs are marketed as hardware units with accompanying classification software (CPOD.exe) which provides automatic species classifiers i.e. they provide the hardware and analysis suite and the user is required to do very little. The C-POD software uses a bespoke click train classifier to classify species in a CP1 datasets whilst PAMGuard works on an individual click by click basis to classify species within acoustic recordings. In Figure 4 Chelonia Ltd.(the company which makes C-PODs) claims C-POD analysis software is “fully automated” and “the C-POD click train identification gives low error rates, even at high sensitivity. By contrast, click characterisation methods, (e.g. PAMGuard) require manual checking of each possible porpoise click (NBHF) and struggle on dolphin clicks (BBT)”
This is a false comparison. Once PAMGuard is set up on a machine it is as fully automated as the CPOD.exe and associated classifiers. After analysis on an acoustic dataset has been completed, manual validation is indeed required to check the performance of the classifiers and adjust if necessary. However, the exact same is true for C-PODs – some sort of validation is needed of the species detection/classification algorithms to assess performance (in fact manual validation is a little more difficult because of the sparsity of data per echolocation click).
We discussed in detection theory that noise can impact detection efficiency. Chelonia Ltd. has not tuned their click train detectors to cope with the huge variety of acoustic environments that exist in our oceans and offers only one parameter for users to adjust. This lack of flexibility in the system is problematic as the ocean soundscapes/noise conditions are highly dependent on where the unit is deployed. Site-specific influences on the soundscape could include shipping, biological sounds (e.g. snapping shrip), storms, water-depth tide, to name a few. Whilst PAMGuard software offers users the ability to change parameter settings, such as detection thresholds, frequency bands to ignore etc, Chelonia Ltd. has largely chosen to ignore these issues and/or rely on independent researchers to validate their systems. Some studies have looked at the minimum detection threshold or site and species-specific detection ranges, however there are no comprehensive studies on how C-POD detection algorithms responds to different levels and types of noise. Click train algorithms might be quite robust to increasing noise or they might fall over completely at some signal to noise threshold; the problem is we don’t really know and because the C-POD is closed source software it’s hard to even begin to predict this. Another consideration is the fact that C-PODs do not record noise; in fact, it’s difficult to get any idea of the soundscape other than impulsive sounds. This is problematic because, even if there was good knowledge on how the C-POD software’s classification algorithms respond to noise, the C-PODs devices are not able to report on what the noise conditions actually were during a deployment. Therefore, it’s difficult for CPOD.exe to predict likely performance metrics unless a recorder has been used to sample the relevant soundscape.
Like PAM recordings analysed with PAMGuard, C-PODs and the C-POD.exe classification algorithms do not perform identically in different habits. A C-POD system does not circumvent laws of detection theory. It will respond to changes to noise in the environment like any other detector/classifier. The magnitude of this response is currently unknown.
Unlike PAM recorders, C-PODs do not record noise. C-PODs do not sifficiently record the soundscape they are operating in. Therefore, even if the response of the click train detector to noise is known, noise can only be measured externally e.g. by using a recorder.
Does this actually matter to the a CPOD user? Whether detection efficiency matters or not, is dependent on the question being asked. Say, for example, that you’re interested in porpoise temporal patterns at one site with a relatively homogeneous noise profile. In that case, you neither need to know the detection efficiency nor how it changes with noise because the probability of detecting an animal (assuming behaviour doesn’t significantly change) is roughly the same for all units and thus understanding the relationship between the soundscape and the likelihood of detecting a click train is unimportant. In such an environment temporal patterns in animal detections will be possible to extract and the relative occupancy or even relative density can be calculated. However, let’s say you wish to sample a site in which noise conditions change over time, space, or both. In this scenario it will be very important to know how the C-POD algorithm responds to noise to calibrate measurements and extract non-biased data. For example, if a site in a shipping lane detects few animals and one in a quite environment detects many, it won’t be possible to tell whether the quiet site detected more because there were more animals there or because it could “hear” things at a greater distance than the noisy site. Knowledge on the false positive and false negative rate must be carefully considered for any experiment which is trying to infer information on animal density.
Comparing the performance of C-PODs versus PAMGuard
Above we have discussed how C-PODs and PAMGuard/any other detection/classification system, are likely to vary in performance depending on the environment and that measuring this performance can be very important, depending on the study/experiment being conducted. However, even if both system vary in performance, one still might perform better or worse than the other and naturally two different PAM methodologies invite comparison. Perhaps the most intuitive way to compare PAMGuard and C-PODs is simply to take a C-POD, strap a SoundTrap to it and put both out in the sea for a while. Then pass the C-POD data through CPOD.exe software and use the PAMGuard click detector and default classifier to find porpoise clicks on the recorder. As C-PODs run a detector with a very low false positive rate there will be many more clicks detected in PAMGuard than C-PODs. Can both be right? Yes and no. Here’s are the problems with this approach.
Neither a C-POD nor PAMGuard is truth because neither are perfect detectors/classifiers (because it’s impossible to be a perfect detector/classifier). In fact, it’s hard to get at truth but the best attempt would probably be for a user to scroll through PAMGuard clicks and manually and meticulously mark porpoise click trains.
The default porpoise settings in PAMGuard were only ever designed as a guide. The idea behind PAMGuard, as discussed above, is for a user to tweak their algorithms depending on the environment. The default settings may (in fact do) perform very badly in some situations.
But the biggest problem is that this is essentially a needless exercise, because all that is being done is to confirm the detection theories discussed above- different algorithms will have different ROC curves and depending on threshold levels will have differing false positive and false negative rates. Sometimes, having low false positive rates might be very important for a study, in other studies maximising the number of detections by accepting false negatives might be advantageous.
So, what would be a more informative experiment? A better approach to testing these algorithms would be to design an experiment in which a noise versus detection efficiency graph could be constructed. As it is difficult to alter threshold values in CPOD.exe full ROC curves are probably not possible. However, the detection efficiency response of the CPOD.exe click train detection algorithm to varying levels and types of noise would be highly informative and could be calculated experimentally and indeed there has been some moves towards this e.g. SAMBAH .The same tests could be carried out on the PAMGuard click detectors for comparison
Knowing the magnitude of this response would be very useful, especially because the C-POD click train algorithm is clearly quite complex, it may respond quite strangely to different soundscapes and noise profiles. For example, in a high noise what if only the loudest fragments of click trains are detected, does the algorithm still classify those effectively? It may also be that in general the C-POD performs better than PAMGuard by having a more accurate detector in different environments; this would not be surprising as a click train detector is certainly a more sophisticated analysis methodology than click by click classification.
Finally, I feel it’s important to mention that as a PAM community we should take note that a significant portion of our research (especially in Europe) is being conducted using a device which has closed source hardware and software. Code between different versions of the CPOD.exe software can be changed at any time without users knowing and the “secret” nature of the algorithm means we have very little idea what it might be doing. If the algorithm were very simple, this might be acceptable, however a click train detector is a sophisticated classification algorithm dealing with a difficult pattern recognition problem. The fact this is used by many researchers and not open to the community certainly worries me.
What C-PODs Can Do Better
C-PODs are sophisticated devices and undoubtedly the device + classification software makes for an effective and user-friendly PAM system. However, the one fits all and user-friendly approach should not be a substitute for a basic understanding of the capabilities and limitations of these systems. Some of the claims and marketing material used by Chelonia Ltd. does not address the fundamental limitation of PAM systems and fully automated classifiers – this should change. The lack of transparency and the closed software is not appropriate for a PAM system which attempts to solve a difficult pattern recognition problem (click train detection) and is used in cutting edge science. This should be a serious consideration for researchers and I would implore Chelonia Ltd. to either open-up their source code or at the very least provide a compiled library which would allow the algorithm to be accessed programmatically, allowing us to test it efficiently.
Note: Chelonia Ltd. are bringing out the F-POD which provides the zero crossing measurements for detected clicks. Still not as good as a waveform but certainly an improvement on the sparse data per click on C-PODs.
What PAMGuard Can Do Better
The click by click approach in PAMGuard might be OK for harbour porpoises as there is usually little interferring noise in the 100 to 150kHz frequency band, however it is certainly not ideal for dolphins because broadband click classifiers are easily fooled by other impulsive sounds and click waveforms and spectrums change substantially depending on the off-axis beam angle. A click train detection and classification approach is therefore definitely the way to go and should be an option in PAMGuard- this is being implemented now, however, it should not be used as a fully automated solution – there will be settings to tweak for different species etc. and some manual validation should still be used to assess performance for different types of acoustic datasets. More generally, there is certainly a lot of scope to improve PAMGuard’s user friendliness e.g. wizards to help people using the software for the first time and improvements to make the user interfaced more intuitive. PAMGuard is still in active develo[pment so hopefully progress will be made on this front.
Arguing that C-PODs and PAMGuard perform differently is a somewhat pointless exercise. They use two different methods of analysis and for any comparison, both are sitting somewhere on their own respective ROC curves. The key for future comparison is working out what those curves are and crucially how they change with signal to noise ratio.
More modern equivalent hardware which can begin to match (still ~5 times less) the longevity of C-PODs, such as the SoundTrap running a click detector+ battery pack, are over $4000 per device (roughly the same as C-POD) and so replacing C-PODs on the usually tight budget of marine mammal research institute is simply not feasible in the short-term. In the short to medium-term, C-PODs are therefore here to stay.
Like PAM recorders, C-PODs are indeed very useful devices. They are available now, there’s lots of them and they can provide important data on harbour porpoises and dolphins for very low deployment and analysis costs. It is therefore understandable that many researchers wish to use C-PODs, however their user friendliness and the one-fits-all-approach should not be a replacement for a basic appreciation of the capabilities of PAM systems in general and therefore some of the weaknesses of C-PODs compared to recorders and vice versa.
If used properly with the caveats mentioned above, C-PODs can, and have been, used for important, interesting and useful research projects e.g. monitoring the decline of Vaquita. However, users should know they are closed source devices and currently the performance of the detection algorithms has not been sufficiently explored. They do not circumvent the laws of detection theory and as such care should be taken to consider and compensate for the various factors which affect all PAM devices. The magnitude of the CPOD.exe response to noise is unknown and can and should be experimentally verified by the research community.
Whether using C-PODs, PAMGuard or any other PAM analysis program, we, marine mammal researchers, should appreciate that PAM is still in active development, new methods and algorithms are being tested and proposed all the time and the field is moving forward. No one has a one stop solution, which is partly why it’s a great place to be a researcher! C-PODs have been part of this development story for the last decade; hopefully further knowledge on the way C-PODs work and more engagement by Chelonia Ltd. with acoustic researchers will keep them, and the new F-POD a relevant and useful asset for the marine mammal community in the future.