AT68

32

68

Spécial « Congrès Acoustics 2012 »

Un nouveau procédé d’optimisation de la distance géométrique dans un système de reconnaissance automatique de chants d’oiseaux

Accuracy also increases as the number of reference

calls in the template is increased. For example if there

are 100 calls of a certain species, for each signal on the

recording every one of the 100 recordings is compared

in turn. Each one will be assigned a GD and the smallest

GD is the one that is declared the match.

This goes even further when there are multiple species.

Assuming each species has its own template then as

each of the templates are run the lowest GD becomes

the “best match”. Some species sound similar and as an

example the Australian Currawong sometimes imitates

the Grey Shrike Thrush (both species considered above).

So on a first pass the software may assign a match of

a call to the Currawong (when it is running the corres-

ponding template) . However, later it might run the Grey

Shrike Thrush (which we assume here is the calling bird).

Now to all but the most expert human ear these are the

same. But the software will not be fooled and will assign a

lower GD to the Grey Shrike Thrush which at the exact time

of the call will now over-ride its first “guess” of a Currawong

with the Grey Shrike Thrush. When the run is completed

the correct assignment will have been made.

So the software accuracy improves not only with more

examples of the target species, but also with more exam-

ples of other species that might be calling in the area.

Trade-offs

As the software developed it became clear that it was

both possible and desirable to trade CPU time for grea-

ter accuracy. Increasing the number of templates (each

with their own settings) certainly improved the accuracy

but increased CPU usage almost in direct proportion to

the number of templates.

The 2-D analysis for most calls closely approximates the 3-D

and since it runs faster it is the preferred mode. The 3-D

mode is suited best to those situations where the tempo-

ral signature is important (for example in estimating the

number of frogs calling in a chorus).

Noise performance of the system is good and it is possi-

ble to trade off accuracy for the ability to work in a noisy

environment. The system will perform well at S/N levels

of 10 dB, but can still give useful results in conditions

as noisy as -20 dB S/N if required. In this instance it

is found that by matching just the peak energy part of

the call (and hence by setting the parameters to focus

on the peak energy section of the call) it is possible

to get good matching. However, by doing this, a lot of

information about the rest of the call is not used and

the uniqueness of the call is totally searched for in a

small portion of the energy peak, so that false positi-

ves will increase.

After a lot of testing we found that the PC clock speed

was the most important indicator of the total run-time. In

recently years clock speeds seem to have saturated and

3.8 GHz seems to be about as fast as a PC will run without

over-clocking. Modern PCs tend to be adding processors

rather than ramping up clock speed and this is a minor

dilemma for the software. While most 64 bit code has

powerful parallel implementations the older 32 bit code

usually does not. We therefore decided to limit the 32

bit code to one processor (although on a multiprocessor

machine multiple instances of the code can be run at any

time and this is recommended for processing very large

file collections).

Field Applications

It is well established that employing bioacoustic methods

for animal surveys in ecological studies offer a number

of advantages over using visual-type surveying methods.

These advantages include: a reduced need to disturb or

handle animals, an ability to survey visually cryptic species,

achieving effective monitoring during inclement weather,

and cost savings through reducing the amount of times a

site needs to be visited by highly skilled specialists.

Despite its clear utility and widespread use (see for exam-

ple work by Rountree et al[1],Payne et al[2], Riede[3]), the

application of bioacoustic monitoring can by somewhat

constrained by “back end” data analysis requirements.

That is, studies using bioacoustics can generate large

amounts of acoustic data, often thousands of hours of

recordings [1]. Processing large amounts of data that is in

an acoustic form can be laborious and time consuming. As

such, the time, resource and cost savings gained through

implementing bioacoustics monitoring over other forms

of survey method may be completely eroded.

Automated analysis of bioacoustic data is promoted as

the answer to circumventing manual data processing and

analysis obstacles. In practice however, the development

of automated sound recognition is challenging, primarily

because the vocalisations of many species can be complex.

For example, some bird species can sing in duets, while

others can intentionally mask their calls, perform vocal

mimicry, have regional dialects, have large song repertoires

and can perform improvisational songs [4]. Furthermore,

despite widespread reporting of successful automated

sound recognition in the literature (including for birds

species [5]), the utility, practicality and accessibility of

these systems to field ecologists seemed limited.

Field Implementation

As discussed, the recorder and automated sound recogni-

tion system was initially designed as an acoustic surveillance

tool for rare parrots, and in this regard is an advancement

in acoustic survey techniques for the conservation of rare

and threatened fauna species. To illustrate, the system can

be deployed at key sites on a long term basis (e.g. over the

fruiting season of certain food source trees within the known

range of the target species) to make high quality recor-

dings over a distance of hundreds of metres. Recordings

can then either be accurately analysed by the software in

real-time to produce a particular response, for example

send an SMS notification over a mobile phone network if

a rare parrot is detected, or analysed post-recording on a

PC to extract information on timing and frequency of site

visitation and potentially species abundance.

Clearly, however, the system has additional utility across the

fields of natural resource management. For example, in most

Australian jurisdictions a Development Application requires

the completion of an Environmental Impact Assessment

(EIA). In these cases, and particularly for large scale deve-

lopments such as mines, a comprehensive survey is requi-

red of local fauna to determine the impact the development

will have on wildlife populations. Because the automated

sound recognition system can be used to recognise the

sound/vocalisation of any species or group of animals it

can be used in conjunction with conventional techniques

to enable a more accurate census of wildlife to be underta-

ken in the EIA process with minimal additional resourcing.

AT68 - page 32

Warning.