This image depicts a Deep Learning model's prediction of adult Tarāpuka (Black-billed gulls) sitting on nests in a colony. The ground truth is pictured on the left, and the prediction is on the right. Image courtesy of Saif Khan.

Using Deep Learning to detect braided river bird populations

"To be honest, being an ecologist, I was a bit nervous to approach people from advanced data science space. But this fear diminished quickly as all involved in the project was more than ready to understand my needs and was ready to work with my strengths.
The below case study shares some of the technical details and outcomes of the scientific and HPC-focused programming support provided to a research project through NeSI’s Consultancy Service.
This service supports projects across a range of domains, with an aim to lift researchers’ productivity, efficiency, and skills in research computing. If you are interested to learn more or apply for Consultancy support, visit our Consultancy Service page.


Research background

When Saif Khan was a PhD Student at the University of Otago, he studied remote sensing applications in braided rivers. One of his projects used an unmanned aerial vehicle (UAV) to monitor threatened bird colonies and their habitat in braided river ecosystems of New Zealand.

There is currently a lack of reliable methods and tools for accurately and precisely detecting animals and automatically counting them. However, manually marking and counting thousands of individuals on aerial images is time-consuming and also prone to personal bias of the observer.

Working with Professor Phil Sneddon, Saif trained a deep learning model to detect and count a colony of Tarāpuka (Black-billed gulls). Using aerial images and the Mask RCNN modelling approach, Saif's model could detect sitting adults in the colony.


Project challenges

Saif's deep learning model was originally developed using Google Colab facilities, which is a free service but limited in computational power. With a prototype in place, he wanted to scale up his model, train it on other types of birds, and detect smaller targets like chicks. To achieve this, he needed computational resources from NeSI.

In addition, Saif wanted to gain more knowledge about additional possibilities to adapt his approach to different scenarios. In particular, he was interested in applying deep learning to detect White fronted terns as well as chicks in Black-billed gulls colony images.

He worked with Maxime Rio, NeSI Data Science Engineer, and Matt Bixley, NeSI Application Support Specialist, to tackle this challenge.


What was done

As a first step, Maxime ported Saif's model training to NeSI. He provided reproducible installation instructions adapted to the NeSI HPC platform, converted the original notebook in a set of CLI (command line interface) tools, and implemented Slurm job submissions scripts to perform model training on the HPC GPU nodes.

The resulting Python script made it possible to train a Mask R-CNN model in a non-interactive way on the HPC platform. Indeed, on an HPC platform, code needs to run non-interactively and ideally without a graphical interface in order to be executed as a batch job. From the computational aspect, training this model for 30 epochs took 12 min using a PCIe A100 GPU, whereas it sometimes took days to do the same using a free account on Google Colab.

Initial results were not fully satisfying, partly due to limitations of the toolbox employed, PixelLib, in the way early stopping was implemented. Early stopping is a mechanism to stop model training when performances on a validation set does not improve anymore, saving resources and ensuring the best model is used for inference. PixelLib did not offer the possibility to use AP@50 (average precision with IoU thresholded at 50%) or mAP (mean average precision) metrics for early stopping, although these are usual benchmark metrics. The best model obtained an AP@50 of 50.

As an alternative, it was then decided to test an alternative Deep Learning framework – the Detectron2 toolbox. Detectron2 is a well known framework for visual recognition tasks developed by Facebook AI Research team, providing a wealth of pretrained models.

Switching to Detectron2 provided multiple advantages:

  • It allowed saving the best models in terms of average precision.
  • It allowed switching from a semantic segmentation task (detecting masks on images) to a simpler object detection task (i.e. bounding box prediction). Changing the task allowed easier labelling of the training dataset, increasing the training dataset size and improving model performances.
  • It provided many base models to use for transfer learning. A small model (Faster RCNN with ResNet-50 backbone) and a larger model (Faster RCNN with ResNeXt-101 backbone) were compared. The larger model had the best performance, with an AP@50 of 86.10 (measured with a 5 fold cross-validation scheme).
  • It was able to handle object detection without a mask and support empty images (i.e. no birds) in the training set. This was critical to train the model to not recognise small features as birds, for example little twigs or foam on the river.
  • Trained models were compatible with the SAHI (Slicing Aided Hyper Inference) toolbox, making it possible to detect birds on very large images, using many smaller tiles and stitching together detections. The final model was tested on a 12,623 x 9,901 aerial image, detecting about 8,000 individuals in a black-billed gull colony.
Image shows a subset of a large aerial image of a Black-billed gull colony, with detections from the trained Faster RCNN model.
Pictured above: A subset of a large aerial image of a Black-billed gull colony, with detections from the trained Faster RCNN model. Image provided by Saif Khan. 


Main outcomes 

At the end of the Consultancy, NeSI was able to provide Saif with a set of scripts to train on models to detect birds in very large aerial images. Also, the testing and development phases of the Consultancy identified useful insights and considerations of the benefits and shortcomings of two deep learning toolboxes, PixelLib and Detectron2. 

Overall, Saif now has a set of more reliable and automated tools for detecting and counting the bird populations he's studying. Ultimately, successful completion of his project will help reduce the cost and time required for bird population monitoring by organisations like the Department of Conservation.


Researcher feedback

"I was truly amazed by the combination of warmth, professionalism and enthusiasm from any contact with NeSI regarding my project. To be honest, being an ecologist, I was a bit nervous to approach people from advanced data science space. But this fear diminished quickly as all involved in the project was more than ready to understand my needs and was ready to work with my strengths.

"Having Maxime to work with my project was a blessing. He literally walked me through from setting up the project to the end. Maxime went above and beyond to further the project's scope. At some point, I would wonder whether this is his project or mine. He truly was inspirational and I learnt heaps from him." 

- Saif Khan, former PhD Student, University of Otago


Do you want to bring your research to the next level? We can help. Send an email to to learn more about our Consultancy support.


Next Case Study

Photo of a riparian strip. Credit Dave Allen, NIWA.

Tools to better understand and address water quality issues

"The team at NeSI worked with us to provide a solution to achieve significant speed-up in legacy R- and FORTRAN-based code for catchment model runs."