Researcher James Shepherd studies how land cover change occurring in a given area from a temporal sequence of satellite images. Working with NeSI, he's been able to reduce his turnaround time for processing his data.

Improved tracking of land cover changes

James Shepherd wanted to reduce the turnaround time for processing his data, so he applied for a Consultancy project with NeSI research software engineers.
The below case study shares some of the technical details and outcomes of the scientific and HPC-focused programming support provided to a research project through NeSI’s Consultancy Service.
This service supports projects across a range of domains, with an aim to lift researchers’ productivity, efficiency, and skills in research computing. If you are interested to learn more or apply for Consultancy support, visit our Consultancy Service page.

Research background

James Shepherd is a senior scientist at Manaaki Whenua – Landcare Research who specialises in correcting satellite imagery for atmospheric, topographic, and directional effects and its subsequent classification for environmental applications.

TMASK is a software written by James that detects land cover change occurring in a given area from a temporal sequence of satellite images. Examples of land cover changes are forest clearing, urbanization, or management of agricultural fields. TMASK analyses multi-year sequences of green, Near-Infrared (NIR) and Short-wave Infrared (SWIR) Sentinel satellite imagery. The imagery is provided as top-of-atmosphere reflectance values. 

 

Project challenges

Because land cover classification can be challenging or impossible in areas covered by cloud and snow, a critical part of the algorithm involves filtering out these effects. After this step, the code computes a set of five Fourier coefficients for each image pixel to take into account seasonal changes. The coefficients are computed using a regression method implemented in the Gnu Scientific Library (GSL). Data stored in the output file can then be used to compare new imagery data against the model to determine if the land cover changed.

TMASK takes only about 300-400 μs to compute the 5 coefficients for each pixel but at the scale of New Zealand, TMASK can take about 8 hours or more to compute the coefficients for a single satellite band, 24 hours for the green, NIR and SWIR bands. James wanted to reduce the turnaround time for processing his data, so he applied for a Consultancy project with NeSI research software engineers.

 

What was done

The code looked ideally suited for parallelisation given that the computation of Fourier coefficients can proceed independently for each pixel and block of pixels. Unfortunately, the time to compute the coefficients was found to vary substantially from one block to another, see figure below. 

The two factors contributing to the spread of CPU time are: (1) blocks at the edge of the domain are often truncated (i.e. have fewer pixels) and (2) the number of temporal data points that are not covered by clouds or snow can vary substantially from block to block.

Using the above spread of CPU times observed for each block, we modelled different parallelisation approaches. The simplest approach uses a domain decomposition where tasks are pre-assigned to processes without regard to load balance. Effectively, some processes will have to wait for other processes to complete, which limits scalability. The other considered approach uses a master-worker setup where tasks are distributed dynamically. Workers that have finished a task ahead of other workers can take on a new task. 

We found the master-worker approach to be superior by +20% to domain decomposition when 3-4 blocks are assigned to each MPI process. We estimated that a maximum 55x speedup could be achieved by assigning one worker process per block (neglecting the cost of communication) and decided to move forward with the master-worker setup. 

 

Main outcomes

The TMASK code has been rewritten to use MPI parallelism. This yielded a 53x speedup with respect to the original, serial code for the supplied problem size, reducing the wall clock time from 50 minutes to less than one minute. A particularly efficient setup was found to be using 16 CPUs with multithreading (32 MPI tasks), yielding a 14x speedup (85 % parallel efficiency).

The dashed line shows the speedup obtained when placing two MPI ranks per CPU using hyperthreading. This provides an additional performance enhancement compared to using one MPI rank per CPU. Note that NeSI charges per CPU and not by logical core (or hyperthread).

 

Do you have an research project that could benefit from working with NeSI research software engineers? Learn more about what kind of support they can offer and get in touch by emailing support@nesi.org.nz.

Next Case Study

A screenshot of an interpolation simlulation run by researcher Tom Etherington.

Supporting ecological research through enhanced interpolation capability

“I will now be able to make better maps of environmental variables, and hence much better models of species distributions."
Subject: