A scalable digital elevation builder for flood mapping
Flooding can damage property, endanger life and cost millions of dollars. Computer models can help predict how areas flood, letting countries plan ahead. These models need to combine different data types – one key input is hydrologically conditioned Digital Elevation Models, which can create issues..
GeoFabrics is an open-access Python library that can help researchers build hydrologically conditioned digital elevation maps by combining LiDAR and other data types. These maps are a vital part of flood models.
Dr Rose Pearson used NeSI to build GeoFabrics’s multi-threading tools. This allows researchers to use GeoFabrics to build digital elevation tools using single computer cores, clusters or supercomputers, depending on their needs.
Flooding impacts NZ through loss of life, property damage and economic loss. A recent release from Finance Minister, Grant Robertson valued the impact of the 2023 Cyclone Gabrielle, and the Auckland Anniversary floods to 0.1% of NZ’s GDP, which equates to roughly $250 million USD.
While every level of NZ government plans for floods, these plans are dependent on accurate flood models that can predict where water channels and collects during intense weather. By understanding this better, they may be able to develop better flood warning systems and infrastructure to prevent flood damage.
Dr Rose Pearson is a remote sensing scientist at NIWA. Rose developed the GeoFabrics Python package, which can be used to combine multiple different types of data when developing DEMs. Accurate DEMs are import for ensuring accurate hydrological flood maps.
Hydrological flood maps simulate the movement of water across an environment. These models require accurate information about the landscape to properly simulate the water’s course. This becomes important for flood resilience and infrastructure planning.
“GeoFabrics is a Python library that aims to ensure a consistent and reproducible approach to produce hydrologically conditioned digital elevation models.”
Hydrologically conditioned DEMs use three types of data:
- Point data that is unstructured data like Light Detection and Ranging (LiDAR) data from drones or satellites.
- Vector data that represent regions and applies attributes like where land, the ocean, rivers, waterways or culverts are located.
- Raster data that measure land areas as a grid and applies numbers to each grid cell, representing things like land elevation. This is both the key output of GeoFabrics, but also where there is no better information coarse DEMs that are not hydrologically conditioned can be used at inputs.
Rose needed to ensure that GeoFabrics was able to handle each of these data types and that it could handle very large datasets collected across regions or the whole of NZ. But to test these large datasets, Rose needed access to NeSI’s Maui and Mahuika supercomputers.
“GeoFabrics is a library for computing at scale, which allows me to write Python code to run across many CPUs and specify the memory that I want it to use, and also the number of cores that are available. NeSI is really the only platform for producing our hydrologic condition digital elevation models at the national scale.”
Rose worked with NeSI Data Science Engineer Maxime Rio to make GeoFabrics scalable for supercomputers. Maxime provided pseudocode examples that allowed Rose to make GeoFabrics scalable.
“I had never used a supercomputer before," says Rose. "We set up a one-hour meeting where I went through code architecture as it was. We went through this at a high level for maybe half an hour. Maxime considered this for about five minutes and then discussed how I could restructure the code to make use of Dask architecture.”
Dask was a key library for turning GeoFabrics single-thread programming approach, where one command is run after another, into a multi-threaded approach where multiple commands are run concurrently on different supercomputer GPUs. This led to GeoFabrics multistep processing pipeline where different data types are processed separately, and then combined into a region grid with digital elevation model data.
GeoFabrics is an open-access module available on Github (link: https://github.com/rosepearson/GeoFabrics). GeoFabrics was developed as part of the Ministry of Business Innovation and Employment funded project Mā te haumaru ō te wai: Flood reliance Aotearoa. As an open-access module, researchers and flood planners around the world may be able to use it to incorporate different data types into their flood models. By standardising flood model data input, existing models could be improved, and it could be easier to combine satellite, weather station and community data to new models.
“There is documentation for anyone to make use of, either on their own computer, or a cluster. There are examples to help users, so it's easy for users to get set up.”
Through NeSI, Rose was able to develop GeoFabrics so users could build models to run on single cores, clusters or supercomputers. This makes it a useful tool for anybody performing flood modelling, even if they have limited computing powers or huge regions to model.
“It's a real treat to be able to work with the highly skilled group of individuals at NeSI. They are responsive to queries or upskilling me so I can better make use of NeSI and its power the next time around.”
Do you have an example of how NeSI platforms or support advanced your research? We’re always looking for projects to feature as a case study. Get in touch by emailing firstname.lastname@example.org.