Supporting data management, discovery, and access

 

Cover of NeSI 2022 Annual Review
Attribution: 

 

Each year NeSI publishes an Annual Review to celebrate the work we do,

the people we work with, and the groundwork we're laying for future directions. 

Below is an excerpt from our 2022 Annual Review

 


Throughout 2022, with the growing demand for services, we’ve optimised our storage platforms. We also learned about new approaches and service models for research data management, and considered how to facilitate easy and secure sharing of research data.

Supporting data management, discovery, and access

NeSI is proud to be supporting two large and growing data sets that are nationally, culturally and scientifically significant: the Aotearoa Genomic Data Repository (AGDR), a secure place for researchers to store, browse, and request access to data sets; and the Rakeiora project, a proof-of-concept platform to support clinical genomics research embodying CARE Principles (Collective benefit, Authority to control, Responsibility, Ethics). 

The features and functionality of the ​​AGDR (https://data.agdr.org.nz/), jointly developed by NeSI and Genomics Aotearoa, matured significantly in 2022. A production instance of AGDR was approved to operate by the University of Auckland cybersecurity team and successfully deployed on NeSI's Flexible HPC platform at the end of October.

Development of the service continued with work integrating local context hubs, traditional knowledge, and biocultural labels and notices allowing Indigenous communities to express local and specific conditions for sharing and engaging in future research. Membership in the New Zealand DOI Consortium allowed us to give each project in the AGDR a Digital Object Identifier (DOI) that can be cited in publications, which makes it easier to discover datasets. More than a dozen new datasets were added to the pipeline last year, including lamprey, rimurapa, killer whale, tarakihi, tuatara, tīeke, Venenivibrio spp, hoiho, and Munida spp.

A view of some of the datasets housed in the Aotearoa Genomic Data Repository.
Attribution: 
A view of some of the datasets housed in the Aotearoa Genomic Data Repository.

 

The Rakieora Project is a genomic pathfinder project in Aotearoa funded by MBIE to develop a cultural and genomic safe database. The project is collaboration between the University of Otago, Ngati Porou Oranga, Waipapa Taumata Rau, Te Whatu Ora Te Toka Tumai, the Institute of Environmental Science and Research (ESR), Ira Tātai Whakaheke, Genomics Aotearoa, and NeSI. In line with Te Tiriti o Waitangi, the project aimed to give effect to mātauranga and tikanga Maori, in particular kaitiakitanga.

“Our role was to ensure Māori beliefs and values were incorporated into the database system,” says Huti Watson, Ngati Porou Oranga, Chairperson for Ira Tātai Whakaheke and a member of the Rakeiora project team. “Taking into account the importance of ensuring NeSI incorporated mātauranga and tikanga Māori alongside international best practice genomic knowledge, we began to establish a uniquely Aotearoa genomic platform that would fit within the newly developing health system. Drawing on these complexities of knowledges, NeSI’s approach was to develop a ‘walled garden’ under the governance and oversight of Ira Tātai Whakaheke who acted as Kaitiaki over the development of the ‘walled garden’. It was a truly insightful journey for us all as we transitioned different stages of development.”

The first phase of Rakeiora was successfully delivered in 2022, with NeSI demonstrating a prototype end-to-end system in mid-June to key stakeholders, who shared valuable feedback which has been triaged and fed into the next phases of development. A series of governance sessions informed a draft data access agreement, and ongoing hui with Ira Tātai Whakaheke ensures we’re including appropriate metadata and also the correct controls over different metadata.

A screenshot of the Rakeiora portal interface, currently in development.
Attribution: 
A screenshot of the Rakeiora portal interface, currently in development.

 

Optimising storage solutions

Across 2022 we identified storage capacities and profiles as one of our most pressing matters, which will continue to be a focus into 2023. New data management features to support data compression and archiving have optimised our high performance storage capacity, ensuring it delivers the most value possible where needed for data-intensive computing and analytics.

Also, our focus on data lifecycle management and data workflows has allowed us to keep pace with the significant changes to and diversification of Aotearoa’s research landscape.

As part of this work we continue to build awareness within our user community of best practice approaches to data storage and sharing, as well as tools to improve or automate their workflow processes.

Sharing data across and beyond Aotearoa

The National Data Transfer Platform is a growing network of managed endpoints across Aotearoa, as part of an international network of endpoints through Globus, a global research infrastructure for moving, sharing, and discovering data. The National Data Transfer Platform is delivered through a partnership between NeSI, Globus, REANNZ, and New Zealand research institutions. In 2022, Manaaki Whenua – Landcare Research and Scion joined the platform.

Discussions are underway to also onboard the University of Canterbury, University of Waikato, and the Institute of Environmental Science and Research (ESR). Last year saw Globus roll out new functionality with support for web links and better scalability, enhancing its existing strengths in service stability and reliability.

Activity on the National Data Transfer Platform continues to grow, with increases seen in the amount of data transferred, the number of files transferred, and the overall number of transfers made. Ongoing collaboration with AARNet and REANNZ to support trans-Tasman research data transfers reached a major milestone in October 2022 when salmon genome files, approximately 3TB of data, were transferred from the Biomolecular Resource Facility at the Australian National University to Plant & Food Research in three hours. We anticipate this will open more doors for regional research collaborations and improved researcher access to specialised infrastructure to power their projects.

Table of data transfer metrics from Globus usage in 2022.
Attribution: 

These are just some of the many partnerships and activities that kept us busy last year.
Click here to read more from our 2022 Annual Review.

If you'd like to receive a printed copy of our review, get in touch.

 

Topic: