How Globus enables national cyber infrastructures
Researchers now more than ever need a highly scalable, friction-free data management solution to effectively address the complexities of conducting modern research.
Careful planning is necessary to avoid research data management obstacles. For example, researchers must be able to handle extremely large data sets; often terabytes and, more recently, petabytes of data, created by powerful instruments and other new tools. Data must be collected and shared rapidly, easily and securely across campus or across the globe.
Mundane research data management tasks should be fully automated and essentially “invisible” to researchers, allowing them to focus on the core mission of scientific research.
To this end, several countries have set up national cyber infrastructures that democratize access to advanced data management capabilities for researchers at diverse institutions and facilities. Below is a look at how NeSI is building Data Services in the national interest.
A collaborative approach
As researchers increasingly generate and transfer data at expanding rates, NeSI is responding to growing requirements for a range of data services, including data transfer capabilities.
Since 2014, NeSI has partnered with Globus to offer a high-speed option for transferring large and distributed data nationally and internationally.
Designed for use with NeSI’s national High Performance Computing (HPC) platforms as well as data storage and research facilities around New Zealand, it was a major step towards a national framework for data sharing and access and supporting international collaboration.
“Bringing the platform online was truly a collaborative effort, involving coordination and cooperation between our international partner Globus, national partner REANNZ, and on a regional scale with several innovative research institutions,” says Brian Flaherty, Data Services Product Manager at NeSI. "Our goal was to lower barriers and to normalise expectations of moving demanding volumes of data to enable data intensive science."
NeSI’s National Data Transfer Platform launched with four endpoints:
NeSI HPC systems (Māui and Mahuika) hosted at NIWA in Wellington
AgResearch in Christchurch
University of Auckland in Auckland
University of Otago in Dunedin
In 2021, a fifth endpoint came online at Plant and Food Research in Auckland. A further two endpoints are in later stages of implementation, with Manaaki Whenua - Landcare Research in Hamilton and Scion in Rotorua planned to come online by the end of the year.
NeSI has also been collaborating with AARNet, Australia's national research and education network, to investigate endpoints at Australian research facilities as part of a phased “whole of Australia research sector” Globus subscription.
AARNet is provisioning a service in collaboration with Globus called SciDataMover that will aid Australian researchers to move large data between endpoints with ease in a reliable and secure manner. The goal is to deploy Globus across the research sector in Australia, particularly at large data facilities to support researchers.
AARNet currently have eight universities trialling Globus as part of a proof-of-concept and CSIRO have subscribed directly for three years of service. Another six universities have active interest in deploying Globus in a pilot.
Early-stage tests of transfers between the Australian endpoints are underway, with an eye to enabling higher performing trans-Tasman research collaborations.
Building national capability
NeSI is invested in building and sustaining capability to operate Globus as a national service provider, and to enable adoption widely across the New Zealand research system.
Using the National Data Transfer Platform, researchers can move gigabits of data on a network 1,000 times faster than through a broadband connection. Powered by REANNZ, data transfers can be done at 10 Gbps.
“We know that research today is increasingly data-intensive and NeSI's Data Services make it easier for New Zealand researchers to access, store, transfer, and share large and distributed datasets,” says Flaherty. “Globus is a valuable and essential component to many of those services.”
In 2020, the National Data Transfer Platform was used by 182 researchers, who transferred 870 TBs of data, and moved more than 72 million files.
As a national provider, NeSI is focused on building capability into institutions and across the research community, whether through shared support channels in Slack, Globus webinars, or national forums considering flows of data across the research ecosystem and across the lifecycle of the data.
Supporting FAIR and CARE principles
New Zealand’s researchers are becoming proactive in considering research data management. Many communities are assessing their shared understanding of practices, workflows, and governance. As an organisation, NeSI strives to support FAIR (Findable, Accessible, Interoperable and Reproducible) and CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles in data and research software.
The National Data Platform’s potential as a national framework for data sharing and access has important applications in NeSI’s collaboration with Genomics Aotearoa, where discussions are underway to address broader issues in data management, custodianship, and access management requirements for sensitive data, indigenous data, and Māori data sovereignty.
NeSI is working with Genomics Aotearoa to develop a pilot data repository for the storage of genomic data generated from taonga species. By early 2020, it hosted six genome data sets of taonga species, totalling around 5.5TB.
As the repository takes shape, contributions of test datasets are helping identify challenges and opportunities that exist around how researchers store, share, publish, and archive data. Use of Globus has helped manage the security and data sovereignty considerations related to sharing access to these taonga species collections.
Now more than ever, collaboration is key to maintaining a progressive and sustainable research ecosystem. As the New Zealand research sector looks to answer national science imperatives across institutional boundaries, NeSI seeks to build national capability in running and optimising use of HPC and eResearch infrastructure.
This is where partnerships like NeSI's with Globus, REANNZ, and research institutions across New Zealand comes into play.
"Working together, we can better respond to research community needs — be it related to computational power, data management, or other advanced digital research capabilities," says Nick Jones, Director of NeSI. "Building shared understanding helps us connect with a research community’s aspirations and goals and better equip their researchers to deliver new and valuable insights in their fields at both local and global scales."