Projects

Below is a list of projects with allocations on our facilities. While these are summaries derived from the project summaries from the access process, they are written by NeSI staff rather than project teams themselves. Therefore, they may contain errors and should be considered indicative rather than authoritative.

For more detailed information on specific exemplars of eScience and our high performance computing facilities, see our case studies.

April 2013

Proposal Development Allocation Class

Genome assembly of the two New Zealand insect species

Advances in DNA sequencing technologies have now made sequencing whole genomes of non-model organisms possible. These technologies current generate millions to billions of short reads (50-150 base pairs) sequences. Due to the short length of the sequence reads and errors in the reads, each site in the DNA sequence must be sequenced many times to ensure the sequence is accurate. Furthermore, assembly of the reads into a genome is only possible with a large number of overlapping reads.

Whole genome sequencing and assembly therefore requires the generation of massive data sets. The algorithms for assembling these short reads into a genome are memory intensive. The data set size and software creates the need for large amounts of RAM and disk storage. Landcare Research is currently gathering sequence data to assemble the genomes of the common stick insect (Clitarchus hookeri) and the Poor Knights giant weta (Deinacrida fallai). Both of these genomes are unexpected large, thus the assemblers will require a machine with very large memory to process.

Principal InvestigatorSoftware Used
Thomas Buckley, Landcare Research

Bioavailability of iron in infant formula

Iron fortification of goat milk formula will be studied using quantum chemical calculations as well as molecular dynamics simulations to evaluate the role of Fe absorption. The casein phosphopeptides from β-casein and αs1-casein would be tested for iron binding sites to assess Fe bioavailability.

Principal InvestigatorSoftware Used
Aruna Awasthi, Callaghan Innovation
  • NWchem
  • VASP
  • LAMMPS

Synthetic Lethal interaction discovery

Synthetic Lethal interactions between genes are regarded as a large untapped reservoir of potential therapeutic targets for cancers. A tool in R has been developed to predict the synthetic lethal partners of a given query gene using normalised TCGA expression data, loops and χ-squared tests on quantile tables. The next phase of this project is to globally predict the synthetic lethal partners of every gene in the dataset. This will prove more effective than a candidate gene approach to find novel genes with high numbers of significant synthetic lethal interactions.

Principal InvestigatorSoftware Used
Tom Kelly, Otago University
  • unspecified

March 2013

Research Allocation Class

Simulating social rules for human population genetics

Analysing populations all around the world, population genetics have allowed us to to infer history of the human kind. Many analyses on human population genetics have empirically associated mating behaviours with genetic patterns. Such bias on population genetics inferences, induced by marriage rules in human society, has not been yet quantified. The purpose of this study is to provide a theoretical framework, using computational simulation of genetic evolution in human populations, by linking mating behaviour and genetic diversity in small communities. A large associated dataset of Indonesian DNA will be used to test the models. This project will be both theoretical and practical, mixing simulation and real genetic datasets.

Principal InvestigatorSoftware Used
Elsa G Guillot, Massey University
  • own codes

Identification of lateral gene tranfer events in Clavicipitaceae genomes

The main goal of this project is to identify Lateral Gene Transfer (LGT) events in Clavicipitaceae genomes which are economically important symbiotes of grasses. This work is based on phylogeny reconstruction and on the analysis of differences of genes composition. To be able to rebuild the phylogeny, we will have to annotate the genomes (finding as many gene functions as possible) first.

Principal InvestigatorSoftware Used
Pierre-Yves Dupont, Massey University
  • own codes

Dynamic rupture simulations of the 2011 Magnitude 6.3 Christchurch earthquake

In February 2011, the city of Christchurch was severely damaged by strong ground motion of the Magnitude 6.3 earthquake. In particular, observed peak ground accelerations in the vertical component were particularly large at stations close to the up-dip end of the fault. Such feature might have been caused by &dlquo;wedge effects&drquo;, i.e., the interaction of seismic waves with shallow sedimentary layers at the tip of a dipping fault.

In this project, the research group proposes to simulate dynamic rupture processes and the corresponding seismic wave propagation of the Magnitude 6.3 earthquake using the-state-of-the-art dynamic-rupture code, SPECFEM3D. SPECFEM3D is an elastic wave-propagation code widely used in seismology, and a version of the code has been developed and used for dynamic-rupture problems by the PI and his collaborators over the last several years. The goal of this project is to identify factors influencing the observed high-frequency strong ground motion in the vicinity of the fault trace during the February 2011 Magnitude 6.3 earthquake.

Principal InvestigatorSoftware Used
Yoshihiro Kaneko, GNS Science
  • SPECFEM3D

How does the heart grow?

This project will be testing the hypothesis that interactions between cell proliferation, cytoskeletal reorganisation and shear stress play a role in the development of the heart during days 20-28 off its development. The project team will use experimental measurements of shape change and tissue structure (part of the research project but not this application) combined with continuum mechanics modelling of the growing heart to predict cell proliferation, growth, apoptosis and migration. Continuum mechanics and finite element methods provide a mathematical framework for solving the coupled physical laws that govern electrical activation processes, large deformation mechanics and 3D fluid flow.

Principal InvestigatorsSoftware Used
Dr Chris Bradley and Prof. Peter Hunter, Auckland Bioengineering Institute
  • OpenCMISS

Metagenomic analysis of aerosol samples collected from a meat processing environment

It is known that meatworkers have a higher incidence of lung cancer. A biological cause has been proposed. In this project we have collected aerosol samples from meat processing factories and are investigating the composition of organisms in this sample by applying culture-independent approaches, namely, next generation sequencing and metagenomics. Sequencing results are compared with databases using BLAST and other sequence search software to identify and characterise micro-organisms present within the samples.

Principal InvestigatorsSoftware Used
Richard Hall, ESR
  • mpiBLAST
  • NCBI BLAST+

Gangnam-style melting: An intriguing structural trend in first-principles melting of gallium clusters

Gallium is a peculiar and interesting metal. With an intriguingly diverse phase diagram, the multitude of potential applications for such a polymorphic element has generated extensive experimental and theoretical interest extending from the bulk to molecular scale. In the bulk phase at normal temperature and pressure, alpha-gallium is often termed a “molecular metal,” with a structure comprised of covalently-bound dimers arranged in metallic planes. Diminished to the size of 10s of atoms, this unique metal continues to beguile with single-atom additions changing the melting temperature by as much as 100 K and cluster melting temperatures nearly twice that of bulk.

As nanoscale particles are typically less stable than their bulk counterparts, this thermodynamic behavior is in striking contrast to classical expectations. At present, the chemistry underlying the molecular, pair-bonded nature of alpha-gallium as well as the fascinating thermodynamic behavior at the nanoscale remains poorly understood.

A previous research allocation has allowed the research group to attain converged results for 7 clusters, taking a significant step toward understanding the intriguing size-sensitivity and anomalous greater-than-bulk melting experimentally reported for these small clusters. These novel results have revealed a size-limit of 9-atoms as a lower bound to melting in gallium, possibly identifying the transition from the cluster to molecular regime. In alluring contrast to previous experimental and theoretical work on gallium, we have also discovered the first small gallium cluster melting at a lower-than-bulk temperature. This fascinating result is partially attributed to persistent pair bonding reminiscent of the bulk structure of alpha-gallium. This work has led to two publications: one published in a peer-reviewed journal and one submitted for review. The previous research allocation has also allowed us to make significant progress towards modelling 5 larger clusters sized 32-36 atoms. Although 3 of the 5 clusters are fully converged, 2 require additional simulation time to reach convergence. As each of these clusters have been experimentally measured, the research group has been able to compare our initial results to the experimental curves.

Initial analysis has revealed a interesting Gangnam-style structural trend, yielding important insights into the way these clusters melt as well as the intriguing one-atom thermodynamic differences and observed for these small systems. In probing the nature of bonding in these small clusters, this research will contribute significantly to our understanding of metallic and covalent systems as well as how these properties might contribute to both bulk and nanoscale characteristics.

Principal InvestigatorsSoftware Used
Krista G. Steenbergen, MacDiarmid Institute/VUW
  • VASP

NZLAM-12 Hindcast Simulations To Force A Lagrangian Dispersion Model For Studying The New Zealand CO2 Budget

The main scientific goal is to quantify New Zealand's carbon budget, in particular the terrestrial CO2 uptake, by combining the information about atmospheric transport with measurements made at the surface. Regional Lagrangian modeling is a powerful tool to link CO2 measurements at stations directly with atmospheric transport and regional CO2 uptake and release. When run in backward mode, regions of CO2 release or uptake that contributed to the measurements made at a specified receptor can be identified.

The project team will use a combination of the NZLAM-12 regional Numerical Weather Prediction model and the NAME III dispersion model, both developed by the UK Met Office, to produce large numbers of high-resolution back trajectories over a recent two year time period, which will enable us to identify predominant transport pathways. Combining these pathways with available measurements will allow us to estimate New Zealand's terrestrial CO2 sink.

Principal InvestigatorsSoftware Used
Kay Steinkamp, NIWA
  • NZLAM-12
  • NAME III

Proposal Development Allocation Class

Stream Biofilm

The stream biofilm research project aims to determine the factors driving biofilm function and the influence of these on the stream food webs. The research group also aims to determine the critical factors for ensuring the establishment and maintenance of biofilms and their contribution to environmental services in stream systems. High throughput sequencing is one of the methods currently being used to investigate the constituent members of the bacterial community in stream biofilms, and how the community composition changes according to environmental variables, anthropogenic disturbances and biogeography.

Principal InvestigatorSoftware Used
Kelvin Lau, The University of Auckland
  • unspecified

Seismic tremor in Mahia-Gisborne region

A Marsden-funded research programme (2012-2014) targeting the detection and location and understanding of seismic tremor and seismic phenomena associated with slow slip deformation events on the Hikurangi subduction zone. The focus for this project is the Mahia-Gisborne region.

Principal InvestigatorSoftware Used
Stephen Bannister, GNS Science

Computational challenges arising from large scale population resequencing

LIC have engaged on a large scale population resequencing project for Bos taurus. Numerous compuational challenges arise in short read mapping, population or pedigree based snp calling, genotype phasing and imputation. Currently 500 animals have been resequenced. LIC hope to impute upto 50,000 animals genotyped on a 50K snp chip into the reference reference population. The team would like to try and determine the computational time vs accuracy tradeoff of this imputation.

Principal InvestigatorSoftware Used
Mike Keehan, LIC

JStar Parallelism Benchmarks

To measure the speedup of JStar programs on massively parallel computing environments.

Principal InvestigatorSoftware Used
Assoc. Prof. Mark Utting, Waikato University

Molecular Dynamics and Enzyme Function

The research group is aiming to understand how molecular motions determine the catalytic and allosteric properties of enzymes.

Principal InvestigatorSoftware Used
Emily Parker, Canterbury University

Groundwater properties simulation and uncertainty propagation

A broad range of open and closed source groundwater modelling tools and frameworks exist, exemplary MODFLOW (USGS), MIKE and FEFLOW series (DHI) and ArcGIS models (ESRI). However, based on the spatial data components and inter-relationships those applications typically have a mainly sequential workflow. Additionally uncertainty of the simulated data cannot be captured easily. Therefore within the SMART project the research group is seeking to implement an open implementation of a simple groundwater properties data simulation.

Principal InvestigatorSoftware Used
Alexander Kmoch, GNS Science
  • own codes

Assembling a de novo eukaryotic genome using genetic markers

This project is exploring a new method for assembling de novo eukaryotic genomes using typed offspring from a cross. The research group's particular focus is yeast.

Principal InvestigatorSoftware Used
Prof. Richard Gardner, The University of Auckland
  • own codes

Quasi-static modeling of deformation at the Hikurangi subduction interface

This project will involve both static (elastic) and viscoelastic models of deformation occurring along the Hikurangi subduction zone. The static models will involve the generation of Green's functions for use in inversions for slow slip events. Current inversions make use of homogeneous elastic half-space models to examine the slip distributions. Using the finite element model we can use a New Zealand-wide velocity model to look at the effects of material heterogeneity as well as topographic effects. The viscoelastic models will look at the behavior of the subduction zone over multiple earthquake cycles, examining the roles of rheology and frictional fault behavior.

Principal InvestigatorSoftware Used
Charles Williams, GNS Science

Feburary 2013

Proposal Development Allocation Class

GCMC simulations for Metal-Organic Frameworks

Metal-Organic Frameworks (MOFs) are known as the most porous materials discovered so far. The high porosity and robust structures make MOFs promising for real-life applications such as energy gas storage, separation, purification and greenhouse gases capture. Grand Canonical Monte Carlo (GCMC) simulations enable to simulate the surface area of a MOF and screen its gas storage / separation capabilities. It also helps scientist to screen over hypothetical MOFs for certain applications, such as methane storage and target synthesize the best candidates.

Principal InvestigatorSoftware Used
Assoc. Prof. Shane Telfer, Massey University
  • unspecified

January 2013

Research Allocation Class

Application development for computer models of cell coupling in vascular geometries

The aim of this project is to develop a pilot computational model that would be able to investigate the relationship between the underlying endothelial and smooth muscle cell cellular chemical dynamics and blood flow in atherosclerotic susceptible areas of the vasculature..

Principal InvestigatorSoftware Used
Prof. Tim David, University of Canterbury
  • unspecified

Proposal Development Allocation Class

Dynamic rupture simulations of the 2011 Magnitude 6.3 Christchurch earthquake

In February 2011, the city of Christchurch was severely damaged by strong ground motion of the Magnitude 6.3 earthquake. In particular, observed peak ground accelerations in the vertical component were particularly large at stations close to the up-dip end of the fault. Such feature might have been caused by “wedge effects”, i.e., the interaction of seismic waves with shallow sedimentary layers at the tip of a dipping fault.

In this project, the research group proposes to simulate dynamic rupture processes and the corresponding seismic wave propagation of the Magnitude 6.3 earthquake using the-state-of-the-art dynamic-rupture code, SPECFEM3D. SPECFEM3D is an elastic wave-propagation code widely used in seismology, and a version of the code has been developed and used for dynamic-rupture problems by the Principle Investigator and his collaborators over the last several years.

The goal of this project is to identify factors influencing the observed high-frequency strong ground motion in the vicinity of the fault trace during the February 2011 Magnitude 6.3 earthquake. Currently, a full simulation would take several hours on ~100 cores.

Principal InvestigatorSoftware Used
Yoshihiro Kaneko, GNS Science
  • SPECFEM3D

Massively parallel protein similarity searches

The Smith-Waterman algorithm can be used search for protein sequences similar to a querying a database. This is much slower than the heuristic methods (e.g. BLAST) but will find the best matches. A recent implementation that speeds the software algorithm is called SWIPE or mpiswipe. The research group wishes to test its ability to speed up the searches on a massively parallel machine, the BlueGene/P.

Principal InvestigatorSoftware Used
Chris Brown, Otago University

Modelling interaction between ocean and ice shelves in the Ross Sea

Further investigation on how to build and successfully run long term ROMS (Regional Ocean Model System) simulations on the NIWA HPCF. The scientific goal is to undestand more of the interaction between ocean and the Ross Ice Shelf and the annual variability of theses processes. The Regional Ocean Model System provides the computational framework for the simulations that need to be adapted and optimized to utilize the performance advantages of the NIWA HPCF.

Principal InvestigatorSoftware Used
Stefan Jendersie, NIWA
  • ROMS

Jurassic Genomics – Using the Tuatara Genome data to re-examine the Basal Reptilian Phylogeny

The tuatara, Spenodon punctatus, is iconic and unique to New Zealand. It is perhaps one of the most enigmatic of extant terrestrial vertebrates. Once widespread across the supercontinent of Gondwana, tuatara are now restricted to a small number of offshore Islands in Cook Strait and the north of the North Island, New Zealand.

Via a collaboration with Ngatiwai iwi and funding and support from the Allan Wilson Centre, Centre for Reproduction and Genomics, New Zealand Genomics Ltd, Illumina and Biomatters Ltd., the research group has begun to sequence the genome of this internationally iconic species as part of the Genome 10K initiative.

The reasons for sequencing the tuatara genome are manyfold. Foremost among these is that the tuatara is phylogenetically unique; the only living member of an archaic reptilian order Rynchocephalia (Sphenodontia) that last shared a common ancestor with the rest of the reptiles from some 220-250 million years ago. As such they represent a key link to the now extinct stem reptiles from which dinosaurs, modern reptiles, birds and mammals evolved. This provides unique insight into what those early vertebrate ancestors may have been like.

Using our newly acquired tuatara data together with data from recently completed representative turtle, crocodilian, lizard, amphibian, fish and bird genomes an opportunity exists to use these data to examine the pattern and timing of diversification at the base of the modern reptiles, a problem that has proven elusive despite significant efforts, including those of Allan Wilson himself.

Principal InvestigatorSoftware Used
Neil Gemmell, Otago University

NZLAM-12 Hindcast Simulations To Force A Lagrangian Dispersion Model For Studying The New Zealand CO2 Budget

The main scientific goal is to quantify New Zealand's carbon budget. In particular the terrestrial CO2 uptake, by combining the information about atmospheric transport with measurements made at the surface. Regional Lagrangian modeling is a powerful tool to link CO2 measurements at stations directly with atmospheric transport and regional CO2 uptake and release.

When run in backward mode, regions of CO2 release or uptake that contributed to the measurements made at a specified receptor can be identified. We will use a combination of the NZLAM-12 regional Numerical Weather Prediction model and the NAME III dispersion model, both developed by the UK Met Office, to produce large numbers of high-resolution back trajectories over a recent two year time period. This will enable the research group to identify predominant transport pathways. Combining these pathways with available measurements will allow us to estimate New Zealand's terrestrial CO2 sink.

The research group's first aim is to discover the optimal model set up that will provide the most economical NZLAM-12 model throughput and determine both the performance of the NAME III dispersion model on a HPCF computing environment. This will include building and testing the necessary post-processing steps. Ultimately, we aim to generate a continuous meteorological hindcast over the period September 2010 to January 2013. While the project focuses on CO2, the high-resolution meteorological hindcast can be used to study a wide range of atmospheric species associated with climate change, ocean biogeochemistry, atmospheric chemistry, and air pollution.

Principal InvestigatorSoftware Used
Kay Steinkamp, NIWA
  • NZLAM-12
  • NAME III

December 2012

Proposal Development Allocation Class

Origins of Two-Dimensional Quantum Turbulence

The research group is attempting to simulate quantum turbulence. This is a computationally intensive task and the team anticipates that the use of NeSI HPC resources will enable a deeper and more wide-ranging studies of the origins of two-dimensional quantum turbulence than have previously been attempted.

The characteristics of turbulent flow in a two-dimensional classical fluid are remarkably different from those of turbulent flow in three-dimensional situations. Key features of classical two-dimensional turbulent dynamics are an inverse cascade of energy from small to large scales, and an associated aggregation of vorticity into large, coherent, vortex structures. In contrast to the classical case, turbulence in superfluids — viscosity-free, quantum degenerate fluids such as an atomic Bose-Einstein condensate — is strongly influenced by the laws of quantum mechanics.

The dynamics of quantum turbulence are determined by the nonlinear Gross-Pitaevskii equation, and are dominated by the motion of quantized vortices, which represent a fundamental and indivisible unit of vorticity. While quantum turbulence in three-dimensional superfluids has been extensively studied, the characteristics of two-dimensional quantum turbulence remain largely unexplored. In particular, demonstrating the existence, and possible origins, of inverse energy cascade and the formation of coherent rotating structures in two-dimensional quantum turbulence remains an open problem. Addressing this problem requires the study of superfluid systems which can be compared to classical models such as Onsager's point-vortex model and the Navier-Stokes equations. Such systems necessarily contain a large number of quantized vortices and display dynamics over a wide range of scales.

Principal InvestigatorSoftware Used
Ashton Bradley, University of Otago
  • Own codes

November 2012

Research Allocation Class

Detailed analysis of the electronic structure of electron rich Π-systems

The problem of the storage of molecular hydrogen is a major issue and its solution a first step on the way to a hydrogen economy. Hydrogen is an energy carrier that has many advantages over fossil fuels and other sources of energy that are currently in use: It burns cleanly with oxygen to form water as its only product and the fuel cell technology used in this process is well developed in the field of engineering. However, the storage of hydrogen before it is used to generate energy is difficult because of hydrogen's nature as a volatile, combustible gas.

Intercalating molecular hydrogen between graphene sheets could be a way to achieve the desired storage concentrations in order to compete with straightforward compressed gas. But in order to achieve this goal it is necessary to understand the properties of the aromatic surface in detail so it can be functionalized to accommodate guest molecules that do not typically bind to plain graphene.

The probe used to understand those surface properties will be a transition metal atom - ruthenium. Ruthenium is a versatile element with many applications in organometallic chemistry. The first sandwich compound of ruthenium, ruthenocene (a ruthenium atom "sandwiched" between to aromatic rings) was already synthesized in 1952.

In the first phase of the project all theoretically possible structures of a RuCp*-HBC complex will be modeled in order to determine the structure that is lowest in energy. In the next step these findings will be corroborated through conformational searches to determine the influence of the position that the side chains of the molecules might have on the total stability. Those results will then be used to adjust the initial calculations and a final data set will be produced. This final data set of structural information will then be subjected to a range of analytical tools in order to determine the characteristics of the chemical bonding that are responsible for directing the metal atoms to the binding sites that are found experimentally. The most prominent analytical tools will be the QTAIM (Quantum Theory of Atoms in Molecules) method that uses an analysis of the topology of the electron density for a description of the bonding features and the NBO (Natural Bonding Analysis) that uses localized orbitals for a Lewis-structure like description of the bonding situation. The combination of these two methods ensures a complete and unbiased picture of the compounds in question.

Principal InvestigatorSoftware Used
Matthias Lein, VUW
  • Own codes

Proposal Development Allocation Class

Identification of lateral gene tranfer events in Clavicipitaceae genomes

The research group will use the allocation to determine which of NeSI’s facilities is best suited for the analysis. The scientific goal of this project is to identify Lateral Gene Transfer (LGT) events in Clavicipitaceae genomes which are economically important symbiotes of grasses. This work is based on phylogeny reconstruction and on the analysis of differences of genes composition. To be able to rebuild the phylogeny, we will have to annotate the genomes (finding as many gene functions as possible) first.

To annotate the genomes, the research group will use BLAST to find homologies with known genes in Clavicipitaceae genomes. Then the reconstruction of phylogeny and the identification of LGT events use parallelized algorithms based on Markov, or Bayesian, models.

Principal InvestigatorSoftware Used
Pierre-Yves Dupont, Massey University

HPC challenges in the development of a multi-scale, multi-physics model of the heart

The heart is a complicated organ responsible for pumping oxygenated blood around the body. The pumping function of the heart is the result of multiple physical processes which interact together in a coordinated fashion. Some of these processes include the electrical activation of the heart muscle, the mechanical contraction of the muscle and the fluid mechanics of the blood that is pumped from the heart as a result of the muscle contraction. Degradation of these processes or their precise interaction can result in heart disease and failure of pump function.

To better understand the function of the heart and the processes and implications of heart disease the Auckland Bioengineering Institute (ABI) has been developing a coupled multi-scale and multi-physics mathematical model of the heart for the last 30 years. This mathematical model generates an extremely large number of equations which require the use of large parallel computers to solve. In addition to the heart model the ABI has also been developing OpenCMISS, an open source computing environment for the solution of biological and bioengineering models. OpenCMISS is a major re-write of our current code CMISS and has been designed from the outset to take advantage of modern high performance computer architectures. Architectures allowed for include shared memory, distributed memory (using MPI) and accelerators (GPUs and FPGAs).

After the initial development and testing of OpenCMISS we are now in a position to look at scaling up the size of the models and concentrating on improving the performance of the code. This is a crucial step in order to demonstrate the feasibility of using OpenCMISS for large coupled models such that further research funding can be obtained. The next step for OpenCMISS is to improve the performance and scalability of the code.

Initial optimisations have started on machines with a lower CPU count and we require access to a larger machine in order to take this further. Another crucial development step is parallel I/O. The temporal nature of the problem requires that a large amount of data representing the solution at a point in time be written out at each time step during the solution process. In order to obtain an effective overall solution it is vital that the IO is efficient and scalable. For I/O OpenCMISS uses FieldML, a standard being developed at the ABI to encode spatially and temporally varying information. Initial work has started on using FieldML together with HDF5 for parallel IO. As part of the development proposal we would be looking to investigate and optimise the best possible I/O platform (e.g. MPI-IO, HDF5, NetCDF, Adios etc.) for OpenCMISS and FieldML.

In order to model the physical processes that occur at small time and space scales fine computational grids are required. However, the use of uniform high density grids may result in an unfeasibly large computational problem. As a possible further development step we would investigate the use of adaptive meshing techniques to improve computational performance. The use of adaptive meshing in distributed computing architectures introduces a number of challenges e.g., minimising the cost of re-distribution of data and load balancing problems.

Principal InvestigatorSoftware Used
Chris Bradley, ABI

October 2012

Proposal Development Allocation Class

Tidal Energy Array Optimisation

The project team is undertaking a preparatory project to evaluate gerris’s potential on NeSI’s POWER 6 and Intel clusters. These results will be compared to performance and accuracy of NIWA’s current modelling cluster. This will enable the project team to determine performance advantages can be gained over the current workflow to meet the current project’s research goals.

Principal InvestigatorSoftware Used
Tim Divett, NIWA

Tidal Energy Array Optimisation

This doctoral project aims to develop a better understanding of the behaviour of composite helicopter structures when impacting on water. Helicopters are generally used to access locations in challenging environments such as offshore platforms. Surveys conducted in the 1990s have shown that the proportion of helicopter accidents occurring over water was significant and most of them lead to severe or fatal injuries. Although engineers are aware that a structure impacting on hard ground will provide a mechanical response very different to an impact on a soft surface or a fluid, current helicopter designs still perform very poorly in the later situation.

The main objective of the project is to develop accurate modelling methodologies to capture the response of a helicopter structure impacting on water using the current numerical methods such as SPH and ALE, implemented in commercially available explicit structural analysis software (LS-DYNA). Numerical models will be developed for rigid and deformable structures in 2D and 3D configurations. The results will be validated through comparison with experimental tests conducted in Auckland with a Servo-Hydraulic Slam Testing System. Once the numerical models are validated, they will be used as predictive models to understand the mechanical response of composite structures impacting on water under various crash conditions.

Principal InvestigatorSoftware Used
Thomas Billac, The University of Auckland

Performance Analysis of Algorithms to Validate XML Keys

This project seeks to study several alternatives for parallelization of sequential algorithms the team has devised to process large XML document collections. The team seeks to formulate three types of parallelizations: 1) by using MapReduce programming model, 2) MPI parallelization on distributed memory using BSPonMPI and 3) OpenMP parallelization by forming a hybrid parallel scheme with BSPonMPI.

Principal InvestigatorSoftware Used
Flavio Ferrarotti, The University of Auckland

The Vault Enigma

The project involves identifying homologs of the Major Vault Protein (MVP) across groups. MVP monomers spontaneously form vault particles and the equilibrium is heavily in favour of whole vaults in metazoa. The project team seeks to know how such a highly conserved protein evolved. There are additional questions, such as a) whether MVP homologs form vault particles in all the species where they are found and what its purpose might be or if its purpose depends on its environment, b) whether the RNA associated with the vault ribonucleoprotein is very ancient, as many RNAs involved with RNPs are, or if it is a relatively recent addition in some species only, and c) why MVP (or indeed vaults) don't seem to be needed in some species.

Principal InvestigatorSoftware Used
Toni Daly

Modelling interaction between ocean and ice shelves in the Ross Sea

With a customised version of the Regional Ocean Model System (ROMS), the team aims to simulate the observed variability of water mass formation and their exchange between McMurdo Sound and surrounding Ross Sea. This is a preparatory project evaluating our ROMS derivative’s potential on NeSI’s POWER 6. Comparing performance and accuracy to NIWA’s Turbine cluster, our current platform, to determine what performance advantages can be gained to meet the project's research goals.

Principal InvestigatorSoftware Used
Stefan Jendersie

September 2012

Research Allocation Class

Breaking superheating in gallium clusters: size limit for greater-than-bulk melting

Gallium’s use in many semiconductor technologies makes its electronic and structural properties of particular interest. However, gallium’s place in the periodic table as an open-shell, group 13 metal with d-electrons makes predictive modelling and simulation especially challenging. One intriguing attribute of this unique metal is the melting temperature: bulk Ga melts at a relatively tepid 303K while small Ga clusters (n=30-55) have been shown to melt at temperatures as high as ~500–800K. Additionally, one atom size-differences can shift the melting temperature up to 100K.While these anomalies have been discovered and demonstrated experimentally, theory and modelling lag in providing a full description of the underlying cause.

A better understanding of these fascinating characteristics could help us better exploit gallium’s molecular properties for use in modern, increasingly “nano” nanotechnologies. Using density functional theory-based molecular dynamics simulations as implemented in VASP, coupled with our novel parallel tempering wrapper code, we will investigate the thermodynamic properties of small gallium clusters in two size regimes.

Using clusters sized 32-38 atoms, we compare our results to those of experiment in order to validate our model. For simulations completed to date for the 20- and 34-atom case, we have been able to demonstrate that our simulations capture the melting temperature and latent heat behavior to a high degree of accuracy. By extending the model to additional cluster sizes, we will be able to probe the electronic and atomic structure contributions to the intriguing one-atom differences experimentally observed for these clusters. We extend our simulations to the size range of 6-15 atoms, in order to simplify the system and better illustrate the various contributions to melting mechanisms.

With simulations already completed on Pan, we have been able to demonstrate that the greater-than-bulk melting trend appears to hold for the 11-atom case but breaks down for 10-atoms and below. Additional cluster sizes will offer more clarity regarding the mechanism by which this break down occurs, offering additional insight into the physics driving this anomalous thermodynamic behavior. With the unprecedented number of cluster sizes, this exciting work will substantially extend our understanding of these small, intriguing systems. Given the recent research focus on the thermodynamic properties of small clusters, we are certain the results will be of great interest to the scientific community at large.

Principal InvestigatorSoftware Used
Krista G. Steenbergen, VUW

Proposal Development Allocation Class

Dipolar gas thermodynamics

The project seeks to benchmark a current codecase for dipolar gas thermodynamics on NeSI's Pan cluster. Results will be used to gauge feasibility of future projects.

Principal InvestigatorSoftware Used
Blair Blakie, Otago University
  • In-house codes

Can species delimitation methods mistake population structure for speciation?

The discovery of new species is one of biology’s most important goals. It is estimated that 1 million species currently described are only 10% of the true diversity of life on earth. This lack of knowledge hampers conservation and means the real diversity of life is not available for biologists to study. The widespread availability of DNA sequence data presents a powerful opportunity to overcome the so-called “taxonomic impediment”. Indeed, methods of species delimitation based on the coalescent model of population genetics are now routinely used to test hypothesised species boundaries. These methods are certainly very powerful, but great power comes with great responsibility. Powerful methods greatly increase the chance of false positives.

This may be particularly important for species delimitation methods as the null models used presume putative species are drawn from a single population with no degree of genetic structure. This is seldom the case in species-delimitation studies and deviation from this assumption may lead to positive results, even in the absence of speciation. We have recently found that commonly used species delimitation statistic (the GSI) is prone to false positives in the presence of population structure.

The project team is interested in finding whether a more sophisticated approach, the Bayesian species delimitation algorithm implemented in the program bpp is prone, to similar errors. In order to test this we have simulated genetic datasets under various demographic histories, with the hope of running bpp on these simulated data to measure the effect of population structure on the results of species delmitation with bpp. The large number of simulations required to tackle this question, combined with the hour-long run-time for each analysis currently, mean we simply cannot achieve these analyses without some sort of High Performance Computing. Each analysis is independent of others, so the whole process should be easily parallelised.

Principal InvestigatorSoftware Used
David Winter, Otago University
  • bpp
  • In-house codes

powerPlant NeSI

The project team is seeking to evaluate the approriateness of NeSI’s facilities for staff at Plant + Food Research. The team will be profiling applications which are run in-house. This data will be used to inform future applications.

Principal InvestigatorSoftware Used
Matthew Laurenson, Plant + Food Research

Trial Parallelised Genome Assembly Using Ray Assembler

Over the past few years, the research team has needed to de novo assemble genome sequences. The team’s approach has been to utilise large memory SMP boxes. Over time the type of genomes is moving away from model genomes to real world genomes of increased complexity ahd size. Approaches using single SMP boxes to tackle these tasks may ultimately prove unsustainable as genome size increases into the 10's of gigabases. The team intends on trialing the use of Ray Assembler on NeSI infrastructure.

Principal InvestigatorSoftware Used
Ross Crowhurst, Plant + Food Research

August 2012

Proposal Development Allocation Class

Mountain precipitation

Investigation of the effect of temperature on modelled precipitation distribution in the Southern Alps.

Principal InvestigatorSoftware Used
Tim Kerr, NIWA

Probabilistic Local Tsunami Hazard Assessment for New Zealand

The goal of this investigation is to be able to make assessments of the Tsunami hazard for the whole New Zealand cost and onshore run-up areas.

Principal InvestigatorSoftware Used
Christof Müller, GNS Science

July 2012

Research Allocation Class

Bayesian evolutionary analysis: Species tries, phylogeography, epidemiology and model averaging

This research aims to develop computational tools to improve our understanding of the processes that generate Earth's biodiversity. The project is split into three themes:

  • develop novel methods for estimating species delimitation and relationships directly by combining genomic and ecological data
  • address the integration of ecological niche modeling with statistical phylogeography to aid prediction of future ecological outcomes
  • statistical unification of mathematical epidemiology and viral phylogenetics to create analytical methods that can better predict epidemic disease outcomes from sequence-base surveillance data
Principal InvestigatorSoftware Used
Alexei Drummond, The University of Auckland
  • BEAST 1
  • BEAST 2
  • Hamlet
  • in-house codes

The chemical environment of metal clusters and complexes

Gold clusters and nanoparticles have attracted continuing attention due to interesting and important electronic, catalytic and optical properties. The understanding of effects of the chemical environment on these properties is essential to obtain a complete and realistic picture. In order to understand the influence of the ligand shell on the electronic properties we study various gold and gold palladium clusters in interaction with different ligands by using quantum chemical methods

Principal InvestigatorSoftware Used
Doreen Mollenhauer, VUW

Proposal Development Allocation Class

Valuing ecosystem services in a new recreational forest park in the Bay of Plenty Region

This project aims to address two key questions in choice experiments (an economic valuation method): (1) Does order and latent response affect the accuracy of estimating the parameters in choice models? and (2) If so, does these two also affect behavioural efficiency of respondents?

Principal InvestigatorSoftware Used
Richard Yao, Scion

Improved numerical weather prediction (NWP) over New Zealand using WRF

This project aims to find an improved WRF NWP model configuration for New Zealand. It seeks to answer questions about which convection scheme, land surface models, land-use data, and resolution performs best under the different synoptic situations on average and in extreme situations.

Principal InvestigatorSoftware Used
Andy Ziegler, MetService

Genetic evolution of human populations

This project is investigating the effect of mating system on the genetics of human populations through simulation. The project consists of developing a program for such simulation, then running those simulations.

Principal InvestigatorSoftware Used
Elsa G Guillot, Massey University
  • in-house codes

Multiple optima of likelihood on trees

It is known that the maximum likelihood function can have multiple local optima on a given tree. Some simulation studies suggest that this is not likely to affect tree-building, but these simulations used data generated on a single tree. In contrast, it has been shown that simple mixture models can generate data where multiple optima can occur even on the tree with the highest likelihood. We will investigate how often multiple optima occur with real biological sequence data, by hill-climbing from random starting points on random topologies.

Principal InvestigatorSoftware Used
Bennet McComish, Massey University

Genome Mining

This project aims to assemble the sequences of transcripts of genomes from short reads provided by next generation sequencing machines. Once the transcriptome is assembled it will be analysed to identify the functions of genes.

Principal InvestigatorSoftware Used
Chris M. Brown, Otago University

Modelling methotrexate in red blood cells

The project aims to determine mechanisms of methotrexate loss from red blood cells. The team have developed a model from patient data, which can be used to test hypotheses.

Principal InvestigatorSoftware Used
Stephen Duffull, Otago University

Using High Performance Computing to Investigate Bovine and Human Genomic Information for the Dairy and Human Health Fields respectively

The overall aim of the project is develop the capability to analyse whole genome and exome next generation sequence data from bovine and human in order to find causative mutations. The expectation is that the solution will evolve with the development of the software approaches. It is also anticipated that this project will be the forerunner of may human genetics projects. Our initial interests in human biology will be in familial neurological disorders with un identified mutations and autism spectrum disorder.

Principal InvestigatorSoftware Used
Russell Snell, The University of Auckland

Mining deep sequencing data for the discovery of novel viruses in humans, animals and environmental samples

ESR presently has a number of projects funded by the ESR Capability Fund and the New Zealand Health Research Council, which seek to identify new or unknown viruses in human, environmental or animal samples. Datasets are generated from Roche or Illumina sequencing instruments. These may consist of 5-7 million sequences each 150 base-pairs in length. This project requires the comparison of each sequence read to known sequence databases (i.e. genbank) to determine if viruses are present.

Principal InvestigatorSoftware Used

Richard Hall, ESR
Jing Wang, ESR

Identification of regions of differential methylation in human sequence data

The goal of this project is to identify differentially methylated sections of the human genome using computational methods. Sequence data have previously been generated via reduced representation bisulphite sequencing, (RRBS), allowing whole-genome identification of CpG site methylation within the DNA of ten individuals. Using a computational approach, we would like to identify regions of the genome that exhibit different levels of methylation across individuals, with each region comprising multiple CpG sites.

Principal InvestigatorSoftware Used

Mik Black, Otago University

  • R
  • in-house codes

Material point method simulations of complex deformable solids

The Material Point Method is a particle-based numerical method that is used to simulate the dynamics of solids and solid-fluid interactions. This project will leverage on the Uintah parallel computing framework to explore new constitutive models for foam growth and nonlinear foam deformation. The goal is to develop the capability for virtual testing of complex materials over a range of length scales.

Principal InvestigatorSoftware Used
Biswajit Banerjee, IRL

May 2012

Research Allocation Class

CFD Simulation of Complex Flows

Understanding flows in several environments. Current models are of the atmospheric boundary layer flowing through a wind farm, startup flows in a rotating torus, and the aerodynamics and hydrodynamics of Americas Cup yachts.

Principal InvestigatorSoftware Used
Stuart Norris, The University of Auckland
  • in-house codes
  • Intel Fortran
  • Portland Group Fortran
  • ANSYS CFX

Nanofunctional Gas Sensors

Understanding the mechanism of action of semiconducting gas sensors that can be related directly to the calculated density of states (electronic structure) of the material.

Principal InvestigatorSoftware Used
Nicola Gaston, IRL

Species Delimitation and Global Biosecurity

Delimiting species of concern to the global biosecurity community. The target taxa have a very close genetic relationship, as revealed by DNA sequencing of a few standard gene regions, but may possess other as yet unrealised regions of the genome which are more suited to identifying species-level relationships. The goal is to search for such regions (of known or anonymous function) that may contain informative single nucleotide polymorphisms (SNPs) for use later in a more comprehensive and targeted phylogenetic analysis.

Principal InvestigatorSoftware Used
Laura Boykin, Lincoln

Regional Climate Simulations of New Zealand recent past climate change episodes

Exploring a 15,000 year record of storm intensity and precipitation over New Zealand at annual resolution. The computational work in this project follows a feasibility study which was recently conducted to investigate the quality of climate record captured in the sediment cores extracted from Lake Ohau. These samples will be used to allow the climate simulations to be assessed.

Principal InvestigatorSoftware Used
Abha Sood, NIWAUnified Model

Accelerating bioactive metabolite discovery through molecular modelling

The approach taken is to build homology models of the 3D protein structures of the novel adenylation domains. Ligand docking software is then used to dock potential amino acid substrates into these homology models. The docked amino acids need to be ranked by binding affinity to predict which is the true substrate of the adenylation domain. However, while able to accurately predict the binding modes of the amino acids in the binding site, the ligand docking software does not accurately rank them. 

In many cases, molecular dynamics simulations have been shown to be a more accurate computational method of predicting the binding affinity of small molecules bound to proteins. Hence, molecular dynamics simulations are used to perform post-processing of the results from the ligand docking of the amino acids into the adenylation domains. This should result in a better ranking of their binding affinity and allow us to predict the specificity of those adenylation domains. 

Molecular dynamics simulations are complex, with a large number of parameters that need to be optimised. In addition, there are a number of different approaches to estimating binding affinity using molecular dynamics simulations. The project team aims to optimise the simulation parameters and test a number of different binding affinity estimation methods.  Adenylation domains with experimentally-determined specificity will be used to do this. If this effort is successful, the team will be able to employ the method to predict the specificities of novel adenylation domains with no known specificity.

Principal InvestigatorSoftware Used
Verne Lee, The University of Auckland

Development, testing and application of phylodynamic methods

Reconstructing evolutionary histories of rapidly evolving populations, such as Human Immunodeficiency virus or Influenza viruses with by developing new or enhancing current methods. Throughout the development, the project team verifies the methods through test simulations. Eventually, the team's methods are applied to real data sets. The method we're currently developing, BDSIR, incorporates an epidemiological SIR model into the phylogenetic analysis.

Principal InvestigatorSoftware Used
Denise Kuehnert, The University of Auckland

Toward new mechanisms for inhibiting PI 3-kinases

The PI 3-kinase isoform p110a, regulates many cellular processes, and activating mutations occur in 15% of cancers. Blocking dysregulated activity is a hot area in cancer drug discovery. The project is looking for new inhibitory mechanisms and molecules that can specifically inhibit oncogenic p110a. This will be accomplished by computing protein flexibility using molecular dynamic simulation.

Principal InvestigatorSoftware Used
Jack Flanagan, The University of Auckland

Development of computational intensive statistical methods for phylogenetics

One of the aims of my research is to develop novel phylogenetic models that explicitly incorporate information with regard to the spatial distribution of individual virus taxa or their hosts and the geographic and ecological features of the surrounding habitats. In addition, it is also of interest to formulate a framework which enables hypothesis testing for structured population dynamic models. 

Principal InvestigatorSoftware Used
Chieh-Hsi Wu, The University of Aucklandin-house codes

Proposal Development Allocation Class

Photodissociation of nitrous oxide

The project team is investigating the rate and products of photodissociation in these previously unconsidered nitrous oxide complexes using theoretical and experimental techniques.

Principal InvestigatorSoftware Used
Jo Lane, University of Waikato

Theoretical foundations of Graphene based Hydrogen Storage Systems

In the first phase of the project all theoretically possible structures of a RuCp*-HBC complex will be modeled in order to determine the structure that is lowest in energy. In the next step these findings will be corroborated through conformational searches to determine the influence of the position that the side chains of the molecules might have on the total stability. Those results will then be used to adjust the initial calculations and a final data set will be produced.

This final data set of structural information will then be subjected to a range of analytical tools in order to determine the characteristics of the chemical bonding that are responsible for directing the metal atoms to the binding sites that are found experimentally. The most prominent analytical tools will be the QTAIM method that uses an analysis of the topology of the electron density for a description of the bonding features and the NBO (Natural Bonding Analysis) that uses localized orbitals for a Lewis-structure like description of the bonding situation. The combination of these two methods ensures a complete and unbiased picture of the compounds in question. 

Principal InvestigatorSoftware Used
Matthias Lein, Victoria University of WellingtonORCA

Modelling the anomalous melting temperatures of small Ga clusters with first-principles molecular dynamics

This research combines parallel tempering with Density Functional Theory-based molecular dynamics, enabling us to investigate the thermodynamic, electronic and structural properties of small gallium clusters. Addressing the fundamental question of how melting occurs at the nanoscale, the team calculates the heat capacity (melting) curves for small gallium clusters and, using a variety of in-house analysis tools, characterise the relevant structural and electronic contributions to the various cluster melting signatures. This is used to explain the greater-than-bulk melting temperatures and interesting odd-even oscillation in the structures and properties of small gallium clusters.

Principal InvestigatorSoftware Used
Krista Steenbergen, Victoria University of WellingtonVASP

A study of microbial populations in agricultural soils

The project involves studying microbial populations in the soil using next generation sequencing (NGS) techniques.

Principal InvestigatorSoftware Used
Andriy Podolyan, Lincoln UniversityQIIME