Most universities have some form of cloud infrastructure in place already, for both graduate level students and staff, such as Office365 or Drop Box; HR, payroll or finance services. But, cloud is only just taking off with university-based, post-grad’ researchers.
OCF is currently working on a number of projects to build cloud infrastructure specifically aimed at helping researchers from across different universities to collaborate on scientific research. These projects are really leading edge and, to my knowledge, stand-alone; there really aren’t any other projects or cloud infrastructures like this in place, yet. Increasingly though, we are seeing interest from universities and research institutes that are aiming to follow this pioneering work and create cloud infrastructure for researchers.
When speaking with one of our customers, a senior lecturer heading up the project to build a cloud infrastructure, he told me, in relation to bioinformatics in particular, that the UK needs a dedicated, specific system to advance research into bacteria that causes diseases, and increase the understanding of bacterial resistance to drugs. He told me that normal High Performance Computing (HPC) machines simply aren’t suitable for the kinds of research they want to do.
The compute power necessary to analyse bacterial pathogens is very different to the power needed to analyse and data mine ancient documents, for example. Researchers want to be able to access research data without delays, and they want a system to run as fast as possible using the workloads they send to it, and cloud infrastructure may well be the solution. Even though the projects we’re working on will benefit bioinformaticians in particular, any researcher wanting to use HPC systems could potentially benefit from a dedicated cloud infrastructure.
In order to interrogate large datasets, researchers need to use applications to analyse the data. Often you’ll find researchers will develop and write their own application in order to analyse and interrogate sets of data. Sharing these applications is very difficult - what may work on one HPC system, won’t necessarily work on another – it’s often quicker to write a new application from scratch.
This is where I see the real value of cloud for researchers. A cloud infrastructure is all about collaboration and sharing resources such as compute and storage, as well as sharing of applications. We’ve been building cloud infrastructure for researchers based on OpenStack, because this technology can create Virtual Machines – a virtual disk that contains the entire environment and operating system, data set and software applications that can interrogate the data. The data, therefore, can stay in place and researchers can create or ‘spin up’ their own virtual HPC clusters to operate software, packages, and applications to interrogate the data.
One of the challenges faced when setting up a private cloud across different universities, for researchers in particular, is around the networking and communications between the two separate sites. When services communicate with each other you don’t want them to destroy or overwrite specific configurations. This is where virtualisation really comes into it’s own because of the way you can share these virtual machines.
By far the biggest benefit to building a cloud infrastructure is the collaboration that can be achieved. Researchers are often experts in a very particular area and so you may only have dozens with one specialist area within the UK. Sharing research data often involves sending hard disks via courier from one researcher to another – data sets are simply too large to send via the internet. But, having a cloud infrastructure means that a researcher at Land’s End could access the data and research of a researcher based in John O’Groats, for example.
Building a cloud specifically for researchers is no easy task, but one that will reap huge benefits for universities willing to collaborate and invest in the infrastructure. Researchers from across the UK, Europe and the rest of the World could all potentially work on one research project, with no limitations on where researchers access the HPC system via the cloud.
Ultimately, the projects we’re working on will benefit hundreds of researchers studying cancers, cardio-vascular and rare diseases helping them understand the diseases and how a person’s genetics may influence their predisposition and potential treatment responses.