Distributed Genomics

A growing amount of data coming from genetics research creates new technical challenges, and simultaneously raises legal, moral and ethical questions. The current estimation of healthcare-funded DNA analysis showed that 60 million new genomes would be sequenced by 2022. Right now, each lab has their own analytical pipelines and file formats to process and store genomics data which complicates interoperability. Proposed cloud solutions do not solve issues of copying a large amount of data and ensuring confidentiality while storing and providing access to the personal genomics data. Federated analysis implicits data access for distributed analysis without physically sharing it. Creation of distributed infrastructure for genomics will provide a federated access and analysis of locally-controlled data in a standardized way.

This cyber-infrastructure project will consolidate sequencing efforts across multiple centers, fulfilling security requirements of confidentiality of personal information among jurisdictions. It will significantly extend the available amount of data for researchers working in the field of genomics driven healthcare. New tools and analytical pipelines will help to harmonize genetics research outcomes.

All data access will be through Application Programming Interfaces (APIs) with appropriate logging and security measures. This will allow scientists to test their hypotheses on larger genomic datasets without having to dedicate time and compute resources copying and storing data locally. Federated access could be used for joint variant calling in separate genes, analysis of differential gene expression of specific transcripts, and validation of associations. This approach is already being used in large-scale personalized healthcare programs run by Australian Genomics, Genomics England, and many others.

Same standards across distributed national-scale platform will allow shared access to genomics data and promote genomic research by facilitating data analysis on a global scale. This means closer cooperation between research laboratories and clinics, as well as participation in world projects such as BRCA Challenge, Matchmaker Exchange, and Beacon Project.