Frequently Asked Questions

General topics

The German Human Genome-Phenome Archive (GHGA) is a national omics data infrastructure that provides a secure platform for data sharing and secondary use of human genome data from research and clinical sequencing.

GHGA’s mission can be found here.

While the GHGA Data Portal presents the point of contact for the up- and download and analysis of omics data, the data is stored in a federated manner, on servers at the GHGA Data Hubs.  

The GHGA Data Hubs are associated with German universities and research centers, co-located with major omics sequencing centers. Together, they operate as a federated network which shares joint standards and infrastructure under central management, coordinated via GHGA central represented by DKFZ.

 

GHGA is organized in a federated manner - under central management data is stored locally at the GHGA Data Hubs.

The data hubs provide - besides storage and compute infrastructure for GHGA services - significant resources, professional operations, data stewardship, technical security, and scalability. One great benefit of this data hub network is that it will establish replication services across data hubs - perspectively providing geo-redundant storage and backup at each data hub.

 

GHGA strives to understand the expectations and concerns of patients. We have held deliberative forums to elicit their views on transparent and trustworthy governance. In group discussions, we also want to understand the perspectives and communication needs of those affected and interested in the topic of sharing human genome data for research purposes.  We hope to make patients, not just their data, an integral part of GHGA’s growth and development.

More on patient engagement in GHGA here.

Yes. In June 2022, GHGA, represented by DKFZ as the coordinating legal entity, signed a collaboration agreement with the European Genome-Phenome Archive (EGA) to become the German national node of the federated EGA.

More here.

Yes. The European Genomic Data Infrastructure (GDI) aims to bring the mission of the 1+MG initiative into fruition, connecting national genomics data infrastructures to form a European genomic data network. GHGA is the German GDI node, thereby linking GHGA to the pan-European GDI infrastructure.

More here.

As a partner of genomDE, GHGA has contributed to the development of a nationwide platform that, in connection with the Genomic Sequencing Model Project (MV GenomSeq), enables broader clinical use of genomic data.

Six GHGA Data Hubs have been approved as genome data centers, making GHGA the archiving and research platform for the genomic data generated within MV GenomSeq.

Learn more

here

.
Using GHGA

The GHGA Data Portal allows users to browse, search, and download omics datasets submitted to the GHGA. It uses the GHGA Metadata Model

Detailed information on the usage of the GHGA Data Portal can be found in our User Documentation.

Currently GHGA is still in an early phase of the project and is therefore in general only accepting data submissions from partner institutions. If you would like to submit human omics data to GHGA, please contact the GHGA Helpdesk.

To stay informed on new feature releases and updates to GHGA, please sign up for our GHGA Newsletter

The GHGA Data Portal allows users to request access to data through the portal. Identify your dataset of interest using the browse and filter functions of the GHGA Data Portal. Click on the "Request access" button. This will direct you to a data access request form. Complete the form with the necessary information and submit it to request access to the dataset. The data access request will be sent to the Research Data Controller, who will review your request and respond accordingly.

Please note that GHGA is not involved in the further process of negotiating the data access.

More information can be found in our user documentation here.

GHGA is an archive for human omics data, which can comprise sequencing raw data files, associated metadata files, or any type of human omics data. As such, most file formats are supported.

Datasets to be submitted to GHGA will need to be described using the GHGA Metadata scheme, which is available on GitHub. Further details can be found in the GHGA Metadata Whitepaper on Zenodo. 

For data from non-human model systems that are not subject to controlled access, we advise users to use specialised archives such as European Nucleotide Archive (ENA). 

Please contact the GHGA helpdesk for data submissions and specific questions.

 

GHGA is implementing an ethical-legal framework tailored to the German legal landscape and interpretation of the GDPR - something other international initiatives cannot guarantee and that is particularly useful for German data producers. 

As GHGA is part of the federated EGA, data deposited within the german node is discoverable within federated European Genome-phenome Archive (FEGA) . Hence, the utility and benefits of regular EGA submissions will be retained when depositing data in GHGA.

While data submissions to GHGA are not limited to German researchers or institutions, European researchers are encouraged to use the federal EGA node of their own country of residence. Within the FEGA, european datasets stored at the national nodes can be found in one central database.

Yes. As a partner of the federated European Genome-Phenome Archive (FEGA) and the European Genomic Data Infrastructure (GDI), data collected within GHGA will be findable internationally. 

In addition, through the use of common technical and metadata standards, we aim to ensure the interoperability of the collected data with data sets world wide. However, access to the data is still subject to the national implementation of GDPR.

 

Data security and protection

The prevention of data misuse is a primary objective in GHGA's mission. To ensure data safety, we take a layered approach. Our newly developed infrastructure provides high level cybersecurity based on zero-trust networking, combined with technical and organizational measures - allowing data to be archived and shared safely. 

In addition, we have developed a framework for GDPR-compliant data processing and help data producers to inform patients and manage consent. Enabling controlled, yet FAIR, data access is the last layer to ensure data is protected while fulfilling its potential to advance research.

GHGA does not act as the controller or owner of any data deposited within GHGA. Instead, GHGA acts as a data processor that processes and shares data as instructed by the data controller.

GHGA is therefore only able to share data with other researchers based on the explicit approval of the institution which has submitted the data. It is the responsibility of the controller to decide on data access requests submitted via GHGA by other researchers who want to use the data stored within GHGA.

More details on Data Use and Access Practises can be found here.

Only non-personal metadata is publicly available within the GHGA Data Portal. 

Researchers intending to use any of the archived data or view personal metadata must apply for access with the respective data controllers. 

A data access committee or a comparable instance will review the legitimacy of the request prior to granting access. This step ensures that only researchers with a valid research purpose gain access to sensitive data - adding another layer of protection.

More details on Data Use and Access Practises can be found here.

GHGA has developed different tools to help users to submit omics data to GHGA in a legally and ethically sound manner.

We have developed modules to be included in existing or newly written consent forms and designed an app to help evaluate legacy consent forms. The tool can be used to guide researchers in assessing their pre-GDPR consent forms to see if it is sufficient to permit new data processing.