RNA Sequencing
RNA sequencing, or RNA-Seq, is the latest technology to study the transcriptome, i.e., the full set of RNA transcripts as genome readouts in a cell or population of cells. This technology directly sequences RNA molecules in the transcriptome in order to determine their genes of origin and abundance. RNA species need to undergo a sequencing library preparatory process prior to sequencing. The libraries are then sequenced to generate millions of reads for each sample. After sequencing, the generated reads are mapped to the reference genome to identify their genomic origin. The total number of reads mapped to a particular genomic region represents the level of transcriptional activity in the region. The more transcriptionally active a genomic region is, the more copies of RNA transcripts it produces, and the more RNA-Seq reads it generates. RNA-seq is essentially a counting game.
GMAK provides five types of RNA-seq services as detailed below:
- mRNA-Seq: Starts with 100 ng to 1 ug high quality total RNA. Prepares library from poly-A enriched mRNA species. Aims to identify differentially expressed protein-coding genes. Most requested.
- Total RNA-Seq: Starts with 100 ng to 1 ug total RNA. Library prep based on rRNA depletion. Targets both protein-coding genes and long noncoding RNAs. Can accommodate degraded RNA, such as those extracted from FFPE or laser capture microdissected samples.
- Low-input RNA-Seq: Accommodates limited amounts of total RNA in the range of 5-100 ng. Library prep can be based on poly(A) mRNA enrichment (default), or rRNA depletion (at additional cost).
- Small RNA-Seq: Prepares sequencing libraries for small RNA species, e.g., miRNAs, from total RNA. Can start from 100 ng to 1 ug (standard input) or 5 to 100 ng (low input) total RNA.
- Single-Cell RNA-Seq
Users who are new to bulk RNA-seq may refer to core documents RNA-Seq Workflow Steps and Examples, and the RNA-Seq Decision Tree to help decide the type of RNA-seq service needed.
Read Length and Sequencing Depth
Standard mRNA- or total RNA-Seq: Paired-end 50 reads are mostly used for general gene expression profiling. To study alternative splicing variants, paired-end, longer reads (up to 150 bp) are often requested. On sequencing depth, 25-30 million reads per sample are usually appropriate for general gene expression profiling, while 40-50 million reads are suggested for splicing variant detection.
Low-input RNA-seq: Read length remains the same as standard mRNA- or total RNA-seq. Sequencing depth may be reduced to some extent based on the amount of starting material.
Small RNA-seq: GMAK generates paired-end 50 bp reads for small RNA-seq. The suggested sequencing depth is 5-10 million reads per sample.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
GMAK takes extracted total RNA for RNA-seq (no tissues or cells). The quality of RNA is the single most important factor that determines final outcome. After sample drop-off, core staff conducts sample QC, which includes Qubit concentration measurement and Bioanalyzer-based RNA Integrity Number (RIN) generation, prior to library construction. A RIN of 8 is required to proceed with mRNA-seq library construction. Submitted RNA samples also need to be DNA-free and we suggest to always include a DNase treatment step during RNA extraction. Presence of genomic DNA contamination is visible on Bioanalyzer traces in the range of 4-10 kb. In situations under which RNA degradation is unavoidable, such as when using FFPE tissues, total RNA-seq is suggested as it is less dependent on the intactness of RNA. Use our Sample Submission Form.
Bioinformatics
Data analysis is provided upon request. Standard RNA-seq bioinformatics service includes sequencing data QC, alignment, normalization, and differential expression analysis.
Single-Cell Sequencing
GMAK has offered single-cell sequencing since 2017. This state-of-the-art technology offers unprecedented opportunities to study cell-to-cell variation, identify/visualize different cell types/identities in a population, and infer cellular developmental trajectories. To help users accomplish these goals, GMAK assists users in every step of this process – from cell prep and sequencing library construction to bioinformatic analysis.
As the technology evolves, single-cell sequencing becomes more diverse to meet varying project needs. Currently, GMAK offers high-throughput single-cell sequencing based on 10x Genomics Technology
10x Genomics Chromium
- Target cell number: 1,000-10,000 cells in each sample
- Input type: freshly prepared single-cell (or nucleus) suspension, fixed-cell (or nucleus), cryopreserved cell, and FFPE embedded tissue.
- Applications: 3’ and 5’ single-cell (or nucleus) RNA-seq, T-cell and B-cell V(D)J clone profiling, cell surface protein profiling, single nucleus ATAC-seq and multiome (simultaneous single nucleus ATAC-seq and RNA-seq on the same cells)
Service Request
To initiate a single-cell sequencing project, please contact us. Project consultation is provided free-of-charge. Consultation with the core prior to starting a single-cell sequencing experiment is highly recommended to ensure accomplishment of project goals.
Sample Submission
For all single-cell sequencing services, please make a sample submission appointment with us in advance.
Recommended Sequencing Parameters
- RNA-Seq Libraries – Read 1 of 28 bp (Cell Barcode and UMI), i7 Index of 10 bp (Sample Index), i5 Index of 10 bp (Sample Index), and Read 2 of 90 bp (Transcript Insert) with a sequencing depth of >20,000 reads per cell;
- ATAC-Seq Libraries – Read 1 of 50 bp (Transposed DNA), i7 Index of 8 bp (Sample index), i5 Index of 16 bp (10x Barcode) and Read 2 of 49 bp (Transposed DNA) with a sequencing depth of >25,000 reads per cell
Bioinformatics
Data analysis is provided upon request.
Spatial Transcriptomics
Spatial transcriptomics enables interrogation of gene expression within the context of tissue architecture, tissue microenvironments and cell groups (especially when coupled with single cell sequencing). To meet the rapidly increasing needs for spatial -omics studies, GMAK teams up with the HZI Core Unit Mouse Histology and Pathology hosted by VMED department . This internal cooperation ensures users have access to the various techniques needed to carry out a typical spatial analysis workflow, such as tissue prep, cryosectioning, staining, imaging, tissue section QC, sequencing library prep, sequencing and data analysis. The established workflow accommodates both fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) tissues. Pre-cut tissue sections on standard glass slides may also be used for the Visium platform from 10x Genomics, as the 10x CytAssist instrument available at GMAK enables sample transfer from pre-existing slides to Visium slides. The Visium platform from 10x Genomics has been offered since August 2023.
Service Request
Please contact GMAK to initiate a spatial analysis project. The Core works closely with rest of HZI cores on the different steps of the workflow. Project consultation is provided free of charge. Consultation with the core prior to starting a spatial transcriptome experiment is highly recommended to ensure accomplishment of project goals.
Whole Genome Sequencing
Whole-Genome-Sequencing (WGS) is a method used to determine the complete DNA sequence of an organism's genome, including chromosomal DNA and mitochondrial DNA. While WGS provides a comprehensive view of the genome. WGS methods can be categorized into de novo sequencing projects and re-sequencing projects. Which method for which project? DNA-Seq_Decision_Tree
De novo WGS
For de novo sequencing of genomes, long reads such as those produced by Oxford Nanopore sequencers (MinION, GridION) are advantageous. Since this technology is available at GMAK, we recommend a hybrid approach for de novo WGS. Here, short reads from Illumina sequencers and long reads from Oxford Nanopore (ONT) sequencers are used to improve the assembly results for genomes.
WGS Re-Sequencing
WGS Re-Sequencing is notably more cost-effective than de novo WGS, as it can be accomplished using only short reads when a high-quality assembled reference genome is available.
Sequencing Mode
- Short reads (Illumina): paired-end run of at least 150bp, better 300bp with a 50x coverage of the genome for re-sequencing and a 100x coverage of the genome for de novo sequencing
- Long reads (ONT): 50x coverage of the genome for a hybrid assembly with Illumina short reads and 200x coverage of the genome for de novo sequencing
We strongly recommend a hybrid approach for successful de novo sequencing of small genomes (bacteria, viruses), the combination of short reads (Illumina) with long reads (ONT).
For the method of re-sequencing and the associated SNP/variant calling, we recommend the use of short reads (Illumina), as this sequencing technology currently has the lowest sequencing error rate and thus better prerequisites for subsequent qualitative analysis steps.
Service request
If required, a free project consultation is offered.
Submission of the sample
- Library Prep for Illumina sequencer: 150 ng high-quality genomic DNA required.
- Library Prep for MinION Sequencer (ONT): 1µg HMW DNA
Fluorometric methods, such as Qubit or PicoGreen, are preferred for DNA quantification. Spectrophotometric methods, such as Nanodrop, may not be accurate enough. Use our Sample Submission Form.
Bioinformatics
Data analysis is offered on request. Depending on the method, the standard bioinformatics service includes quality control of sequencing data, alignment, assembly, hybrid assembly, SNP/variant calling, automatic annotation.
Whole Exome Sequencing
While the protein-coding region of the genome (i.e., the exome) represents only a small portion of the genome (less than 2 percent in humans), it is the most studied and best annotated. For example, the human exome contains approximately 85 percent of all known disease-related variants. Due to its cost effectiveness and better data manageability, whole exome sequencing (WES) offers an ideal approach when whole-genome sequencing is not practical or needed.
WES enables core users to focus their resources on genes that are most likely to have an impact on the phenotype or disease of interest. By scanning through the entire amino acid coding region of the genome, it leads to identification of relevant variants across a wide range of applications, including genetic diseases, cancer development and population genetics.
GMAK uses a capture-based approach to target exome regions for sequencing. We use biotinylated nucleic acid baits, which are complementary to the target exome, to hybridize to genomic DNA libraries for the capture. For our WES, we only require 150 ng of high-quality human genomic DNA.
Sequencing Mode
Paired end 100 or 150 bp high- or mid-output runs are recommended for WES. Each high- or mid-output run generates 1600 million, or 800 million, paired-end reads, respectively.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
For WES, 150 ng of high-quality human genomic DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard WES bioinformatics service for variant discovery includes sequencing data QC, alignment, and variant calling. Delivered results are variant call (VCF) files.
ATAC-Seq for Open Chromatin Profiling
The eukaryotic genome is highly packaged to fit into the very limited nuclear space. As a result, access to genomic information is tightly regulated based on cellular state. What regions of the genome are accessible reveals a great deal about the state of the cell. ATAC-seq, or Assay for Transposase-Accessible Chromatin coupled with next-gen sequencing, is a technique to locate accessible chromatin regions.
As the name suggests, ATAC-seq is based on the use of an engineered, hyperactive transposase (called Tn5), which fragments DNA in open regions of the chromatin. In the same process, it simultaneously tags the ends of the fragmented DNA with sequencing adapters. This tagmentation process is a key part of ATAC-seq library construction.
Standard ATAC-seq: For input material, GMAK needs 50,000 live cells to start library prep. Extra cells are needed for cell viability and density checks before conduct of library prep. Please submit at least 60,000 cells to the Core. Since cells need to be processed immediately after delivery to the Core, the user needs to contact the Core ahead of time (at least by one week) to schedule the work.
Single cell ATAC-seq using 10x Chromium: Single nuclei suspension prepared from fresh, cryopreserved, and flash frozen tissue or cell samples is needed for library prep. As performed for single cell RNA-seq, single nuclei prep will be check first for quality and concentration, and 500-10,000 nuclei can be targeted in each sample. As single nuclei prep needs to be processed right away by Core personnel upon arrival, the user needs to schedule the work ahead of time (at least by two week).
Service Request
Project consultation is provided free-of-charge.
Sample Submission
Standard ATAC-seq: Live cells, not genomic DNA, is required as input for ATAC-seq. The number of cells is critical to project success, with 50,000 cells as a good starting point. Healthy cells in a homogeneous single-cell suspension work the best. Before proceeding to library prep, GMAK staff will check for cell viability and density.
Single cell ATAC-seq: Single nuclei suspension prepared from fresh, cryopreserved, and flash frozen tissue or cell samples is needed for library prep. The 10x Genomics scATAC-seq Support page lists Demonstrated Protocols for sample prep. GMAK staff will check single nuclei prep quality and concentration upon sample delivery.
It is essential to coordinate with core staff for sample delivery because samples need to be processed right away. It is advisable to schedule the work with the Core as early as possible, prior to cell preparation.
Sequencing Mode
- Standard ATAC-seq: Paired-end 50 bp reads are usually enough for mapping ATAC-seq reads to the reference genome.
- Sequencing depth: at least 50 million reads per sample are recommended.
Bioinformatics
Data analysis is provided upon request. Raw data and analysis results are usually returned to user via FTP-Service.
ChIP Sequencing
ChIP-seq is a genomics technology developed to map binding sites of a DNA-interacting protein across the genome. Examples of DNA-interacting proteins include transcription factors, histones, and enzymes for DNA repair and modification. A common application of ChIP-seq is to locate transcription factor binding patterns under different conditions, such as development stages or pathological conditions.
ChIP-seq starts with covalent cross-linking of DNA with interacting proteins, then shearing of chromatin into fragments, followed by enrichment of the protein of interest with its bound DNA by immunoprecipitation using an antibody specific for the protein. Subsequently, after dissociating the enriched protein-DNA complex, the released DNA fragments are subjected to sequencing. One key experimental factor in the ChIP-Seq process is the quality of the antibody used in the enrichment step, as the use of a poor-quality antibody can lead to high experimental noise due to non-specific precipitation of DNA fragments.
Sequencing Mode
A paired-end 50 bp run is sufficient for most cases. Use of longer reads may help reads alignment, especially to repetitive regions.
Service Request
If needed, project consultation is provided free-of-charge.
Sample Submission
For library prep, 10 ng of ChIP and input DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard ChIP-seq bioinformatics service includes sequencing data QC, alignment and peak calling.
DNA Methylation Sequencing
The methylation of cytosines is a major epigenomic mechanism that modulates the primary genomic code. This leads to the formation of 5-methylcytosines (5mCs) at select sites of the genome. Cytosine methylation regulates gene expression and chromatin remodeling, and as a result plays important roles in many biological functions including embryonic development, cell differentiation, and stem cell pluripotency. Abnormal DNA methylation can lead to diseases, such as cancer.
DNA methylation sequencing is a newer technology that is usually based on bisulfite conversion to differentiate methylated vs. unmethylated cytosines. Upon treatment with bisulfite, unmethylated cytosines are converted to uracils, while 5mCs are nonreactive and retained. In the sequencing step, unmethylated cytosines are read as thymines, while methylated cytosines still as cytosines.
Based on genomic coverage, bisulfite conversion based sequencing can be conducted in GMAK as either Whole-Genome Bisulfite Sequencing (WGBS) or Reduced Representation Bisulfite Sequencing (RRBS). WGBS costs more and the associated data analysis is much more involved. RRBS instead provides a cost effective approach to survey DNA methylation by sampling CpG-rich regions of the genome. To perform RRBS, genomic DNA is digested with a methylation-insensitive restriction enzyme, such as MspI. The digested DNA fragments are then subjected to adapter ligation, bisulfite conversion, and PCR, to generate a library for sequencing.
Sequencing Mode
A paired-end 150 bp run is suggested for most cases.
Service Request
If needed, project consultation is provided free-of-charge.
Sample Submission
For library prep, 150 ng of high-quality genomic DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard RRBS bioinformatics service includes sequencing data QC, alignment, and DNA methylation localization and quantification.
Nanopore Long-Reads Sequencing
Oxford Nanopore sequencing is a third-generation, single-molecule sequencing platform. The length of sequencing reads it produces is typically 10-100 kb for long reads sequencing mode and 100-300 kb for ultra-long reads sequencing. The longest reads achieved so far is 4 Mb. These long reads are needed for a list of applications (see below) that short-reads sequencing struggles with.
With continuous technology development, the error rate of Nanopore sequencing raw reads is at 1% achieving 99% (Q20) accuracy. Consensus accuracy can reach Q47 at 60x coverage for human DNA through combining multiple raw reads from a genomic region to form a single consensus sequence. The most common sequencing error in nanopore sequencing occurs in homopolymeric regions.
Oxford Nanopore currently offers three main devices at different data throughput levels, i.e., MinION, GridION, and PromethION. MinION/GridION use the same flow cell type which typically produces 10-20 Gb data (30 Gb maximum). The flow cell used on the PromethION has a throughput of 50-100 Gb (170 Gb maximum).
Below are some of the major applications for Oxford Nanopore sequencing:
- Structural variation detection
- Single nucleotide variant phasing
- Full-length transcript sequencing and splicing isoform detection
- Detection of fusion transcripts
- De novo genome assembly
- Direct detection of epigenetic base modifications on DNA and RNA
Read Length and Sequencing Output
Long-reads sequencing mode: 10-100 kb in length. Good for:
- Full-length transcript sequencing and splicing isoform detection
- Detection of fusion transcripts
- Direct base modification on DNA and RNA
Ultra-long reads sequencing mode: 100-300 kb in length. Good for:
- De novo genome assembly
- Haplotype phasing
- Structural variants detection
- Total data output: 10-20 Gb per MinION flow cell; up to 200 Gb per PromethION flow cell.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
GMAK takes extracted total RNA for RNA-seq or enriched poly(A) mRNA for direct RNA modification sequencing. The quality of RNA is the single most important factor that determines final outcome. After sample dropoff, core staff conducts sample QC, which includes Qubit concentration measurement and Bioanalyzer-based RNA Integrity Number (RIN) generation, prior to library construction. A RIN of 8 is required to proceed with mRNA-seq library construction. Submitted RNA samples also need to be DNA-free and we suggest to always include a DNase treatment step during RNA extraction. Presence of genomic DNA contamination is visible on Bioanalyzer traces in the range of 4-10 kb.
For DNA sample used for genome assembly, DNA purity and length (in Mb) are critical to obtain high quality data. Please consult with GMAK core for your project.
Bioinformatics
Data analysis is provided upon request.
RNA Sequencing
RNA sequencing, or RNA-Seq, is the latest technology to study the transcriptome, i.e., the full set of RNA transcripts as genome readouts in a cell or population of cells. This technology directly sequences RNA molecules in the transcriptome in order to determine their genes of origin and abundance. RNA species need to undergo a sequencing library preparatory process prior to sequencing. The libraries are then sequenced to generate millions of reads for each sample. After sequencing, the generated reads are mapped to the reference genome to identify their genomic origin. The total number of reads mapped to a particular genomic region represents the level of transcriptional activity in the region. The more transcriptionally active a genomic region is, the more copies of RNA transcripts it produces, and the more RNA-Seq reads it generates. RNA-seq is essentially a counting game.
GMAK provides five types of RNA-seq services as detailed below:
- mRNA-Seq: Starts with 100 ng to 1 ug high quality total RNA. Prepares library from poly-A enriched mRNA species. Aims to identify differentially expressed protein-coding genes. Most requested.
- Total RNA-Seq: Starts with 100 ng to 1 ug total RNA. Library prep based on rRNA depletion. Targets both protein-coding genes and long noncoding RNAs. Can accommodate degraded RNA, such as those extracted from FFPE or laser capture microdissected samples.
- Low-input RNA-Seq: Accommodates limited amounts of total RNA in the range of 5-100 ng. Library prep can be based on poly(A) mRNA enrichment (default), or rRNA depletion (at additional cost).
- Small RNA-Seq: Prepares sequencing libraries for small RNA species, e.g., miRNAs, from total RNA. Can start from 100 ng to 1 ug (standard input) or 5 to 100 ng (low input) total RNA.
- Single-Cell RNA-Seq
Users who are new to bulk RNA-seq may refer to core documents RNA-Seq Workflow Steps and Examples, and the RNA-Seq Decision Tree to help decide the type of RNA-seq service needed.
Read Length and Sequencing Depth
Standard mRNA- or total RNA-Seq: Paired-end 50 reads are mostly used for general gene expression profiling. To study alternative splicing variants, paired-end, longer reads (up to 150 bp) are often requested. On sequencing depth, 25-30 million reads per sample are usually appropriate for general gene expression profiling, while 40-50 million reads are suggested for splicing variant detection.
Low-input RNA-seq: Read length remains the same as standard mRNA- or total RNA-seq. Sequencing depth may be reduced to some extent based on the amount of starting material.
Small RNA-seq: GMAK generates paired-end 50 bp reads for small RNA-seq. The suggested sequencing depth is 5-10 million reads per sample.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
GMAK takes extracted total RNA for RNA-seq (no tissues or cells). The quality of RNA is the single most important factor that determines final outcome. After sample drop-off, core staff conducts sample QC, which includes Qubit concentration measurement and Bioanalyzer-based RNA Integrity Number (RIN) generation, prior to library construction. A RIN of 8 is required to proceed with mRNA-seq library construction. Submitted RNA samples also need to be DNA-free and we suggest to always include a DNase treatment step during RNA extraction. Presence of genomic DNA contamination is visible on Bioanalyzer traces in the range of 4-10 kb. In situations under which RNA degradation is unavoidable, such as when using FFPE tissues, total RNA-seq is suggested as it is less dependent on the intactness of RNA. Use our Sample Submission Form.
Bioinformatics
Data analysis is provided upon request. Standard RNA-seq bioinformatics service includes sequencing data QC, alignment, normalization, and differential expression analysis.
Single-Cell Sequencing
GMAK has offered single-cell sequencing since 2017. This state-of-the-art technology offers unprecedented opportunities to study cell-to-cell variation, identify/visualize different cell types/identities in a population, and infer cellular developmental trajectories. To help users accomplish these goals, GMAK assists users in every step of this process – from cell prep and sequencing library construction to bioinformatic analysis.
As the technology evolves, single-cell sequencing becomes more diverse to meet varying project needs. Currently, GMAK offers high-throughput single-cell sequencing based on 10x Genomics Technology
10x Genomics Chromium
- Target cell number: 1,000-10,000 cells in each sample
- Input type: freshly prepared single-cell (or nucleus) suspension, fixed-cell (or nucleus), cryopreserved cell, and FFPE embedded tissue.
- Applications: 3’ and 5’ single-cell (or nucleus) RNA-seq, T-cell and B-cell V(D)J clone profiling, cell surface protein profiling, single nucleus ATAC-seq and multiome (simultaneous single nucleus ATAC-seq and RNA-seq on the same cells)
Service Request
To initiate a single-cell sequencing project, please contact us. Project consultation is provided free-of-charge. Consultation with the core prior to starting a single-cell sequencing experiment is highly recommended to ensure accomplishment of project goals.
Sample Submission
For all single-cell sequencing services, please make a sample submission appointment with us in advance.
Recommended Sequencing Parameters
- RNA-Seq Libraries – Read 1 of 28 bp (Cell Barcode and UMI), i7 Index of 10 bp (Sample Index), i5 Index of 10 bp (Sample Index), and Read 2 of 90 bp (Transcript Insert) with a sequencing depth of >20,000 reads per cell;
- ATAC-Seq Libraries – Read 1 of 50 bp (Transposed DNA), i7 Index of 8 bp (Sample index), i5 Index of 16 bp (10x Barcode) and Read 2 of 49 bp (Transposed DNA) with a sequencing depth of >25,000 reads per cell
Bioinformatics
Data analysis is provided upon request.
Spatial Transcriptomics
Spatial transcriptomics enables interrogation of gene expression within the context of tissue architecture, tissue microenvironments and cell groups (especially when coupled with single cell sequencing). To meet the rapidly increasing needs for spatial -omics studies, GMAK teams up with the HZI Core Unit Mouse Histology and Pathology hosted by VMED department . This internal cooperation ensures users have access to the various techniques needed to carry out a typical spatial analysis workflow, such as tissue prep, cryosectioning, staining, imaging, tissue section QC, sequencing library prep, sequencing and data analysis. The established workflow accommodates both fresh frozen (FF) and formalin-fixed paraffin-embedded (FFPE) tissues. Pre-cut tissue sections on standard glass slides may also be used for the Visium platform from 10x Genomics, as the 10x CytAssist instrument available at GMAK enables sample transfer from pre-existing slides to Visium slides. The Visium platform from 10x Genomics has been offered since August 2023.
Service Request
Please contact GMAK to initiate a spatial analysis project. The Core works closely with rest of HZI cores on the different steps of the workflow. Project consultation is provided free of charge. Consultation with the core prior to starting a spatial transcriptome experiment is highly recommended to ensure accomplishment of project goals.
Whole Genome Sequencing
Whole-Genome-Sequencing (WGS) is a method used to determine the complete DNA sequence of an organism's genome, including chromosomal DNA and mitochondrial DNA. While WGS provides a comprehensive view of the genome. WGS methods can be categorized into de novo sequencing projects and re-sequencing projects. Which method for which project? DNA-Seq_Decision_Tree
De novo WGS
For de novo sequencing of genomes, long reads such as those produced by Oxford Nanopore sequencers (MinION, GridION) are advantageous. Since this technology is available at GMAK, we recommend a hybrid approach for de novo WGS. Here, short reads from Illumina sequencers and long reads from Oxford Nanopore (ONT) sequencers are used to improve the assembly results for genomes.
WGS Re-Sequencing
WGS Re-Sequencing is notably more cost-effective than de novo WGS, as it can be accomplished using only short reads when a high-quality assembled reference genome is available.
Sequencing Mode
- Short reads (Illumina): paired-end run of at least 150bp, better 300bp with a 50x coverage of the genome for re-sequencing and a 100x coverage of the genome for de novo sequencing
- Long reads (ONT): 50x coverage of the genome for a hybrid assembly with Illumina short reads and 200x coverage of the genome for de novo sequencing
We strongly recommend a hybrid approach for successful de novo sequencing of small genomes (bacteria, viruses), the combination of short reads (Illumina) with long reads (ONT).
For the method of re-sequencing and the associated SNP/variant calling, we recommend the use of short reads (Illumina), as this sequencing technology currently has the lowest sequencing error rate and thus better prerequisites for subsequent qualitative analysis steps.
Service request
If required, a free project consultation is offered.
Submission of the sample
- Library Prep for Illumina sequencer: 150 ng high-quality genomic DNA required.
- Library Prep for MinION Sequencer (ONT): 1µg HMW DNA
Fluorometric methods, such as Qubit or PicoGreen, are preferred for DNA quantification. Spectrophotometric methods, such as Nanodrop, may not be accurate enough. Use our Sample Submission Form.
Bioinformatics
Data analysis is offered on request. Depending on the method, the standard bioinformatics service includes quality control of sequencing data, alignment, assembly, hybrid assembly, SNP/variant calling, automatic annotation.
Whole Exome Sequencing
While the protein-coding region of the genome (i.e., the exome) represents only a small portion of the genome (less than 2 percent in humans), it is the most studied and best annotated. For example, the human exome contains approximately 85 percent of all known disease-related variants. Due to its cost effectiveness and better data manageability, whole exome sequencing (WES) offers an ideal approach when whole-genome sequencing is not practical or needed.
WES enables core users to focus their resources on genes that are most likely to have an impact on the phenotype or disease of interest. By scanning through the entire amino acid coding region of the genome, it leads to identification of relevant variants across a wide range of applications, including genetic diseases, cancer development and population genetics.
GMAK uses a capture-based approach to target exome regions for sequencing. We use biotinylated nucleic acid baits, which are complementary to the target exome, to hybridize to genomic DNA libraries for the capture. For our WES, we only require 150 ng of high-quality human genomic DNA.
Sequencing Mode
Paired end 100 or 150 bp high- or mid-output runs are recommended for WES. Each high- or mid-output run generates 1600 million, or 800 million, paired-end reads, respectively.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
For WES, 150 ng of high-quality human genomic DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard WES bioinformatics service for variant discovery includes sequencing data QC, alignment, and variant calling. Delivered results are variant call (VCF) files.
ATAC-Seq for Open Chromatin Profiling
The eukaryotic genome is highly packaged to fit into the very limited nuclear space. As a result, access to genomic information is tightly regulated based on cellular state. What regions of the genome are accessible reveals a great deal about the state of the cell. ATAC-seq, or Assay for Transposase-Accessible Chromatin coupled with next-gen sequencing, is a technique to locate accessible chromatin regions.
As the name suggests, ATAC-seq is based on the use of an engineered, hyperactive transposase (called Tn5), which fragments DNA in open regions of the chromatin. In the same process, it simultaneously tags the ends of the fragmented DNA with sequencing adapters. This tagmentation process is a key part of ATAC-seq library construction.
Standard ATAC-seq: For input material, GMAK needs 50,000 live cells to start library prep. Extra cells are needed for cell viability and density checks before conduct of library prep. Please submit at least 60,000 cells to the Core. Since cells need to be processed immediately after delivery to the Core, the user needs to contact the Core ahead of time (at least by one week) to schedule the work.
Single cell ATAC-seq using 10x Chromium: Single nuclei suspension prepared from fresh, cryopreserved, and flash frozen tissue or cell samples is needed for library prep. As performed for single cell RNA-seq, single nuclei prep will be check first for quality and concentration, and 500-10,000 nuclei can be targeted in each sample. As single nuclei prep needs to be processed right away by Core personnel upon arrival, the user needs to schedule the work ahead of time (at least by two week).
Service Request
Project consultation is provided free-of-charge.
Sample Submission
Standard ATAC-seq: Live cells, not genomic DNA, is required as input for ATAC-seq. The number of cells is critical to project success, with 50,000 cells as a good starting point. Healthy cells in a homogeneous single-cell suspension work the best. Before proceeding to library prep, GMAK staff will check for cell viability and density.
Single cell ATAC-seq: Single nuclei suspension prepared from fresh, cryopreserved, and flash frozen tissue or cell samples is needed for library prep. The 10x Genomics scATAC-seq Support page lists Demonstrated Protocols for sample prep. GMAK staff will check single nuclei prep quality and concentration upon sample delivery.
It is essential to coordinate with core staff for sample delivery because samples need to be processed right away. It is advisable to schedule the work with the Core as early as possible, prior to cell preparation.
Sequencing Mode
- Standard ATAC-seq: Paired-end 50 bp reads are usually enough for mapping ATAC-seq reads to the reference genome.
- Sequencing depth: at least 50 million reads per sample are recommended.
Bioinformatics
Data analysis is provided upon request. Raw data and analysis results are usually returned to user via FTP-Service.
ChIP Sequencing
ChIP-seq is a genomics technology developed to map binding sites of a DNA-interacting protein across the genome. Examples of DNA-interacting proteins include transcription factors, histones, and enzymes for DNA repair and modification. A common application of ChIP-seq is to locate transcription factor binding patterns under different conditions, such as development stages or pathological conditions.
ChIP-seq starts with covalent cross-linking of DNA with interacting proteins, then shearing of chromatin into fragments, followed by enrichment of the protein of interest with its bound DNA by immunoprecipitation using an antibody specific for the protein. Subsequently, after dissociating the enriched protein-DNA complex, the released DNA fragments are subjected to sequencing. One key experimental factor in the ChIP-Seq process is the quality of the antibody used in the enrichment step, as the use of a poor-quality antibody can lead to high experimental noise due to non-specific precipitation of DNA fragments.
Sequencing Mode
A paired-end 50 bp run is sufficient for most cases. Use of longer reads may help reads alignment, especially to repetitive regions.
Service Request
If needed, project consultation is provided free-of-charge.
Sample Submission
For library prep, 10 ng of ChIP and input DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard ChIP-seq bioinformatics service includes sequencing data QC, alignment and peak calling.
DNA Methylation Sequencing
The methylation of cytosines is a major epigenomic mechanism that modulates the primary genomic code. This leads to the formation of 5-methylcytosines (5mCs) at select sites of the genome. Cytosine methylation regulates gene expression and chromatin remodeling, and as a result plays important roles in many biological functions including embryonic development, cell differentiation, and stem cell pluripotency. Abnormal DNA methylation can lead to diseases, such as cancer.
DNA methylation sequencing is a newer technology that is usually based on bisulfite conversion to differentiate methylated vs. unmethylated cytosines. Upon treatment with bisulfite, unmethylated cytosines are converted to uracils, while 5mCs are nonreactive and retained. In the sequencing step, unmethylated cytosines are read as thymines, while methylated cytosines still as cytosines.
Based on genomic coverage, bisulfite conversion based sequencing can be conducted in GMAK as either Whole-Genome Bisulfite Sequencing (WGBS) or Reduced Representation Bisulfite Sequencing (RRBS). WGBS costs more and the associated data analysis is much more involved. RRBS instead provides a cost effective approach to survey DNA methylation by sampling CpG-rich regions of the genome. To perform RRBS, genomic DNA is digested with a methylation-insensitive restriction enzyme, such as MspI. The digested DNA fragments are then subjected to adapter ligation, bisulfite conversion, and PCR, to generate a library for sequencing.
Sequencing Mode
A paired-end 150 bp run is suggested for most cases.
Service Request
If needed, project consultation is provided free-of-charge.
Sample Submission
For library prep, 150 ng of high-quality genomic DNA is required. For DNA quantification, fluorometric-based methods, such as Qubit or PicoGreen, are preferred. Spectrophotometric-based methods, such as Nanodrop, may not be accurate.
Bioinformatics
Data analysis is provided upon request. Standard RRBS bioinformatics service includes sequencing data QC, alignment, and DNA methylation localization and quantification.
Nanopore Long-Reads Sequencing
Oxford Nanopore sequencing is a third-generation, single-molecule sequencing platform. The length of sequencing reads it produces is typically 10-100 kb for long reads sequencing mode and 100-300 kb for ultra-long reads sequencing. The longest reads achieved so far is 4 Mb. These long reads are needed for a list of applications (see below) that short-reads sequencing struggles with.
With continuous technology development, the error rate of Nanopore sequencing raw reads is at 1% achieving 99% (Q20) accuracy. Consensus accuracy can reach Q47 at 60x coverage for human DNA through combining multiple raw reads from a genomic region to form a single consensus sequence. The most common sequencing error in nanopore sequencing occurs in homopolymeric regions.
Oxford Nanopore currently offers three main devices at different data throughput levels, i.e., MinION, GridION, and PromethION. MinION/GridION use the same flow cell type which typically produces 10-20 Gb data (30 Gb maximum). The flow cell used on the PromethION has a throughput of 50-100 Gb (170 Gb maximum).
Below are some of the major applications for Oxford Nanopore sequencing:
- Structural variation detection
- Single nucleotide variant phasing
- Full-length transcript sequencing and splicing isoform detection
- Detection of fusion transcripts
- De novo genome assembly
- Direct detection of epigenetic base modifications on DNA and RNA
Read Length and Sequencing Output
Long-reads sequencing mode: 10-100 kb in length. Good for:
- Full-length transcript sequencing and splicing isoform detection
- Detection of fusion transcripts
- Direct base modification on DNA and RNA
Ultra-long reads sequencing mode: 100-300 kb in length. Good for:
- De novo genome assembly
- Haplotype phasing
- Structural variants detection
- Total data output: 10-20 Gb per MinION flow cell; up to 200 Gb per PromethION flow cell.
Service Request
Project consultation is provided free-of-charge.
Sample Submission
GMAK takes extracted total RNA for RNA-seq or enriched poly(A) mRNA for direct RNA modification sequencing. The quality of RNA is the single most important factor that determines final outcome. After sample dropoff, core staff conducts sample QC, which includes Qubit concentration measurement and Bioanalyzer-based RNA Integrity Number (RIN) generation, prior to library construction. A RIN of 8 is required to proceed with mRNA-seq library construction. Submitted RNA samples also need to be DNA-free and we suggest to always include a DNase treatment step during RNA extraction. Presence of genomic DNA contamination is visible on Bioanalyzer traces in the range of 4-10 kb.
For DNA sample used for genome assembly, DNA purity and length (in Mb) are critical to obtain high quality data. Please consult with GMAK core for your project.
Bioinformatics
Data analysis is provided upon request.
Robert Geffers majored in biology at the University of Hamburg and graduated in 1996 (MSc, "Diplom"). Even during his studies, his focus was on applied genome research. He investigated the molecular mechanisms of gene regulation in his doctoral thesis at the Technische Universität Braunschweig. Even before he completed his doctorate, he transferred to BIOBASE GmbH, a bioinformatics company in Wolfenbüttel, Germany, as a project leader to direct the development of database-supported modelling of plant gene regulation networks.
In 2002, he joined the "Mucosal Immunity" junior research group at the HZI, then GBF. In 2003, he was appointed head of the group and established micro-array analysis as a centre-wide transcriptome analysis service. Since 2011, Robert Gefferts has directed the "Genome Analysis" research group as part of the scientific service facilities at the HZI providing scientists with the opportunity to use sophisticated technologies for genome, epigenome and transcriptome analysis.
In addition, he is a founding member of the Braunschweig Integrated Centre of Systems Biology BRICS.
Team
Selected Publications
Kruse B, Buzzai AC, Shridhar N, Braun AD, Gellert S, Knauth K, Pozniak J, Peters J, Dittmann P, Mengoni M, van der Sluis TC, Hohn S, Antoranz A, Krone A, Fu Y, Yu D, Essand M, Geffers R, Mougiakakos D, Kahlfuss S, Kashkar H, Gaffal E, Bosisio FM, Bechter O, Rambow F, Marine JC, Kastenmuller W, Muller AJ, Tuting T. CD4(+) T cell-induced inflammatory cell death controls immune-evasive tumours. Nature. Jun 2023;618(7967):1033-1040. DOI: 10.1038/s41586-023-06199-x
Sohail A, Iqbal AA, Sahini N, Chen F, Tantawy M, Waqas SFH, Winterhoff M, Ebensen T, Schultz K, Geffers R, Schughart K, Preusse M, Shehata M, Bahre H, Pils MC, Guzman CA, Mostafa A, Pleschka S, Falk C, Michelucci A, Pessler F. Itaconate and derivatives reduce interferon responses and inflammation in influenza A virus infection. PLoS Pathog. Jan 2022;18(1):e1010219. DOI: 10.1371/journal.ppat.1010219
Riese P, Trittel S, Akmatov MK, May M, Prokein J, Illig T, Schindler C, Sawitzki B, Elfaki Y, Floess S, Huehn J, Blazejewski AJ, Strowig T, Hernandez-Vargas EA, Geffers R, Zhang B, Li Y, Pessler F, Guzman CA. Distinct immunological and molecular signatures underpinning influenza vaccine responsiveness in the elderly. Nat Commun. Nov 12 2022;13(1):6894. DOI: 10.1038/s41467-022-34487-z
Wendisch D, Dietrich O, Mari T, von Stillfried S, Ibarra IL, Mittermaier M, Mache C, Chua RL, Knoll R, Timm S, Brumhard S, Krammer T, Zauber H, Hiller AL, Pascual-Reguant A, Mothes R, Bulow RD, Schulze J, Leipold AM, Djudjaj S, Erhard F, Geffers R, Pott F, Kazmierski J, Radke J, Pergantis P, Bassler K, Conrad C, Aschenbrenner AC, Sawitzki B, Landthaler M, Wyler E, Horst D, Deutsche C-OI, Hippenstiel S, Hocke A, Heppner FL, Uhrig A, Garcia C, Machleidt F, Herold S, Elezkurtaj S, Thibeault C, Witzenrath M, Cochain C, Suttorp N, Drosten C, Goffinet C, Kurth F, Schultze JL, Radbruch H, Ochs M, Eils R, Muller-Redetzky H, Hauser AE, Luecken MD, Theis FJ, Conrad C, Wolff T, Boor P, Selbach M, Saliba AE, Sander LE. SARS-CoV-2 infection triggers profibrotic macrophage responses and lung fibrosis. Cell. Dec 22 2021;184(26):6243-6261 e6227. DOI: 10.1016/j.cell.2021.11.033
Schulte-Schrepping J, Reusch N, Paclik D, Bassler K, Schlickeiser S, Zhang B, Kramer B, Krammer T, Brumhard S, Bonaguro L, De Domenico E, Wendisch D, Grasshoff M, Kapellos TS, Beckstette M, Pecht T, Saglam A, Dietrich O, Mei HE, Schulz AR, Conrad C, Kunkel D, Vafadarnejad E, Xu CJ, Horne A, Herbert M, Drews A, Thibeault C, Pfeiffer M, Hippenstiel S, Hocke A, Muller-Redetzky H, Heim KM, Machleidt F, Uhrig A, Bosquillon de Jarcy L, Jurgens L, Stegemann M, Glosenkamp CR, Volk HD, Goffinet C, Landthaler M, Wyler E, Georg P, Schneider M, Dang-Heine C, Neuwinger N, Kappert K, Tauber R, Corman V, Raabe J, Kaiser KM, Vinh MT, Rieke G, Meisel C, Ulas T, Becker M, Geffers R, Witzenrath M, Drosten C, Suttorp N, von Kalle C, Kurth F, Handler K, Schultze JL, Aschenbrenner AC, Li Y, Nattermann J, Sawitzki B, Saliba AE, Sander LE, Deutsche C-OI. Severe COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment. Cell. Sep 17 2020;182(6):1419-1440 e1423. DOI: 10.1016/j.cell.2020.08.001
Publications
NGS Equipment
Devices for quality check (QC)
Fragmentanalyzer 5200
The Fragment Analyzer is a parallel capillary electrophoresis system used for qualifying and quantifying nucleic acids for 12 samples in parallel. It performs DNA and RNA quality control (QC) for a broad range of samples, including gDNA, small RNA, large DNA fragments, total RNA, and mRNA and library QC for next generation sequencing (NGS).
The Fragment Analyzer has many enhanced features, including:
- Diverse quantitative sample kit options for genomic DNA (gDNA), NGS libraries, small RNA, total RNA, and messenger RNA (mRNA)
- Seamless, automated switching between applications with two gel input lines
- Multitray format holds three standard 96-well plates for automated analysis of up to 288 samples
- Minimal hands-on time required for instrument setup and sample handling
- High analytical sensitivity detects concentrations as low as 5 pg/μl for fragments and 50 pg/μl for smears Fragment Analyzer
Qubit
The Qubit 4 fluorometer accurately and quickly determines the concentration of DNA, RNA or protein in a single sample. It can also be used to assess the integrity and quality of RNA. Qubit fluorometers detect fluorescent dyes that are specifically bound to the target molecule. Even at extremely low levels or in the presence of contaminants, optimized Qubit assays allow to distinguish dsDNA from ssDNA or intact from degraded RNA.
Different quatification kits are available for DNA (5 ng to 2 µg) and RNA (5 ng to 1.2 µg).
Devices for Sequencing
Illumina NovaSeq 6000
- SP - flow cell: 400 Mio reads per lane, 2 lanes per flow cell, available read length: 2x50, 2x150, 2x250 bp.
- S1 - flow cell: 800 Mio reads per lane, 2 lanes per flow cell, available read length: 2x50, 2x100 and 2x150 bp
- S2 - flow cell: 2050 Mio reads per lane, 2 lanes per flow cell, available read length: 2x50, 2x100 and 2x150 bp
- S4 - flow cell: 2500 Mio reads per lane, 4 lanes per flow cell, available read length: 2x100 and 2x150 bp
Illumina MiSeq
- 1, 4, 15 - 25 Mio reads per flow cell (different flow cell types), only one lane
- available read length: 2x150, 2x250 and 2x300 bp
Oxford Nanopore Technology MinION
- Long-read sequencing with read lengths of up to 4 Mbp, typically 10-100 kb for long-read sequencing and 100-300 kb for ultra-long-read sequencing mode
- MinION: 10-20 Gb output per flow cell
NGS Laboratory Equipment
BluePippin
The BluePippin is a preparative electrophoresis platform for collection of size-selected DNA or protein samples. The system automates DNA or protein size selection using disposable, pre-cast agarose cassettes to extract fractions according to user-defined software input. Target sizes or ranges of sizes are entered in software, and fractions are collected in buffer.
Up to 5 samples/gel cassette may be run, with no possibility of cross contamination.
Roche Light Cycler 480
The Roche LightCycler is used to perform qPCR for the absolute or relative quantification of nucleic acids.
Laboratory Automation
epMotion 5075
The epMotion 5075t NGS Solution is a liquid handling system and contains all the necessary dispensing tools, accessories and consumables for the preparation of NGS libraries with up to 96 samples. It automates the process, delivers reproducible and accurate results and increases laboratory productivity.
epMotion 5073
The epMotion is an automated liquid handling system and simplifies traditionally complex and laboratory-intensive pipetting tasks, saves time and improves the reliability and reproducibility of results.
Biomek® FXP-Pipettierroboter
The Biomek from Beckman Coulter is an automated liquid handling system and simplifies laboratory-intensive pipetting tasks and improves the reproducibility of results.
Devices for Single-Cell Analysis
Chromium Controller
The Chromium Controller uses advanced microfluidics to perform single cell partitioning and barcoding in a matter of minutes. Powered by Next GEM technology, the Chromium Controller perform multiomic analysis of hundreds to tens of thousands of single cells.
Chromium Controller iX
The Chromium iX is an extension of the Chromium Controller and also enables single cell analysis from fixed cells.
Visium CytAssist
The Visium CytAssist is an instrument by facilitating the transfer of transcriptomic probes from standard glass slides to Visium slides, enabling whole transcriptomic spatial profiling insights across your entire tissue section, and expanded compatibility with FFPE and/or FF samples.
Bioinformatics
In addition to producing Next-Generation Sequencing (NGS) data, we offer thorough bioinformatic analyses customized for each project. Employing state-of-the-art techniques, the most recent analyses, and specialized statistically sound methods, we utilize computational approaches to address a diverse array of biological inquiries.
Similar to many other sequencing platform, our initial step involves scheduling an in-person or Zoom meeting to comprehensively understand your project requirements, timelines, and budget constraints. Following this consultation, we will assess if we can undertake the project and outline how we will deliver the results. If both parties agree to proceed, we will formalize the arrangement by signing a contract that delineates our mutual obligations. Because our work is grounded in research not every scientific question may be resolved as anticipated. We may necessitate more frequent discussions to accommodate evolving needs and circumstances.
Primary data analysis
The primary data processing stage converts the raw data in BCL/POD5/FASTQ5 format produced by the sequencing instruments into quality-scored nucleotide sequences, formatted as FASTQ files, which are essential for subsequent data processing steps. Furthermore, this stage demultiplexing different samples by index, generates a QC report and checksum files to ensure the integrity of transferred data files. This process is crucial when utilizing the sequencing service provided by the NGS platform at HZI. The primary data analysis is included in sequencing service of NGS platform at no additional cost.
Secondary data analysis
We provide fundamental processing services for both single-cell and bulk sequencing data across various platforms and technologies, which encompass the reprocessing of published or public datasets. Our computational pipelines execute all essential steps, such as assembly alignment, peak calling, and feature counting, necessary for tertiary analyses, while also generating a QC report.
- Data management (storage, meta-table management, ENA/GEO submission)
- DNA-seq analysis (from raw sequencing data to QC, de novo Assembly and automatic annotation, SNPs/Variant calling from genotyping/WGS/WES data)
- RNA-seq analysis (from raw sequencing data to QC, to normalized gene expression table, group comparison, pathway analysis, and interactive reporting and visualization of the results)
- Small RNA-seq analysis (miRNAs )
- Single-cell omics analysis (10x Genomics pipeline, clustering, trajectory analysis, spatial transcriptomics)
- ChIP-seq analysis (e.g. QC, peak calling, differential binding, UCSC Genome Browser visualization)
- Other next-generation sequencing (NGS) data analysis (ATAC-seq)
Secondary data analysis is offered on request and after a project consultation. It is carried out in close cooperation with the users of the service.