Series | GSE62952 |
Title | Integrated genome and transcriptome sequencing from the same cell |
---|---|
Year | 2014 |
Country | Netherlands |
Article | van Oudenaarden A,Bienko M,Spanjaard B,Kester L,Dey SS.Integrated genome and transcriptome sequencing of the same cell.Nature biotechnology.2015 Mar |
PMID | 25599178 |
Bio Project | BioProject: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA266282 |
Sra | SRA: http://www.ncbi.nlm.nih.gov/sra?term=SRP049500 |
Overall Desgin | First, hand-picked single cells are lysed and reverse transcribed using a poly-A primer including cell-specific barcodes, a 5' Illumina adapter and a T7 promoter overhang to convert mRNA to single stranded cDNA (ss cDNA). The gDNA and single stranded cDNA are then subjected to quasilinear whole genome amplification, as previously described, using an adapter with a defined 27 nucleotide sequence at the 5’ end followed by 8 random nucleotides. After 7 rounds of amplification, the gDNA and cDNA are copied to generate a variety of different short amplicon (0.5–2.5 kb) species, with a majority of amplicons containing adapter Ad-2 at both ends and a small fraction of cDNA derived amplicons containing Ad-2 at one end and Ad-1x at the other. Next, the sample is split into two tubes to further amplify gDNA and cDNA. The tube used to sequence gDNA is amplified using PCR. Following sonication, adapter Ad-2 removal, and cell-specific indexed Illumina library preparation, this half is used to sequence gDNA. The tube used to sequence cDNA is converted to double-stranded cDNA and amplified using in vitro transcription such that the amplified RNA (aRNA) is uniquely produced from cDNA but not gDNA. 3’ Illumina adapters are then ligated to the aRNA followed by reverse transcription and PCR, allowing quantification of mRNA. |
Summary | Single-cell genomics and single-cell transcriptomics have recently emerged as powerful tools to study the biology of single cells at a genome-wide scale. Here we describe a method that allows the integration of genomic DNA and mRNA sequencing from the same cell. We use this method to correlate DNA copy number variation to transcriptome variability among individual cells. |
Experimental Protocol | NEBnext Ultra DNA library prep kit; Trypsinized single cells were picked using a mouth pipet with a 30μm glass capillary under a stereomicroscope. Picked cells were deposited in the center of the lid of a 0.2ml PCR tube and snap frozen in liquid nitrogen. DR-Seq. First strand cDNA synthesis was performed by adding 2 μL of reaction mix containing 0.2 μL first strand buffer (MessageAmp II, Life technologies), 0.4 μL of dNTP mix (MessageAmp II, Life technologies), 0.1 μL Arrayscript (MessageAmp II, Life technologies), 0.1 μL RNAse inhibitor (MessageAmp II, Life technologies), 0.2 μL RT primer with cell specific barcode (Ad-1x)13, 0.2 μL 1:500000 diluted ERCC spike-in mix 1 (Life technologies) and 0.05% IGEPAL in water. The first strand cDNA synthesis and lysis reaction mix together with the spike-in molecules were added directly to the drop in the lid of the tube containing a single cell. Samples were incubated in a PCR machine with lid and block set to 42C for 15 minutes after which the samples were spun down and incubated for another 105 minutes. After first strand synthesis, samples were incubated for 10 minutes at 80oC. Quasilinear amplification buffer containing 6.0 μL ThermoPol buffer (NEB) 1.0 μL 10 mM dNTP mix, 26 μL water and 0.15 μL 50 μM primer mix (Ad-2)7 was added to each sample. Samples were incubated for 3 minutes at 94 oC to denature the DNA. Seven cycles of quasilinear amplification was performed (10oC for 45 seconds,15oC for 45 seconds, 20oC for 45 seconds, 30oC for 45 seconds, 40oC for 45 seconds, 50oC for 45 seconds, 65oC for 2 minutes, 95oC for 20 seconds, 58oC for 40 seconds and then immediately quench on ice). Prior to each cycle 0.6 μL polymerase mix containing 2U Bst large fragment (NEB) and 0.8U Pyrophage 3173 exo- (Lucigen) was added. Note that the 58oC step for 40 seconds step, prior to quenching the reaction on ice, is not performed for the first quasilinear amplification round. After 7 sounds of quasilinear amplification, samples were split in two. One half of the sample was processed for gDNA sequencing, the other half was processed for mRNA sequencing. For mRNA sequencing, second strand synthesis of the quasilinear amplified cDNA was performed using the P1 primer (5' - CGATTGAGGCCGGTAATAC - 3') in a single cycle of PCR (94oC for 20 sec, 51oC for 20 sec, 72oC for 7 min). After this, samples with non-overlapping barcodes were pooled and cleaned up on a cDNA purification column (MessageAmp II, Life technologies), and eluted twice with 9 μL of water at 55oC. Next, the volume of the sample was reduced to 6.4 μL using a SpeedVac®. In vitro transcription (IVT) mix containing 1.6 μL 10x IVT buffer, 1.6 μL ATP, 1.6 μL GTP, 1.6 μL CTP, 1.6 μL UTP and 1.6 μL enzyme mix (MessageAmp II, Life technologies) were added to the samples and incubated at 37oC for 13 hours. After IVT, samples were cleaned up using the aRNA clean up columns (MessageAmp II, Life technologies) and aRNA was eluted twice in 12 μL of warm water at 55oC. After clean up, aRNA quality was assessed on a bioanalyzer (Agilent) Eukaryote Total RNA Pico chip. Library preparation was performed as previously described in Hashimshony et al. Cell reports 2012. For DNA sequencing, the other half of the quasilinear amplification product was amplified further by PCR. PCR mix containing 1.0 μL 10mM dNTP, 3 μL Thermopol buffer (10X), 0.2 μL 100 μM primer P2 (5' - GTGAGTGATGGTTGAGGTAGTGTGGAG - 3') and 1.0 μL Deep VentR (exo-) polymerase (NEB) was added to each sample for a final volume of 68 μL. PCR was performed as follows, 21 cycles of (94oC for 20 seconds, 59oC for 20 seconds, 65oC for 1 minute, 72oC for 2 minutes), and 72oC for 5 minutes at the end. After PCR, the quality of the products were assessed by agarose gel electrophoresis and the samples were cleaned up using a PCR purification column. Next, to remove adapter Ad-2 from the PCR product prior to preparing Illumina libraries, another PCR was done starting with 80 ng of product from the previous step. PCR mix containing 0.3 μL of 50 μM primer P3 with a 5' biotinylated end (5' - GTGAGCTGGAGTTGAGGTAGTGTGGAG - 3'), 5 μL Thermopol buffer (10X), 1 μL 10mM dNTP and 1 μL Deep VentR (exo-) polymerase (NEB) was added to each sample for a final volume of 50 μL. PCR was performed as follows, 94oC for 2 minutes, then 4 cycles of (94oC for 20 seconds, 46oC for 20 seconds, 65oC for 1 minute and 72oC for 2 minutes) and 9 cycles of (94oC for 20 seconds, 59oC for 20 seconds, 65oC for 1 minute and 72oC for 2 minutes). The PCR product was sheared using a sonicator (Biorupter®) on the low power setting with 15 cycles of 1 minute (30 seconds On, 30 seconds Off) with constant cooling at 4oC. The sheared products was then cleaned up using a PCR purification column (Qiagen) and eluted in 50 μL water. The final product distribution was verified on a bioanalyzer (Agilent) High Sensitivity DNA chip to have an average product size of approximately 300 bp. The DNA products was then added to Dynabeads MyOne Streptavidin C1 beads (Life Technologies) in 50 μL 2X BW buffer (10 mM Tris-HCl, 1mM EDTA and 2mM NaCl). After immobilizing the DNA products on the beads for 15 minutes, the biotinylated DNA was separated using a magnetic stand and the supernatant was stored. The biotinylated DNA was digested on the magnetic beads and the beads were washed twice with 50 μL 1X BW buffer. These two washes were then combined with the first supernatant and purified using a PCR purification column (Qiagen). Finally, Illumina libraries were prepared with different index primers for each single cell using the NEBNext Ultra DNA Library Prep Kit for Illumina® (NEB).; Trypsinized single cells were picked using a mouth pipet with a 30μm glass capillary under a stereomicroscope. Picked cells were deposited in the center of the lid of a 0.2ml PCR tube and snap frozen in liquid nitrogen. Illumina stranded mRNA kit; Trypsinized single cells were picked using a mouth pipet with a 30μm glass capillary under a stereomicroscope. Picked cells were deposited in the center of the lid of a 0.2ml PCR tube and snap frozen in liquid nitrogen. |
Data processing | For bulk gDNA and MALBAC libraries, paired end sequencing reads were aligned to the genome release mm10 for mouse cells (E14) and to the genome release hg19 for human cells (SK-BR-3) using BWA with default parameters. For the single cell gDNA libraries processed by DR-Seq, paired end sequencing reads were aligned to a masked genome mm10 for mouse cells and to a masked genome hg19 for human cells using BWA with default parameters. The masked genomes mm10 and hg19 were created by replacing all the coding sequences within the genome with N. This is because the fraction used to sequence gDNA contains sequences that could originate from the cDNA within coding regions. By masking the coding sequences within the genome, such ambiguous reads that might arise from either gDNA or cDNA are discarded computationally leaving only reads that arise from gDNA. This does not pose a problem for calling copy number variations since the coding region constitutes only approximately 2% of the genome.; All PCR duplicates within mapped reads from the bulk, MALBAC or DR-Seq libraries are removed. As the first step towards quantifying the gDNA data, the genome is divided into bins. To account for the masking of the genome in the DR-Seq data, the start and end coordinates of each bin are chosen such that the length of all bins are the same after excluding coding regions within each bin. Next, to further reduce amplification biases, we developed a coverage-based method to quantify the reads within bins. This coverage-based method significantly reduces bin-to-bin technical noise (see supplementary document). The reads are then corrected for GC bias. The corrected read distribution is then used to identify breakpoints using the circular binary segmentation (CBS) algorithm26. Finally, the median read counts for each segment are used to call copy number variations in single cells.; For bulk mRNA and CEL-Seq libraries, paired end sequencing reads were aligned to the transcriptome using Burrows-Wheeler Aligner (BWA) with default parameters. For single cell mRNA processed using DR-Seq, the Ad-2 adapter sequence was trimmed computationally from the right mate and then aligned to the transcriptome using BWA with default parameters. For the E14 cells, we used the RefSeq gene models based on the mouse genome release mm10. For the SK-BR-3 cells, we used the RefSeq gene models based on the human genome release hg19. For bulk mRNA sequencing both mates of each read were mapped to the transcriptome. For CEL-Seq and DR-Seq, the right mate of each read pair was mapped to the transcriptome and the ERCC spike-ins. The left mate was used to identify the cell from which the transcript came based on the cell-specific barcode. Reads mapping to more than one region were distributed uniformly.; For the bulk mRNA sequencing libraries, PCR duplicates were then removed to obtain the dataset used in all the analysis. The left mate of the CEL-Seq libraries also contained a 4-bp random sequence, introduced during reverse transcription, to count unique cDNA molecules, as previously described. Length-based identifiers were determined for each read in the single-cell mRNA libraries processed by DR-Seq using the first coordinate of the right mate after trimming off adapter Ad-2. The length-based identifiers were used to minimize amplification biases and achieve resolution close to identifying unique cDNA molecules.; Genome_build: mm10 and hg19; Supplementary_files_format_and_content: details included in the 'processed_data_files_description.txt' file. |
Platform | GPL16791;GPL17021 |
Public On | Public on Dec 11 2014 |