Loading

Dataset View [GSE66357]

SeriesGSE66357
TitleScalable Microfluidics for Single Cell RNA Printing and Sequencing
Year2015
CountryUSA
ArticleSims PA,Pe'er D,Vieira G,Rizvi AH,Carr A,Wan Z,Bose S.Scalable microfluidics for single-cell RNA printing and sequencing.Genome biology.2015 Jun 6
PMID26047807
Bio ProjectBioProject: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA276634
SraSRA: http://www.ncbi.nlm.nih.gov/sra?term=SRP055569
Overall DesginA microfluidic device that pairs sequence-barcoded mRNA capture beads with individual cells was used to barcode cDNA from individual cells which was then pre-amplified by in vitro transcription in a pool and converted into an Illumina RNA-Seq library. Libraries were generated from ~600 individual cells in parallel and extensive analysis was done on 396 cells from the U87 and MCF10a cell lines and from ~500 individual cells with extensive analysis on 247 cells from the U87 and WI-38 cell lines. Sequencing was done on the 3'-end of the transcript molecules. The first read contains cell-identifying barcodes that were present on the capture bead and the second read contains a unique molecular identifier (UMI) barcode, a lane-identifying barcode, and then the sequence of the transcript.
SummarySingle cell transcriptomics has emerged as a powerful approach to dissecting phenotypic heterogeneity in complex, unsynchronized cellular populations. However, many important biological questions demand quantitative analysis of large numbers of individual cells. Hence, new tools are urgently needed for efficient, inexpensive, and parallel manipulation of RNA from individual cells. We report a simple microfluidic platform for trapping single cell lysates in sealed, picoliter microwells capable of “printing” RNA on glass or capturing RNA on polymer beads. To demonstrate the utility of our system for single cell transcriptomics, we developed a highly scalable technology for genome-wide, single cell RNA-Seq. The current implementation of our device is pipette-operated, profiles hundreds of individual cells in parallel with library preparation costs of ~$0.10-$0.20/cell, and includes five lanes for simultaneous experiments. We anticipate that this system will ultimately serve as a general platform for large-scale single cell transcriptomics, compatible with both imaging and sequencing readouts.type = Expression profiling by high throughput sequencing
Experimental ProtocolRNA was extracted from individual cells in individual microfluidic chambers following cell lysis by Triton X-100 and freeze-thaw.; mRNA from individual cells was reverse transcribed with a primer containing a cell-identifying barcode followed by oligo(dT). Following second strand synthesis using DNA Polymerase I and reagents from the MessageAmp II kit (Ambion), ds-cDNA from all barcoded individual cells was pre-amplified by in vitro transcription using T7 RNA polymerase in a pool. The pools of amplified RNA from each lane of the microfluidic device were individually reverse transcribed using barcoded random hexamers containing both a unique molecular identifier (random 8-base barcode) followed by a lane-identifying barcode (6-base barcode). Illumina adapters were inserted on either end of the library during the two previous reverse transcription steps and were used to then enrich the library by PCR. The pooled library was sequenced on an Illumina NextSeq 500.
Data processingWe collected the set of reads that uniquely mapped to the transcriptome and assigned an address comprised of its cell-identifying barcode, gene, UMI, and mapping position. We then filtered the reads to identify unique molecules. Reads with identical addresses were collapsed to a single molecule. In addition, reads with identical cell-identifying barcodes, genes, mapping positions, and with UMIs having a Hamming distance less than or equal to two were collapsed to a single molecule. Finally, because the mapping positions produced by STAR do not necessarily correspond to the beginning of a read, we further considered reads to originate from identical molecules if they had identical genes, cell-identifying barcodes, UMIs with a Hamming distance less than or equal to two, and a mapping position within five bases.; To identify barcodes that correspond to actual individual cells in our device, we filtered the observed cell-identifying barcodes by progressively downsampling the corresponding gene profiles to the same number of total reads and assessing the number of unique molecules detected from each cell-identifying barcode. After excluding cell-identifying barcodes having zero associated molecules, we found the distribution of associated unique molecules to be bimodal, with one small subpopulation having nearly as many unique molecules as reads at low read totals. We found the size of this subpopulation to be in excellent agreement with our device imaging data. We took these 598 profiles to represent the actual individual cells captured in our device with a barcoded bead.; We conducted more detailed analysis on 370 single-cell profiles with the highest coverage in our data set across all five lanes of the microfluidic device. Raw fastq data from read 2 of those 370 cells is provided here. Note that the UMI for each read appears in the comment line of each fastq entry.; The processed data files contain the number of molecules counted for each gene based on counting reads with HTSeq and filtering the UMIs to identify unique molecules. If two UMIs had a Hamming distance less than three, they were considered to be the same UMI. If two reads with identical UMIs mapped to the transcriptome to within 6 bases of each other, they were considered identical molecules.; Genome_build: hg19; Supplementary_files_format_and_content: The processed data files contain the number of molecules counted for each gene based on counting reads with HTSeq and filtering the UMIs to identify unique molecules.
Read 1 of our single cell RNA-Seq data contains a cell-identifying barcode sequence followed by poly(dT), and read 2 contains a 8-base UMI followed by a 6-base lane-identifying barcode and a transcript sequence. We first demultiplex the reads based on the lane-identifying barcode while recording the corresponding UMI using a custom Python script. We then map the remainder of read 2 to the human genome and transcriptome (hg19, Ensembl annotation from Illumina iGenomes) using the STAR aligner. Mapped reads for each lane are then demultiplexed based on the cell-identifying barcodes in read 1 and assigned to a gene using HTSeq. Both the lane- and cell-identifying barcodes were allowed to have a single-base mismatch during demultiplexing. We note that >30% of reads map to the PhiX spike-in genome from the Illumina sequencing kit.; We collected the set of reads that uniquely mapped to the transcriptome and assigned an address comprised of its cell-identifying barcode, gene, UMI, and mapping position. We then filtered the reads to identify unique molecules. Reads with identical addresses were collapsed to a single molecule. In addition, reads with identical cell-identifying barcodes, genes, mapping positions, and with UMIs having a Hamming distance less than or equal to two were collapsed to a single molecule. Finally, because the mapping positions produced by STAR do not necessarily correspond to the beginning of a read, we further considered reads to originate from identical molecules if they had identical genes, cell-identifying barcodes, UMIs with a Hamming distance less than or equal to two, and a mapping position within five bases.; To identify barcodes that correspond to actual individual cells in our device, we filtered the observed cell-identifying barcodes by progressively downsampling the corresponding gene profiles to the same number of total reads and assessing the number of unique molecules detected from each cell-identifying barcode. After excluding cell-identifying barcodes having zero associated molecules, we found the distribution of associated unique molecules to be bimodal, with one small subpopulation having nearly as many unique molecules as reads at low read totals. We found the size of this subpopulation to be in excellent agreement with our device imaging data. We took these 598 profiles to represent the actual individual cells captured in our device with a barcoded bead.; We conducted more detailed analysis on 396 single-cell profiles with the highest coverage in our data set across all five lanes of the microfluidic device for PS034 and 247 single-cell profiles from PS041. Raw fastq data from read 2 of those 396 cells from PS034 and 247 cells from PS041 are provided here. Note that the UMI for each read appears in the comment line of each fastq entry.; The processed data files contain the number of molecules counted for each gene based on counting reads with HTSeq and filtering the UMIs to identify unique molecules. If two UMIs had a Hamming distance less than three, they were considered to be the same UMI. If two reads with identical UMIs mapped to the transcriptome to within 6 bases of each other, they were considered identical molecules. In addition, any molecules considered to be identical by the above defintion, but that appeared in two different cells were eliminated to mitigate any sources of cross-talk resulting from PCR recombination; Genome_build: hg19; Supplementary_files_format_and_content: The processed data files contain the number of molecules counted for each gene based on counting reads with HTSeq and filtering the UMIs to identify unique molecules.
PlatformGPL18573
Public OnPublic on May 21 2015

Cell Groups

Differential Expression Gene List

KEGG GO Others   

Gene SymbolEnsembl IDFDR
__ambiguous[PCSK7+TAGLN]8.27172174270978e-09
PITX1ENSG000000690112.51974412457122e-05
H3F3BENSG000001324753.39731684573505e-05
IGFBP4ENSG000001417533.86971604162343e-05
ANXA6ENSG000001970434.23585549372763e-05
RANENSG000001323417.70207279301249e-05
ATP5G3ENSG000001545189.62116127657263e-05
MTRNR2L3ENSG000002562229.62116127657263e-05
EIF3EENSG000001044089.94997245801655e-05
CYCSENSG000001721150.00012653031894827
Displaying 1-10 of 77 results.