As summarized in Figure 1 below, the NGS-based CNV detection methods can be categorized into five different strategies, including: (1) paired-end mapping (PEM), (2) split read (SR), (3) read depth (RD), (4) de novo assembly of a genome (AS), and (5) a combination of the above approaches (CB) . |
Table 1 - Summary of paired-end mapping (PEM), split read (SR), and de novo assembly (AS)-based tools for CNV detection using NGS data
Method |
Language |
Input |
Comments |
Ref. |
PEM-based |
BreakDancer |
http://breakdancer.sourceforge.net/ |
Perl, C++ |
Alignment files |
Predicting insertions, deletions, inversions, inter- and intra-chromosomal translocations |
[1] |
PEMer |
http://sv.gersteinlab.org/pemer/ |
Perl, Python |
Using simulation-based error models to call SVs |
[2] |
VariationHunter |
http://compbio.cs.sfu.ca/strvar.htm |
C |
Detecting insertions, deletions and inversions |
[3] |
commonLAW |
http://compbio.cs.sfu.ca/strvar.htm |
C++ |
Alignment files |
Aligning multiple samples simultaneously to gain accurate SVs using maximum parsimony model |
[4] |
http://code.google.com/p/gasv/ |
Java |
A geometric approach for classification and comparison of structural variants |
[5] |
Spanner |
N/A |
N/A |
N/A |
Using PEM to detect tandem duplications |
[6] |
SR-based |
http://sv.gersteinlab.org/age |
C++ |
A dynamic-programming algorithm using optimal alignments with gap excision to detect breakpoints |
[7] |
Pindel |
http://www.ebi.ac.uk/~kye/pindel/ |
C++ |
Using a pattern growth approach to identify breakpoints of various SVs |
[8] |
http://www-genepi.med.utah.edu/suppl/SLOPE |
C++ |
Locating SVs from targeted sequencing data |
[9] |
SRiC |
N/A |
N/A |
BLAT output |
CalibratingSV calling using realistic error models |
[10] |
AS-based |
Magnolya |
http://sourceforge.net/projects/magnolya/ |
Python |
Calling CNV from co-assembled genomes and estimating copy number with Poisson mixture model |
[11] |
Cortex assembler |
http://cortexassembler.sourceforge.net/ |
C |
Using alignment of de novo assembled genome to build de Bruijn graph to detect SVs |
[12] |
http://gmt.genome.wustl.edu/tigra-sv/ |
C |
SV callsc + BAM |
Local assembly of SVs using the iterative graph routing assembly (TIGRA) algorithm |
N/A |
aThe specific input format for VariationHunter, including the reads with multiple alignments. bFile format from MAQ mapview. cThe file including the detected structure variations using other tools. |
Table 2 - Read depth (RD)-based tools for CNV detection using whole genome sequencing data
Tool |
Language |
Input |
Comments |
Ref. |
SegSeqa |
http://www.broadinstitute.org/cgi-bin/cancer/publications/pub_paper.cgi?mode=view&paper_id=182 |
Matlab |
Aligned read positions |
Detecting CNV breakpoints using massively parallel sequence data |
[13] |
CNV-seqa |
http://tiger.dbs.nus.edu.sg/cnv-seq/ |
Perl, R |
Aligned read positions |
Identifying CNVs using the difference of observed copy number ratios |
[14] |
RDXplorerb |
http://rdxplorer.sourceforge.net/ |
Python, Shell |
Detecting CNVs through event-wise testing algorithm on normalized read depth of coverage |
[15] |
BIC-seqa |
http://compbio.med.harvard.edu/Supplements/PNAS11.html |
Perl, R |
Using the Bayesian information criterion to detect CNVs based on uniquely mapped reads |
[16] |
CNAsega |
http://www.compbio.group.cam.ac.uk/software/cnaseg |
R |
Using flowcell-to-flowcell variability in cancer and control samples to reduce false positives |
[17] |
cn.MOPSb |
http://www.bioinf.jku.at/software/cnmops/ |
R |
BAM/read count matrices |
Modelling of read depths across samples at each genomic position using mixture Poisson model |
[18] |
JointSLMb |
http://nar.oxfordjournals.org/content/suppl/2011/02/16/ |
R |
Population-based approach to detect common CNVs using read depth data |
[19] |
ReadDepth |
http://code.google.com/p/readdepth/ |
R |
BED files |
Using breakpoints to increase the resolution of CNV detection from low-coverage reads |
[20] |
rSW-seqa |
http://compbio.med.harvard.edu/Supplements/BMCBioinfo10-2.html |
C |
Aligned read positions |
Identifying CNVs by comparing matched tumor and control sample |
[21] |
CNVnator |
http://sv.gersteinlab.org/ |
C++ |
Using mean-shift approach and performing multiple-bandwidth partitioning and GC correction |
[22] |
CNVnorma |
http://www.precancer.leeds.ac.uk/cnanorm |
R |
Aligned read positions |
Identifying contamination level with normal cells |
[23] |
https://dsgweb.wustl.edu/qunyuan/software/cmds |
C, R |
Aligned read positions |
Discovering CNVs from multiple samples |
[24] |
mrCaNaVar |
http://mrcanavar.sourceforge.net/ |
C |
A tool to detect large segmental duplications and insertions |
[25] |
N/A |
N/A |
N/A |
Predicting CNV breakpoints in base-pair resolution |
[26] |
cnvHMM |
http://genome.wustl.edu/software/cnvhmm |
C |
consensus sequence from SAMtools |
Using HMM to detect CNV |
N/A |
aTools require matched case-control sample as input. bTools use multiple samples as input. |
Table 3 - Summary of bioinformatics tools for CNV detection using exome sequencing data
Tool |
Language |
Input |
Comments |
Ref. |
Control-FREECa |
http://bioinfo-out.curie.fr/projects/freec/ |
C++ |
SAM/BAM/pileup/ Eland, BED, SOAP, arachne, psi (BLAT) and Bowtie formats |
Correcting copy number using matched case-control samples or GC contents |
[27] |
http://conifer.sf.net/ |
Python |
Using singular value decomposition to normalize copy number and avoiding batch bias by integrating multiple samples |
[28] |
http://atgu.mgh.harvard.edu/xhmm/ |
C++ |
Uses principal component analysis to normalize copy number and HMM to detect CNVs |
[29] |
ExomeCNVc |
http://cran.r-project.org/web/packages/ExomeCNV |
R |
BAM/pileup |
Using read depth and B-allele frequencies from exome sequencing data to detect CNVs and LOHs |
[30] |
http://contra-cnv.sourceforge.net/ |
Python |
Comparing base-level log-ratios calculated from read depth between case and control samples |
[31] |
http://code.google.com/p/condr/ |
Java |
Sorted BED files |
Using HMM to identify CNVs |
[32] |
SeqGene |
http://seqgene.sourceforge.net |
Python, R |
SAM/pileup |
Calling variants, including CNVs, from exome sequencing data |
[33] |
PropSeqc |
http://bioinformatics.nki.nl/ocs/ |
R, C |
N/A |
Using the read depth of the case sample as a linear function of that of control sample to detect CNVs |
[34] |
VarScan2c |
Java |
BAM/pileup |
Using pairwise comparisons of the normalized read depth at each position to estimate CNV |
[35] |
ExoCNVTestb |
http://www1.imperial.ac.uk/medicine/people/l.coin/ |
Java, R |
Identifying and genotyping common CNVs associated with complex disease |
[36] |
ExomeDepthb |
http://cran.r-project.org/web/packages/ExomeDepth/index.html |
R |
Using beta-binomial model to fit read depth of WES data |
[37] |
aControl-FREEC accepts either matched case-control samples or single sample as input. bTools use multiple samples as input. cTools require matched case-control samples as input. |
Table 4 - Combinatorial bioinformatics tools for CNV detection using NGS data
Method |
Language |
Input |
Combinationa |
Ref. |
NovelSeq |
http://compbio.cs.sfu.ca/strvar.htm |
C |
[38] |
http://code.google.com/p/hydra-sv/ |
Python |
discordant paired-end mappings |
[39] |
CNVer |
http://compbio.cs.toronto.edu/CNVer/ |
Perl, C++ |
BAM/ aligned positions |
[40] |
http://code.google.com/p/gasv/ |
C++ |
[41] |
Genome STRiP |
http://www.broadinstitute.org/software/ |
Java, R |
[42] |
SVDetect |
http://svdetect.sourceforge.net/ |
Perl |
[43] |
inGAP-sv |
http://ingap.sourceforge.net/ |
Java |
[44] |
SVseq |
http://www.engr.uconn.edu/~jiz08001/svseq.html |
C |
[45] |
Nord et al. |
N/A |
N/A |
N/A |
[46] |
aRD: read depth-based approach; PEM: paired-end mapping approach; SR: split read approach; AS: de novo assembly approach. |
