DeepVISP (Deep learning for viral integration sites prediction)



  Visualization of the kernel-specific motifs of the first convolutional layer in DeepVISP and distribution of PWMs of each motif.


The visualizations are based on the benchmark data set. First, we need the output of the first convolutional layer given the DNA sequences. This output is called 'activations'. You can think of a kernel as a position-weight matrix (PWM) and activations are the result of sliding with the kernel over an input sequence, one position at a time. The result is a vector of numbers and the position of the maximum activation shows to which subsequence the kernel is most similar. If the maximum activation is above a certain threshold we can extract the corresponding subsequence of the length of the kernel from the input starting at that position. If we do this for all input sequences we get a number of subsequences, all of the same length, and we can compute a PWM. In addition, we drew the heatmaps to show the distributions of the PWMs for each motif.