本文主要是介绍Linear Alignment 与 Chimeric Alignment,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
reads比对到参考序列后,bam文件中会有2048、2064这样的flag,表示supplementary alignment 。 为了理解这个概念,可能需要以下知识。
Linear Alignment
An alignment of a read to a single reference sequence that may include insertions, deletions, skips and clipping, but may not include direction changes (i.e. one portion of the alignment on forward strand and another portion of alignment on reverse strand). 1
Chimeric Alignment
An alignment of a read that cannot be represented as a linear alignment. Typically, one of the linear alignments in a chimeric alignment is considered the “representative” alignment, and the others are called “supplementary” and are distinguished by the supplementary alignment flag. 1
Chimeric reads are indicative of structural variation in DNA-seq and it may indicate the presence of chimeric genes in RNA-seq. 2
In short, chimeric reads can be split in to two or more parts, each part would be mapped to reference(it’s not hard-clipped), the total length of the mapped part is longger than read length. 3
Representative alignment
A chimeric alignment that is represented as a set of linear alignments that do not have large overlaps typically has one linear alignment that is considered the representative alignment.
I don’t understand representative alignment with the word “representative” in my mother tongue and could not find more information(figure) about it. One read can align to multiple positions, we can find one alignmnet position which sequence do not have large overlaps, it called representative alighment, for other alignment positions, we called them supplementary alignment.
It seems that GATK can realignment those representative reads to the correctly position via RealignerTargetCreator and IndelRealigner. (WARNING: I am not quite sure if I understand this correctly. If someone could help me, please leave me a message below, thanks, thanks.)
Supplementary Alignment
A chimeric reads but not a representative reads.
Primary Alignment and Secondary Alignment
A read may map ambiguously to multiple locations, e.g. due to repeats. Only one of the multiple read alignments is considered primary, and this decision may be arbitrary. All other alignments have the secondary alignment flag.
这篇关于Linear Alignment 与 Chimeric Alignment的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!