Running bismark之后
1. test_dataset_bismark_bt2.bam (contains all alignments plus methylation call strings)
2. test_dataset_bismark_SE_report.txt (contains alignment and methylation summary)
用samtools view -h filename查看目录下的bam文件
其中文件每一列代表的信息如下:
QNAME (seq-ID)
FLAG (this flag tries to take the strand a bisulfite read originated from into account (this is different from ordinary DNA alignment flags!))
RNAME (chromosome)
POS (start position)
MAPQ (calculated for Bowtie 2 and HISAT2)
CIGAR
RNEXT
PNEXT
TLEN
SEQ
QUAL (Phred33 scale)
NM-tag (edit distance to the reference)
MD-tag (base-by-base mismatches to the reference)
XM-tag (methylation call string)
XR-tag (read conversion state for the alignment)
XG-tag (genome conversion state for the alignment)
特别是XM-tag所对应的Methylation call,我们可以看到它的格式一般是这样的:XM:Z:h..x...h..,分别是什么意思呢?
z - C in CpG context - unmethylated
Z - C in CpG context - methylated
x - C in CHG context - unmethylated
X - C in CHG context - methylated
h - C in CHH context - unmethylated
H - C in CHH context - methylated
u - C in Unknown context (CN or CHN) - unmethylated
U - C in Unknown context (CN or CHN) - methylated
. - not a C or irrelevant position
具体的对bam文件的解读可以看这个链接https://www.jianshu.com/p/ba89ec471dfe
Running bismark_methylation_extractor 之后
CpG_context_test_dataset_bismark_bt2.txt.gz
CHG_context_test_dataset_bismark_bt2.txt.gz
CHH_context_test_dataset_bismark_bt2.txt.gz
以及bedGraph和Bismark coverage file。
the output of the methylation extractor can be transformed into a bedGraph and coverage file using the option --bedGraph
bedGraph和Bismark coverage file是甲基化提取文件的另一种整合(transform)。
methylation extractor output也就是上面的gz压缩包的文件打开呢,是长这样的:
HWUSI-EAS611_0006:3:1:1058:15806#0/1 - 6 91793279 z
HWUSI-EAS611_0006:3:1:1058:17564#0/1 + 8 122855484 Z
每一列代表的信息如下
1. seq-ID
2. methylation state
3. chromosome
4. start position (= end position)
5. methylation call
bedgraph文件:
每列的信息如下:
<chromosome> <start position> <end position> <methylation percentage>
Bismark coverage file包含的信息再多一点,包含了不仅有比例,也有具体的数目。
<chromosome> <start position> <end position> <methylation percentage> <count methylated> <count unmethylated>
Running bismark2report
find Bismark alignment, deduplication and methylation extraction (splitting) reports as well as M-bias files
寻找bismark比对,去除重复,甲基化数据提取以及偏差文件
就是对比对结果的一个可视化
The M-bias plot
The M-bias plot can for example show the methylation bias at the start of reads in PBAT-Seq experiments(在methylation extractor步骤中产生)
Running bismark2summary
先识别bam文件,再根据这个bam文件扫描当前目录中:different Bismark alignment, deduplication and methylation extraction (splitting) reports
均生成HTML格式的文件