文章链接:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6426083/#ref-11
文章题目:Relative Abundance of Transcripts ( RATs): Identifying differential isoform abundance from RNA-seq
生词
DGE: differential gene expression
DTU: differential transcript (isoform) usage
DTE: differential transcript expression
背景
1.RATs: an R package that identifies DTU transcriptome-wide directly from transcript abundance estimates.RATs is unique in applying bootstrapping to estimate the reliability of detected DTU events and shows good performance at all replication levels (median false positive fraction < 0.05).
2.接下来作者对比提到了两外两种可以寻找DTU的R包:DRIM-Seq & SUPPA2并且罗列了一系列优势,文章后面也会对比上面这两种方法进行对比。
3.关于基因和DTU的关系:detect 248 genes have been previously identified as displaying significant DTU.
4.更细节的介绍了DGE、DTE及DTU,最主要的就是DGE和DTE的区别,原文描述非常清楚,英文就很清晰明了,没法删减,字字珠玑,如下:
High-throughput gene regulation studies have focused primarily on quantifying gene expression and calculating differential gene expression (DGE) between samples in different groups, conditions, treatments, or time-points. However, in higher eukaryotes, alternative splicing of multi-exon genes and/or alternative transcript start and end sites leads to multiple transcript isoforms originating from each gene. Since transcripts represent the executive form of genetic information, analysis of differential transcript expression (DTE) is preferable to DGE. Unfortunately, isoform-level transcriptome analysis is more complex and expensive since, in order to achieve similar statistical power in a DTE study, higher sequencing depth is required to compensate for the expression of each gene being split among its component isoforms. In addition, isoforms of a gene share high sequence similarity and this complicates the attribution of reads among them. Despite these challenges, several studies have shown that isoforms have distinct functions 1– 3 and that shifts in individual isoform expression represent a real level of gene regulation 4– 7, suggesting there is little justification for choosing DGE over DTE in the study of complex transcriptomes.
It is possible to find significant DTE among the isoforms of a gene, even when the gene shows no significant DGE.
关于DTU就是This introduces the concept of differential transcript usage (DTU), where the abundances of individual isoforms of a gene can change relative to one another, with the most pronounced examples resulting in a change of the dominant isoform (isoform switching).
最后呢,一张图片说明三者的异同点
现有的分析DTU的工具
引用11、12、13、14 分别是
后面太多了 介绍了好多软件 ,推荐看原文描述的:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6426083/#ref-11
最后作者表明We assess the accuracy of RATs in comparison to SUPPA2 and DRIM-Seq and find RATs to perform at as well as or better than its competitors
方法
中间省略
第一个测试数据,来源于另一篇文章如下SRA号
RNA-seq data were submitted to the NCBI Short Read Archive with accession number SRA048904.
另一个数据是 Ensembl v60
As in the original at study, we used Ensembl v60 32 as the source of the reference human genome and its annotation, in which each of the three discussed genes features two isoforms. Unlike the original study, we used Salmon (v0.7.1, with sequence bias correction enabled, 100 bootstrap iterations and default values for the remaining parameters, using k=21 for the index) to quantify the isoform abundances.
关于数输入数据的格式For consistency in the use of abundances normalised for transcript length, RATs and DRIM-Seq were also provided with TPM, but the values were scaled up to the average library size of 25M reads, as their testing methods expect counts and would be under-powered if used directly with TPMs.
结果
结果一 gene-level and transcript-level的假阳性率的差异
关于Fig.2的描述,如下
Fig.2 shows that the gene-level test has a higher FP fraction than the transcript-level test, irrespective of replication level or effect size, although the two methods converge for highly replicated experiments or large effect sizes. Furthermore, the gene-level test only identifies the presence of a shift in the ratios of the isoforms belonging to the gene, without identifying which specific isoforms are affected. The transcript-level test, in contrast, directly identifies the specific isoforms whose proportions are changing and has fewer false positives than the gene-level test.
结果二 Comparative performance on simulated DTU比较在推算DTU上的差异
关于上图的Qrep和Rrep分别解释如下:
(RATs quantification reproducibility – Qrep, RATs inter-replicate reproducibility - Rrep)
结果三Recapitulating published validated examples of DTU
这个结果三就是重新计算曾发表的一篇文章中的DTU,不过有一点值得说,下面table一开始我还以为是自己理解有问题,其实不是,看了下面的解释就知道了确确实实没有另一种SUPPA2表现好**,但是作者还是给列出来了
上面这个是说随ensembl版本的临近,已知的转录本的数目在提高
结果四 Comparison of DTU methods against Deng et al.
就是利用这篇文章 中的数据再次进行比较三种方法的比较https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3699530/,如果想要练习这个包的用法就可以利用这里面的数据进行
总结
收获一:感觉这一篇没有前两篇阅读起来的舒适感,可能是这篇文章涵盖的点太少,内容有点单一。相比较上一篇,用了4个case来验证那个包,这里面仅用了一个数据集,另一个就是在重复Ensembl数据库上前后的差异,所以内容上 就少了很多。而实际上什么样的文章都是可以发表出去的,仅仅是分不同罢了。上篇No.2是14分多,这篇已经没有影响因子分数了,之前是2分多。所以完全不用担心文章能不能发出去,投分低点的杂志就好了。
收获二:下次 对于这种分数不太高的文章 可以瞄一眼以后直接看它的R包用法了,直接定位到文章中的测试数据集或者是R包教程中的测试数据集,同时因为文章中对于与其他两种方法进行了对比,也可以知道两外两种方法的用法,分数不高的话,在写作上可借鉴的也不是很多,看起来觉得从句什么的都不顺畅,这仅仅是我的个人感触。
收获三:光说不练假把式,一定要加强练习。找寻可以用到这个R包的点,就是DTE-differential transcript expression 。先有个大致了解,后续再去进行相应搜索,循序渐进,一口吃不了个大胖子呀~