文献里面用到的基因组注释方法(不包括重复序列和ncRNA)

Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement (NG, 2019)

1、同源比对注释

For homolog evidence, 744,030 annotated protein sequences of six species (Arabidopsis thaliana, Brachypodium distachyon, Oryza sativa, Setaria italica, Sorghum bicolor, Zea mays) were aligned to the genome using exonerate, and then clustered and filtered to result in the final homolog gene set.

2、转录组注释

Generated 327,904 high-quality full-length transcripts from Iso-seq and 1,795,841 Trinity-assembled transcripts from the RNA-seq. The transcripts from RNA-seq and Iso-Seq were further validated by PASA.

3、de novo

we used Augustus and FGENESH trained on 2,000 homolog genes which were supported by Iso-Seq full-length transcripts and monocots transcripts, respectively.

4、整合

All the evidence was submitted to MAKER resulting in 40,936 gene models and 48,224 transcripts. The output of MAKER was refined again by PASA only retaining the validated transcripts.

The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication

1、同源比对注释

2、转录组注释

RNA-seq and Iso-Seq reads were mapped onto the reference genome using TopHat and Bowtie 2, respectively. Hints with locations of potential intron–exon boundaries were generated from the alignment files with the software package BAM2 hints in the MAKER package. MAKER with AUGUSTUS was then used to predict genes in the repeat-masked reference genome.

3、de novo

AUGUSTUS, SNAP and GeneMark were used for ab initio gene prediction, using model training based on coding sequences from A. ipaensis, A. duranensis, G. max and A. thaliana.

4、整合

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. (NG, 2018)

使用MAKER做了2轮分析,并且又手动做了很多调整。这里只记录第一轮,详见文章内容。
1、同源比对注释

2、转录组注释
Trinity assembled transcripts (genome-guided) were fed to PASA. The PASA-assembled transcripts were used for training.

3、de novo

SNAP, GENEMARK and AUGUSTUS, were each trained with those selected proteins.

4、整合

MAKER pipeline was used to integrate multiple tiers of coding evidence, including ab initio gene prediction, transcript evidence and protein evidence and generate a comprehensive set of protein-coding genes.

Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense

1、同源比对注释

For the homolog-based approach, GeMoMa (version 1.3.1) software was applied by using protein sequences from Populus trichocarpa, Arabidopsis thaliana, Vitis vinifera, Theobroma cacao and Gossypium raimondii.

2、转录组注释

For the transcript-based prediction, the Hisat (version 2.0.4) and Stringtie (version 1.2.3) programs were used to carry out reference-based transcriptome assembly (data from NCBI BioProject of PRJNA248163 and PRJNA266265). TransDecoder (version2.0; https://github.com/TransDecoder/TransDecoder/) and GeneMarkS-T (version 5.1) were used to predict genes based on transcripts. The PASA (version 2.0.2) software was used to predict genes based on unigenes and full-length transcripts from the PacBio sequencing.

3、de novo

For the de novo prediction, five software programs were used, including Genscan, Augustus (version 2.4), GlimmerHMM (version 3.0.4), GeneID (version 1.4) and SNAP (version 2006-07-28) to scan the repeat-masked genome.

4、整合

Gene models from these different approaches were combined using the EVM software (version 1.1.1).

The rubber tree genome reveals new insights into rubber production and species adaptation (NP, 2016)

1、同源比对注释

SPALN was used for protein homologue search with the parameter “-Q4 –O0 –M10 –H180” against proteins in Malpighiales from NRDB and Uniprot

2、转录组注释

the assembled transcripts from transcriptome sequencing were used to construct gene models by the PASA software for training the predictors, as well as extracting the most possible coding sequences (CDS) with the PASA inner-built Transdecoder program.

3、de novo

Four HMM based predictors for ab initio prediction were used, namely AUGUSTUS, GlimmerHMM, SNAP, and FGENESH++. The first three predictors were trained with PASA-built training sets and the FGENESH++ was run with pre-trained parameters specialized for Hevea.

4、整合

All results from the three types of prediction were integrated by EVM software.

除此之外还写了脚本对上述结果进行了过滤,使用了4个标准,详见文章

Finally, all gene models were updated and curated by PASA software to confirm the UTR region and alternative splicing form. Highly repetitive genes, such as “Retro-transposon”, were manually removed from the candidates.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 205,236评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,867评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,715评论 0 340
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,899评论 1 278
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,895评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,733评论 1 283
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,085评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,722评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,025评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,696评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,816评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,447评论 4 322
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,057评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,009评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,254评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,204评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,561评论 2 343

推荐阅读更多精彩内容