『R画图脚本进阶』单个基因结构绘制


2020.11.10更新代码
emmm, 稍微想了下把那个辣眼睛的代码改的稍微不辣眼睛了一点...


果然需求是生产力驱动的第一要素.....以前没想过要整,现在又需求了,其实TBtools就可以很方便的绘制,不过还是手痒想用R实现一下。就用ggplot2撸出来了。国际惯例,上图

Gh_A08G039000.jpg

方法和原理

其实就是把gff文件可视化一下,看GFF文件的介绍:陈连福大佬的博客-GFF3格式介绍
我们发现GFF3包含了基因整体信息,所以就想**根据gff文件信息,提取UTR和cds信息用ggplot2geom_segment小片段一段一段的画出来。其实很简单。注意以下几个信息:

  • strand,方向性。
  • exon是否包含UTR。

所以鉴于这两点也是写了两个判断。直来直去的代码有点辣眼睛,不过没时间了,先解决问题,以后在慢慢美化ಥ_ಥ。

Script

更新后, 之前的不删除了,看最后能优化成什么样

drawGeneStructure = function(gff,size,lcolor){
  names(gff) = c("chr","source","type","start","end","score","strand","phase","attributes")
  chr = unique(gff$chr)
  strand = unique(gff$strand)
  a = filter(gff,type == "gene")$attributes
  desc = strsplit(a,split = ";")%>%unlist(.)
  tpm1 = filter(gff,type == "exon")
  tpm3 = filter(gff,type == "five_prime_UTR"| type == "three_prime_UTR")
  if (strand == "-"){
    tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                      end = tpm1$start,
                      start = tpm1$end,
                      y = 1)
    xend = tpm2$end[1]-100
  } else {
    tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                      start = tpm1$start,
                      end = tpm1$end,
                      y = 1)
    xend = tpm2$end[1]+100
  }
  p = ggplot(tpm2,aes(x = start,y = y))+
    geom_segment(aes(x = tpm2$start[1],y = 0, xend = tpm2$end[nrow(tpm2)], yend = 0),
                 color = lcolor)+
    geom_segment(data = tpm2,
                 mapping = aes(x = start,xend = end,y = 0, yend = 0,color = type),
                 size = size)+
    geom_segment(aes(x = tpm2$end[1],y = 0, xend = xend, yend = 0),
                 arrow = arrow(length = unit(0.2,"cm")),
                 color = "red")+
    xlab(chr)+
    theme(axis.ticks.y.left = element_blank(),
          axis.ticks.y = element_blank(),
          axis.line.y = element_blank(),
          axis.line.x.top = element_blank(),
          panel.grid = element_blank(),
          axis.text.y = element_blank(),
          plot.background = element_blank(),
          panel.background = element_blank(),
          axis.title.y = element_blank(),
          axis.line = element_line(colour = "black",size = 1),
          legend.position = "none")
  
  if(nrow(tpm3) != 0){
    p = p +
      geom_segment(data = tpm3,
                   mapping = aes(x = start,xend = end,y = 0, yend = 0),
                   size = size,
                   color = "grey")
  } else {
    p = p
  }
  return(p)
}

原始辣眼睛代码

drawGeneStructure = function(gff,size,lcolor){
  names(gff) = c("chr","source","type","start","end","score","strand","phase","attributes")
  chr = unique(gff$chr)
  strand = unique(gff$strand)
  a = filter(gff,type == "gene")$attributes
  desc = strsplit(a,split = ";")%>%unlist(.)
  tpm1 = filter(gff,type == "exon")
  tpm3 = filter(gff,type == "five_prime_UTR"| type == "three_prime_UTR")
  if (nrow(tpm3) != 0) {
    if (strand == "-") {
      tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                        end = tpm1$start,
                        start = tpm1$end,
                        y = 1)
      tpm3 = data.frame(type = paste(tpm3$type,seq(1:nrow(tpm3)),sep = ""),
                        end = tpm3$start,
                        start = tpm3$end)
      p = ggplot(tpm2,aes(x = start,y = y))+
        geom_segment(aes(x = tpm2$start[1],y = 0, xend = tpm2$end[nrow(tpm2)], yend = 0),
                     color = lcolor)+
        geom_segment(data = tpm2,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0,color = type),
                     size = size)+
        geom_segment(data = tpm3,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0),
                     size = size,
                     color = "grey")+
        geom_segment(aes(x = tpm2$end[1],y = 0, xend = tpm2$end[1]-100, yend = 0),
                     arrow = arrow(length = unit(0.2,"cm")),
                     color = "red")+
        xlab(chr)+
        theme(axis.ticks.y.left = element_blank(),
              axis.ticks.y = element_blank(),
              axis.line.y = element_blank(),
              axis.line.x.top = element_blank(),
              panel.grid = element_blank(),
              axis.text.y = element_blank(),
              plot.background = element_blank(),
              panel.background = element_blank(),
              axis.title.y = element_blank(),
              axis.line = element_line(colour = "black",size = 1),
              legend.position = "none")
    } else {
      tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                        start = tpm1$start,
                        end = tpm1$end,
                        y = 1)
      p = ggplot(tpm2,aes(x = start,y = y))+
        geom_segment(aes(x = tpm2$start[1],y = 0, xend = tpm2$end[nrow(tpm2)], yend = 0),
                     color = lcolor)+
        geom_segment(data = tpm2,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0,color = type),
                     size = size)+
        geom_segment(data = tpm3,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0),
                     size = size,
                     color = "grey")+
        geom_segment(aes(x = tpm2$end[1],y = 0, xend = tpm2$end[1]+100, yend = 0),
                     arrow = arrow(length = unit(0.2,"cm")),
                     color = "red")+
        xlab(chr)+
        theme(axis.ticks.y.left = element_blank(),
              axis.ticks.y = element_blank(),
              axis.line.y = element_blank(),
              axis.line.x.top = element_blank(),
              panel.grid = element_blank(),
              axis.text.y = element_blank(),
              plot.background = element_blank(),
              panel.background = element_blank(),
              axis.title.y = element_blank(),
              axis.line = element_line(colour = "black",size = 1),
              legend.position = "none")
    }
  } else{
    if (strand == "-") {
      tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                        end = tpm1$start,
                        start = tpm1$end,
                        y = 1)
      p = ggplot(tpm2,aes(x = start,y = y))+
        geom_segment(aes(x = tpm2$start[1],y = 0, xend = tpm2$end[nrow(tpm2)], yend = 0),
                     color = lcolor)+
        geom_segment(data = tpm2,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0,color = type),
                     size = size)+
        geom_segment(aes(x = tpm2$end[1],y = 0, xend = tpm2$end[1]-100, yend = 0),
                     arrow = arrow(length = unit(0.2,"cm")),
                     color = "red")+
        xlab(chr)+
        theme(axis.ticks.y.left = element_blank(),
              axis.ticks.y = element_blank(),
              axis.line.y = element_blank(),
              axis.line.x.top = element_blank(),
              panel.grid = element_blank(),
              axis.text.y = element_blank(),
              plot.background = element_blank(),
              panel.background = element_blank(),
              axis.title.y = element_blank(),
              axis.line = element_line(colour = "black",size = 1),
              legend.position = "none")
    } else {
      tpm2 = data.frame(type = paste(tpm1$type,seq(1:nrow(tpm1)),sep = ""),
                        start = tpm1$start,
                        end = tpm1$end,
                        y = 1)
      p = ggplot(tpm2,aes(x = start,y = y))+
        geom_segment(aes(x = tpm2$start[1],y = 0, xend = tpm2$end[nrow(tpm2)], yend = 0),
                     color = lcolor)+
        geom_segment(data = tpm2,
                     mapping = aes(x = start,xend = end,y = 0, yend = 0,color = type),
                     size = size)+
        geom_segment(aes(x = tpm2$end[1],y = 0, xend = tpm2$end[1]+100, yend = 0),
                     arrow = arrow(length = unit(0.2,"cm")),
                     color = "red")+
        xlab(chr)+
        theme(axis.ticks.y.left = element_blank(),
              axis.ticks.y = element_blank(),
              axis.line.y = element_blank(),
              axis.line.x.top = element_blank(),
              panel.grid = element_blank(),
              axis.text.y = element_blank(),
              plot.background = element_blank(),
              panel.background = element_blank(),
              axis.title.y = element_blank(),
              axis.line = element_line(colour = "black",size = 1),
              legend.position = "none")
    }
  }
  
return(p)
}

Usage

  • gff3文件提取你要画的基因的结构,如果没有gff3,按照下面的自己写一个df也行:
# 提取
grep "Gh_A08G039000" gh.criv2.gff3
# 提取结果:
A08     EVM     gene    4148620 4154669 .       -       .       ID=Gh_A08G039000;Name=RH8;description=DEAD-box ATP-dependent RNA helicase 8 ;
A08     EVM     mRNA    4148620 4154669 .       -       .       ID=Gh_A08G039000.1;Parent=Gh_A08G039000
A08     EVM     exon    4148620 4148965 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     three_prime_UTR 4148620 4148965 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     three_prime_UTR 4149091 4149130 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     exon    4149091 4149209 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4149131 4149209 .       -       1       Parent=Gh_A08G039000.1
A08     EVM     exon    4149308 4149381 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4149308 4149381 .       -       0       Parent=Gh_A08G039000.1
A08     EVM     exon    4149465 4149553 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4149465 4149553 .       -       2       Parent=Gh_A08G039000.1
A08     EVM     exon    4150111 4150291 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4150111 4150291 .       -       0       Parent=Gh_A08G039000.1
A08     EVM     exon    4150367 4150618 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4150367 4150618 .       -       0       Parent=Gh_A08G039000.1
A08     EVM     exon    4151917 4152155 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4151917 4152155 .       -       2       Parent=Gh_A08G039000.1
A08     EVM     exon    4152683 4152917 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4152683 4152917 .       -       0       Parent=Gh_A08G039000.1
A08     EVM     exon    4153404 4153464 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     CDS     4153404 4153464 .       -       1       Parent=Gh_A08G039000.1
A08     EVM     CDS     4154015 4154283 .       -       0       Parent=Gh_A08G039000.1
A08     EVM     exon    4154015 4154669 .       -       .       Parent=Gh_A08G039000.1
A08     EVM     five_prime_UTR  4154284 4154669 .       -       .       Parent=Gh_A08G039000.1
  • 画图
    参数解释:
    • gff: 就是上面你提取出来的基因gff信息
    • size: 就是图中exon的高度,size越大高度越大,建议size = 4
    • lcolor: 中间那条黑线的颜色,这个看你爱好
drawGeneStructure(gff = gff, lcolor = "black", size = 4)

完工...
为了保住小徽章我也是醉了....

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 203,271评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,275评论 2 380
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 150,151评论 0 336
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,550评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,553评论 5 365
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,559评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,924评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,580评论 0 257
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,826评论 1 297
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,578评论 2 320
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,661评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,363评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,940评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,926评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,156评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,872评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,391评论 2 342

推荐阅读更多精彩内容