miRBase的miRNA命名更新官方说明

正常来讲,看懂以下图片,就已经可以明了关于miRNA命名的大部分问题了。


新版microRNA 命名体系简要示意图

(图片来源于网络,链接已迷失~~~)

以下内容选自miRBase命名法说明,翻译不当,敬请谅解。

What’s in a name?

名字有什么含义了?

As I briefly mentioned in a previous post, miRBase 17 included two conceptual changes in the miRNA nomenclature scheme, which deserve further detail and clarification.

正如我在之前的一则公告中提到的,miRNA17版本在命名规则上有了两个概念上的变化,在这里需要要做进一步的说明。

The name of a miRNA contains some human-readable information. If you stop reading this post halfway, you’ll likely think this is a good thing. Which of course it is, as long as we recognise the limitations. Hold on to the end and hopefully you’ll see that names can create some issues.

关于miRNA命名的可读性,如果你读这篇文章的时候半途而废,那该庆幸是一件好事。当然如果你坚持读下去了,你会发现有很多的问题。

Take for example, hsa-mir-20b. The “hsa” tells us it is a human miRNA. The “20″ tells us that was discovered early — it’s only the 20th family that was named. “20b” tells us that it is related to another miRNA that we can guess is probably called hsa-mir-20a. We can go further — the (lack of) capitalisation of “mir” tells us we’re talking about the miRNA precursor. Or maybe the genomic locus, or maybe the primary transcript, or maybe the extended hairpin that includes the precursor. So that’s already less useful.

比如hsa-mir-20b,hsa表示这是一个人类的miRNA,20代表第20个家族(排在第20位,可能发现的比较早),20b告诉我们它与另外一个miRNA有关,那个miRNA可能是hsa-mir-20a,mir表示miRNA前体,或者可能是基因组的位置,或者可能是初级转录物,或者是包括前体的发夹结构的延伸。

hsa-mir-20b has two mature products, named hsa-miR-20b and hsa-miR-20b* (as of this moment — as you’ll see below, this will change). “miR” tells us we’re talking about a mature sequence. In this case miR-20b arises from the 5′ arm of the mir-20b hairpin, and miR-20b* arises from the 3′ arm. The “” tells us that miR-20b is considered a “minor” product. That means miR-20b* is found in the cell at lower concentration than miR-20b. It is often inferred that miR-20b* is non-functional, and you’ve probably noticed that miR* sequences in general magically disappear in most pictures of miRNA biogenesis, while the dominant arm is magically incorporated into the RISC complex.

hsa-mir-20b有两个成熟体产物,分别是hsa-miR-20b 和 hsa-miR-20b* (现在是这样,但是后文会说到这个会改的)。这样的话,“miR”表示一个成熟体序列。miR-20b 来自于mir-20b发夹结构的5'臂,而miR-20b* 则来自于3'臂,带 “ * ” 的被认为是未成熟的产物,也就是说,miR-20b* 在细胞中的浓度比miR-20b 要低,一直以来人们推测 miR-20b* 是无作用的。你可能发现miR* 的序列经常在很多miRNA起源的图片中神奇的消失了,然而居然只有优势臂会和RISC结合。

But hang on a minute, a bunch of papers now tell us that miR* sequences can be functional (eg Yang et al. 2011), perhaps through binding different Agonaute proteins (a glut of papers in the past couple of years nicely reviewed by Czech and Hannon, 2011). And, of course, the miR* sequence from one hairpin might be expressed at orders of magnitude higher level than the dominant miR sequence from another hairpin. Perhaps the arm that makes the dominant product can change in different tissues, stages and species (G-J et al. 2011). Should we rename miR and miR* sequences every time someone produces an ever deeper sequencing dataset? To cap it all, the “*” character causes problems for database searches and the like.

但是请再想一下,一些文献中告诉我们miR* 序列可能是有作用的( Yang et al. 2011),作用途径可能是通过结合不同的Ago蛋白(过去的两年里有大量的文献都提到了,Czech and Hannon, 2011)。当然,从发夹结构中的一条miR* 序列可能比另一条优势序列的表达量还要高一个水平,也有可能在不同组织、不同时期、不同物种中,优势序列的表达量也会不一样(G-J et al. 2011),那在产生一个深度测序数据中就要改变一下miR和miR* 序列的名字吗?那么加不加“*”就会对数据库的检索等操作带来麻烦。

We therefore intend to retire the miR/miR* nomenclature, in favour of the -5p/-3p nomenclature (the latter has been used in parallel for mature products of approximately equal expression, and will in future be applied to all sequences). We will make this transition in phases, as we can make companion data available to show the expression of mature products from each arm. In miRBase 17, all Drosophila melanogaster mature sequences are renamed as -5p/-3p, and many previously missing second mature products have been added. The available deep sequencing data makes clear which of the potential mature products is dominant. Other species will follow suit in due course.

所以为了解决miR/miR*的命名问题,我们提出了用 -5p/-3p的命名法(后者可以同时表示两个成熟产物而不考虑他们的表达量水平的高低,未来可能会应用在所有的序列上),我们会分阶段进行转换,并且会提供两个成熟体相关的表达量数据,在miRBase17版本中,所有的果蝇黑腹菌属成熟序列都用-5p/-3p来命名,还有很多之前没有的第二成熟体也增加进去了。现在的深度测序可以测到哪个成熟体可能是占优势的,后面我们也会按照这种个方式更新其他的物种。

The second change in miRBase 17 concerns the small number of pairs of miRNA sequences that are transcribed from the same locus in opposite directions — that is, sense/antisense pairs. For example, the dme-mir-307 locus has been shown to be transcribed in both directions, and both transcripts are processed to produce mature miRNAs. These miRNAs were previously named dme-mir-307 and dme-mir-307-as in miRBase. The -as is confusing, because it is similar to the suffixes used to denote families of related miRNAs. The classification of sense and antisense is arbitrary. To confuse matters further, -as and -s were used in early miRNA literature to refer to mature products produced from the 5′ and 3′ arms of a hairpin precursor. From miRBase 17 onwards, the -as nomenclature is retired. Sense and antisense miRNAs will be named independently and in the same way as all other sequences: If the sequences are similar then they get a, b suffixes (eg dme-mir-307a and dme-mir-307b), and if they are not deemed similar enough then they get different numbers (eg rno-mir-151 and rno-mir-3586).

miRBase17版本的第二个变化也涉及到了在基因组相对的位置上转录成的小的成对的miRNA序列-也就是正反义链,比如dme-mir-307 在位置上可以从两个方向上转录,这两个转录本经过转录后处理产生两个成熟的miRNA,这些miRNA在miRBase上之前叫做dme-mir-307 and dme-mir-307-as,“-as”会有点难解释清楚,因为这个后缀和表示miRNA家族的方法很像,这种正反义链的分类的方式是随意的。在更早以前, 早期的文献中,-as 和-s也用来表示从一个发夹前体上产生的5’和3‘的两个成熟体。不过在miRBase17版本之前,就没有用-as命名的方式了。我们现在把来自同一个DNA正反义链的两条序列用自己单独的名字来命名:如果序列是相似的,会在后面加一个a或者b的后缀(比如dme-mir-307a 和 dme-mir-307b),如果序列的相似度不高就用不同的数字来表示(比如 rno-mir-151 and rno-mir-3586)。

The combined result of these changes is that the name of a miRNA contains less information than previously. This may seem like a retrograde step. However, the problem with encoding information in the name is that people are tempted to use it. MicroRNA names are often pragmatic compromises, and have been overloaded with relatively complex meaning, for example, regarding family relationships and expression levels. Names should be useful, but should never be used in place of the correct analysis, for example, of sequence relationships or expression. We therefore suggest that you’ll find your miRNA life easier if you bear in mind some simple concepts:

这一系列的改变导致的结果就是miRNA的命名所展示的信息会比以前更少,这看上去好像是退步了。但是人们更倾向于使用名字的编码信息。miRNA命名在实际用法上就妥协了,因为以前超载了太多的复杂意义,比如把家族和表达水平也考虑进去了。命名规则确实是需要一些用处的,却不应该用来替代精确的分析,比如给一些序列相关性或者表达量做排序。我们相信如果记住了以下几点,你会在miRNA的工作中进行地更加顺利:

1. Be explicit. If you are referring to the mature miR-20b sequence, you could rely on the capitalisation in miR-20b to say that for you. But it is much better to say “the mature miR-20b sequence”. Even better, show the sequence along with the name; names are not formally stable, but quoting the specific sequence you’ve used in your paper will ensure the entity is traceable forever.

1.确切地说,如果你要表示成熟体 miR-20b序列,你可以用大写的 miR-20b来表示。这比用“the mature miR-20b sequence”要好的多。或者有更好的办法,用序列和名字一起来表示;由于名字还没有正式固定下来,所以在文章中用特定的序列可以在以后更能追本溯源。

2. Never use the name to encode or derive complex meaning. If you are interested in sequence relationships, you should do some sequence analysis. If you care about expression levels of alternate mature miRNAs, look at expression data. If you derive all your information about miRNA sequence relationships from the name, you will miss a great deal. If you rely on the name to tell you about relative expression then all hope is lost.

2.不让命名去编码或者得到什么复杂的含义。如果对序列的相关性有兴趣,你可以做一些序列的分析。如果想关注成熟miRNA的表达量水平,你可以分析表达量数据。如果从miRNA的名字中得到序列的相关性,那可能会不如你所愿,如果想让名字告诉你相关的表达量,你会大失所望的。

参考:(以上内容节选自)http://www.mirbase.org/blog/category/nomenclature/

翻译内容为原创,转载请注明出处。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 205,132评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,802评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,566评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,858评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,867评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,695评论 1 282
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,064评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,705评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,915评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,677评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,796评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,432评论 4 322
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,041评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,992评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,223评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,185评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,535评论 2 343

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,289评论 0 10
  • 时间:放学后 地点:学校 人物:法官红之国王、被告人爱丽丝、被告人红心女王、检察官疯帽子、被害人白兔、证人睡鼠同学...
    明虹阅读 2,645评论 0 0
  • 为迎接明天省领导来我校检查,今天下午放学全校组织卫生大扫除。范主任今天上午发来消息说,下午校领导要检查黑...
    玲儿坚持阅读 243评论 0 1
  • 碰了面 不止A君B君催促这样肉肉的我 减肥要赶紧 更漂亮 会有爱 好养眼 这样诸如此类的鼓动 不禁心中被鼓舞 瞬间...
    UnaFung阅读 189评论 0 1
  • 阅读层级:A+ 注释:目前阅读水平记作A。 高于目前阅读水平记作A+ 低于目前阅读水平记作A- 如何判断书籍水平 ...
    佩清风阅读 295评论 0 0