这是生信技能树生信爆款入门课程R语言部分的一道作业题。
加载数据
> load("matchtest.Rdata")#右边Environment已经加载了环境变量
> dim(x)#查看维度
[1] 7 2
> head(x)#查看前几行
file_name ID
1 708a16a3-7a5e-4e27-b06b-4c3c308b11fe.htseq.counts.gz TCGA-AA-3531-01A-01R-0821-07
2 95e726db-5ccc-4836-a2ae-7feaddaf9f1b.htseq.counts.gz TCGA-A6-2678-11A-01R-A32Z-07
3 90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz TCGA-AA-A02K-01A-03R-A32Y-07
4 587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz TCGA-NH-A6GA-01A-11R-A37K-07
5 1b843dbb-5ef0-47ca-9783-dbeb94aa6df3.htseq.counts.gz TCGA-AZ-6600-11A-01R-1774-07
6 09796233-3f40-4deb-b77d-2267c3afff59.htseq.counts.gz TCGA-CM-6676-01A-11R-1839-07
> dim(y)#查看维度
[1] 7 7
> head(y)#查看前几行
90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz
ENSG00000000003.13 6564
ENSG00000000005.5 29
ENSG00000000419.11 2659
ENSG00000000457.12 246
ENSG00000000460.15 145
ENSG00000000938.11 37
587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz
ENSG00000000003.13 3127
ENSG00000000005.5 0
ENSG00000000419.11 889
ENSG00000000457.12 382
ENSG00000000460.15 188
ENSG00000000938.11 749
95e726db-5ccc-4836-a2ae-7feaddaf9f1b.htseq.counts.gz
ENSG00000000003.13 6330
ENSG00000000005.5 0
ENSG00000000419.11 2428
ENSG00000000457.12 1701
ENSG00000000460.15 1009
ENSG00000000938.11 220
09796233-3f40-4deb-b77d-2267c3afff59.htseq.counts.gz
ENSG00000000003.13 3583
ENSG00000000005.5 70
ENSG00000000419.11 1436
ENSG00000000457.12 590
ENSG00000000460.15 440
ENSG00000000938.11 812
708a16a3-7a5e-4e27-b06b-4c3c308b11fe.htseq.counts.gz
ENSG00000000003.13 643
ENSG00000000005.5 17
ENSG00000000419.11 2476
ENSG00000000457.12 804
ENSG00000000460.15 496
ENSG00000000938.11 962
44f1dc34-a01e-4a7b-a7a1-a90064039fdd.htseq.counts.gz
ENSG00000000003.13 1514
ENSG00000000005.5 13
ENSG00000000419.11 876
ENSG00000000457.12 483
ENSG00000000460.15 250
ENSG00000000938.11 75
1b843dbb-5ef0-47ca-9783-dbeb94aa6df3.htseq.counts.gz
ENSG00000000003.13 11751
ENSG00000000005.5 26
ENSG00000000419.11 2494
ENSG00000000457.12 531
ENSG00000000460.15 632
ENSG00000000938.11 85
> x[1:4,1:2]#查看行列。第一行是列名,行名是数字
file_name ID
1 708a16a3-7a5e-4e27-b06b-4c3c308b11fe.htseq.counts.gz TCGA-AA-3531-01A-01R-0821-07
2 95e726db-5ccc-4836-a2ae-7feaddaf9f1b.htseq.counts.gz TCGA-A6-2678-11A-01R-A32Z-07
3 90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz TCGA-AA-A02K-01A-03R-A32Y-07
4 587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz TCGA-NH-A6GA-01A-11R-A37K-07
> y[1:4,1:2]#查看行列。第一行是列名,无行名
90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz
ENSG00000000003.13 6564
ENSG00000000005.5 29
ENSG00000000419.11 2659
ENSG00000000457.12 246
587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz
ENSG00000000003.13 3127
ENSG00000000005.5 0
ENSG00000000419.11 889
ENSG00000000457.12 382
任务:将y的行名,改为x的第二列对应id,
共有元素;y的行名,x的第一列
任务分解:
1.将x的第一列按照y的行名排列,提取下标顺序
> lx1=colnames(y)#y的行名
> lx1
[1] "90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz"
[2] "587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz"
[3] "95e726db-5ccc-4836-a2ae-7feaddaf9f1b.htseq.counts.gz"
[4] "09796233-3f40-4deb-b77d-2267c3afff59.htseq.counts.gz"
[5] "708a16a3-7a5e-4e27-b06b-4c3c308b11fe.htseq.counts.gz"
[6] "44f1dc34-a01e-4a7b-a7a1-a90064039fdd.htseq.counts.gz"
[7] "1b843dbb-5ef0-47ca-9783-dbeb94aa6df3.htseq.counts.gz"
> lx2=x[,1]#x的第一列
> lx2
[1] "708a16a3-7a5e-4e27-b06b-4c3c308b11fe.htseq.counts.gz"
[2] "95e726db-5ccc-4836-a2ae-7feaddaf9f1b.htseq.counts.gz"
[3] "90a46dce-5762-47ec-925c-deff853069aa.htseq.counts.gz"
[4] "587e44e4-87ba-4981-a520-d20612486f53.htseq.counts.gz"
[5] "1b843dbb-5ef0-47ca-9783-dbeb94aa6df3.htseq.counts.gz"
[6] "09796233-3f40-4deb-b77d-2267c3afff59.htseq.counts.gz"
[7] "44f1dc34-a01e-4a7b-a7a1-a90064039fdd.htseq.counts.gz"
> lx3=match(lx1,lx2)#提取下标
> lx3
[1] 3 4 2 6 1 7 5
2.id对应
> x[,2]#查看id
[1] "TCGA-AA-3531-01A-01R-0821-07" "TCGA-A6-2678-11A-01R-A32Z-07" "TCGA-AA-A02K-01A-03R-A32Y-07"
[4] "TCGA-NH-A6GA-01A-11R-A37K-07" "TCGA-AZ-6600-11A-01R-1774-07" "TCGA-CM-6676-01A-11R-1839-07"
[7] "TCGA-AA-3971-01A-01R-1022-07"
> 提取对应行号id名称
> x[lx3,2]
[1] "TCGA-AA-A02K-01A-03R-A32Y-07" "TCGA-NH-A6GA-01A-11R-A37K-07" "TCGA-A6-2678-11A-01R-A32Z-07"
[4] "TCGA-CM-6676-01A-11R-1839-07" "TCGA-AA-3531-01A-01R-0821-07" "TCGA-AA-3971-01A-01R-1022-07"
[7] "TCGA-AZ-6600-11A-01R-1774-07"
3.赋值给y的列名
> colnames(y) <- x[lx3,2]
> #查看y
> y
TCGA-AA-A02K-01A-03R-A32Y-07 TCGA-NH-A6GA-01A-11R-A37K-07
ENSG00000000003.13 6564 3127
ENSG00000000005.5 29 0
ENSG00000000419.11 2659 889
ENSG00000000457.12 246 382
ENSG00000000460.15 145 188
ENSG00000000938.11 37 749
ENSG00000000971.14 77 708
TCGA-A6-2678-11A-01R-A32Z-07 TCGA-CM-6676-01A-11R-1839-07
ENSG00000000003.13 6330 3583
ENSG00000000005.5 0 70
ENSG00000000419.11 2428 1436
ENSG00000000457.12 1701 590
ENSG00000000460.15 1009 440
ENSG00000000938.11 220 812
ENSG00000000971.14 530 4173
TCGA-AA-3531-01A-01R-0821-07 TCGA-AA-3971-01A-01R-1022-07
ENSG00000000003.13 643 1514
ENSG00000000005.5 17 13
ENSG00000000419.11 2476 876
ENSG00000000457.12 804 483
ENSG00000000460.15 496 250
ENSG00000000938.11 962 75
ENSG00000000971.14 3958 272
TCGA-AZ-6600-11A-01R-1774-07
ENSG00000000003.13 11751
ENSG00000000005.5 26
ENSG00000000419.11 2494
ENSG00000000457.12 531
ENSG00000000460.15 632
ENSG00000000938.11 85
ENSG00000000971.14 319
4.合并代码
> colnames(y) <- x[match(colnames(y),x[,1]),2]
> y
TCGA-AA-A02K-01A-03R-A32Y-07 TCGA-NH-A6GA-01A-11R-A37K-07
ENSG00000000003.13 6564 3127
ENSG00000000005.5 29 0
ENSG00000000419.11 2659 889
ENSG00000000457.12 246 382
ENSG00000000460.15 145 188
ENSG00000000938.11 37 749
ENSG00000000971.14 77 708
TCGA-A6-2678-11A-01R-A32Z-07 TCGA-CM-6676-01A-11R-1839-07
ENSG00000000003.13 6330 3583
ENSG00000000005.5 0 70
ENSG00000000419.11 2428 1436
ENSG00000000457.12 1701 590
ENSG00000000460.15 1009 440
ENSG00000000938.11 220 812
ENSG00000000971.14 530 4173
TCGA-AA-3531-01A-01R-0821-07 TCGA-AA-3971-01A-01R-1022-07
ENSG00000000003.13 643 1514
ENSG00000000005.5 17 13
ENSG00000000419.11 2476 876
ENSG00000000457.12 804 483
ENSG00000000460.15 496 250
ENSG00000000938.11 962 75
ENSG00000000971.14 3958 272
TCGA-AZ-6600-11A-01R-1774-07
ENSG00000000003.13 11751
ENSG00000000005.5 26
ENSG00000000419.11 2494
ENSG00000000457.12 531
ENSG00000000460.15 632
ENSG00000000938.11 85
ENSG00000000971.14 319