1.BED格式相关的提取
bedtools
- 最知名的bed文件相关工具,但是和samtools并非出自一家
- http://bedtools.readthedocs.io/en/latest/index.html
- 更多参看bedtools用法大全:https://www.jianshu.com/p/6c3b87301491
bedops
- 速度比bedtools快
- https://bedops.readthedocs.io/en/latest/index.html
2.提取序列相关
seqtik
seqtk
bam2fastq
3.windows提取
TBtools软件
4.perl脚本:get_fa_by_id.pl
用法:
perl get_fa_by_id.pl id pro.fa >id.fa # id为geneid,pro.fa为库
script:
use strict;
die "perl $0<id><fa>“>输出目录”\n"unless @ARGV==2;
my($id,$fa)=@ARGV;
open IN,$id||die;
my%ha;
map{chomp;$ha{(split)[0]}=1}<IN>;
close IN;
$fa=~/gz$/?(open IN,"gzip -cd $fa|"||die):(open IN,$fa||die);
$/=">";<IN>;$/="\n";
my %out;
while(<IN>){
my $info=$1 if(/^(\S+)/);
$/=">";
my $seq=<IN>;
$/="\n";
$seq=~s/>|\r|\*//g;
print ">$info\n$seq" if(exists $ha{$info} && ! exists $out{$info});
$out{$info}=1;
}
close IN