在数据分析的时候经常碰到各种文件格式的相互转换,UCSC提供了一些工具。
去Anaconda工具库中搜索ucsc也出来很多的工具
有些人先将bigwig转换为wig,再将wig转换为bed
conda install -c bioconda ucsc-bigwigtowig
bigWigToWig signal.bigWig signal.wig
conda install -c bioconda bedops
wig2bed < signal.wig > signal.bed
#Filter BED by its score column with `awk`:
awk '($5 < 1)' signal.bed > signal.filtered.bed
另外还有convert2bed工具
convert2bed -i wig -o bed --max-mem=8G --sort-tmpdir=./ -x < sample.wig >sample.bed
注意:其中< 读取文件, >输出文件中的"<" 和 ">"必须要加,我没加一直报错~~
另外还有可能提示:
Warning: WIG data contains 0-indexed element at line 2
Consider adding --zero-indexed (-x) option to convert zero-indexed WIG data
Warning: If your Wiggle data is a significant portion of available system memory,
use the --max-mem and --sort-tmpdir options,
or use --do-not-sort to disable post-
conversion sorting. See --help for more information.
Usage: $ convert2bed --input=fmt [--output=fmt] [options] < input > output
Convert BAM, GFF, GTF, GVF, PSL, RepeatMasker (OUT), SAM, VCF
and WIG genomic formats to BED or BEDOPS Starch (compressed BED)
Input can be a regular file or standard input piped in using the
hyphen character ('-'):
$ some_upstream_process ... | convert2bed --input=fmt - > output
Input (required):
--input=[bam|gff|gtf|gvf|psl|rmsk|sam|vcf|wig] (-i <fmt>)
Genomic format of input file (required)
Output:
--output=[bed|starch] (-o <fmt>)
Format of output file, either BED or BEDOPS Starch (optional, default is BED)
Other processing options:
--do-not-sort (-d)
Do not sort BED output with sort-bed (not compatible with --output=starch)
--max-mem=<value> (-m <val>)
Sets aside <value> memory for sorting BED output. For example, <value> can
be 8G, 8000M or 8000000000 to specify 8 GB of memory (default is 2G)
--sort-tmpdir=<dir> (-r <dir>)
Optionally sets [dir] as temporary directory for sort data, when used in
conjunction with --max-mem=[value], instead of the host's operating system
default temporary directory
--starch-bzip2 (-z)
Used with --output=starch, the compressed output explicitly applies the bzip2
algorithm to compress intermediate data (default is bzip2)
--starch-gzip (-g)
Used with --output=starch, the compressed output applies gzip compression on
intermediate data
--starch-note="xyz..." (-e "xyz...")
Used with --output=starch, this adds a note to the Starch archive metadata
--help | --help[-bam|-gff|-gtf|-gvf|-psl|-rmsk|-sam|-vcf|-wig] (-h | -h <fmt>)
Show general help message (or detailed help for a specified input format)
--version (-w)
Show application version
在这里我用的是 ucsc-bigwigtobedgraph
conda install -c bioconda ucsc-bigwigtobedgraph
bigWigToBedGraph in.bw out.bed
usage:
bigWigToBedGraph in.bigWig out.bedGraph
options:
-chrom=chr1 - if set restrict output to given chromosome
-start=N - if set, restrict output to only that over start
-end=N - if set, restict output to only that under end
-udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs