导读
Unicycler傻瓜组装法。
文章:Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
杂志:PLoS Comput Biol.
时间:2017
一、下载、安装Unicycler及依赖
地址:https://github.com/rrwick/Unicycler
git clone https://github.com/rrwick/Unicycler.git
cd Unicycler
make
运行试试:
# unicycler混合组装
time unicycler-runner.py -t 52 \
-1 clean_data/1B_R1_paired.fq -2 clean_data/1B_R2_paired.fq \
-l clean_data/1B.pacbio.fa -o unicyc
报错,显示依赖:
下载、安装:spades
地址:http://cab.spbu.ru/files/release3.14.0/SPAdes-3.14.0-Linux.tar.gz
wget -c http://cab.spbu.ru/files/release3.14.0/SPAdes-3.14.0-Linux.tar.gz
下载安装:racon
地址:https://github.com/isovic/racon
git clone --recursive https://github.com/isovic/racon.git racon
cd racon
mkdir build && cd build && cmake ../ # cmakecache.txt的位置
make
下载安装:pilon
地址:https://github.com/broadinstitute/pilon/releases/download/v1.23/pilon-1.23.jar
wget -c https://github.com/broadinstitute/pilon/releases/download/v1.23/pilon-1.23.jar
java -jar /home/cheng/huty/softwares/pilon-1.23.jar --help
# java程序,免安装
全局:unicycler, spades, racon, pilon
# 在程序所在目录内进行以下操作
ln -s $(readlink -f ./unicycler-runner.py) ~/bin/
ln -s $(readlink -f pilon-1.23.jar) ~/bin/
ln -s $(readlink -f ./spades.py) ~/bin/
ln -s $(readlink -f ./racon) ~/bin/
conda安装
conda create -n assembly
conda activate assembly
conda install unicycler
# samtools 报错
conda update samtools
二、Unicycler混合组装
再次运行试试:
# unicycler混合组装
time unicycler-runner.py -t 52 \
-1 clean_data/1B_R1_paired.fq -2 clean_data/1B_R2_paired.fq \
-l clean_data/1B.pacbio.fa -o unicyc
依赖情况:
组装过程:
- 1 二代矫错:spades read error correction
- 2 二代组装:spades assemblies
- 3 混合组装:assembling contigs and long reads with miniasm
- 4 比对:aligning reads
- 5 搭桥:building long read bridges
- 6 抛光:polishing assembly with pilon
组装结果
查看结果文件,得到组装结果:assembly.fasta
-rw-rw-r-- 1 cheng WST 2933634 1月 7 18:06 001_best_spades_graph.gfa
-rw-rw-r-- 1 cheng WST 2909841 1月 7 18:06 002_overlaps_removed.gfa
-rw-rw-r-- 1 cheng WST 2920506 1月 7 18:17 003_bridges_applied.gfa
-rw-rw-r-- 1 cheng WST 2906329 1月 7 18:17 004_final_clean.gfa
-rw-rw-r-- 1 cheng WST 2906330 1月 7 19:32 005_polished.gfa
-rw-rw-r-- 1 cheng WST 2945006 1月 7 19:32 assembly.fasta
-rw-rw-r-- 1 cheng WST 2906330 1月 7 19:32 assembly.gfa
-rw-rw-r-- 1 cheng WST 17815 1月 7 19:32 unicycler.log
三、组装二代
kneaddata质控
kneaddata \
-i ./rawdata/FC2282_FDSW210258126-1r_1.fq \
-i ./rawdata/FC2282_FDSW210258126-1r_2.fq \
-o ./cleandata/ \
--trimmomatic /miniconda3/envs/kneaddata/share/trimmomatic/ \
-t 4 \
--trimmomatic-options "SLIDINGWINDOW:4:20 MINLEN:50" \
--remove-intermediate-output
Error: the paired read input files have an unequal number of reads
第一组数据是最终结果,但是unpair (报错),所以选择第二组数据。
组装
unicycler -t 16 \
-1 ./cleandata/FC2282_FDSW210258126-1r_1_kneaddata.trimmed.1.fastq \
-2 ./cleandata/FC2282_FDSW210258126-1r_1_kneaddata.trimmed.2.fastq \
-o ./assembly_unicyc/
结果