软件介绍
听名字就可以猜出来Mitofinder是一个专门用于组装线粒体基因组的软件,该软件运行于liunx平台,不仅可以用于组装线粒体的基因组还可以用于注释线粒体基因组,不过需要一个近缘种的线粒体gb文件(gb格式的线粒体文件可以在NCBI中下载)。
软件安装
软件的安装可以看官方的介绍官方网页,偷懒的话可以用conda安装,但需要注意安装三个dependencies,(BLAST, Assemblers, tRNA annonatation),不然软件没有办法正常运行。
Input files
- Reference_file.gb 已经组装好的发布的线粒体基因组的文件
- left_reads.fastq.gz / right_reads.fastq.gz 适用于双端测的clean reads
- SE_reads.fastq.gz 单端测序的reads
How to use
- for PE reads
mitofinder -j [seqid] -1 [left_reads.fastq.gz] -2 [right_reads.fastq.gz] -r [genbank_reference.gb] -o [genetic_code] -p [threads] -m [memory]
- for SE reads
mitofinder -j [seqid] -s [SE_reads.fastq.gz] -r [genbank_reference.gb] -o [genetic_code] -p [threads] -m [memory]
- 实例
cd PATH/TO/MITOFINDER/test_case/
mitofinder -j Hospitalitermes_medioflavus_NCBI -a Hospitalitermes_medioflavus_NCBI.fasta -r Hospitalitermes_medioflavus_NCBI.gb -o 5
- 常用参数说明
-p number of threads
-r reference gb file
-o organism (5 the invertebrate mitogenome)
-m max-memmory
OUTPUT FILES
[Seq_ID]_final_genes_NT.fasta containing the nucleotides sequences of the final genes selected from all contigs found by MitoFinder (每个线粒体基因的核酸序列)
[Seq_ID]_final_genes_AA.fasta containing the amino acids sequences of the final genes selected from all contigs found by MitoFinder (每个线粒体基因的氨基酸序列)
[Seq_ID]_mtDNA_contig.fasta containing a mitochondrial contig (序列文件)
[Seq_ID]_mtDNA_contig.gff containing the final annotation for a given contig (GFF3 format) (GFF3注释文件)
[Seq_ID]_mtDNA_contig.tbl containing the final annotation for a given contig (Genbank submission format) (上传NCBI需要的文件)
[Seq_ID]_mtDNA_contig.gb containing the final annotation for a given contig (Genbank format for visualization) (NCBI注释的格式)
[Seq_ID]_mtDNA_contig_genes_NT.fasta containing the nucleotide sequences of annotated genes for a given contig
[Seq_ID]_mtDNA_contig_genes_AA.fasta containing the amino acids sequences of annotated genes for a given contig
[Seq_ID]_mtDNA_contig.png schematic representation of the annotation of the mtDNA contig
[Seq_ID]_mtDNA_contig.infos containing the initial contig name, the length of the contig and the GC content