minimap2
项目地址:https://github.com/lh3/minimap2。和blast类似,也是局部比对。环状参考序列可以考虑double,解决首尾问题。

# 安装
conda create -n minimap2 -c conda-forge -c bioconda minimap2
conda activate minimap2
# 主要参数
-t:线程数
-a:输出sam,默认paf
# long sequences against a reference genome
./minimap2 -a test/MT-human.fa test/MT-orang.fa > test.sam
# create an index first and then map
./minimap2 -x map-ont -d MT-human-ont.mmi test/MT-human.fa
./minimap2 -a MT-human-ont.mmi test/MT-orang.fa > test.sam
# use presets (no test data)
./minimap2 -ax map-pb ref.fa pacbio.fq.gz > aln.sam # PacBio CLR genomic reads
./minimap2 -ax map-ont ref.fa ont.fq.gz > aln.sam # Oxford Nanopore genomic reads
./minimap2 -ax map-hifi ref.fa pacbio-ccs.fq.gz > aln.sam # PacBio HiFi/CCS genomic reads (v2.19 or later)
./minimap2 -ax asm20 ref.fa pacbio-ccs.fq.gz > aln.sam # PacBio HiFi/CCS genomic reads (v2.18 or earlier)
./minimap2 -ax sr ref.fa read1.fa read2.fa > aln.sam # short genomic paired-end reads
./minimap2 -ax splice ref.fa rna-reads.fa > aln.sam # spliced long reads (strand unknown)
./minimap2 -ax splice -uf -k14 ref.fa reads.fa > aln.sam # noisy Nanopore Direct RNA-seq
./minimap2 -ax splice:hq -uf ref.fa query.fa > aln.sam # Final PacBio Iso-seq or traditional cDNA,-uf只考虑正义链转录本;-C5对于识别不同物种的剪切位点,灵敏度更高;-O6,24 -B4可以找到更多外显子
./minimap2 -ax splice --junc-bed anno.bed12 ref.fa query.fa > aln.sam # prioritize on annotated junctions
./minimap2 -cx asm5 asm1.fa asm2.fa > aln.paf # intra-species asm-to-asm alignment
./minimap2 -x ava-pb reads.fa reads.fa > overlaps.paf # PacBio read overlap
./minimap2 -x ava-ont reads.fa reads.fa > overlaps.paf # Nanopore read overlap
# man page for detailed command line options
man ./minimap2.1
paftools.js
Usage: paftools.js <command> [arguments]
Commands:
view convert PAF to BLAST-like (for eyeballing) or MAF
splice2bed convert spliced alignment in PAF/SAM to BED12
sam2paf convert SAM to PAF
delta2paf convert MUMmer's delta to PAF
gff2bed convert GTF/GFF3 to BED12
stat collect basic mapping information in PAF/SAM
liftover simplistic liftOver
call call variants from asm-to-ref alignment with the cs tag
bedcov compute the number of bases covered
mapeval evaluate mapping accuracy using mason2/PBSIM-simulated FASTQ
mason2fq convert mason2-simulated SAM to FASTQ
pbsim2fq convert PBSIM-simulated MAF to FASTQ
junceval evaluate splice junction consistency with known annotations
ov-eval evaluate read overlap sensitivity using read-to-ref mapping
hisat2:rna-seq比对,快速、低内存
# 安装
conda create -n hisat2 -c conda-forge -c bioconda hisat2
conda activate hisat2
# 建索引
hisat2-build ref.fa ref.idx -p 20
# 双端测序的比对
hisat2 -q -x ref.idx -1 B2_1.clean.fq.gz -2 B2_2.clean.fq.gz -S out.sam -p 25
--end-to-end 调用bowtie2的参数全长比对
STAR:RNA-Seq reads
STAR --runThreadN 20 # 线程数
--runMode genomeGenerate
--genomeDir star_index/ # index输出的路径 (事先准备好的index文件夹)
--genomeFastaFiles ref.fa # 参考基因组(之前下载的.fa文件)
--sjdbGTFfile gencode.v38.annotation.gtf # 参考基因组注释文件 (之前下载的.gtf文件)
--sjdbOverhang 35 # readlength-1 (默认值是100),read长度-1
IGV:reads比对可视化
将bam文件和对应的bai文件拷到机器
jBrowse2
bwa & bwa-mem2:二代比对到参考基因组
项目地址:https://github.com/lh3/bwa 及 https://github.com/bwa-mem2/bwa-mem2。能够将差异度较小的短序列比对到一个较大的参考基因组上
# 安装
conda create -n bwa -c conda-forge -c bioconda bwa
conda activate bwa
conda create -n bwa-mem2 -c conda-forge -c bioconda bwa-mem2
conda activate bwa-mem2
# 建索引
bwa-mem2 index ref.fa
# 比对
bwa-mem2 mem -t <num_threads> ref.fa read1.fq read2.fq > out.sam
samtools
# 安装
conda create -n samtools -c conda-forge -c bioconda samtools
conda activate samtools
bstr
比如用于筛选叶绿体基因组三代reads
BLEND
项目地址:https://github.com/CMU-SAFARI/BLEND。类似minimap2