bowtie examples

| category Bioinformatics  | tag NGS 

1. generate bowtie2 index

bowtie2-build ref.fasta indexDir/ref

2. read alignments

2.1 end-to-end read alignments

bowtie2 –x indexDir/ref –U seq.fastq –S out.full.sam

2.2 partial read alignments

bowtie2 –x indexDir/ref –U seq.fastq –S out.local.sam --local

3. statistics of the alignments

3.1 How many matches (alignments) were reported?

Check the SAM file to determine the number of alignment lines, excluding lines that refer to unmapped reads. A SAM line indicating an unmapped read can be recognized by a “*” in column 3 (chrom).

grep -v "^@" out.full.sam | cut -f3 | grep -v "*" | wc -l
grep -v "^@" out.local.sam | cut -f3 | grep -v "*" | wc -l

3.2 How many alignments contained insertions and/or deletions?

This information is captured in the CIGAR field (col 6), marked with ‘D’ and ‘I’, respectively.

cut -f6 out.full.sam | grep -c "[I,D]"
cut -f6 out.local.sam | grep -c "[I,D]"

or

grep -v "^@" out.full.sam | awk '$6~"I" || $6~"D"' | wc -l
grep -v "^@" out.local.sam | awk '$6~"I" || $6~"D"' | wc -l

4. call variants

4.1 converting the SAM file to BAM format

samtools view –bT ref.fasta out.full.sam > out.full.bam

4.2 sort the bam file

samtools sort out.full.bam -o out.full.sorted.bam

4.3 Determine candidate sites

samtools mpileup –f ref.fasta –g out.full.sorted.bam > out.full.mpileup.bcf
bcftools call -m -v -O v -o out.mileup.vcf out.full.mileup.bcf

5. variants statistics

5.1 How many variants were reported for Chr1?

grep -c "^Chr1" out.full.mpileup.vcf 

5.2 How many variants have ‘A’ as the reference allele?

grep -v "^#" out.full.mpileup.vcf | awk '$4=="A"' | wc -l

5.3 How many variants have exactly 20 supporting reads (read depth)?

grep -v "^#" out.full.mpileup.vcf | grep "DP=20;" | wc -l

5.4 How many variants represent indels?

grep -v "^#" out.full.mpileup.vcf | grep "INDEL" | wc -l

Previous     Next