16S rRNA Analysis (1): QC with FastQC & Nanoplot

I got a small set of 16S rRNA sequence data from professor.
(I said small, but it's still 1.8GB!!)

First step,

I installed some necessary packages and ran QC with FastQC and NanoPlot.

I'm running everything on my personal laptop.
(AMD Ryzen 5 4500U … & 16+4GB RAM)

It took a long time… I don’t know—I watched a movie, and it was still working.
So I just went to bed, and it was done by morning.

a. Mean Length
b. Median Length
c. N50

a weighted median statistic such that 50% of the entire assembly is contained in contigs or scaffolds equal to or larger than this value. :Wikipedia)

We use this to decide the --min_length value for Filtlong.

Also, 16S amplicon length should be approx. 1500bp long—so my result (all around 1500) looks good!

My result says:

So it’s low-quality data → We need to set Filtlong more generous (by setting --keep_percent).

In the next article, I’ll proceed to Filtlong with the data I got today!

SeqPT