questions about DADA2 on PacBio data

Hi Dr. Callahan,

Recently, I have been trying to run DADA2 on my drinking water and filter material samples, which are normally very low in biomass (1-10 ng/ul). I have 254 samples sequenced for the full-length 16S gene using PacBio, and tried to analyse them with the DADA2 pipeline using the default parameters.

I am not an expert in bioinformatics, and this is my first time dealing with PacBio reads. I have a couple of questions that I would appreciate it if you could help me with.

After running DADA2 using default parameters, the average loss of reads after filtering and trimming was ~25%.
After denoising, the reads loss was between 40 to 60% (compared to raw reads), which was shocking for me. 

And then after the taxonomy assignment, I tried to figure out how many singletons I have. Out of ~53000 ASVs, ~43000 were occurring once (almost 80%). 

Are these issues normal, or is there something fundamentally wrong I am doing? 

I also tried minBootstrap = 80 in assignTaxonomy() function, but the results were not different from the default. I was not sure how to define bootstrapping for the taxonomy assignment, and what we should expect to see if we change 50 to 80?

Thank you for your support. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

questions about DADA2 on PacBio data #2203

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

questions about DADA2 on PacBio data #2203

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions