Skip to content

questions about DADA2 on PacBio data #2203

@haniehshakeri94-cloud

Description

@haniehshakeri94-cloud

Hi Dr. Callahan,

Recently, I have been trying to run DADA2 on my drinking water and filter material samples, which are normally very low in biomass (1-10 ng/ul). I have 254 samples sequenced for the full-length 16S gene using PacBio, and tried to analyse them with the DADA2 pipeline using the default parameters.

I am not an expert in bioinformatics, and this is my first time dealing with PacBio reads. I have a couple of questions that I would appreciate it if you could help me with.

After running DADA2 using default parameters, the average loss of reads after filtering and trimming was ~25%.
After denoising, the reads loss was between 40 to 60% (compared to raw reads), which was shocking for me.

And then after the taxonomy assignment, I tried to figure out how many singletons I have. Out of ~53000 ASVs, ~43000 were occurring once (almost 80%).

Are these issues normal, or is there something fundamentally wrong I am doing?

I also tried minBootstrap = 80 in assignTaxonomy() function, but the results were not different from the default. I was not sure how to define bootstrapping for the taxonomy assignment, and what we should expect to see if we change 50 to 80?

Thank you for your support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions