Reads Filter-1
Reads Filter-1
1
--stdout stream passing-filters reads to STDOUT.
This option will result in interleaved FASTQ output for paired-end output.
Disabled by default.
--interleaved_in indicate that <in1> is an interleaved
FASTQ which contains both read1 and read2. Disabled by default.
--reads_to_process specify how many reads/pairs to be
processed. Default 0 means process all reads. (int [=0])
--dont_overwrite don't overwrite existing files.
Overwritting is allowed by default.
--fix_mgi_id the MGI FASTQ ID format is not compatible
with many BAM operation tools, enable this option to fix it.
-V, --verbose output verbose log information (i.e. when
every 1M reads are processed).
-A, --disable_adapter_trimming adapter trimming is enabled by default.
If this option is specified, adapter trimming is disabled
-a, --adapter_sequence the adapter for read1. For SE data, if
not specified, the adapter will be auto-detected. For PE data, this is used if
R1/R2 are found not overlapped. (string [=auto])
--adapter_sequence_r2 the adapter for read2 (PE data only).
This is used if R1/R2 are found not overlapped. If not specified, it will be the
same as <adapter_sequence> (string [=auto])
--adapter_fasta specify a FASTA file to trim both read1
and read2 (if PE) by all the sequences in this FASTA file (string [=])
--detect_adapter_for_pe by default, the auto-detection for
adapter is for SE data input only, turn on this option to enable it for PE data.
-f, --trim_front1 trimming how many bases in front for
read1, default is 0 (int [=0])
-t, --trim_tail1 trimming how many bases in tail for
read1, default is 0 (int [=0])
-b, --max_len1 if read1 is longer than max_len1, then
trim read1 at its tail to make it as long as max_len1. Default 0 means no
limitation (int [=0])
-F, --trim_front2 trimming how many bases in front for
read2. If it's not specified, it will follow read1's settings (int [=0])
-T, --trim_tail2 trimming how many bases in tail for
read2. If it's not specified, it will follow read1's settings (int [=0])
-B, --max_len2 if read2 is longer than max_len2, then
trim read2 at its tail to make it as long as max_len2. Default 0 means no
limitation. If it's not specified, it will follow read1's settings (int [=0])
-D, --dedup enable deduplication to drop the
duplicated reads/pairs
--dup_calc_accuracy accuracy level to calculate duplication
(1~6), higher level uses more memory (1G, 2G, 4G, 8G, 16G, 24G). Default 1 for
no-dedup mode, and 3 for dedup mode. (int [=0])
--dont_eval_duplication don't evaluate duplication rate to save
time and use less memory.
-g, --trim_poly_g force polyG tail trimming, by default
trimming is automatically enabled for Illumina NextSeq/NovaSeq data
2
--poly_g_min_len the minimum length to detect polyG in the
read tail. 10 by default. (int [=10])
-G, --disable_trim_poly_g disable polyG tail trimming, by default
trimming is automatically enabled for Illumina NextSeq/NovaSeq data
-x, --trim_poly_x enable polyX trimming in 3' ends.
--poly_x_min_len the minimum length to detect polyX in the
read tail. 10 by default. (int [=10])
-5, --cut_front move a sliding window from front (5') to
tail, drop the bases in the window if its mean quality < threshold, stop
otherwise.
-3, --cut_tail move a sliding window from tail (3') to
front, drop the bases in the window if its mean quality < threshold, stop
otherwise.
-r, --cut_right move a sliding window from front to tail,
if meet one window with mean quality < threshold, drop the bases in the window
and the right part, and then stop.
-W, --cut_window_size the window size option shared by
cut_front, cut_tail or cut_sliding. Range: 1~1000, default: 4 (int [=4])
-M, --cut_mean_quality the mean quality requirement option
shared by cut_front, cut_tail or cut_sliding. Range: 1~36 default: 20 (Q20) (int
[=20])
--cut_front_window_size the window size option of cut_front,
default to cut_window_size if not specified (int [=4])
--cut_front_mean_quality the mean quality requirement option for
cut_front, default to cut_mean_quality if not specified (int [=20])
--cut_tail_window_size the window size option of cut_tail,
default to cut_window_size if not specified (int [=4])
--cut_tail_mean_quality the mean quality requirement option for
cut_tail, default to cut_mean_quality if not specified (int [=20])
--cut_right_window_size the window size option of cut_right,
default to cut_window_size if not specified (int [=4])
--cut_right_mean_quality the mean quality requirement option for
cut_right, default to cut_mean_quality if not specified (int [=20])
-Q, --disable_quality_filtering quality filtering is enabled by default.
If this option is specified, quality filtering is disabled
-q, --qualified_quality_phred the quality value that a base is
qualified. Default 15 means phred quality >=Q15 is qualified. (int [=15])
-u, --unqualified_percent_limit how many percents of bases are allowed to
be unqualified (0~100). Default 40 means 40% (int [=40])
-n, --n_base_limit if one read's number of N base is
>n_base_limit, then this read/pair is discarded. Default is 5 (int [=5])
-e, --average_qual if one read's average quality score
<avg_qual, then this read/pair is discarded. Default 0 means no requirement (int
[=0])
-L, --disable_length_filtering length filtering is enabled by default.
If this option is specified, length filtering is disabled
-l, --length_required reads shorter than length_required will
be discarded, default is 15. (int [=15])
3
--length_limit reads longer than length_limit will be
discarded, default 0 means no limitation. (int [=0])
-y, --low_complexity_filter enable low complexity filter. The
complexity is defined as the percentage of base that is different from its next
base (base[i] != base[i+1]).
-Y, --complexity_threshold the threshold for low complexity filter
(0~100). Default is 30, which means 30% complexity is required. (int [=30])
--filter_by_index1 specify a file contains a list of
barcodes of index1 to be filtered out, one barcode per line (string [=])
--filter_by_index2 specify a file contains a list of
barcodes of index2 to be filtered out, one barcode per line (string [=])
--filter_by_index_threshold the allowed difference of index barcode
for index filtering, default 0 means completely identical. (int [=0])
-c, --correction enable base correction in overlapped
regions (only for PE data), default is disabled
--overlap_len_require the minimum length to detect overlapped
region of PE reads. This will affect overlap analysis based PE merge, adapter
trimming and correction. 30 by default. (int [=30])
--overlap_diff_limit the maximum number of mismatched bases to
detect overlapped region of PE reads. This will affect overlap analysis based PE
merge, adapter trimming and correction. 5 by default. (int [=5])
--overlap_diff_percent_limit the maximum percentage of mismatched
bases to detect overlapped region of PE reads. This will affect overlap analysis
based PE merge, adapter trimming and correction. Default 20 means 20%. (int
[=20])
-U, --umi enable unique molecular identifier (UMI)
preprocessing
--umi_loc specify the location of UMI, can be
(index1/index2/read1/read2/per_index/per_read, default is none (string [=])
--umi_len if the UMI is in read1/read2, its length
should be provided (int [=0])
--umi_prefix if specified, an underline will be used
to connect prefix and UMI (i.e. prefix=UMI, UMI=AATTCG, final=UMI_AATTCG). No
prefix by default (string [=])
--umi_skip if the UMI is in read1/read2, fastp can
skip several bases following UMI, default is 0 (int [=0])
--umi_delim delimiter to use between the read name
and the UMI, default is : (string [=:])
-p, --overrepresentation_analysis enable overrepresented sequence analysis.
-P, --overrepresentation_sampling one in (--overrepresentation_sampling)
reads will be computed for overrepresentation analysis (1~10000), smaller is
slower, default is 20. (int [=20])
-j, --json the json format report file name (string
[=fastp.json])
-h, --html the html format report file name (string
[=fastp.html])
-R, --report_title should be quoted with ' or ", default is
"fastp report" (string [=fastp report])
4
-w, --thread worker thread number, default is 3 (int
[=3])
-s, --split split output by limiting total split file
number with this option (2~999), a sequential number prefix will be added to
output name ( 0001.out.fq, 0002.out.fq…), disabled by default (int [=0])
-S, --split_by_lines split output by limiting lines of each
file with this option(>=1000), a sequential number prefix will be added to
output name ( 0001.out.fq, 0002.out.fq…), disabled by default (long [=0])
-d, --split_prefix_digits the digits for the sequential number
padding (1~10), default is 4, so the filename will be padded as 0001.xxx, 0 to
disable padding (int [=4])
--cut_by_quality5 DEPRECATED, use --cut_front instead.
--cut_by_quality3 DEPRECATED, use --cut_tail instead.
--cut_by_quality_aggressive DEPRECATED, use --cut_right instead.
--discard_unmerged DEPRECATED, no effect now, see the
introduction for merging.
-?, --help print this message
[1]: ls
fastqc_results VA1832-2021
VA2041-2021 VA2095-2021 VA542-2022
fastqc_trimming VA1833-2021
VA2042-2021 VA3232-2021 VA543-2022
skesa_assemblies.ipynb VA1842-2021 VA2043-2021
VA4087-2021
VA1638-2021 VA1887-2021
VA2092-2021 VA4090-2021
VA1701-2021 VA2040-2021
VA2093-2021 VA541-2022
[2]: cd VA1638-2021
[3]: ls
VA1638-2021_S3_L001_R1_001.fastq.gz
VA1638-2021_S3_L001_R2_001.fastq.gz
↪VA1638-2021_trimmed_R2_001.fastq.gz -t 3 -T 3 -f 21 -F 21 -w 16
5
Q30 bases: 338487695(87.8695%)
Filtering result:
reads passed filter: 2603200
reads failed due to low quality: 620634
reads failed due to too many N: 0
reads failed due to too short: 1004
reads with adapter trimmed: 542752
bases trimmed due to adapters: 2754504
fastp -i VA1638-2021_S3_L001_R1_001.fastq.gz -o
VA1638-2021_trimmed_R1_001.fastq.gz -I VA1638-2021_S3_L001_R2_001.fastq.gz -O
VA1638-2021_trimmed_R2_001.fastq.gz -t 3 -T 3 -f 21 -F 21 -w 16
fastp v0.24.1, time used: 15 seconds
[5]: ls
fastp.html
VA1638-2021_S3_L001_R2_001.fastq.gz
fastp.json
VA1638-2021_trimmed_R1_001.fastq.gz
VA1638-2021_S3_L001_R1_001.fastq.gz
VA1638-2021_trimmed_R2_001.fastq.gz
6
[6]: cd ../VA1701-2021
[10]: ls
VA1701-2021_S4_L001_R1_001.fastq.gz
VA1701-2021_S4_L001_R2_001.fastq.gz
↪VA1701-2021_trimmed_R2_001.fastq.gz -t 4 -T 4 -f 21 -F 21 -w 16
Filtering result:
reads passed filter: 2859530
reads failed due to low quality: 438240
reads failed due to too many N: 0
reads failed due to too short: 1770
reads with adapter trimmed: 617114
bases trimmed due to adapters: 2440871
7
HTML report: fastp.html
fastp -i VA1701-2021_S4_L001_R1_001.fastq.gz -o
VA1701-2021_trimmed_R1_001.fastq.gz -I VA1701-2021_S4_L001_R2_001.fastq.gz -O
VA1701-2021_trimmed_R2_001.fastq.gz -t 4 -T 4 -f 21 -F 21 -w 16
fastp v0.24.1, time used: 16 seconds
[12]: ls
fastp.html
VA1701-2021_S4_L001_R2_001.fastq.gz
fastp.json
VA1701-2021_trimmed_R1_001.fastq.gz
VA1701-2021_S4_L001_R1_001.fastq.gz
VA1701-2021_trimmed_R2_001.fastq.gz
[13]: cd ../VA1832-2021
[16]: ls
VA1832-2021_S5_L001_R1_001.fastq.gz
VA1832-2021_S5_L001_R2_001.fastq.gz
↪VA1832-2021_trimmed_R2_001.fastq.gz -t 4 -T 4 -f 21 -F 21 -w 16
8
Q20 bases: 429595266(83.4731%)
Q30 bases: 382325264(74.2883%)
Filtering result:
reads passed filter: 4909812
reads failed due to low quality: 489712
reads failed due to too many N: 0
reads failed due to too short: 2216
reads with adapter trimmed: 1332290
bases trimmed due to adapters: 4226519
fastp -i VA1832-2021_S5_L001_R1_001.fastq.gz -o
VA1832-2021_trimmed_R1_001.fastq.gz -I VA1832-2021_S5_L001_R2_001.fastq.gz -O
VA1832-2021_trimmed_R2_001.fastq.gz -t 4 -T 4 -f 21 -F 21 -w 16
fastp v0.24.1, time used: 27 seconds
[4]: cd ../VA1833-2021
[5]: ls
VA1833-2021_S6_L001_R1_001.fastq.gz
VA1833-2021_S6_L001_R2_001.fastq.gz
↪VA1833-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
9
total bases: 386657672
Q20 bases: 366630140(94.8203%)
Q30 bases: 354444014(91.6687%)
Filtering result:
reads passed filter: 3761006
reads failed due to low quality: 1023640
reads failed due to too many N: 0
reads failed due to too short: 1766
reads with adapter trimmed: 1187252
bases trimmed due to adapters: 6733581
fastp -i VA1833-2021_S6_L001_R1_001.fastq.gz -o
VA1833-2021_trimmed_R1_001.fastq.gz -I VA1833-2021_S6_L001_R2_001.fastq.gz -O
VA1833-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
fastp v0.24.1, time used: 21 seconds
[7]: cd ../VA1842-2021
[8]: ls
VA1842-2021_S7_L001_R1_001.fastq.gz
VA1842-2021_S7_L001_R2_001.fastq.gz
↪VA1842-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
10
total reads: 1180436
total bases: 273271603
Q20 bases: 220349001(80.6337%)
Q30 bases: 194221055(71.0725%)
Filtering result:
reads passed filter: 2147142
reads failed due to low quality: 212462
reads failed due to too many N: 0
reads failed due to too short: 1268
reads with adapter trimmed: 541744
bases trimmed due to adapters: 1799333
fastp -i VA1842-2021_S7_L001_R1_001.fastq.gz -o
VA1842-2021_trimmed_R1_001.fastq.gz -I VA1842-2021_S7_L001_R2_001.fastq.gz -O
VA1842-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
fastp v0.24.1, time used: 12 seconds
[13]: cd ../VA1887-2021
[14]: ls
VA1887-2021_S8_L001_R1_001.fastq.gz
VA1887-2021_S8_L001_R2_001.fastq.gz
↪VA1887-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
11
Read1 before filtering:
total reads: 2624018
total bases: 613320531
Q20 bases: 566930291(92.4362%)
Q30 bases: 540969342(88.2034%)
Filtering result:
reads passed filter: 4842164
reads failed due to low quality: 404348
reads failed due to too many N: 0
reads failed due to too short: 1524
reads with adapter trimmed: 1352518
bases trimmed due to adapters: 4118103
fastp -i VA1887-2021_S8_L001_R1_001.fastq.gz -o
VA1887-2021_trimmed_R1_001.fastq.gz -I VA1887-2021_S8_L001_R2_001.fastq.gz -O
VA1887-2021_trimmed_R2_001.fastq.gz -t 2 -T 2 -f 20 -F 20 -w 16
fastp v0.24.1, time used: 25 seconds
[16]: cd ../VA2040-2021
[17]: ls
12
VA2040-2021_S9_L001_R1_001.fastq.gz
VA2040-2021_S9_L001_R2_001.fastq.gz
↪VA2040-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
Filtering result:
reads passed filter: 2811334
reads failed due to low quality: 315550
reads failed due to too many N: 0
reads failed due to too short: 992
reads with adapter trimmed: 708540
bases trimmed due to adapters: 2559243
fastp -i VA2040-2021_S9_L001_R1_001.fastq.gz -o
VA2040-2021_S9_trimmed_001.fastq.gz -I VA2040-2021_S9_L001_R2_001.fastq.gz -O
13
VA2040-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
fastp v0.24.1, time used: 16 seconds
[19]: cd ../VA2041-2021
[20]: ls
VA2041-2021_S10_L001_R1_001.fastq.gz
VA2041-2021_S10_L001_R2_001.fastq.gz
↪VA2041-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
Filtering result:
reads passed filter: 3558078
reads failed due to low quality: 643304
reads failed due to too many N: 0
reads failed due to too short: 1232
reads with adapter trimmed: 904258
bases trimmed due to adapters: 4066578
14
Insert size peak (evaluated by paired-end reads): 330
fastp -i VA2041-2021_S10_L001_R1_001.fastq.gz -o
VA2041-2021_trimmed_R1_001.fastq.gz -I VA2041-2021_S10_L001_R2_001.fastq.gz -O
VA2041-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
fastp v0.24.1, time used: 20 seconds
[22]: cd ../VA2042-2021
[23]: ls
VA2042-2021_S1_L001_R1_001.fastq.gz
VA2042-2021_S1_L001_R2_001.fastq.gz
↪VA2042-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
Filtering result:
reads passed filter: 4522626
reads failed due to low quality: 486762
15
reads failed due to too many N: 0
reads failed due to too short: 1058
reads with adapter trimmed: 1083528
bases trimmed due to adapters: 3696114
fastp -i VA2042-2021_S1_L001_R1_001.fastq.gz -o
VA2042-2021_trimmed_R1_001.fastq.gz -I VA2042-2021_S1_L001_R2_001.fastq.gz -O
VA2042-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 2 -T 2 -w 16
fastp v0.24.1, time used: 25 seconds
[2]: cd VA2043-2021/
[3]: ls
VA2043-2021_S2_L001_R1_001.fastq.gz
VA2043-2021_S2_L001_R2_001.fastq.gz
↪VA2043-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 3 -T 2 -w 16
16
total bases: 301630509
Q20 bases: 231189257(76.6465%)
Q30 bases: 198357439(65.7617%)
Filtering result:
reads passed filter: 2849284
reads failed due to low quality: 542520
reads failed due to too many N: 0
reads failed due to too short: 1378
reads with adapter trimmed: 672104
bases trimmed due to adapters: 3100420
fastp -i VA2043-2021_S2_L001_R1_001.fastq.gz -o
VA2043-2021_trimmed_R1_001.fastq.gz -I VA2043-2021_S2_L001_R2_001.fastq.gz -O
VA2043-2021_trimmed_R2_001.fastq.gz -f 20 -F 20 -t 3 -T 2 -w 16
fastp v0.24.1, time used: 16 seconds
[5]: cd ../VA2092-2021
[6]: ls
VA2092-2021-2_S2_L001_R1_001.fastq.gz
VA2092-2021_S1_L001_R1_001.fastq.gz
VA2092-2021-2_S2_L001_R2_001.fastq.gz
VA2092-2021_S1_L001_R2_001.fastq.gz
En este caso, como son varias lecturas R1 y R2, haremos los cortes de calidad con fastp, y luego
“cat” para unir todas las R1 y todas las R2, en sus respectivos archivos.
[8]: fastp -i VA2092-2021-2_S2_L001_R1_001.fastq.gz -o␣
↪VA2092-2021-2_S2_trimmed_R1_001.fastq.gz -I VA2092-2021-2_S2_L001_R2_001.
↪fastq.gz -O VA2092-2021-2_S2_trimmed_R2_001.fastq.gz -f 20 -F 20 -T 1 -t 1␣
↪-w 16
17
total reads: 1222381
total bases: 306817631
Q20 bases: 262289307(85.487%)
Q30 bases: 243888111(79.4896%)
Filtering result:
reads passed filter: 2332528
reads failed due to low quality: 112234
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 670326
bases trimmed due to adapters: 48590492
fastp -i VA2092-2021-2_S2_L001_R1_001.fastq.gz -o
VA2092-2021-2_S2_trimmed_R1_001.fastq.gz -I
VA2092-2021-2_S2_L001_R2_001.fastq.gz -O
VA2092-2021-2_S2_trimmed_R2_001.fastq.gz -f 20 -F 20 -T 1 -t 1 -w 16
fastp v0.24.1, time used: 13 seconds
[9]: ls
fastp.html
VA2092-2021-2_S2_trimmed_R1_001.fastq.gz
fastp.json
VA2092-2021-2_S2_trimmed_R2_001.fastq.gz
VA2092-2021-2_S2_L001_R1_001.fastq.gz
VA2092-2021_S1_L001_R1_001.fastq.gz
VA2092-2021-2_S2_L001_R2_001.fastq.gz
VA2092-2021_S1_L001_R2_001.fastq.gz
18
[10]: fastp -i VA2092-2021_S1_L001_R1_001.fastq.gz -o VA2092-2021_S1_trimmed_R1_001.
↪fastq.gz -I VA2092-2021_S1_L001_R2_001.fastq.gz -O␣
↪VA2092-2021_S1_trimmed_R2_001.fastq.gz -f 29 -F 21 -t 1 -T 1 -w 16
Filtering result:
reads passed filter: 2511012
reads failed due to low quality: 117484
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 695548
bases trimmed due to adapters: 41471124
fastp -i VA2092-2021_S1_L001_R1_001.fastq.gz -o
VA2092-2021_S1_trimmed_R1_001.fastq.gz -I VA2092-2021_S1_L001_R2_001.fastq.gz -O
VA2092-2021_S1_trimmed_R2_001.fastq.gz -f 29 -F 21 -t 1 -T 1 -w 16
fastp v0.24.1, time used: 14 seconds
19
[11]: ls
fastp.html
fastp.json
VA2092-2021-2_S2_L001_R1_001.fastq.gz
VA2092-2021-2_S2_L001_R2_001.fastq.gz
VA2092-2021-2_S2_trimmed_R1_001.fastq.gz
VA2092-2021-2_S2_trimmed_R2_001.fastq.gz
VA2092-2021_S1_L001_R1_001.fastq.gz
VA2092-2021_S1_L001_R2_001.fastq.gz
VA2092-2021_S1_trimmed_R1_001.fastq.gz
VA2092-2021_S1_trimmed_R2_001.fastq.gz
[12]: cd ../VA2093-2021
[13]: ls
VA2093-2021-2_S4_L001_R1_001.fastq.gz
VA2093-2021_S3_L001_R1_001.fastq.gz
VA2093-2021-2_S4_L001_R2_001.fastq.gz
VA2093-2021_S3_L001_R2_001.fastq.gz
↪fastq.gz -O VA2093-2021-2_S4_trimmed_R2_001.fastq.gz -f 21 -F 20 -t 1 -T 1␣
↪-w 16
20
Q20 bases: 224750643(87.352%)
Q30 bases: 209136476(81.2834%)
Filtering result:
reads passed filter: 2431346
reads failed due to low quality: 148242
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 658818
bases trimmed due to adapters: 44733442
fastp -i VA2093-2021-2_S4_L001_R1_001.fastq.gz -o
VA2093-2021-2_S4_trimmed_R1_001.fastq.gz -I
VA2093-2021-2_S4_L001_R2_001.fastq.gz -O
VA2093-2021-2_S4_trimmed_R2_001.fastq.gz -f 21 -F 20 -t 1 -T 1 -w 16
fastp v0.24.1, time used: 14 seconds
[17]: cd ../VA2095-2021
[18]: ls
VA2095-2021-2_S6_L001_R1_001.fastq.gz
VA2095-2021_S5_L001_R1_001.fastq.gz
VA2095-2021-2_S6_L001_R2_001.fastq.gz
VA2095-2021_S5_L001_R2_001.fastq.gz
↪fastq.gz -O VA2095-2021-2_S6_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 2 -T 2␣
↪-w 16
21
Q30 bases: 303304491(82.1061%)
Filtering result:
reads passed filter: 2821482
reads failed due to low quality: 121988
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 744196
bases trimmed due to adapters: 51151414
fastp -i VA2095-2021-2_S6_L001_R1_001.fastq.gz -o
VA2095-2021-2_S6_trimmed_R1_001.fastq.gz -I
VA2095-2021-2_S6_L001_R2_001.fastq.gz -O
VA2095-2021-2_S6_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 2 -T 2 -w 16
fastp v0.24.1, time used: 16 seconds
↪VA2095-2021_S5_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 2 -T 2 -w 16
22
Q20 bases: 292103599(75.3354%)
Q30 bases: 256014729(66.0278%)
Filtering result:
reads passed filter: 2760188
reads failed due to low quality: 329354
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 731530
bases trimmed due to adapters: 51053894
fastp -i VA2095-2021_S5_L001_R1_001.fastq.gz -o
VA2095-2021_S5_trimmed_R1_001.fastq.gz -I VA2095-2021_S5_L001_R2_001.fastq.gz -O
VA2095-2021_S5_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 2 -T 2 -w 16
fastp v0.24.1, time used: 16 seconds
[21]: cd ../VA3232-2021
[22]: ls
VA3232-2021_S5_L001_R1_001.fastq.gz
VA3232-2021_S5_L001_R2_001.fastq.gz
↪VA3232-2021_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 4 -w 16
23
total bases: 301026002
Q20 bases: 284821156(94.6168%)
Q30 bases: 277725348(92.2596%)
Filtering result:
reads passed filter: 3576430
reads failed due to low quality: 177416
reads failed due to too many N: 0
reads failed due to too short: 6526
reads with adapter trimmed: 3021114
bases trimmed due to adapters: 13575259
fastp -i VA3232-2021_S5_L001_R1_001.fastq.gz -o
VA3232-2021_trimmed_R1_001.fastq.gz -I VA3232-2021_S5_L001_R2_001.fastq.gz -O
VA3232-2021_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 4 -w 16
fastp v0.24.1, time used: 16 seconds
[25]: cd ../VA4087-2021
[26]: ls
VA4087-21c3_S1_L001_R1_001.fastq.gz
VA4087-21c4_S2_L001_R1_001.fastq.gz
24
VA4087-21c3_S1_L001_R2_001.fastq.gz
VA4087-21c4_S2_L001_R2_001.fastq.gz
↪VA4087-21c3_S1_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
Filtering result:
reads passed filter: 4456094
reads failed due to low quality: 50746
reads failed due to too many N: 0
reads failed due to too short: 16436
reads with adapter trimmed: 2291498
bases trimmed due to adapters: 460571
fastp -i VA4087-21c3_S1_L001_R1_001.fastq.gz -o
VA4087-21c3_S1_trimmed_R1_001.fastq.gz -I VA4087-21c3_S1_L001_R2_001.fastq.gz -O
25
VA4087-21c3_S1_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
fastp v0.24.1, time used: 13 seconds
↪VA4087-21c4_S2_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
Filtering result:
reads passed filter: 3896120
reads failed due to low quality: 47300
reads failed due to too many N: 0
reads failed due to too short: 14842
reads with adapter trimmed: 2056876
bases trimmed due to adapters: 472512
fastp -i VA4087-21c4_S2_L001_R1_001.fastq.gz -o
VA4087-21c4_S2_trimmed_R1_001.fastq.gz -I VA4087-21c4_S2_L001_R2_001.fastq.gz -O
26
VA4087-21c4_S2_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
fastp v0.24.1, time used: 11 seconds
[29]: cd ../VA4090-2021
[30]: ls
VA4090-21c1_S1_L001_R1_001.fastq.gz
VA4090-21c2_S2_L001_R1_001.fastq.gz
VA4090-21c1_S1_L001_R2_001.fastq.gz
VA4090-21c2_S2_L001_R2_001.fastq.gz
↪VA4090-21c1_S1_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
Filtering result:
reads passed filter: 4409170
reads failed due to low quality: 49526
reads failed due to too many N: 0
reads failed due to too short: 5082
reads with adapter trimmed: 1662504
bases trimmed due to adapters: 376459
27
Duplication rate: 2.41226%
fastp -i VA4090-21c1_S1_L001_R1_001.fastq.gz -o
VA4090-21c1_S1_trimmed_R1_001.fastq.gz -I VA4090-21c1_S1_L001_R2_001.fastq.gz -O
VA4090-21c1_S1_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
fastp v0.24.1, time used: 15 seconds
↪VA4090-21c2_S2_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
Filtering result:
reads passed filter: 4938762
reads failed due to low quality: 50150
reads failed due to too many N: 0
reads failed due to too short: 5336
reads with adapter trimmed: 2009198
bases trimmed due to adapters: 416879
28
Duplication rate: 2.22232%
fastp -i VA4090-21c2_S2_L001_R1_001.fastq.gz -o
VA4090-21c2_S2_trimmed_R1_001.fastq.gz -I VA4090-21c2_S2_L001_R2_001.fastq.gz -O
VA4090-21c2_S2_trimmed_R2_001.fastq.gz -f 21 -F 21 -t 3 -T 3 -w 16
fastp v0.24.1, time used: 16 seconds
[34]: cd ../VA541-2022
[35]: ls
VA541-2022-2_R1.fastq.gz VA541-2022-2_R2.fastq.gz
VA541-2022_R1.fastq.gz
VA541-2022.2_R1.fastq.gz VA541-2022.2_R2.fastq.gz
VA541-2022_R2.fastq.gz
↪1 -T 1 -w 16
29
Filtering result:
reads passed filter: 6848902
reads failed due to low quality: 481038
reads failed due to too many N: 3316
reads failed due to too short: 7568
reads with adapter trimmed: 3307802
bases trimmed due to adapters: 69689585
↪1 -T 1 -w 16
30
Filtering result:
reads passed filter: 21192172
reads failed due to low quality: 1478618
reads failed due to too many N: 10298
reads failed due to too short: 22674
reads with adapter trimmed: 10329036
bases trimmed due to adapters: 217609677
↪1 -w 16
31
Filtering result:
reads passed filter: 14343270
reads failed due to low quality: 997580
reads failed due to too many N: 6982
reads failed due to too short: 15106
reads with adapter trimmed: 7021234
bases trimmed due to adapters: 147920092
[41]: cd ../VA542-2022
[42]: ls
VA542-2022-2_R1.fastq.gz VA542-2022-2_R2.fastq.gz
VA542-2022_R1.fastq.gz
VA542-2022.2_R1.fastq.gz VA542-2022.2_R2.fastq.gz
VA542-2022_R2.fastq.gz
↪1 -T 1 -w 16
32
total bases: 322117785
Q20 bases: 302971986(94.0563%)
Q30 bases: 283713637(88.0776%)
Filtering result:
reads passed filter: 5544252
reads failed due to low quality: 410706
reads failed due to too many N: 2822
reads failed due to too short: 4944
reads with adapter trimmed: 2683728
bases trimmed due to adapters: 48406330
↪1 -T 1 -w 16
33
total bases: 1067991438
Q20 bases: 1005506002(94.1493%)
Q30 bases: 941410401(88.1477%)
Filtering result:
reads passed filter: 18383530
reads failed due to low quality: 1352900
reads failed due to too many N: 9410
reads failed due to too short: 17466
reads with adapter trimmed: 8898500
bases trimmed due to adapters: 160485780
↪1 -w 16
34
total bases: 745873653
Q20 bases: 702534016(94.1894%)
Q30 bases: 657696764(88.178%)
Filtering result:
reads passed filter: 12839278
reads failed due to low quality: 942194
reads failed due to too many N: 6588
reads failed due to too short: 12522
reads with adapter trimmed: 6214772
bases trimmed due to adapters: 112079450
[47]: cd ../VA543-2022
[48]: ls
VA543-2022-2_R1.fastq.gz VA543-2022-2_R2.fastq.gz
VA543-2022_R1.fastq.gz
VA543-2022.2_R1.fastq.gz VA543-2022.2_R2.fastq.gz
VA543-2022_R2.fastq.gz
↪1 -T 1 -w 16
35
Read2 before filtering:
total reads: 2382669
total bases: 359783019
Q20 bases: 330784846(91.9401%)
Q30 bases: 302508577(84.0808%)
Filtering result:
reads passed filter: 4417702
reads failed due to low quality: 340998
reads failed due to too many N: 2198
reads failed due to too short: 4440
reads with adapter trimmed: 2135090
bases trimmed due to adapters: 39061341
↪1 -T 1 -w 16
36
Read2 before filtering:
total reads: 9056436
total bases: 1367521836
Q20 bases: 1259863586(92.1275%)
Q30 bases: 1153549476(84.3533%)
Filtering result:
reads passed filter: 16814706
reads failed due to low quality: 1273186
reads failed due to too many N: 8256
reads failed due to too short: 16724
reads with adapter trimmed: 8087750
bases trimmed due to adapters: 146753539
↪1 -w 16
37
Read2 before filtering:
total reads: 6673767
total bases: 1007738817
Q20 bases: 929078740(92.1944%)
Q30 bases: 851040899(84.4505%)
Filtering result:
reads passed filter: 12397004
reads failed due to low quality: 932188
reads failed due to too many N: 6058
reads failed due to too short: 12284
reads with adapter trimmed: 5952660
bases trimmed due to adapters: 107692198
38