SUMMARISING RUN PARAMETERS ========================== Input filename: /sibcb2/bioinformatics2/heshutao/raw_data/cup/20201028_merged/RRBS20A041685_val_1.fq.gz Trimming mode: paired-end Trim Galore version: 0.6.2 Cutadapt version: 2.6 Number of cores used for trimming: 1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction All Read 2 sequences will be trimmed by 2 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases Output file will be GZIP compressed This is cutadapt 2.6 with Python 3.6.7 Command line parameters: -j 1 -e 0.1 -O 1 -a AGATCGGAAGAGC /sibcb2/bioinformatics2/heshutao/raw_data/cup/BS_workflow/20201028_merged/tmp/ab50e80e-3dd8-11eb-bd4d-6c92bfc12c98/trimmed/RRBS20A041685_val_1.fq.gz_qual_trimmed.fastq Processing reads on 1 core in single-end mode ... Finished in 2.32 s (23 us/read; 2.63 M reads/minute). === Summary === Total reads processed: 101,608 Reads with adapters: 42,035 (41.4%) Reads written (passing filters): 101,608 (100.0%) Total basepairs processed: 10,416,134 bp Total written (filtered): 10,286,255 bp (98.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 42035 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 30.3% C: 1.0% G: 21.2% T: 47.5% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 29786 25402.0 0 29786 2 8753 6350.5 0 8753 3 1981 1587.6 0 1981 4 481 396.9 0 481 5 21 99.2 0 21 6 17 24.8 0 17 7 22 6.2 0 22 8 6 1.6 0 6 9 15 0.4 0 12 3 10 22 0.1 1 8 14 11 12 0.0 1 3 9 12 6 0.0 1 0 6 13 4 0.0 1 0 4 14 20 0.0 1 4 16 15 10 0.0 1 3 7 16 38 0.0 1 4 34 17 17 0.0 1 3 14 18 13 0.0 1 0 13 20 6 0.0 1 2 4 22 1 0.0 1 0 1 23 7 0.0 1 1 6 24 20 0.0 1 2 18 25 4 0.0 1 0 4 28 16 0.0 1 0 16 29 14 0.0 1 0 14 30 6 0.0 1 1 5 31 2 0.0 1 0 2 32 12 0.0 1 2 10 33 7 0.0 1 0 7 34 11 0.0 1 4 7 35 5 0.0 1 0 5 36 13 0.0 1 2 11 37 15 0.0 1 4 11 38 2 0.0 1 1 1 39 4 0.0 1 1 3 40 11 0.0 1 1 10 41 9 0.0 1 0 9 42 15 0.0 1 2 13 43 36 0.0 1 6 30 44 1 0.0 1 0 1 45 6 0.0 1 0 6 46 4 0.0 1 1 3 47 8 0.0 1 1 7 48 15 0.0 1 2 13 50 9 0.0 1 1 8 51 2 0.0 1 0 2 52 2 0.0 1 0 2 53 8 0.0 1 1 7 54 14 0.0 1 2 12 55 12 0.0 1 2 10 56 3 0.0 1 0 3 57 5 0.0 1 1 4 58 5 0.0 1 1 4 59 1 0.0 1 0 1 60 5 0.0 1 0 5 61 7 0.0 1 1 6 62 23 0.0 1 4 19 63 4 0.0 1 1 3 64 1 0.0 1 1 66 4 0.0 1 0 4 68 4 0.0 1 3 1 69 4 0.0 1 0 4 70 10 0.0 1 0 10 71 1 0.0 1 0 1 73 1 0.0 1 0 1 74 4 0.0 1 0 4 75 3 0.0 1 0 3 76 1 0.0 1 0 1 77 2 0.0 1 1 1 78 4 0.0 1 0 4 79 2 0.0 1 0 2 80 2 0.0 1 1 1 82 5 0.0 1 1 4 83 2 0.0 1 0 2 84 4 0.0 1 0 4 85 4 0.0 1 0 4 86 1 0.0 1 1 87 1 0.0 1 0 1 88 2 0.0 1 0 2 90 1 0.0 1 0 1 91 1 0.0 1 0 1 92 5 0.0 1 2 3 93 5 0.0 1 0 5 94 13 0.0 1 1 12 95 2 0.0 1 0 2 96 1 0.0 1 0 1 97 17 0.0 1 3 14 100 2 0.0 1 0 2 101 1 0.0 1 0 1 102 2 0.0 1 0 2 103 1 0.0 1 0 1 105 1 0.0 1 0 1 106 3 0.0 1 1 2 137 11 0.0 1 11 138 333 0.0 1 328 5 RUN STATISTICS FOR INPUT FILE: /sibcb2/bioinformatics2/heshutao/raw_data/cup/20201028_merged/RRBS20A041685_val_1.fq.gz ============================================= 101608 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 4913 (4.8%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 42035 (41.4%)