SUMMARISING RUN PARAMETERS ========================== Input filename: /sibcb2/bioinformatics2/heshutao/processing/cup/20201120/RRBS20A041604_val_2.fq.gz Trimming mode: paired-end Trim Galore version: 0.6.2 Cutadapt version: 2.6 Number of cores used for trimming: 1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction All Read 2 sequences will be trimmed by 2 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases Output file will be GZIP compressed This is cutadapt 2.6 with Python 3.6.7 Command line parameters: -j 1 -e 0.1 -O 1 -a AGATCGGAAGAGC /sibcb2/bioinformatics2/heshutao/processing/cup/BS_workflow/20201120/tmp/a5a39dd2-5572-11eb-a07c-6c92bfc12ecc/trimmed/RRBS20A041604_val_2.fq.gz_qual_trimmed.fastq Processing reads on 1 core in single-end mode ... Finished in 294.85 s (23 us/read; 2.58 M reads/minute). === Summary === Total reads processed: 12,691,001 Reads with adapters: 4,900,081 (38.6%) Reads written (passing filters): 12,691,001 (100.0%) Total basepairs processed: 1,318,185,118 bp Total written (filtered): 1,259,355,957 bp (95.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 4900081 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 28.5% C: 63.6% G: 1.0% T: 7.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 4437245 3172750.2 0 4437245 2 46082 793187.6 0 46082 3 9041 198296.9 0 9041 4 2920 49574.2 0 2920 5 679 12393.6 0 679 6 204 3098.4 0 204 7 643 774.6 0 643 8 409 193.6 0 409 9 156 48.4 0 143 13 10 1464 12.1 1 564 900 11 103 3.0 1 43 60 12 1014 0.8 1 226 788 13 1032 0.2 1 257 775 14 152 0.2 1 5 147 15 106 0.2 1 5 101 16 60 0.2 1 4 56 17 147 0.2 1 4 143 18 65 0.2 1 2 63 19 163 0.2 1 11 152 20 129 0.2 1 12 117 21 21 0.2 1 5 16 22 94 0.2 1 6 88 23 130 0.2 1 8 122 24 406 0.2 1 18 388 25 129 0.2 1 6 123 26 160 0.2 1 9 151 27 59 0.2 1 3 56 28 173 0.2 1 21 152 29 8 0.2 1 0 8 30 117 0.2 1 11 106 31 126 0.2 1 13 113 32 162 0.2 1 3 159 33 292 0.2 1 18 274 34 73 0.2 1 6 67 35 133 0.2 1 11 122 36 19 0.2 1 2 17 37 151 0.2 1 6 145 38 39 0.2 1 3 36 39 91 0.2 1 3 88 40 66 0.2 1 2 64 41 290 0.2 1 7 283 42 152 0.2 1 9 143 43 73 0.2 1 2 71 44 114 0.2 1 7 107 45 216 0.2 1 16 200 46 108 0.2 1 4 104 47 45 0.2 1 3 42 48 167 0.2 1 12 155 49 88 0.2 1 5 83 50 139 0.2 1 11 128 51 179 0.2 1 9 170 52 179 0.2 1 18 161 53 29 0.2 1 3 26 54 129 0.2 1 6 123 55 72 0.2 1 6 66 56 22 0.2 1 6 16 57 88 0.2 1 8 80 58 266 0.2 1 24 242 59 73 0.2 1 7 66 60 127 0.2 1 29 98 61 192 0.2 1 68 124 62 303 0.2 1 159 144 63 254 0.2 1 154 100 64 104 0.2 1 41 63 65 125 0.2 1 24 101 66 343 0.2 1 32 311 67 158 0.2 1 12 146 68 126 0.2 1 15 111 69 92 0.2 1 13 79 70 99 0.2 1 16 83 71 99 0.2 1 20 79 72 96 0.2 1 22 74 73 126 0.2 1 26 100 74 156 0.2 1 24 132 75 163 0.2 1 12 151 76 156 0.2 1 4 152 77 124 0.2 1 3 121 78 118 0.2 1 9 109 79 139 0.2 1 6 133 80 88 0.2 1 9 79 81 75 0.2 1 7 68 82 111 0.2 1 7 104 83 117 0.2 1 7 110 84 95 0.2 1 10 85 85 119 0.2 1 10 109 86 55 0.2 1 5 50 87 47 0.2 1 5 42 88 47 0.2 1 8 39 89 50 0.2 1 7 43 90 38 0.2 1 4 34 91 37 0.2 1 7 30 92 31 0.2 1 0 31 93 35 0.2 1 0 35 94 39 0.2 1 3 36 95 17 0.2 1 2 15 96 24 0.2 1 1 23 97 31 0.2 1 3 28 98 24 0.2 1 0 24 99 33 0.2 1 3 30 100 24 0.2 1 0 24 101 19 0.2 1 0 19 102 32 0.2 1 1 31 103 16 0.2 1 1 15 104 5 0.2 1 0 5 105 7 0.2 1 0 7 106 5 0.2 1 1 4 107 2 0.2 1 1 1 108 6 0.2 1 0 6 109 1 0.2 1 0 1 110 3 0.2 1 1 2 111 2 0.2 1 1 1 112 3 0.2 1 2 1 113 4 0.2 1 2 2 114 10 0.2 1 5 5 115 4 0.2 1 4 116 18 0.2 1 11 7 117 7 0.2 1 7 118 3 0.2 1 2 1 119 3 0.2 1 3 120 3 0.2 1 3 121 3 0.2 1 2 1 122 1 0.2 1 1 123 7 0.2 1 7 124 6 0.2 1 6 125 9 0.2 1 9 126 19 0.2 1 18 1 127 29 0.2 1 28 1 128 40 0.2 1 39 1 129 44 0.2 1 42 2 130 31 0.2 1 31 131 40 0.2 1 37 3 132 24 0.2 1 24 133 58 0.2 1 52 6 134 75 0.2 1 71 4 135 214 0.2 1 199 15 136 575 0.2 1 539 36 137 5231 0.2 1 5072 159 138 382593 0.2 1 376081 6512 RUN STATISTICS FOR INPUT FILE: /sibcb2/bioinformatics2/heshutao/processing/cup/20201120/RRBS20A041604_val_2.fq.gz ============================================= 12691001 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 615528 (4.9%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 0 (0.0%) Total number of sequences analysed for the sequence pair length validation: 12691001 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 413468 (3.26%)