SUMMARISING RUN PARAMETERS ========================== Input filename: /sibcb2/bioinformatics2/heshutao/processing/cup/20201120/RRBS20A041687_val_2.fq.gz Trimming mode: paired-end Trim Galore version: 0.6.2 Cutadapt version: 2.6 Number of cores used for trimming: 1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction All Read 2 sequences will be trimmed by 2 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases Output file will be GZIP compressed This is cutadapt 2.6 with Python 3.6.7 Command line parameters: -j 1 -e 0.1 -O 1 -a AGATCGGAAGAGC /sibcb2/bioinformatics2/heshutao/processing/cup/BS_workflow/20201120/tmp/a6f6756a-5572-11eb-976b-6c92bfc12788/trimmed/RRBS20A041687_val_2.fq.gz_qual_trimmed.fastq Processing reads on 1 core in single-end mode ... Finished in 477.56 s (25 us/read; 2.43 M reads/minute). === Summary === Total reads processed: 19,375,386 Reads with adapters: 7,911,446 (40.8%) Reads written (passing filters): 19,375,386 (100.0%) Total basepairs processed: 2,011,032,043 bp Total written (filtered): 1,919,640,686 bp (95.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 7911446 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.2% C: 51.8% G: 9.0% T: 10.0% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 7162367 4843846.5 0 7162367 2 93140 1210961.6 0 93140 3 10160 302740.4 0 10160 4 6936 75685.1 0 6936 5 2021 18921.3 0 2021 6 1063 4730.3 0 1063 7 1745 1182.6 0 1745 8 663 295.6 0 663 9 337 73.9 0 286 51 10 3832 18.5 1 744 3088 11 152 4.6 1 15 137 12 2870 1.2 1 470 2400 13 603 0.3 1 65 538 14 801 0.3 1 53 748 15 298 0.3 1 15 283 16 355 0.3 1 26 329 17 574 0.3 1 33 541 18 261 0.3 1 12 249 19 671 0.3 1 41 630 20 484 0.3 1 28 456 21 46 0.3 1 17 29 22 392 0.3 1 39 353 23 588 0.3 1 39 549 24 1311 0.3 1 80 1231 25 454 0.3 1 25 429 26 711 0.3 1 43 668 27 266 0.3 1 24 242 28 493 0.3 1 53 440 29 54 0.3 1 2 52 30 414 0.3 1 38 376 31 441 0.3 1 44 397 32 479 0.3 1 38 441 33 1129 0.3 1 75 1054 34 277 0.3 1 32 245 35 511 0.3 1 36 475 36 97 0.3 1 7 90 37 572 0.3 1 35 537 38 155 0.3 1 4 151 39 222 0.3 1 13 209 40 189 0.3 1 16 173 41 346 0.3 1 16 330 42 439 0.3 1 23 416 43 194 0.3 1 4 190 44 305 0.3 1 23 282 45 611 0.3 1 47 564 46 301 0.3 1 16 285 47 142 0.3 1 13 129 48 528 0.3 1 49 479 49 243 0.3 1 23 220 50 445 0.3 1 46 399 51 571 0.3 1 51 520 52 489 0.3 1 42 447 53 99 0.3 1 9 90 54 252 0.3 1 21 231 55 244 0.3 1 24 220 56 54 0.3 1 3 51 57 195 0.3 1 20 175 58 431 0.3 1 33 398 59 231 0.3 1 25 206 60 296 0.3 1 43 253 61 375 0.3 1 83 292 62 453 0.3 1 158 295 63 444 0.3 1 174 270 64 169 0.3 1 40 129 65 206 0.3 1 35 171 66 499 0.3 1 49 450 67 281 0.3 1 21 260 68 230 0.3 1 19 211 69 197 0.3 1 23 174 70 228 0.3 1 28 200 71 205 0.3 1 25 180 72 182 0.3 1 23 159 73 177 0.3 1 22 155 74 177 0.3 1 15 162 75 438 0.3 1 29 409 76 460 0.3 1 50 410 77 243 0.3 1 18 225 78 134 0.3 1 13 121 79 161 0.3 1 16 145 80 188 0.3 1 12 176 81 157 0.3 1 11 146 82 165 0.3 1 14 151 83 189 0.3 1 21 168 84 217 0.3 1 20 197 85 203 0.3 1 22 181 86 132 0.3 1 13 119 87 117 0.3 1 14 103 88 115 0.3 1 6 109 89 121 0.3 1 8 113 90 112 0.3 1 10 102 91 87 0.3 1 6 81 92 86 0.3 1 6 80 93 128 0.3 1 17 111 94 113 0.3 1 11 102 95 78 0.3 1 8 70 96 60 0.3 1 3 57 97 62 0.3 1 4 58 98 50 0.3 1 4 46 99 95 0.3 1 13 82 100 65 0.3 1 4 61 101 34 0.3 1 4 30 102 78 0.3 1 7 71 103 38 0.3 1 1 37 104 23 0.3 1 1 22 105 26 0.3 1 1 25 106 14 0.3 1 1 13 107 13 0.3 1 2 11 108 19 0.3 1 2 17 109 26 0.3 1 3 23 110 19 0.3 1 3 16 111 4 0.3 1 1 3 112 3 0.3 1 0 3 113 4 0.3 1 2 2 114 5 0.3 1 1 4 115 6 0.3 1 6 116 13 0.3 1 11 2 117 4 0.3 1 3 1 118 7 0.3 1 6 1 119 6 0.3 1 5 1 120 7 0.3 1 6 1 121 2 0.3 1 2 122 6 0.3 1 6 123 8 0.3 1 8 124 10 0.3 1 9 1 125 11 0.3 1 9 2 126 13 0.3 1 13 127 21 0.3 1 20 1 128 28 0.3 1 23 5 129 30 0.3 1 27 3 130 35 0.3 1 31 4 131 34 0.3 1 30 4 132 46 0.3 1 35 11 133 76 0.3 1 61 15 134 71 0.3 1 68 3 135 251 0.3 1 224 27 136 666 0.3 1 597 69 137 18131 0.3 1 17632 499 138 579308 0.3 1 568186 11122 139 1 0.3 1 1 RUN STATISTICS FOR INPUT FILE: /sibcb2/bioinformatics2/heshutao/processing/cup/20201120/RRBS20A041687_val_2.fq.gz ============================================= 19375386 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 1026858 (5.3%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 0 (0.0%) Total number of sequences analysed for the sequence pair length validation: 19375386 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 671325 (3.46%)