SUMMARISING RUN PARAMETERS ========================== Input filename: /sibcb2/bioinformatics2/heshutao/processing/cup/20201104_20201106/RRBS20A041609_val_1.fq.gz Trimming mode: paired-end Trim Galore version: 0.6.2 Cutadapt version: 2.6 Number of cores used for trimming: 1 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp File was specified to be an MspI-digested RRBS sample. Read 1 sequences with adapter contamination will be trimmed a further 2 bp from their 3' end, and Read 2 sequences will be trimmed by 2 bp from their 5' end to remove potential methylation-biased bases from the end-repair reaction All Read 2 sequences will be trimmed by 2 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 3 bp from their 3' end to avoid poor qualities or biases Output file will be GZIP compressed This is cutadapt 2.6 with Python 3.6.7 Command line parameters: -j 1 -e 0.1 -O 1 -a AGATCGGAAGAGC /sibcb2/bioinformatics2/heshutao/processing/cup/BS_workflow/20201104_20201106/tmp/5eb97c56-48bc-11eb-b85d-6c92bfc12dee/trimmed/RRBS20A041609_val_1.fq.gz_qual_trimmed.fastq Processing reads on 1 core in single-end mode ... Finished in 336.08 s (21 us/read; 2.80 M reads/minute). === Summary === Total reads processed: 15,695,213 Reads with adapters: 6,363,529 (40.5%) Reads written (passing filters): 15,695,213 (100.0%) Total basepairs processed: 1,520,618,855 bp Total written (filtered): 1,502,857,559 bp (98.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6363529 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.5% C: 0.7% G: 22.6% T: 47.2% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 4510642 3923803.2 0 4510642 2 1320172 980950.8 0 1320172 3 317330 245237.7 0 317330 4 77461 61309.4 0 77461 5 4617 15327.4 0 4617 6 2779 3831.8 0 2779 7 2002 958.0 0 2002 8 2214 239.5 0 2214 9 2725 59.9 0 2536 189 10 2654 15.0 1 1415 1239 11 1629 3.7 1 293 1336 12 673 0.9 1 131 542 13 742 0.2 1 159 583 14 1415 0.2 1 269 1146 15 1248 0.2 1 230 1018 16 2345 0.2 1 398 1947 17 3402 0.2 1 771 2631 18 1790 0.2 1 571 1219 19 76 0.2 1 14 62 20 769 0.2 1 200 569 21 73 0.2 1 16 57 22 154 0.2 1 37 117 23 725 0.2 1 113 612 24 2006 0.2 1 426 1580 25 1093 0.2 1 244 849 26 354 0.2 1 72 282 27 1018 0.2 1 202 816 28 2143 0.2 1 487 1656 29 1993 0.2 1 430 1563 30 502 0.2 1 122 380 31 130 0.2 1 18 112 32 974 0.2 1 221 753 33 1383 0.2 1 280 1103 34 1882 0.2 1 428 1454 35 694 0.2 1 140 554 36 1086 0.2 1 213 873 37 1258 0.2 1 253 1005 38 1396 0.2 1 302 1094 39 414 0.2 1 59 355 40 1163 0.2 1 229 934 41 2141 0.2 1 435 1706 42 237 0.2 1 43 194 43 2952 0.2 1 668 2284 44 406 0.2 1 71 335 45 1200 0.2 1 270 930 46 442 0.2 1 65 377 47 842 0.2 1 145 697 48 2270 0.2 1 515 1755 49 205 0.2 1 33 172 50 1165 0.2 1 220 945 51 307 0.2 1 65 242 52 264 0.2 1 46 218 53 915 0.2 1 181 734 54 2619 0.2 1 640 1979 55 1450 0.2 1 305 1145 56 663 0.2 1 132 531 57 1307 0.2 1 278 1029 58 703 0.2 1 123 580 59 288 0.2 1 41 247 60 643 0.2 1 135 508 61 867 0.2 1 141 726 62 2099 0.2 1 466 1633 63 604 0.2 1 122 482 64 65 0.2 1 5 60 65 50 0.2 1 13 37 66 242 0.2 1 58 184 67 282 0.2 1 55 227 68 1208 0.2 1 252 956 69 1338 0.2 1 261 1077 70 2307 0.2 1 490 1817 71 820 0.2 1 141 679 72 320 0.2 1 62 258 73 414 0.2 1 82 332 74 435 0.2 1 104 331 75 692 0.2 1 172 520 76 610 0.2 1 129 481 77 683 0.2 1 143 540 78 687 0.2 1 171 516 79 573 0.2 1 126 447 80 854 0.2 1 183 671 81 689 0.2 1 135 554 82 716 0.2 1 155 561 83 652 0.2 1 149 503 84 619 0.2 1 138 481 85 643 0.2 1 133 510 86 665 0.2 1 146 519 87 452 0.2 1 102 350 88 357 0.2 1 76 281 89 409 0.2 1 91 318 90 371 0.2 1 95 276 91 373 0.2 1 84 289 92 313 0.2 1 58 255 93 1322 0.2 1 369 953 94 1958 0.2 1 475 1483 95 601 0.2 1 150 451 96 351 0.2 1 90 261 97 2054 0.2 1 504 1550 98 277 0.2 1 74 203 99 174 0.2 1 34 140 100 189 0.2 1 38 151 101 172 0.2 1 35 137 102 283 0.2 1 63 220 103 104 0.2 1 29 75 104 66 0.2 1 16 50 105 73 0.2 1 17 56 106 61 0.2 1 17 44 107 34 0.2 1 10 24 108 11 0.2 1 4 7 109 19 0.2 1 5 14 110 40 0.2 1 7 33 111 8 0.2 1 1 7 112 6 0.2 1 2 4 113 1 0.2 1 1 114 3 0.2 1 1 2 115 3 0.2 1 0 3 116 6 0.2 1 2 4 119 1 0.2 1 0 1 120 1 0.2 1 1 124 1 0.2 1 1 125 1 0.2 1 1 126 1 0.2 1 1 127 1 0.2 1 1 129 4 0.2 1 3 1 130 2 0.2 1 1 1 131 2 0.2 1 1 1 132 6 0.2 1 5 1 133 13 0.2 1 7 6 134 10 0.2 1 9 1 135 21 0.2 1 17 4 136 81 0.2 1 67 14 137 9246 0.2 1 9058 188 138 26761 0.2 1 26042 719 140 5 0.2 1 0 5 141 1 0.2 1 0 1 143 1 0.2 1 0 1 RUN STATISTICS FOR INPUT FILE: /sibcb2/bioinformatics2/heshutao/processing/cup/20201104_20201106/RRBS20A041609_val_1.fq.gz ============================================= 15695213 sequences processed in total Sequences were truncated to a varying degree because of deteriorating qualities (Phred score quality cutoff: 20): 546717 (3.5%) RRBS reads trimmed by additional 2 bp when adapter contamination was detected: 6363527 (40.5%)