In support of instrument development and run diagnostics, the errorTool is capable of decomposing per base error rates in many different ways. The most generic partitioning of the errors is by read length and is described here.
The general report results from running errorTool with the following command line arguments:
$ errorTool --reference_file ref.fasta --read_file reads.sms \
--analysis_type generalThe report contains three sub-reports in which the per base error rates have been partitioned by their aligned position in either a homopolymer, non-homopolymer, or irrespective of these reference contexts. The sub-report headers: Errors in the whole read, Errors in non-homopolymer regions, and Errors in homopolymer regions precede each sub-report.
Generic Report Output Information
Reports are stored in a set of files with names that encode the flowcell, channel, camera, and read pass number. Specifically the file <TT>_fc<VV>_ch<XX>_pass<YY>_camera<ZZ> is associated with an error analysis of
If the --separate_pass_accounting flag is chosen YY is the pass number, otherwise YY is set to ALL.
If the --by_camera flag is chosen ZZ is the camera number, otherwise ZZ is set to ALL.
If the --by_reference_accounting flag is chosen then each file contains a set of tables for each reference. The corresponding set begins with the name of the reference. Example Report
The example below shows a mock report generated for reads of length 14 to 18. The sub-reports have been formatted in HTML to aid in viewing them. Those output from errorTool will be in text format (ASCII).
Errors in the whole read
Read Number Cumul. Cumul. Cumul. Cumul. Cumul. Percent Percent Percent Percent Length Reads Length Inserts Missed Substs Error Inserts Missed Substs Error 18 1102 20144 614 922 219 1755 3.05 4.58 1.09 8.71 17 3650 62550 655 1155 321 2131 1.05 1.85 0.51 3.41 16 5768 94516 630 2858 423 3911 0.67 3.02 0.45 4.14 15 6031 95110 670 5315 425 6410 0.70 5.59 0.45 6.74 14 7121 107192 813 8311 552 9676 0.76 7.75 0.51 9.03
Errors in non-homopolymer regions
Read Number Cumul. Cumul. Cumul. Cumul. Cumul. Percent Percent Percent Percent Length Reads Length Inserts Missed Substs Error Inserts Missed Substs Error 18 1102 16639 614 333 84 1031 3.69 2.00 0.50 6.20 17 3650 54594 655 737 105 1497 1.20 1.35 0.19 2.74 16 5768 89030 628 2395 247 3270 0.71 2.69 0.28 0.67 15 6031 91858 670 4882 307 5859 0.73 5.31 0.33 6.38 14 7121 104848 813 8086 473 9372 0.78 7.71 0.45 8.94
Errors in homopolymer regions
Read Number Cumul. Cumul. Cumul. Cumul. Cumul. Percent Percent Percent Percent Length Reads Length Inserts Missed Substs Error Inserts Missed Substs Error 18 986 3505 0 589 135 724 0.00 16.80 3.85 20.66 17 3363 7956 0 418 216 634 0.00 5.25 2.71 7.97 16 2276 5486 2 463 176 641 0.04 8.44 3.21 11.68 15 1287 3252 0 433 118 551 0.00 13.31 3.63 16.94 14 957 2344 0 225 79 304 0.00 9.60 3.37 12.97