PROGRESS REPORT - 2016
THE PROBLEM
In biology, we associate reproducibility with validity. To wit: if an experiment can be duplicated, the results are valid. Reproducibility, however, is a coin with two sides – one being precision (repeatability) and the other accuracy (truth). Herein lies a problem. Increasing precision brings individual estimates closer together, but this improvement may have nothing to do with accuracy. Even if an estimate is extremely precise, it may be completely wrong. Typically, accuracy has remained well beyond our reach in biology because estimates for a given part – across the literature – invariably produce a diffuse cloud of data points. The problem becomes unreconcilable. How does one choose the one correct point from a large collection of presumably correct points?
THE SOLUTION
In truth, reproducibility becomes a valid argument only when both precision and accuracy are in play. This being the case, it would seem advantageous to have access to a reproducibility test based on both accuracy and precision. Accordingly, identifying such a test will be the main objective of the report. The reader will soon discover that a solution to this problem turns out to be a simple exercise in the application of complexity theory. What will happen? First, we’ll find biology’s solution to its reproducibility problem and then copy it. If it’s good enough for biology, it should be good enough for us. More to the point, it makes an otherwise difficult problem easy to solve.
THE PROCESS
Begin by reading the report. If interested in the mechanics of the test, work through the example given in the appendix. Don’t be intimidated by the details accompanying the worked example. They are included to simplify the task of moving data to and from several programs, which include Word, Excel, Access, and Mathematica. A word of caution. Some of these programs have data caps that will cause problems when working with large data sets.
THE UNDERSTANDING
Complexity is about power. Biology uses complexity to accumulate power by selecting advantageous relationships of parts to connections. Unwittingly, we strip that power from our biological data by isolating parts and discarding their connections. The consequence? Since parts get most of their power from connections, they cannot be expected – in the absence of such connections – to be very effective at solving fundamental problems in biology. And what exactly is this power? It includes mathematics and our ability to interact with biology mathematically. Thanks to literature databases, we can now access both parts and connections. This puts complex problem solving back in play. For example, by allowing two sets of connections to interact, one with and one without parts, we reactivate this power by letting biology solve the reproducibility problem for us. The result is a robust test for reproducibility – an emergent property of biology that comes to us via the biology literature.
PROGRESS REPORT - 2016
The progress report addresses the current “crisis of reproducibility” in the life sciences by introducing a reproducibility test based first on accuracy and then on precision.
APPENDIX FOR 2016
The appendix includes a worked example of the reproducibility test. This includes a set of INSTRUCTIONS and copies of the FILES described therein.