Total Recall - C. Gordon Bell [54]
According to Vannevar Bush, “the inheritance from the master becomes, not only his additions to the world’s record, but for his disciples the entire scaffolding by which they were erected.” What he failed to see is just how elaborate this scaffolding might be.
To date, it is common for a published paper with a few tables and charts to be the only long-term survivor of a research project that once had volumes of data, “metadata” that describes how the data was gathered, copious notes, and conversations among the researchers. Vannevar Bush saw that more notes and background material might be shared. Jim Gray led the charge in proposing that everything could be shared. Think of the amazing detail and enormous volume of data that Deb Roy is collecting. His Speechome corpus need not be reduced to a few publications; the whole data set can be passed on.
Science began with a paradigm of observation and experimentation. Later came a paradigm of theory, and, more recently, a paradigm of computer simulation. The fourth paradigm of science, or the Gray paradigm, as I believe it should be called, is a paradigm of data-intensive science. Gray and his colleagues elaborate:
Traditionally scientists have had good excuses for not saving and documenting everything forever, it was uneconomic or infeasible. So, we have followed the style set by Tycho Brahe and Galileo—maintain careful notebooks and make them available; but, the source data is either not recorded at all, or is discarded after it is reduced. In some cases it is even considered private, especially when done in corporate laboratories!
It is now feasible, even economical, to store everything from most experiments. If you can afford to store some digital information for a year, you can afford to buy a digital cemetery plot that will store it forever. In the future, some fields will no doubt require public storage and access of experimental data. Astronomy is an example of a community that is in transition to this new kind of science and may no doubt be at the forefront because it is traditionally collaborative and minimally funded. Sharing observations is critical and the norm.
Researchers from all fields, not just science, will be able to preserve and share all of their material and notes to the benefit of others. There can be enormous value in a marginal entry indicating that a historical assertion is refuted elsewhere, or a note that the thermometer was slightly moved in 1978, accounting for increased temperature readings, or an explanation of why a certain approach was abandoned. Someone may want to apply a fresh approach to the old data. Shared systems will allow many researchers to pool their material together, so that for some given data, say, an economic report for 2002, you can see comments by many individuals, links to related reports, and metadata describing how the report collected its data and tabulated its results.
Historians ought to jump on the fourth paradigm, and insist on original source material being made readily available. Too many works have relied on secondary sources in the past.