Online Book Reader

Home Category

Reinventing Discovery - Michael Nielsen [53]

By Root 330 0
more unusual for them to encourage their colleagues to make independent analyses, and perhaps independent discoveries. You can grasp something of what’s at stake by looking at some famous cases where data was partially revealed. For instance, earlier I mentioned Ptolemy’s Almagest, one of the great scientific works of antiquity. But I should perhaps have put “Ptolemy” in quotes, because many historians of science—not all, but many—believe that Ptolemy plagiarized many of the star positions in his catalog from the astronomer Hipparchus, who had done his own sky survey nearly 300 years earlier. In fact, the history of science is full of examples of scientists stealing data from one another. Back at the dawn of modern science the astronomer Johannes Kepler discovered that planets move in ellipses around the sun using data he stole from his deceased mentor, the astronomer Tycho Brahe. James Watson and Francis Crick discovered the structure of DNA with the aid of data they borrowed from one of the world’s leading crystallographers, Rosalind Franklin. I say borrowed, because this was done without her knowledge, although with the aid of a colleague of Franklin’s who was arguably within his rights. These are, admittedly, extreme examples, but they do show why most scientists go to some trouble to keep their data secret.

There’s a puzzle here, then: why does the SDSS share data so openly? Think about the situation from the point of view of members of the SDSS collaboration. Almost certainly there are important discoveries that they could have made, but which they were beaten to by someone outside the collaboration who used SDSS data. To put it in starkly self-interested terms, while open data may be good for science, it’s arguably bad for the careers of members of the SDSS collaboration. Why do they stand for it? Why doesn’t the SDSS lock up the data?

In fact, the SDSS does partially lock up the data. When the SDSS telescope takes images, they aren’t immediately made public. Instead, for a brief period of time—typically a few months to a little over a year—they are only available to official members of the SDSS collaboration. It’s only after that period has elapsed that the data are made freely available to everyone in the world.

There’s a similar partial openness about the membership of the SDSS collaboration. While most scientific experiments still involve only a small number of participants, the SDSS collaboration has 25 participating academic institutions, and includes also 14 additional scientists who are not at any of the participating institutions. All in all, roughly 200 scientists are official members of the collaboration, far more than was scientifically necessary to get the SDSS up and running. The home page of the website for the current phase of the SDSS (stage III) even encourages “[i]nquiries from interested parties to join the collaboration.” Astronomy is a small community, with just a few thousand professional astronomers in the world. As a result many, perha most, professional astronomers have a friend or colleague who is part of the SDSS collaboration, and with whom they can potentially collaborate using SDSS data, even during the initial period when the data are not open.

These explanations clarify the process the SDSS uses to share data, but they don’t answer our starting question, which is why the SDSS makes its data partially open in the first place. Why not just lock the data up for good? And why isn’t the SDSS collaboration deliberately kept as small as possible, to increase the benefits received by individual members? Before I answer these questions, I want to briefly describe several more examples of experiments that make their data openly available. Those examples will help us understand why and when scientists make their data openly available, and why open data is important.


Building the Scientific Information Commons

In September of 2009 an organization called the Ocean Observatories Initiative began building a high-speed network for data and electricity on the floor of the Pacific Ocean.

Return Main Page Previous Page Next Page

®Online Book Reader