Online Book Reader

Home Category

Choose a category
All
Classic-Fiction

Reinventing Discovery - Michael Nielsen [56]

By Root 383 0

the explanation is political. Think about the SDSS. A typical small astronomy project may cost “only” a few tens or hundreds of thousands of dollars. That’s a lot of money, but it’s small change out of the billions of dollars our society spends on astronomy. If the people doing the experiment keep the data to themselves, it’s not a big loss to other astronomers. Furthermore, those other astronomers aren’t in any position to complain, for they too are keeping the data from their experiments secret. It’s a stable, uncooperative state of affairs. But the SDSS’s size makes it special and different. It’s so large that it consumes much of the entire world budget for astronomy. If the data is kept secret, then to astronomers outside the SDSS collaboration it’s as though that entire chunk of money has simply disappeared from the astronomy budget. They have every reason to insist that the data be made open. And so, if large projects don’t commit to at least partial openness, their applications for funding risk being shot down by people in the same field but outside the collaboration. This motivates big scientific projects to make their data at least partially open.

There is another factor inhibiting open scientific data, which is that even if you are willing to share your data, it can be difficult to do so in a way that’s useful to others. You can take all the photographs of galaxies you like, and share them with others, but those photographs are of limited scientific use without all sorts of extra information. What color filters did you use? Has the image been processed in any way, say, to remove bad or damaged pixels? Was there any haze the nhe photos were taken, which might obscure the image? And so on. In many parts of science it’s difficult to make sense of experimental data without detailed calibration information. And even with the data and the calibration information, other scientists still need an extremely detailed understanding of the experiment to make use of the data. Add on top of that problems like being sure everyone is using technical terminology in exactly the same way, file format conversion, and so on. Individually these are all soluble problems, but together they’re a formidable obstacle to sharing data in a way that’s useful.

These questions about sharing data are part of a deeper story, a story about why and when scientific knowledge is shared. Earlier in the book, I mentioned several times that scientists build their reputation and career based on the papers they’ve written. A reputation for writing great papers will get them a good scientific job, and continued grant support. Much of the challenge with data sharing is that the rewards scientists get for sharing their data are much more uncertain than the rewards for writing papers. It’s true that a few large collaborations such as the SDSS have won widespread kudos for sharing data. But in many areas of science, there are few established norms for how and when the use of someone else’s data should be acknowledged. And that means that sharing data is chancy for a scientist. It’s just not something scientists are typically well rewarded for, despite the fact that it’s enormously valuable. And so open data remains uncommon, especially in smaller laboratories. We will return to the question of how to get scientists enthused about sharing data (and other related questions) in chapters 8 and 9. For the purposes of the remainder of this chapter it’s enough that there is already a considerable (and increasing) amount of scientific data openly available, through projects such as the SDSS and the Human Genome Project.

Dreaming of the Data Web

So far in this chapter we’ve taken a concrete, near-term perspective, looking at existing projects such as the SDSS. But the internet is an infinitely flexible and extensible platform for manipulating human knowledge, with a potential that is open-ended. To understand that potential we need to expand our thinking, and move to a long view that sees the internet not as a ten- or twenty-year revolution, but as a hundred-

Online Book Reader

Reinventing Discovery - Michael Nielsen [56]

®Online Book Reader