Data Sharing and Climate Change Research

[Note partial correction below, noted in the UPDATE.]

I know next to nothing about climate change questions, which is why I almost never blog about them. But the recent discussion of the East Anglia CRU e-mails touches on a matter that I have studied a bit, though not much — standards related to the sharing of raw data by researchers.

As I understand, and please correct me if I’m wrong, the CRU researchers apparently refused to share the underlying raw data that was the basis of their research. Is that so? If so, is there a scientifically acceptable justification for this refusal, especially when the topic is as important as this one?

My inclination would be to say that data should nearly always be shared. If you share your data, this lets others check the conclusions you draw from the data, as well as verifying the accuracy of the data against other available sources. They might disprove your arguments, or lead you to improve your arguments, or, if they reproduce your results, they might help prove the validity of your arguments. But in either case, science progresses better, and the decisions made based on the science are more reliable, than if you keep the data secret.

I can see some possible exceptions, for instance where there are concerns about the privacy of research subjects, or where the data was gathered as part of a commercial endeavor that requires that the data be kept proprietary for the commercial project to be viable, or some such. Even there I stress that the exceptions are merely possible; perhaps on balance the data should be shared even then, and in any event, even if there’s a good reason for the data not to be shared, people should view the research skeptically because of the lack of sharing.

But in any event, I don’t quite see why any such possible exceptions would apply here. The explanations that I’ve sometimes heard, which are that (1) the data might be misused by people who would make misleading arguments based on them, and (2) the initial gatherers of the data didn’t authorize the CRU to release it, strike me as unsound. Argument 1 does not seem acceptable for scientific research, since the proper scientific way to fight bad arguments is by responding with good ones, and not denying everyone access to the important data that would let them check your work. And argument 2 strikes me as implausible on the facts, since it’s not clear why the initial gatherers of the data would see it as any sort of secret. [UPDATE: It seems, as I note in my follow-up post, that some government agencies that gather meteorological data may indeed have insisted that CRU not share the data, apparently because they make money by selling it. But, as I argue in that post, that insistence strikes me as unjustifiable, given that the countries in which the data is gathered likely have vastly more to gain from better science that stems from more data sharing than they have to lose from lost climate data licensing revenue.]

Am I missing some important justification for keeping the data confidential in this case? Or am I misunderstanding the situation, and the data was indeed made available to everyone? I am not looking for generalities about the alleged sins or virtues of various schools of thought on climate change, or even arguments about whether climate is indeed changing. My question is solely limited to the question of the scientific ethics related to data sharing; and it would be particularly useful to hear from scientists who have actually dealt with such questions in their own work.

Powered by WordPress. Designed by Woo Themes