More on Data Sharing and Climate Change Research

I much appreciate all the responses to my earlier question about data sharing in the climate change debate. The quantity and the detail of the responses was dauntingly high, and I’ve just managed to get through them all. I thought therefore that I’d post a bit more about the subject.

1. It looks like some of the data sources — apparently government-run meteorological services — did indeed demand (whether contractually or as an informal understanding) that the data they gave to CRU not be shared further. Apparently this is because the services sell the data commercially. But to the extent that CRU’s continuing refusal to share the data stemmed from those agencies’ refusal, this just means the problem is broader than I had thought.

The countries in which these meteorological services are located will have to, in the aggregate, spend trillions of dollars under various climate-change-fighting proposals. If I’m right that data sharing is an important part of making science accurate, those countries have much to gain from such sharing.

Even if the global warming claims are right in general, it’s nearly certain that they’re not right in all particulars; that’s not a condemnation of global warming theory, but just a reflection of what happens with nearly all emerging scientific theories. The particulars might well matter in deciding on the best way to combat global warming. An error in the models might yield errors in the recommendations; public access is more likely to uncover those errors.

Of course, if the global warming claims are wrong in important respects, the countries involved have even more to gain from their being proven wrong. And if they are largely right, the countries have much to gain from public confidence in the claims’ correctness, confidence that I think should be reinforced by data sharing and undermined by data concealment.

So if indeed the meteorological services are refusing to have data shared because they sell the data, that is astoundingly penny-wise and pound-foolish. Perhaps CRU might justify its actions by saying that the services conveyed the data under a confidentiality understanding. For that we’d need to know more details about the understandings, and also the details of English Freedom of Information law that may or may not preempt such understandings. But where’s the adequate justification for the underlying continuing insistence by government-run meteorological services that their data remain confidential?

2. Some commenters have pointed out that in many research projects, the researchers try to keep their laboriously gathered datasets private for some time in order to milk the data for more papers, without competition by others. I’m not sure whether this is on balance beneficial to science, or to what degree it is tolerated in various scientific communities.

But even if it makes sense for some researchers to trade off verifiability and credibility for extra lead time to produce novel work — and for some disciplines to trade off the benefits of public access for the benefits of extra incentives to gather data — can that possibly apply here? This isn’t just cutting-edge science in which scientists can say, “OK, if you don’t believe my first article, fine; that’s a tradeoff I’m willing to make in order to get more publications on my resume, and then in several years I’ll release the data and dispel your doubts.”

This is research that people are claiming should form the basis of multi-trillion-dollar regulations and expenditures — that should lead to compulsory restructurings of vast segments of society — and the sooner the better. The need for accuracy is so great here that I don’t think the “I want more lead time to write more papers without competition” argument can apply (nor do I think it’s even made by CRU).

3. Some commenters have argued that a great deal of climate data is indeed available from other sources; and that’s good. But as I understand it, the CRU didn’t get its standing by just being duplicative of other data gatherers. The CRU dataset has apparently formed the basis of a great deal of work, and even if it has been seen by many as entirely consistent with the other datasets, it remains important.

And to verify the CRU’s work it’s necessary to have the precise data on which the CRU relied, and not just other data from other sources that has its own limitations and characteristics. Small differences in coverage (temporal and spatial), as well as differences I the methods through which the data was gathered, can potentially make substantial differences in result. And absence of the precise data, with its precise temporal and spatial boundaries, makes it impossible to verify the particular results that CRU reports. If you are so important, and your dataset is so important, it’s important that other scientists who want to check your work have access to your dataset and not just to other datasets.

4. Finally, one commenter asks whether “this work[s] both ways,” and whether “anti-AGW partisans” should share their data as well. Obviously so, and I take it that if someone criticized conventional scientific wisdom (which global warming has, descriptively, become) and then refused to share the underlying data, he’d have zero chance of persuading anyone, and rightly so. The question is how much extra slack should be cut to the researchers whose work forms the conventional scientific wisdom. Again, where many trillions of dollars are at stake, it seems to me the answer should be “not much.”