19 August 2008

Cherry Picking

Unfortunately I'm not talking about getting hold of a nice batch of fresh fruit. Instead, it's a particularly common dishonest tactic. It's also one that is flagrantly against the principles of doing science.

What it consists of is making a statement that is true only about a specific especially well-chosen circumstance, and then pretending that you've made a general statement about the system at hand. This is offensive to me as a scientist because in science we're trying to understand the system -- all of it. The cherry pickers abandon honesty for word games.

Suppose we're trying to understand the global mean surface air temperature. There are many other things we could try to understand, but this one is fairly often looked at. After we look a bit, we notice several things. One is that the temperature varies from year to year. As we look in to this further, we see that several things happen which affect the temperatures. This includes having a more active sun (warmer), having a recent major volcano erupt (cooler), having an El Nino (warmer) or La Nina (cooler). After doing our best to subtract out all those effects, we see that there is still some variation year to year. That's the 'free variability' (the scientist's way of saying 'stuff happens'). It turns out that there are also some contributions from anthropogenic aerosols (cooling), increased greenhouse gas levels (warming), and other human activity (depends).

As we try to make our honest understanding of this complex system (are there other things that affect global mean temperature? how much?) we also have to wonder about how much data we need to collect before we can tell the difference between that free variability and a trend caused by one source or another. Remember what happened when you tried my climate change detection experiment. Even with random numbers, you got runs of several consecutive 'years' of warming or cooling. Free variability does this to you. So if you're looking for trends or other systematic things, you need to look at a long enough period that the free variability can't lead you to a mistaken conclusion. Plus, of course, you have to make that allowance for all the things that you know happen and affect the variable (global mean temperature) you're interested in but are due to processes (solar variability, El Nino, volcanoes, ...) that you're not concerned about at the moment (greenhouse gas levels).

It can be very difficult to do this even when you're trying to do it all correctly. One of the first satellite sounding temperature analyses (Spencer and Christy, 1992 or 1993, if I remember rightly) showed a large cooling trend at the same time that all other data sets showed a warming. This was very puzzling. Not long after, however, Christy (same one) and McNider (1994 or so) showed that this was because the data record started near an anomalously warm period (strong El Nino in 1982-3) and ended near an anomalously cold period (after the eruption of Mount Pinatubo). It's anomalous because we're not (in looking for signs of whether human activity affects global mean temperature) concerned with El Nino and volcanoes. Once those two obvious anomalous events were taken out, the 'cooling' trend vanished. Science being a small world, I ran in to McNider not long after he'd published that paper and we talked about it among other things.

One thing you can look for, even with no particular knowledge, is whether the author (blogger, commenter, ...) is considering other factors that can be involved. Even easier, and the cherry-pick which prompts me here, is to see how they selected the time spans they used and the data sets that are used. In the satellite example above, for instance, it was straightforward -- the authors used all the satellite period they had data for. Fair enough.

Since 1998, though, there's been an industry that is careful to not use all the data they could. Indeed they're aggressive about ignoring data. You don't need to be a specialist to know that this doesn't square with honest understanding of a complex system. People who are seriously trying to understand climate are continually complaining about wanting more data. Throwing away good data is inconceivable to them. But in that industry, they're not concerned with honest understanding. They wish to arrive at a conclusion and if they pick the right starting year (1998) and data set (CRU rather than GISS, for instance), then they can get the answer (a cooling 'trend') that they want.

Now to get that, they have to choose only one or two years, both from recent history, as the time to start their 'analysis'. If they choose any of the 100+ years before 1998 that we have a surface temperature record for, their conclusion is gone. If they use GISS rather than CRU, their conclusion is gone.

Further, even choosing that one year as the start would not be enough to preserve their conclusion if they were honest enough to examine the other things we know affect the climate system -- that was a year with a strong El Nino (warming) and high solar activity (warming). Instead they ignore this (either dishonest or simply not doing their homework) and make various declarations against anthropogenic climate change.

With a couple questions, then, a legion of authors/sites can be pitched for being unreliable:
* Are they playing the 'global cooling since 1998' game?
* More generally, would their conclusions hold up if the start year were chosen differently?
* Are they assuming that only one thing affects global mean temperatures?

If the do the first or third, they're lying or not doing their homework. If they don't address the second, they're at least not doing their homework.

I've been aware of this particular cherry-pick for some years now, and the popularity of cherry-picking among anti-scientific groups even longer. So I'll let you do your own check of how many sites or sources within 15 minutes you can find that commit this error. Depending on your reading speed, you should make 5 easily, and 20 if you're a quicker reader and have a fast connection.


Anonymous said...

The newest meme is to claim there has been "no warming this entire century".


Anonymous said...

Nice post; I've often wondered why that simplistic claim - that warming stopped in 1998 - seems so convincing to large numbers of people. I don’t think people have much of an intuition for how a trend can be pushed around by random causes and still remain a trend. What’s remembered from a statement such as “On average, over and above the impact of other causes we know about, x causes y to increase” is simply the core statement “x causes y to increase,” with an assumption unconsciously added that other causes aren’t important. Add a little bit of biased attention to the original message and you have a fairly impervious audience.

For many people, it seems to me, there’s also a protean expectation that scientific findings are perfectly “clean”; that is, free from noise in the data and isolated from other influences, as if scientific findings report only perfect correspondences. In other “non-scientific” areas, noisy trends might be accepted as trends, but the noise violates an uninitiated intuition about science. So you end up with a seeming paradox – people who will confidently see an overall upward trend in, say, a stock market time series, and who will act on it, will fail to be impressed by a similar trend in temperature measures.

And, as you say, it’s immensely frustrating when people take advantage of these mistaken intuitions in order to influence public opinion.

Anonymous said...

Thanks for this... I now have a great link to point people to when they use 1998 as a baseline. Though I doubt many of 'those' people will listen to your well reasoned logic.

Robert Grumbine said...

thingsbreak: one recent winter the local media were going berserk about how it was the 'coldest winter in 10 years'. Please. I remember winters which were the coldest in recorded history, not merely 10 years. But so goes trying to scare people rather than trying to explain the best science.

ian: It's generally pretty easy to convince people of things they want to believe. The acceptance of 'it stopped' seems fastest among those whose response to scientists is 'you're just trying to take away my SUV'. What's striking is that most scientists' feelings seem to be like mine. The science is no threat, nor are scientists.

Dan: The folks whose response is 'you're just trying to take away my SUV', no. They're not deciding based on science, or reality even. No evidence is too weak to be a support for their conclusion, no evidence is strong enough to change their conclusion.

But ... most people, I think, really are open to reality and reality-based decisions. Most of the time we (all) use short cuts like doing as a friend of ours does. We think that our friend is sharp and has paid enough attention to make a good decision, so we save time by doing as they do. In the realm of climate, though, chances are good that our friend has been taken in by some of the lousy 'information' (lies and laziness) that abounds.

So I try to put out some good information that folks like you can point others to. I'm far from the only scientist doing so, but maybe my explanations will be different in a useful way.

A different side I haven't emphasized but which is present -- pointing out how using 1998 as the 'reference' year is particularly bad is not to say that other sources (IPCC, for instance) are necessarily right. Just that the people abusing 1998 like this are definitely wrong. After weeding out a bunch of definitely wrong sources, it becomes easier to have a reasonable discussion.

Anonymous said...

Please comment on this analysis:

This is an analysis of temperature trends since 2001 not 1998 and shows that the 2 deg/century put out by the IPCC is falsified.

Please refrain from calling or characterizing skeptics as "liars". Unless of course you allow your own standards to be used with members of your own organization (UCAR). Keeps to the facts and arguments and do not impute motivations, character and (hard for UCAR guys) credentials.

Robert Grumbine said...

Jonathan writes:

Please comment on this analysis:

Will do, though this looks like a copy without credit of a different source I saw a while back. Or maybe the other was a copy.

This is an analysis of temperature trends since 2001 not 1998

Do you think that using a shorter period makes it more reliable for studying climate? Why?

and shows that the 2 deg/century put out by the IPCC is falsified.

There are surprisingly many errors in that statement, which I'll expand on in a later note.

I don't work at UCAR. I was a postdoc being paid by them, but that was in 1990 and 1991. I've been paid elsewhere since then. But my employer is irrelevant. They don't tell me what the company line is, and I don't speak for them. (Even at work I don't, and blogging is decidedly not at work!) In any case, your request for good behavior on my part would be more effective if you didn't slam a large group of people you thought I worked for.

I'll take up your reference in a different post. In the mean time, I'll note that the analysis was done for 2001-present. Why 2001? Is it the beginning of the data sets? No, they go back at least another 20 years. Hm. Why ignore over 2/3rds of the data? Looks like some cherries might be getting picked. But I'll take the extra time in the other post and go through the claims in some detail. Again, no great knowledge will be required even though statistical techniques are showing up.

I hope you'll be around (probably Monday) then, take a look, and play the part of a real skeptic. That is, if there are substantive reasons that you aren't persuaded, give them. And if you don't have substantive reasons, allow as how that source wasn't good.

Anonymous said...

"Why 2001?"

Lucia explained that she chose 2001 because that was the year TAR came out. She is testing TAR projections.

Robert Grumbine said...

anon: in the given link she's referring to AR4, not TAR. IPCC 4th report, not third. On Monday (25th) the long comment Jonathan invited will appear. I have some other things I want to print first, including a part 1 to set up the long comment.

Simon Evans said...

Jonathan, anonymous,

Well, penguindreams is planning a longer comment, but just a couple of points in the meantime -

1)I'm puzzled by Lucia's use of the term 'falsified' when she has said that she fully anticipates future short periods showing a trend greater than IPCC projections. The only thing that would be falsified is the statement '2001 to 2007/8' will show a 0.2C upward trend'. Nobody has made such a silly statement! I fully anticipate that Lucia will be declaring the projection 'unfalsified' in the future. If you can see that is at least theoretically possible, then you must understand how questionable is the use of the term at this stage.

2) The IPCC has not "put out" 2C/century (though they have said, in the 4th AR, that they project the next two decades to be about 0.2C/decade for all scenarios). Projected temperature rise depends upon premises adopted in the scenarios. This is why they are projections, not predictions. Science can't predict unless the inputs are known.

Anonymous said...

Why indeed leave out two-thirds of the data? That should always set off an alarm if there's no reason given. Is it just that the 80's and 90's are filled with too many bad memories for the usual suspects?

I think part of it is the hope that this decade really has been a turning point, when the natural cycle goes from a warming to a cooling phase. The science tells us they'll be sorely disappointed.

Robert Grumbine said...

Your first para is much stronger without the question of 'bad memories'. Dropping 2/3rds of the data is certainly a warning flag. There can be good reasons to do so, but since none were given, the flag stays up. We don't need to speculate about bad memories.

I think part of it is the hope that this decade really has been a turning point, when the natural cycle goes from a warming to a cooling phase. The science tells us they'll be sorely disappointed.

Which papers/books/etc. lead you to this conclusion? What evidence would have to show up for you to conclude that the 80s and 90s do just represent a natural fluctuation that started reversing some time in the last few years?

Simon Evans said...


I suspect that you have misunderstood brewster's comment. When he/she wrote
"The science tells us they'll be sorely disappointed" I presume he/she meant the "they" to refer to those who are desperately trying to argue that GW has stopped. Well, that's what I understood it to mean anyway.

Robert Grumbine said...

That's what I understood as well. Two things though: a) he didn't include any sources, so I don't know what his source of confidence is b) maybe the science is wrong. At what point would the evidence be sufficient for him (any of us) to conclude that the 80s and 90s were just an anomalous period, nothing anthropogenic to it?

Simon Evans said...

Ah, I see!

Well, personally, I would become more dubious if the next El Nino does not coincide with a rising temperature trend, having taken account of any known negative forcings such as a major volcanic eruption.

I'm quite interested in the position of the 'solar' crowd at the moment, who seem to be getting very excited about the possibility of cycle 24 being weaker. It seems to me that if it is weaker, and yet if temperatures continue to rise (albeit somewhat mitigated by a reduction in solar output), then we might lay that particular canard to rest?