Lately I've been seeing the assertion that carbon dioxide (CO2) levels have no correlation with temperature changes, particularly not over the last 100-150 years. I discussed a form of this, where the site was asking you to estimate the correlation by eye and doing a

misleading job of it. It's a bizarre claim, however, given that even at the eyeball level the figures in that previous note of mine show a pretty good correlation.

[Update April 9, 2009: Some recent commentators don't believe, and didn't bother to check, that there are sources which claim no correlation. See

20 who deny co2 is correlated with temperature. It's actually 21.]

But let's be quantitative. Eyeball estimation only gets you so far, and eyeballs can be deceived fairly easily (such as by the co2sceptics presenting different periods of data, and drawing in a misleading curve for one set). Also, showing the variation through time, separately, of temperature and carbon dioxide, is not the best way to look at the correlation. Better is to plot the temperature against the CO2 level for that year. I use the temperatures and CO2 values from my prior note. For the period of the Siple ice core (before 1959) I filled in the missing values by straight line fill. Fortunately, in this period the CO2 concentration is changing only slowly.

Before going further, though, a couple of reminders. One is, doing this analysis is

*not* how professional climate change predictions are done. The prediction of warming from human-released CO2 was made 60 years before the first measurements of atmospheric CO2 levels. The estimates today of warming from CO2 are done, as was done originally, by examining how the laws of conservation of energy play out in the atmosphere. Second is, this analysis is one that is demanded by the people making the claim that there is no such correlation. They are badly wrong, as we'll see. They're either lying, or failing to do their homework. Either way, not sources to keep using.

So, for eyeball inspection, annual mean global temperature (deviation from baseline) plotted against CO2 annual average:

It's awfully hard to look at this and say that there's no correlation between CO2 and temperature. Since eyeballs can be deceived, we'll be quantitative. While I think the term 'correlation' is more or less known in common language, there are a few technical points I want to be sure we all have in hand. Correlation is a measure of how much one variable (temperature, in our case) depends on another (CO2 for us). It can range from +1 (perfect correlation in a positive sense -- that is, every time you increase CO2 by 10 ppm, you would increase temperature by the same, say, 0.1 degree) to -1 (perfect correlation in a negative sense, every time CO2 increased by 10 ppm, temperature would

*drop* by 0.1 degree). At 0 correlation, there's no connection. But ... all of this is for straight line relationships. If the real relationship is not straight line, you have more work to do. We are going to

*assume* that the relationship is straight line.

One last bit before trotting out numbers. Correlation itself is not really our goal even if we compute that number along the way. We generally want to predict some variable given knowledge of some other variable; in our case, predict temperature given CO2 levels. Now temperature varies around year to year. If we have a good prediction method, we can explain most of that variance. If we multiply the correlation by itself (square it), that number tells us how much of the variance we can explain. If a variable explains more than 50% of the variance, we can say that it is the predominant factor. If it's over 10% of the variance, it's notable even if not predominant. Such conclusions also have to be checked for whether they're statistically significant. If I threw dice only twice, their value would 'explain' 100% of the variance in temperature for 2 years (hence the comments about 'you can always draw a straight line through 2 points'). If something isn't

*even* statistically significant, then it is definitely not notable or predominant. It's also entirely possible for a relation to be statistically significant, but not notable -- physically.

I'm doing a little algebra and will represent the predictions as:

T = slope * (CO2 - reference)

T is the temperature anomaly in global mean surface air temperature, slope we'll compute, CO2 is the annual average CO2, and reference is a reference level of CO2. (Temperature anomalies are in centigrade, and CO2 is in ppmv)

time span | % variance explained | slope (C per ppmv CO2) | slope (C per 100 ppmv CO2) | reference (ppmv) |

1850-2007 | 78 | 0.00868 | 0.868 | 333 |

1850-1958 | 28 | 0.00962 | 0.962 | 329 |

1959-2007 | 82 | 0.00962 | 0.962 | 335 |

If we simply look over the whole period of temperature data, we see 78% of the temperature variance is explained as a linear response to CO2 changes. Perhaps you don't trust ice core CO2, or older temperature values. The more recent period shows CO2 explaining 82% of all variation. For either the record as a whole, or for the more recent period, temperature shows a very strong correlation to CO2.

Research (see the citations in the

IPCC working group 1, 4th report) is showing that it is in the last half century that human-derived CO2 (and others) have been the predominant drivers of climate change. Both the full 158 year record and the recent shorter 50 year record support that -- explaining 78-82% of variance definitely qualifies (at least if it passes statistical significance, which we'll get to in a minute). On the other hand, the same report says that before about 1950, CO2 is not the major driver. Here we see 28% of the temperature variance in that period being from CO2. Consistent with CO2 being a notable component of the system, but not the predominant one in that period.

Now for the tests of statistical significance, which I'm afraid is more gory in detail than I write up here. But, the result is that all three correlations are significant at better than the 0.0005 level. In more normal language, we'd take 1 and divide by that number. The result is the number of times we'd have to collect data that were just random numbers before we'd find even 1 example with this high a correlation. In this case, at least 2000. Less than a 1 in 2000 chance of just being noise. (My statistical table only goes out this far. If I had better tables, they'd show much, much, higher odds against these correlations being chance.)

The IPCC estimate for climate sensitivity to doubling CO2 levels (which they took as 550 ppm, see, for example, the technical summary at the above link) is that it is 'likely' (see their definition of the word, it probably isn't what you think) 2 to 4.5 C, with a best estimate of 3 C. Using the highly simple-minded regression above, we get estimates of 2.4 and 2.7 C. Not only does CO2 indeed correlate with temperature, proving false the unreliable sources that started us here, but the sensitivity suggested by that correlation is in the range of what the IPCC arrives at by much more meaningful models of the physics of the climate system.

One last piece. Cast your eyes back up to the graph, and maybe click on it to get the full size version. Towards the right hand end, you see a dot that's far above the straight line fit. You're not surprised that this is 1998 -- the year of the major El-Nino that was concurrent with a time of high solar activity. The bit farthest below the curve around 357 ppm CO2 is 1992-3 -- cooling effects of Mount Pinatubo. Even against a statistically strong trend, there are still other effects in the system that can give some tenths of a degree weather variations.

There are many reasons climatologists don't approach climate this way, and they're good ones. I only did it because I've been encountering sources that say that the correlation is zero (nonexistent). Those places are wrong.