No, the delay was not because the results came out differently than I'd expected. Just the more mundane business that I've been doing a number of different things and getting the figures done up reasonably is taking some time as I change my mind about what constitutes reasonable.
In brief (in a journal paper, this would be the 'abstract'):
- You need 20-30 years of data to define a climate trend in global mean temperature
- Forward and backward trends are markedly different
- Therefore, to discuss climate trends in global mean temperature, you need to use 20-30 years of data centered on the date of interest.
As with any abstract, it's too brief to show you why any of these are true, just some simple declarations. Now, if you trust me absolutely (which I don't recommend -- and if I'm talking science, you don't need to), you can stop and move on to some other reading. But let's take a look at the whys. As before, I'm putting the data and programs on my personal web site and you can run the analysis yourself, and modify the programs to work on different assumptions, methods, data sets.
Let's consider the first point -- how long it takes to determine a climate trend in global mean temperature. We could define
a trend with 2 minutes of data -- temperature at one minute, temperature at the next minute, and draw a straight line through the two numbers. We'd wind up with wildly varying trends, though, from minute to minute through the year. This is weather and turbulence. Make it daily or monthly averages, and we still have the wildly varying trends, and the magnitude of those trends will depend on what time period we chose. Rather than declare that 'this is the right period', we'll determine it by looking at the data itself.
If it is meaningful to talk about climate as opposed to weather, there has to be a time span over which our result for describing climate does
not depend much on how long a time span we choose. For average climate temperature, we found 20-30 years as the appropriate time span. I didn't show the figures then, but it's in the program and output you can pick up from my web site that this is also the appropriate time span for deciding a climate temperature variance (how much scatter there is about the average; even if the average didn't change, we would probably consider it a climate change to have winter lows vary from -30 to +15 instead of -10 to -5).
Figure 1 here shows the trends for all years (remember I'm lopping off the first 31 and last 31 from the NCDC record) that I computed trends for, by all 3 methods, in terms of the length of data record used. So at 36 (months) we see a range in the computed trends between +15 C/century and -15 C/century. These are enormous values cmpared to what we think of for climate change. If I wanted to give you a wrong impression about climate, then, I could use such short records. The range declines as we take longer periods. And then flattens out for trend periods of 252-372 months (21-31 years -- remember I took only odd averaging periods). In this part of the display, the range is about +1.5 C/century to -1.5 C/century -- and it is independant of how long an average I took. This, then, supports that a) there is such a thing as a climate temperature trend and b) that you need 21-31 years to find it (we can round to 20-30, given how 19 years is close to 21 also, we expect 20 even to be so as well).
Statistical aside: To compute variance, we find the deviation of each observation from the mean, square it, and then add this value up for all observations and divide by the number of observations. (Or use the appropriate function in your spreadsheet.) This is a fundamentally meaningful quantity. If we then take the square root of this number,
and the numbers have a normal distribution, then we have a standard deviation. We can always take the square root of the variance, but it will not always be a standard deviation.
In figure 2, I plot instead the maximum and minumum trends, and the square root of variance -- again in terms of how long a period is used to compute the trends. This shows fundamentally the same information, but perhaps a little more clearly. Aside: It's a good idea to look at your data from several different vantages. Sometimes the display method you use in one step can hide something that's blindingly obvious in another method. Again, we see that the figures (maximum trend, minimum trend, average trend, square root of variance in trend) all stabilize once the data length used is 20-30 years. And, conversely, that for periods of 3-13 years, the figures all depend sensitively on how long an averaging period you
choose.
Choose is a key word in doing science. We try to avoid having choices. Choices can be made differently by different people, for different reasons, and not all those reasons will turn out to be good ones. Finding a scientific principle and then looking for how to satisfy that principle is far better. Here, the principle is that the length of data used should not affect your conclusion about what the climate trend is. This is a strong principle. So when you see someone violating it (say by using a 7 year span without doing some real work to justify it -- work like I'm doing here), they're probably not doing good science.
Now, in figure 3, let's look at what the trends are like if we use 7 years of data, versus using 25 years. I'm computing all these by using centered information -- data evenly on either side of the time of interest. We'll get to why this is best in a minute. The main thing I think this shows is that if you use the short period, you present a false impression that climate is highly variable, trends changing from some very high positive value to a high negative value in a span not terribly longer than the 7 years' data you used in either case. This makes no sense for climate, but does for weather, or for misleading people. Weather, we know, does change rapidly. We can be warmer than usual for a few days (or months, or years) and then cool a few days/months/years later. Nobody (scientific) has ever said weather was going to end. Go out to the 25 year data period trends and we see, instead, that the trends have more stable behavior. They do change, which is reasonable since we do expect that climate changes. But it's no longer large magnitude flip-flops. That, too, makes sense as climate is a big beast and turning on a dime has to be a rare if ever occurrence.
To look a little differently at it ... if someone shows you a trend over 3 years, about 90% of what they're showing you is weather (real trends of up to 1.5 C/century, 3 year trends of up to 15 C/century -- 90% of that 15 C/century is weather). For a 7 year trend, it's about 70% weather. Weather is interesting, but if you're interested in climate, and they're claiming to be talking about climate, then they're misleading you by those 70-90% of weather they've thrown in by using such short spans.
On to which data to use for computing trends (or, for that matter, averages, but I'm focusing on trends today). Figure 4 shows trends in degrees per century as computed with data forward from the date given, backwards from the date given, and centered on the date shown. It also shows, and this is why I went to degrees per decade -- the magnitudes come out comparable, the NCDC monthly anomalies. I computed the trends using 25 years of data (300 months).
We immediately see that, indeed, the forward and backward trends are quite different, as expected in the planning note 'deciding climate trends'. The curves themselves are actually the same -- but shifted by 25 years (the period used to compute the trend -- when I use 31 years, it's shifted by 31 years, and so on). The centered trends show, again, the same behavior, but now 12.5 years (half my data period) off from the forward, or backward trends. So, look at the data for anomalies versus the different trends as computed. If we look at, say, 1945 (month 780) -- a generally warm year, we see a modestly negative trend from the centered trends, an extremely negative trend if we look forward, and extremely positive if we look backward. What's happening? And which makes most sense for thinking about the 1945
climate trend?
Climate is about normal, our expectations. For 1945, then, would we describe the typical change as one of rapid warming? rapid cooling? a modest cooling? If we look around then, the best description of the tendency is that there's a modest cooling going on. 1930s were warm, 1945 was a particularly warm year, but going in to the 1950s, temperatures were cooler. The large trends, in opposite directions, the forward and backward trend computations give us, even though for an appropriate period, mislead us. The centered trend computation gives us the right idea of what is going on around 1945. Repeat this inspection for other years, and think you'll rapidly come to the conclusion that the best description of what is going on around any given time is the one from a centered computation. If a trend is downwards (negative numbers), then we expect that times before our year of interest are generally warmer than later times. If it's upwards (positive trend), the we expect later years to be warmer. Only the centered computation consistently gives us this result.
We now have 2 conclusions: trends should be computed with 20-30 years of data, and they should be centered on the date of interest. Let's see how it works, applied to as much data as possible. To cover the greatest time span possible, I'll take the shortest data length reasonable for computing trends -- 20 years. Since I'm doing centered computations, this lets me get to within 10 years of the start and end of the record (remember I had been skipping the first and last 31 years so that all three methods could be used, and because I didn't know if some number shorter than 31 years would be long enough). Figure 5 gives the result, in degrees per decade, and with the NCDC monthly anomalies as well.
The most recent year we can compute a good quality trend for is 1998. The trend then is a warming of 0.19 degrees per decade, 1.9 per century. We see that the trend is higher towards the end of the record (i.e., towards the present) than at other times, though we'd have to do additional work to decide whether the difference was physically meaningful. The thing which surprised me about the curve is that most of the time -- from 1890 to 1998 that we can compute a quality trend for -- the climate trend has been a warming. Not a matter of sometimes up, sometimes down, but rather a basically up except for a bit of down in mid-20th century. 70% of the time, we've been seeing warming.
By way of summary, or postscript, or some such ...
- Whether it is to compute the average climate temperature, the variance of climate temperature, or the trends in climate temperature, we need 20-30 years of data.
- On the other hand, it is possible to compute a climate with 20-30 years of data. This didn't have to be the case. I'll show in a later post a curve with no climate in the sense I've been talking about.
- Most of the time for the period of data, we've been experiencing a climate warming.
Conclusions are, of course, limited by how data were analyzed. In this analysis, I looked only at the global mean temperature data themselves. If I'd done something more sophisticated, like carefully removing the effects of some 'weather' (short term or from outside the climate system) processes such as El Nino-La Nina oscillations, volcanoes, and solar variability, and then looking at how long was needed to get a stable estimate of the mean, or trend, I might wind up with a shorter period. On the other hand, that shorter period would only apply when people did indeed do a careful analysis of the effects of those things on global mean temperatures. This can be done, but is more sophisticated than I wanted to start with. It's also vastly more sophisticated than the many blogs and such out there which are misleading people by doing sloppy short term analysis and pretending that it's climate they're talking about.
Update 13 February 2010: The data file I worked on and the program I used are in a
tar file. You'll need a fortran compiler for this, or translate it to a language of your choice. Nothing very fortran-ish is being done and the program is short.