'Analysis' is what we call an attempt to represent the state of the atmosphere/ocean/sea ice/... given a set of observations. One such analysis is the global surface air temperature analysis. That, then, spawns efforts to find a global mean temperature, or global mean temperature trends, and so forth. Several of the recently-added blogs aim to study that, in one way or another. That particular one is not my interest in two different ways.
One is, I'm an oceanographer, so I'm more interested in a sea surface temperature (sst) analysis. The other is, most of the interest in the surface air temperature analysis seems to come from its role as a detector of climate change. On the scale of things, I consider this the second weakest climate change indicator. The only thing weaker, in my view, is the so-called 'Hockey Stick'. But enough raw opinion.
Regardless of what it is you're trying to analyze, and what your reason for doing so is, there are quite a few ways of setting about doing so objectively. The fact that there are many makes this the first of something like eight notes I'll be writing up on the idea. There turn out to be many different ways of making an analysis, each objective, each with strengths, each with weaknesses.
The simplest one, if not as simple as you might think, is the 'drop in a bucket' method.
First, a bit of language. We typically divide the earth's surface in to a bunch of boxes/cells. Also typically, they're some number (or fraction) of degree latitude by so many degrees (or fraction) of longitude. Depending on which sst analysis I'm looking at, a cell can be anything from 5 degrees on a side to 1/100th of a degree on a side.
The basic idea for drop in a bucket is very simple -- if you have temperature observations in a cell, you use them to find the temperature of (analyze) that cell. If you have no temperatures in a cell, then you have no analysis for that cell. So 0 observations in a cell is very easy -- you report no analysis. 1 observation is also very easy, your analysis temperature is the temperature from that one observation.
But what about having more than 1 observation in a cell? The very simplest thing to do is just average all of them. We just blindly treat all observations as being equally good. Hmm. That sounds a bit problematic. Some observing methods are better than others, after all. The quality of the observations is described by the standard error, which is the standard deviation between the true value and the observed value -- computed after you have many such observation to truth comparisons.
For typical drifting buoys and satellite methods, this is about 0.5 degrees. For ships, let's say 1 degree. This being science, of course I mean degrees C. One could pursue this to substantial complexity, as it's probably the case that every type of buoy has a somewhat different standard error, different ship observing methods have different standard error, and the different satellites and satellite methods have still other standard errors. Life is probably no simpler for surface air temperature observations; and for rain it's even harder.
For the sake of illustration, let's consider a cell with a buoy that observed a temperature of 25 C, and a ship in the same area that observe 26 C. In blind averaging, we'd treat them as equal, and give our analysis as 25.5 C. But ... the buoy is a better observer than the ship. Shouldn't our analysis be closer to the buoy? Maybe we should just throw out the worse observer?
Probably not. The observations have a distribution of likelihood (which is not the vertical axis! beware!) around the value they report. For each observation, the most likely value is what is reported. But those standard errors give us a curve of likelihood. The ship is in orange, the buoy in blue, and a third thing in black:
The third curve is where we get to the creative part. What it is, is that I've multiplied the likelihood curves for the buoy and the ship. The result is a joint likelihood. The peak of the curve is our point of maximum likelihood. It's a temperature of 25.2 C. You can, in principle, do this sort of thing graphically regardless of how many observations you have. It gets tedious and ugly, of course. That's why we invented mathematics. In this case, calculus. (See bottom for the gory math details).
We also see that our resulting estimate, the maximum likelihood curve, is narrower than the two original ones. The more observations we have, the better our resultant estimate -- even better than the original observations. This is the same sort of thing we saw result in How can annual average temperatures be so precise?
Our estimate based on considering the quality of the different observing platforms is 25.2, rather than the 25.5 of treating them as equally good. When we're looking to deal with climate change and detecting small differences over time, it's obviously important to pay attention to this sort of change. If you change from one to the other, there's a 0.3 C change -- not because of climate, but because you changed your methods for filling cells. (Not a mistake I think anyone has made, but a heads up if you are getting started.)
The general term for this sort of thing (treating some observations as better than others) is 'weighting'. We give more weight to some sources than others. I'll give the exact method at bottom for the mathy folks. Different methods that we'll be getting to will do their weighting in different ways.
Now let's go back to thinking about the general approach. We select boxes of some size, and then if there's an observation in the box, then we say that we know nothing about what's going on in the box. There are about 5000 drifting buoy observations per day. If the cells are 5 degrees on a side, and the buoys are distributed randomly, there are about 3 observations per grid cell and we are doing pretty well. On the other hand, if our cells are 1/100 degree on a side, then probably only one cell in 80,000 has an observation. How did we go from knowing most of the globe pretty well (3 observations, obs, per cell on average) to knowing almost nothing? On the other hand, for every cell you say you know something, you definitely have at least one observation in support, and you haven't made any assumptions (at least not past selecting the cell size).
But suppose you are running a numerical weather prediction (NWP) model. You can't accept an 'I don't know' for starting your prediction. Consequently the people involved in NWP were early people to develop more advanced methods. That's next.
Methods (so far):
1a) Drop in bucket, blind averaging
1b) Drop in bucket, maximum likelihood averaging
Gory math details:
To find the maximum likelihood estimate for temperatures given N observations, multiply together your N curves, each of the form exp( - (T-T_i)^2/s_i^2), and find the T for which this is a maximum. T_i is the ith observation, s_i is its standard error.
The maximum likelihood value is sum(T_i / s_i^2) / sum(1 / s_i^2).
I'll leave it for those interested to compute the standard error of the maximum likelihood estimate itself.
1 hour ago