Boy, blow one historic blizzard forecast and people get all cranky*.  Except, as 
H. Michael Mogil
discusses, it was an almost perfect forecast.  For the specifics of that storm and its forecast, I refer you to Mogil's article.
I'm going to take up the more narrow topic of forecast evaluation.  (Disclosure: I do work for NOAA/NWS, but, as always, this blog presents my thoughts alone.  Not least here, because I agree more with Mogil than the head of the NWS, Louis Uccellinni, about this forecast.)  One school of forecast (or model) evaluation looks at computing large scale statistics.  The most famous one for global atmospheric models is the 5 day, 500 millibar (halfway up the atmosphere), wave number 1-20 (large scale patterns), anomaly correlation.  When people refer to the ECMWF model (or 'Euro') being better than the NWS's model (GFS), this is usually the number that is being compared.  But I don't live halfway up the atmosphere, nor do most of you.  We're somewhere near the bottom of the atmosphere.  And there is much more of interest than just average temperature through a layer of the atmosphere.  So there are many other scores (dozens of them) -- See 
http://www.emc.ncep.noaa.gov/gmb/STATS_vsdb/ for some examples and discussion of what the scores mean.
Most of those scores, though, don't get to my personal -- weather forecast consumer -- interest.  Namely, I'm trying to make a decision of some kind.  NYC, which heard a forecast of 24" (60 cm) but got 9" (22 cm), presumably made decisions that they wouldn't have if they'd heard the perfect forecast that hindsight now provides.  It's here, I think, that we get to the meat of forecast evaluation.  Had this same error been made over the ocean, rather than over the most populated city in the US, with the rest being as it happened, the NWS would be getting praised for their great forecast.  The important part was not difference between reality and forecast, but number of people who made the wrong (in hindsight) decisions.
So let's explore evaluating forecasts by way of our decisions.  I don't make decisions for major metropolitan areas, and not about street plowing and so forth, so will leave that aside.  One realm of weather-affected decisions is in my running.  Let's ignore summer decisions (I'd as soon avoid thinking about what summers are like here) and go with the path as temperatures drop.  Normal gear -- in pleasant weather conditions, is t-shirt and shorts.  Once it cools below 60 F (16 C), I pull on a pair of gloves for my run.