To blend or not to blend, that is the question!

The big ECAD site of the European Climate Assessment & Dataset holds two versions of data files: a non-blended and a blended one. The non-blended files are said to simply contain the raw data as submitted by the meteorological station, with -9999 indicating missing or unacceptable bad values (there is a column with a quality code, which is 0 if the data are ok and for instance 9 if the data are missing). The blended series fill in missing data by including measurements from realtime SYNOP messages broadcast continuously; in this in-filling only data from station not further than 12.5 km should be used.
Now, what does this mean in practice? I made some detective work using the TN and TX (= minimum and maximum daily temperatures) data files of daily measurements done at the Findel airport meteorological station. As said in a previous blog the FINDEL series have many missing days in 2011 ( the first 5 months are totally missing) and in 2012 (September is missing). Mr. Jacques Zimmer from MeteoLUX (the organisation running the meteorological FINDEL station) kindly told me that the problem was not non-existent measurements, but long-time and intermittent software problems with the transmission of the data to the official European collector.

1. Differences between the ECAD TN and TX non-blended and blended long-time series

The files with the TN and TX data are best found from the ECA&D website by following the links “Daily data” … “Custom Search” and entering Luxembourg and Luxembourg Airport. Both type of files start the 01 January 1947 and end the 31 December 2012. A quick glance directly shows the huge gaps in the non-blended versions during 2011 and 2012, but it is not clear how many differences exist during the previous years from 1947 to 2010.  To make an easy comparison of the text files, I used the excellent freeware ExamDiff  from Prestosoft.
Well, the result comes as a surprise: from 1947 to 2010 included, non-blended and blended files are exactly the same. The versions differ only during the last 2 years (2011 and 2012): most (but not all) of the missing data have been filled-in. Nevertheless, there remain 7 days flagged -9999  (01 to 03 January, 06 Mar,03 Apr and 18 May). I do not understand why these few holes have not been filled in!

Now a serious question is this: Can one reasonably assume that during the whole period 1947 to 2010 there have been no missing raw data in the FINDEL series? I think no, one can not! The conclusion is that after some delay (let’s say two years), the originally raw data have disappeared, and even series flagged as “raw” (as does the KNMI Climate Explorer which leads to the same files) are in fact modified series. So any hope going back to the “originals” for data re-evaluation is doomed.

2. Influence of blending on Findel mean yearly DTR

If we use the year 2011, we find the following:

– average DTR using the Jan. to May data sent by Mr.Jacques Zimmer and the non-blended series for the remaining of the year:  DTRavg = 8.12  (8.122740)

-average DTR using the blended ECAD series  (which still contain 7 missing days): DTRavg = 7.92  (7.922409)

The abs0lute difference is 0.30 °C, the blended value being 3.8% lower, a not negligible difference.

Using only the blended files for the 2002 to 2012 DTR anomalies (w.r. to the 2002-2011 mean) one finds a negative trend of -0.34 °C per decade. Using non-blended data (with the exception of in filling the missing September 2012 days by the blended data), the trend is -0.27°C per decade.

The following picture resumes the comparison, including the BEST DTR anomalies (up to 2011); it should be noted that BEST gives only mean monthly DTR, and not daily values as does ECAD; so the BEST points represent yearly averages computed from monthly means, whereas the meteoLCD (green points) and FINDEL (red and pink points) represent yearly averages computed from daily values.


The picture shown in the previous blog showed a slightly positive trend for FINDEL; the reason is that I replaced the missing FINDEL data by those of meteoLCD, multiplied by a calibration factor. As a general rule,one should not expect DTR pattern being the same even for relatively close stations. The measured DTR represents the local climate, which can strongly diverge from a  pattern computed using data from neighboring stations, as does  BEST.

Marcel Severijnen, a regularly correspondent, also made in his blog a comparison of DTR’s of some big and well-known stations Dutch weatherstations as de Bilt,  Maastricht, Vlissingen etc. He shows the following graph:


All these stations are less than 300 km apart, nevertheless show really different behavior: in-land station de Bilt could reflect the influence of a ~120 years double AMO related oscillation, whereas the damped oscillation of coastal Vlissingen seems close to the AMO period of ~60 years. Please read the full text (in Dutch) of Marcel’s comment here.

3. Conclusion

It seems that the ECA&D non-blended files are in reality blended ones, except for the most recent years. Thus the raw data are lost in this big European dataset. This conclusion is preliminary, as based on a single station. A more stringent  analysis is badly needed.

3.2. DTR patterns of a station reflect the local micro-climate. They can be hugely different from those of other neighboring stations, and evidently from averages computed over extended regions.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: