Philip Q. Hanser of Northeastern University sent me this very nice example of DD-Type 14 dark data:

‘I was studying the question of wind turbine output variability to try to understand how much non-wind turbine back-up resources are needed to ensure the electrical system’s reliability. Of particular concern in this regard is the intra-hourly variability because of the very short time required to respond to changes in wind turbine output. As a result, we were using five-minute output data, the shortest time interval available. The data looked mostly right, but to check, I did a spectral analysis to see what the highest frequencies were. That analysis suggested a much higher degree of regularity than would be expected. I called the U.S.’s National Renewable Energy Laboratory (NREL), the data’s source, to enquire about the data. It turned out that the only data that NREL had was hourly data. To satisfy the needs of users who wished shorter time period data, they had hired an engineering firm that filled the in-between hourly data with a “typical” shape for a wind turbine, which, of course, made the data useless for our purposes. Of the thousands of data points in our data set, only about 1/12th were real data.’

Philip’s example reminded me of another case I encountered, involving the same sort of idea, but in a rather way. In the past, the standard approach to measuring pulse rates was to count the number of beats which occur in 15 seconds and multiply by four to give the rate per minute. In the case I saw, none of the figures were divisible by four. When asked about this, the research nurse confessed to having made up the numbers.