The data is available at the Data Repository of the IDSC – Research Data Center of IZA. The paper we first studied the properties of these data is published at the Journal of Forecasting.

The Toll Index has been widely covered in national and international press (selection):

]]>The data is available at the Data Repository of the IDSC – Research Data Center of IZA. The paper we first studied the properties of these data is published at the Journal of Forecasting.

The Toll Index has been widely covered in national and international press (selection):

]]>On September 21 I published the Toll index for the month of August. How much less surprised would you have been if you used the Toll Index to forecast exports?

If are monthly exports (in billions of Euros, is the Toll Index (border crossing trucks), is the monthly difference operator and monthly dummies and you estimated a model like this:

you would have come up with +5.7% which would imply you would be much less surprised. There is still some difference to 7.2% but first this is a preliminary estimate, second there is some noise in the world and third the model is build and run in under 5 minutes. Of course you would be almost just as informed if you just saw this graph:

]]>Before we go on to the results let me remind you of the Diversity Prediction Theorem, a very easy to verify fact. If individuals are asked to guess a true measurement and are their single guesses and is the mean of their guesses (also called the crowd guess) then the theorem states:

which simply reads that the square error of the crowd’s guess equals the mean squared individual error minus the variance of the individual errors. Which means that the crowd does better than any of its members as variance increases.

This works well when the crowd members are uncorrelated and fails miserably when not as manias, information cascades and bubbles show.

On a party by party basis we can thus compute the mean individual squared error vs the crowd squared error and we can do so for the crowd of laymen as well as for the crowd of the 13 pollsters whose last polls can be found here together with the guesses of the lay people.

In each cell the left hand side number is the mean individual (polling firm or layperson) squared error and and on the right you see the crowd’s (of firms or person) squared error. Clearly with the exception of Die Grünen and Die Linke the crowd guessed much better than the pollsters.

One can legitimately ask whether the pollsters should ask people to predict the elections instead of (or in addition to) asking them to reveal their preference. The former is immune to strategic response whereas the latter is not. Greetings from Galton.

Caveats: The laymen were still guessing on Sunday perhaops giving them an advantage over pollsters. The crowd of the pollsters is 13 strong whereas the laymen are 4,144 also an advantage for the laypeople. On the other hand the pollsters have the entire statistical toolkit at their disposal and a larger sample than each of the laymen.

For the convenience of Stata users here are the datasets:

- Preliminary official results (final.dta)
- Laymen’s guesses (btw2017.dta)
- Pollsters last forecasts (pollsters.dta)

]]>