In May 2015 Opinion polling took a hammering. Most of the polls showed the General Election very close throughout the whole year, never mind just the election period. There were a few outliers in Labour’s favour, especially from the company TNS, whose methodology appeared to be the reason and was very different to other pollster including a self selecting element. There were a few outliers in the Tories favour, the ones in reality were the more accurate ones, most from Lord Ashcroft but also a couple from ICM. Why did they get the more accurate outliers? And why were the rest wrong?
Let’s deal with Scotland first. This is the plus point for all polls, Scotland specific or those showing small Scottish samples. In short the polls for Scotland were pretty much bang on. The average SNP score for all polls going into the General Election including detailed Scotland specific polls and nation sub samples was 49%. The SNP scored just over 50% in the real election. When it came to Scotland, in spite of some inevitable sub sample outliers due to low samples, were overall bang on.
Throughout the campaign I was producing regular seat projections not based on uniform national swing but using the swing in each region against the 2010 result. It is a method I still use today (Albeit against the 2015 result now of course.) What was interesting was the discrepancy between the English Midlands and Wales on one hand, and The North and South Of England on the other. Going into the campaign a trend started emerging that in the Midlands and in Wales, poll after poll on the whole, saw no swing at all, indeed as often as not there was a small swing to the Tories which I flagged up again and again on my twitter feed that possible gains and pro Tory swings in these regions were possible (Which proved correct with gains from Labour in Derby North, Telford, Gower & Vale of Clwyd). Labour were doing better in the north in the polls, which they were, and poll after poll was also seeing a big move to Labour in the South of England (Not just London where the polls like Scotland were pretty much bang on), often swings as high as 7-8%.
The other noticeable thing was that if you assumed the swing in the south of England was more in line with what was going in in the midlands and Wales, taking into account the Lib Dem collapse, my seat projection would have seen the Tories above 300 seats regularly, with my showing an overall majority. Time and again it was the southern results dragging down the Tory seat score, with the Lib Dems holding up with 20-30 seats due to a small number of Lib Dems holding up the percentage, and Labour gaining seats on paper again and again that proved in May to be never on the cards.
Nationally the polls were just within the 3% margin of error, underestimating the Tories on average by 3% and overestimating Labour by 3%. The build up of Labour votes also made it look closer than it was. I stated earlier Labour were doing better in the north, indeed in the North-West Labour scored a 5% swing from the Tories in the region as a whole. However they only gained a net of 2 seats. The Tories did much better in the marginals like Warrington South while Labour piled up extra votes in the cities of Manchester and Liverpool that did not matter. The North is the first area which skewed the perception of the election, it was not that the polls were wrong, but the effect on seats was wrongly assumed, Labour on an even 5% swing would have taken a number more seats in the North-West, they didn’t.
But the biggest thing was the South of England. I suspect the problem lay in the size of the Tory lead there. How on a small sample, where a small number of Liberal Democrat or Labour voters can skew the numbers, can you consistently model a South of England Tory lead of 27% over Labour accurately? Remember YouGov were producing a poll everyday and most pollsters at least once a week during the long campaign. This failure was producing a regular swing in the south to Labour based on a handful too many Labour voters being found. In terms of my regional method, this suppressed the Tory swing against the Lib Dems by reducing the Tory vote, which also saw them holding onto too many seats under my method.
So how did Lord Ashcroft and ICM get the majority of the outliers. The answer lies in how they put their regions together. Most pollsters report the south of England and Greater London separately. Lord Ashcroft’s polls were different, his polls reported the South-East with London, and the South West with Wales. Note both London and Wales had Labour lead in 2010, so putting these regions together reduced the starting point for the Tory leads in what was being reported, enabling the balance of London and Wales to make it easier to produce a credible starting sample and increasing the chances of finding something more accurate. For ICM they did not report Greater London separately, but in with the south. This does not dilute the lead in the whole south by as much, hence they had less outliers than Lord Ashcroft, but again increased the chances of something more accurate, which occasionally they did.
There are enough votes in the south alone to account for the 2-3% errors in most of the polls. Yes odd outliers in other regions will have accounted for a small amount too, but the pollsters southern problem was the primary cause of the polls being inaccurate last May. With no swing in the south in 2015, the problem will remain going forward to 2020 as the Tories still start with a 27% lead across the south excluding London. Can they solve it? Time will tell, but in the meantime the lesson is to be cautious when you see the south going clearly the opposite way to the rest of England and Wales, this may well be the reason for it.
So in reality no the polls were not that wrong. They were right in Scotland, in London, the trends were correct in the Midlands and in Wales, in percentage terms they were right in the north and national polls cannot detect differences in safe and marginal seats, so you can’t blame the polls for that. They fell foul of an anomaly that on a small sub sample is not that easy to solve when either side has such a big lead. A few extra Labour or Lib Dem supporters found in the south of England accounted in the main for the small under estimation of the Tories, and small over estimation of Labour, and thus the polls, that were only wrong by a similar amount to most polls in 1997 (Which was ignored because Labour were so far ahead, they won anyway despite being overestimated then), were hammered because it was close and the outcome itself was not called. The hammering they took in my view was unfair, but the challenge of the South of England is a facinating one to see if they can put right.