Our model combines data provided by YouGov with all publicly released national and constituency polls, historical election results, and data from the UK Census. Daily updates to the website are posted by Jack Blumenau, London School of Economics, most recently on 05 May 2015. To read commentary on the election using these forecasts, follow Election4castUK on Twitter. If you would like to give us feedback on this forecast, please email us at
Will any party have 326 or more seats?
Which party will have the most seats?
Our GB vote forecasts, starting one year before the election. Why are these forecasts different from the current polls?
Our current prediction is that there will be no overall majority, but that the Conservatives will be the largest party with 281 seats. However, based on the historical relationships between the sources of information we are using in our forecast and the outcome of UK elections, we know there is substantial uncertainty in our forecast. The sidebar at right includes predictive probabilities of the key outcomes of the election, as well as vote and seat forecasts for each party with 90% uncertainty intervals.
When reading our seat predictions, please keep in mind that our model may not know as much about your specific seat of interest as you do. The model knows how the general patterns of support across the UK have changed in constituencies with different kinds of political, geographic and demographic characteristics. The model uses the Ashcroft constituency polls where available, plus smaller samples of polling data for every constituency, extracted from pooling many national-level polls. However, the model does not know whether your MP is beloved by constituents or embroiled in scandal, nor does it know whether Boris Johnson or Nigel Farage is standing in your constituency, let alone what the implications of that might be. Some of this might be picked up in the polls, but not all of it will be, and we do not have much polling data to go on when it comes to constituencies. In the aggregate, these aspects of constituency-specific competition tend to average out across parties, but they certainly matter in individual constituencies. Think of our seat-level projections as a baseline for what you might expect from past election results, geography and demography, plus a little bit of polling data.
|Sortable table of predicted vote share for every party in every seat.|
|Sortable table of 90% prediction intervals for vote share for every party in every seat.|
|Sortable table of predicted probability of victory for every party in every seat.|
The following tables focuses on potential seat gains and losses for each of the parties, including only those seats for which the probability of a change of control is estimated at over 10%. If the table is blank, there are currently no such seats.
The following table provides the individual seat predictions, aggregated up to England, Scotland and Wales. Please note that these may not exactly match the totals in the main forecast table, as they are based on the individual seat forecasts..
The following table provides the individual seat predictions (columns), aggregated by the party that won the seat at the 2010 general election (rows). Please note that these may not exactly match the totals in the main forecast table, as they are based on the individual seat forecasts..
The votes and seats totals shown elsewhere on this site are forecasts. They are predictions about what will happen on 7 May 2015. These forecasts are based on where we think the polls are today, combined with historical evidence about how support for parties evolves as elections approach. As such, they will have significant uncertainty until just before the election. Because they are predictions about what will happen on 7 May, there is no way to evaluate them until the election occurs.
This tab provides our estimates of what GB and constituency polls, conducted today, would show. This allows us to get a sense of what to expect when new polls come out from the various pollsters, as well as providing a way to check whether certain aspects of our model are working properly.
For GB polls, we have computed 90% prediction intervals for all of the pollsters, for all of the GB parties. We start from our pooled estimate of national vote intention. We then take into account that pollster's general tendency to have certain parties at higher or lower vote share relative to other parties (that is, house effects, see the current estimates of these here). Finally, to estimate how much variation we can expect from poll to poll, we also take into account the typical number of respondents giving a party preference in that pollster's polls, which varies substantially between pollsters (telephone polls typically have samples in the 450-750 range, online polls in the 750-1500 range). We use the number of weighted respondents reported by each pollster in their data tables, before any allocation of respondents who said they did not know how they would vote.
If a new poll comes out, and it is inside the range we show for that pollster for all parties, this is an indication that the poll is not surprising given that pollster's house effects and where the polls are more generally. This can occur even if the pollster shows a substantial swing from their previous poll, as there is substantial uncertainty in every poll. One of our goals in posting this information is to make it clearer how much variation to expect from poll to poll for each pollster.
If a new poll comes out, and it is outside the range we show for that pollster for one or more parties, there are two possible interpretations: either
While the latter is the more exciting interpretation, the former will usually be the correct one.
So, if a new GB poll is inside our prediction interval for that pollster, it probably indicates nothing has changed, and if it is outside our prediction interval for that pollster, it still probably indicates nothing has changed. This reflects the fact that, in general, no single poll is going to give a very strong indication that public opinion has actually changed. Individual polls are too uncertain, relative to the magnitude of real swings in voting intention. The only way we can be confident that change has occured is if we see evidence from many polls.
|YouGov||32 - 37||32 - 36||7 - 9||3 - 5||0 - 1||4 - 6||11 - 14||1 - 2||1290|
|Ashcroft||30 - 37||28 - 35||6 - 10||3 - 7||0 - 2||5 - 8||10 - 15||1 - 3||530|
|ComRes||31 - 37||30 - 36||7 - 11||3 - 5||0 - 1||4 - 7||10 - 15||1 - 3||650|
|ComRes Online||31 - 36||32 - 36||7 - 9||4 - 6||0 - 1||3 - 5||13 - 16||0 - 1||1340|
|ICM||32 - 39||30 - 37||6 - 10||3 - 6||0 - 2||4 - 8||8 - 13||0 - 2||490|
|Ipsos MORI||31 - 38||29 - 36||6 - 11||4 - 7||0 - 1||5 - 9||8 - 12||0 - 2||560|
|Opinium||32 - 36||31 - 36||7 - 9||3 - 5||0 - 1||4 - 6||12 - 15||1 - 2||1460|
|Panelbase||29 - 35||31 - 37||6 - 9||3 - 5||0 - 2||3 - 6||14 - 19||1 - 2||740|
|Populus||30 - 35||31 - 36||8 - 11||4 - 6||0 - 1||4 - 6||13 - 16||0 - 1||1220|
|Survation||30 - 35||30 - 36||7 - 10||3 - 6||0 - 1||3 - 5||15 - 19||0 - 1||910|
|TNS||29 - 35||30 - 37||6 - 9||2 - 5||0 - 1||3 - 7||13 - 18||1 - 3||610|
For a graphical presentation of the prediction intervals for each pollster, for each party, click on the links below. These also show all polls released in the last 14 days (as horizontal marks) for each pollster in comparison to the predicted range.
For constituency polls, since Lord Ashcroft Polls is currently publishing the vast majority of constituency polls, the estimates below specifically aim to predict where those polls would find each constituency in the question “Thinking specifically about your own Parliamentary constituency at the next General Election and the candidates who are likely to stand for election to Westminster there, which party's candidate do you think you will vote for in your own constituency?”
These estimates allow us to evaluate the performance of our model at interpolating the likely situation in unpolled constituencies every time Lord Ashcroft releases polls of previously unpolled constituencies. While there are many other aspects of our forecasting model, one important component is to correctly determine what is likely to be happening right now in unpolled constituencies, or in constuencies that have not been polled in many months, and so the degree to which we are successful at that is important. Every time there are new constituency polls released, we will write a blog post evaluating how well these constituency poll estimates predicted those newly published polls at the LSE General Election blog.
Here are our estimates of what these hypothetical constituency polls would look like in every GB seat today:
|Sortable table of current vote share for every party in every seat.|
If we aggregate the vote shares to the regional level, we get these totals:
|East of England||41||23||13||0||0||4||18||0|
|Yorkshire and The Humber||26||43||9||0||0||4||16||2|
An unambiguous majority in the House of Commons requires 326 seats. Our forecast is that there is currently a 1.00 probability that no single party will reach 326. This tab provides information on the probability that different combinations of two parties will pass a lower 323-seat threshold. We use the lower threshold here, as surviving a confidence vote, given Sinn Fein's policy of abstention, will require fewer than 326 votes. Here, we make the assumption that certain pairs of parties will never work together to support a government. We exclude Labour partnering with the Conservatives, UKIP, DUP or Sinn Fein; we exclude Conservatives partnering with Labour, SNP, Greens, Plaid Cymru, SDLP or Sinn Fein.
We can divide the range of possible seat distributions into a set of coalition formation scenarios. The following is based on an adapted version of a typology presented here. Given the current forecast, there are four major types of coalitions scenarios that might occur after the 2015 election:
|I||The largest party has at least 323 seats.|
|II||The largest party has more than one possible two-party coalition partner.|
|III||The largest party has only one possible two-party coalition partner.|
|IV||The largest party has no possible two-party coalition partner.|
The table below indicates the probability of observing each of these 'types' after the election, breaking down some of the types into subtypes that depend on exactly which parties are involved:
|Type||Lab largest||Probability||Con largest||Probability|
|I||with majority||0.00||with majority||0.01|
|II||with any one of three or more parties||0.00||with any one of three or more parties||0.00|
|II||with Liberal Democrats or SNP||0.05||with DUP or Liberal Democrats||0.03|
|III||with SNP only||0.34||with Liberal Democrats only||0.13|
|IV||with no possible two party coalition||0.01||with no possible two party coalition||0.43|
These probabilities are predictions of what majorities are mathematically possible. They are not predictions of what government will form. First, legislative arithmetic does not determine governing arrangements: just because a party has the seats to govern alone does not mean it will govern alone, and just because a party could reach a majority with another party does not mean it will be able to reach an agreement with that party. Second, these mathematically possible majorities do not help us distinguish between a formal coalition (such as the one formed between the Conservatives and the Liberal Democrats in 2010) and a confidence and supply agreement (where one party relies on the support of another for budget votes and votes-of-confidence).
The typology does, however, help us to understand the relative strength of Labour or the Conservatives after the next election. For example, if Labour finds itself as the lead party in a type II scenario, then it will be able to form a coalition with more than one potential partner. This puts it in a stronger bargaining position than, for instance, if it were in a type III scenario and could only form a coalition with a single partner. But this would still be a stronger position than if Labour were in a type IV scenario where it required agreement with multiple partners.
Note that the cells in our table are not logically exhaustive of all possibilities, even if we assume that Labour and the Conservatives are guaranteed to be the two largest parties in terms of seats. For example, in 2010, had Labour won somewhat more seats than they did, either a Labour-Liberal Democrat or a Conservative-Liberal Democrat coalition would have commanded a majority of seats. This is an additional type that we do not list in the table above. Given likely Liberal Democrat losses and the fact that the SNP has ruled out supporting a Conservative government, this situation where a third party can choose between a two-party arrangement with either the Conservatives or Labour is very unlikely to occur in 2015.
We combine several sources of information. We have past election results, and we have have information on a variety of constituency characteristics. For Northern Ireland (NI) we have very limited polling data, but for Great Britain (GB) we also have frequent current and historical polling which gives us information on what proportion of people who intend to vote for each party. For a small number of these polls we have the raw data, and can break down the results by constituency. Here is what we can learn from each of these kinds of information...
The outcomes of recent elections are useful in two ways. First, they set some rough bounds on what outcomes are likely in this election (e.g. the Conservatives are very unlikely to get less than 25% or more than 45% of the national vote). Second, and more importantly, past election results can help us calibrate how predictive our other sources of information are likely to be in 2015. If we know how useful a particular kind of information was in predicting past elections, that can help us determine how to use it in the present.
Many pollsters poll GB voting intention continuously, whether there is an election soon or not. You can see lists of polls on UK Polling Report or Wikipedia. If all polling companies produced a poll every day with the same methods and the same sample size, we could take a simple average of these polls, and use this as our best guess of the true support for each party. Unfortunately, polls are carried out using different methods by different companies at varying intervals and with smaller or larger samples. We therefore pool the polls to get an estimate of relative party support across Great Britain for every day during the year before the election, using an assumption that relative party support is changing slowly to smooth out the gaps between the polls.
We use a variant of an idea developed by Stephen Fisher following Erikson and Wlezien for determining how to use current pooled polling to predict the election day vote share for each party nationwide. The basic principle is that polling has some systematic biases, in particular a tendency to overstate changes from the previous election. We used historical polling data starting with the 1979 election compiled by UK Polling Report to calibrate how much weight we should put on past electoral performance relative to current polling performance, and how those weights should change as we approach the election.
Aggregate polling is enough to forecast parties' national vote shares, but what actually matters is how many seats each party secures, which in turn depends on how well each party performs in each constituency. Both history and polling can help predict how vote shares in each constituency will deviate from vote shares nationally. A traditional seat forecast would use Uniform National Swing (UNS), which relies entirely on each constituency's historical deviation from national vote shares. We also use this historical information to inform our estimates of vote shares in each constituency, but we combine these estimates with the limited information we get from our raw polling data. This raw polling data does not give us very many responses per constituency, and so we cannot simply tally up how many respondents support each party in each constituency. Instead, we have to use these observations to infer broader patterns of support across constituencies.
The way we infer these broader patterns is by modelling how individual respondents' voting intention varies as a function of the characteristics of their constituency. By constituency characteristics, we mean things like past vote and incumbent party, as well as population density, region, average age, distribution of religious affiliation and many other characteristics made available by the 2011 UK Census. We use a multilevel model that we designed specifically for this project in order to estimate how these characteristics correlate with polling responses. The more strongly these characteristics are related to individuals' vote choice, the more confident we can be in estimating constituency vote shares, even for constituencies where we only have a few observations in the raw data. For the purposes of prediction, we don't need these characteristics to cause people to vote in any particular way: correlation is enough.
Because the forecast has a lot of inertia -- as it should. Polls have sampling error. Pollsters also have systematic biases, because surveying a random sample of the people who will choose to turn out to vote at some point in the future is very difficult. Different pollsters make different choices about how to best approximate this, which is why our model includes house effects. So the estimate of where current polling puts the parties will only change noticeably if changes are evident across multiple polls from multiple pollsters. In addition to requiring many polls to show a shift in party support, the forecast puts weight on both past vote share as well as current polling, with the weight on the latter increasing as the election approaches. We estimated the optimal weighting of past vote share and current polling based on polling leading up to elections from 1979 forward. This means that even when all the polls show a change, if it is far from the election, the change in our forecast vote share will be substantially smaller than the change in pooled polls.
We do know some things that are difficult to incorporate into our model. At the level of the overall election outcome, there are a number of factors that could cause us to miss badly. The two that we worry about most are how Liberal Democrat losses are distributed, and how UKIP gains are distributed. While we have some evidence from polling about how support for these parties varies across the UK right now, generic party preference polls do not always capture the features of local competition that are likely to be vital as the Liberal Democrats fight to hold onto their seats and UKIP tries to win some.
For the Liberal Democrats, past performance is useful. In contrast, UKIP's performance in the last general election may not be a good predictor given the recent increases in the party's popularity. Where possible, we've used information from the 2014 European Parliament elections - but because EP election results are only available at the local authority level rather than at the level of the Westminster constituency, this may be a poor guide. If UKIP had competed in elections from 1979 onwards, and had performed near their current polling level, we would have good historical evidence on the link between current polling and eventual electoral performance. As we get closer to the election, this should become less of a problem. However, if our forecast is badly off, it will probably be related to either the Liberal Democrats or UKIP's performance in the election, and the knock-on consequences for Labour and the Conservatives.
More generally, it is difficult to predict the performance of parties that compete only in a part of the UK, particularly the Scottish National Party and the Plaid Cymru. The number of seats that these parties will win can vary a lot based on very small changes in their national vote share, because their votes are concentrated in a few constituencies. This is why our forecasts for seat totals for these two parties have a wide range: national-level polls do not provide very precise information about how many seats they will win. In that sense, we are not be wrong about these parties, but we cannot provide a very precise predictions.
At the level of individual seats, there are lots of factors that may matter, that we are not measuring. We don't know whether we'll see a particularly strong performance for the Bus Pass Elvis Party, or unduly heavy rain in that region on election day, or whether the local MP is embroiled in a scandal. If there is something systematic that might affect the results across a range of constituencies, and which can be measured, let us know.
Our forecast is based on a Bayesian model that incorporates the various sources of information described above. The model reflects what we believe are reasonable assumptions about how to combine these sources of information, but we could be wrong. The intervals we report are the central 90% of the posterior distribution for seats or for vote share. This means that our model, given the data we have so far, indicates that there is a 5% chance that the quantity in question will fall below the lower bound (Lo) and a 5% chance that the quantity in question will fall above the upper bound (Hi). Thus, there is a 90% chance that the true figure will fall between the numbers marked Lo and Hi.
These intervals, as well as the mean posterior estimates that we report as our primary prediction, are derived from an MCMC estimate of the entire distribution of possible outcomes for each of the parties.
Most of the uncertainty in our predictions comes from the fact that even immediately before election day general election polls in the UK have not been very accurate. Whereas the average of all the polls taken in US presidential elections has been highly accurate in recent elections, in the UK the election day polls have recently missed party vote shares by as much as 4 percentage points. There is a good reason for this: polls tend to ask a generic question about support for different parties, rather than asking about the specific candidates in a survey respondent's constituency. This means they tend to miss the local strategic concerns that arise in multiparty campaigns.
One consequence of this is that even on election day, we will have substantial uncertainty in our estimates. The forecasts will get more precise, but not until very close to election day. The relative weight on current polling as opposed to past election results will start rising around January 2015; however the forecasting uncertainty will not decline until the final fortnight before the election. When we run the same model on data available immediately prior to the 2010 election, the 90% intervals in seats for Labour and the Conservatives are both about 100 seats wide, so even on election day we are not going to be able to make a very precise prediction.
Our procedure would have produced a better prediction than any other forecast available at the time -- but we developed our procedure by testing it on the 2010 results, and so this isn't a fair comparison. The biggest difficulty in predicting the 2010 election outcome is that the Liberal Democrats average 27% across all pollsters on the day before the election, but only received 23% of the vote. That 4% error is not especially large by historical standards (in 1992 both Labour and Conservative vote shares were off from the final polling average by that much), but it did make it difficult for forecasts to avoid overpredicting Liberal Democrat seats substantially: the best forecast published before the election predicted 81 Lib Dem seats, versus the 57 they actually secured.
Our procedure, applied to the 2010 results, still overpredicts Liberal Democrat seats, but not as much as others did at the time. Our model yields a better prediction for Liberal Democrat seats because it takes into account the historical tendency for the Conservatives, Labour and Liberal Democrats to underperform the polls when the polls indicate gains on the last election and to overperform the polls when the polls indicate losses from the last election. For 2010, this means that the forecast for the Liberal Democrat vote comes in lower than what the final polls indicated; for 2015 it means that the forecast for the Liberal Democrat vote comes in above their current polling.
Our Northern Ireland forecast is entirely separate from our forecast for the rest of the UK, and is based on much more limited data. Northern Ireland is allocated 18 of the 650 seats in the House of Commons. Northern Ireland has a different party system than the rest of the UK, and none of the major UK parties have recently won seats there. Political polls of the "UK" are nearly always polls of Great Britain: they exclude Northern Ireland from the sampling frame. There is very little dedicated Northern Irish polling by comparison to the rest of the UK.
Parties from Northern Ireland have not recently entered into government coalitions in the UK, however, in case of a hung parliament, the major UK parties may look to Northern Irish parties to make up the gap to a parliamentary majority. In our table of seats, we explicitly list the two Northern Irish parties generally viewed as most likely to support a UK government. The Democratic Unionist Party (DUP, 8 seats in 2010) might support a Conservative government. The Social Democratic and Labour Party (SDLP, 3 seats in 2010) might support a Labour government. MPs from other Northern Irish parties with fewer seats, as well as Sinn Fein MPs following a policy of abstentionism (5 seats in 2010), are currently grouped under "Other".
We use 326 as the standard for a majority, even though the non-voting Speaker plus the abstaining Sinn Fein MPs reduce the number of votes required to survive a confidence vote to 323 (given the current number of Sinn Fein MPs).
The house effects describe systematic differences in support for the various parties that do not reflect sampling variability, but instead appear to reflect the different decisions that pollsters make about how to ask about support for smaller parties, about weighting, and about modelling voter turnout. Here are the current estimates of the house effects for each polling company, for each party.
The classic approach for translating vote shares in seats in the UK is to apply Uniform National Swing (UNS). In this approach, for every constituency we take the results from the last election for each party, and shift those results by the national shift in vote share from the last election. Thus, if we somehow could figure out the national vote shares for the parties for the upcoming election, we could make a prediction for every seat.
We could therefore produce a different forecast by taking our forecast vote shares and applying a uniform national swing. Some web pages allow you to make predictions in this way.
We don't do this, because we believe we can improve on UNS by using raw polling information. If UNS is "right", then raw polling information should lead us to make similar predictions.
US election forecasts have received a lot of attention, in part because they have performed so well in recent presidential elections. In 2012, the forecasts of Nate Silver, Drew Linzer, Sam Wang, and Simon Jackman all performed extremely well, correctly predicting either 50 or 51 of the 51 states/districts. All of these models depend fundamentally on the availability of state-level polling data in the US, combining that data with other sources of data and assumptions. The reason all these models worked well and yielded similar predictions is because publically reported polls of individual US states were plentiful and (on average) accurate, and the electoral college system makes US states the units where presidential elections are decided. Unfortunately, constituency-level polls in the UK are rare, and even when we get raw, individual-level data from national polls, the sample sizes for constituencies are tiny. This, plus the presence of more than two parties, makes forecasting UK elections fundamentally more difficult than US presidential elections.
Imagine there are 3 constituencies that we estimate each have a 2/3 chance of going to Labour, with the remaining 1/3 for the Conservatives. If we want to make our best guess for each constituency individually, we would predict Labour in all three constituencies. However, if we wanted to make our best guess as to the total number of Labour seats, we would predict 2 total Labour seats rather than 3. The discrepency between our individual seat predictions and our aggregate seat predictions arises from this kind of difference, across many constituencies, with varying and non-independent probabilities, across many parties.
We use data starting in 1979 for two reasons. In 1974 there were UK general elections in both February and October due to a hung parliament after the February election and the inability of any set of parties to form a majority coalition. Having two elections in 1974 makes studying the trajectories of the polls in the run-up to the October election difficult. Second, the further we go back, the greater risk we have that polling performance has changed fundamentally, and so it makes sense to stop at some point.
Not as confident as the model says we are, for many of the reasons noted above. UKIP performance is uncertain in a way that is very difficult to model. Because UKIP has such a limited record in parliamentary elections, it is difficult to predict where they will perform well. Moreover, we just have no idea from recent UK history how well a party like UKIP can be expected to do in the general election compared to, for example, the EP election in 2014. The fact that our estimates put weight on both lagged vote from 2010 and current polling means that the model is skeptical about UKIPs poll numbers, and will remain so until shortly before the election. If UKIP support holds at its current levels through election day and polling begins to indicate that UKIP support is concentrated in certain constituencies rather than inefficiently spread across most constituencies, the forecast may begin to indicate seat gains.
This scale comes from "Quantitative meanings of verbal probability expressions" by Reagan, Mosteller and Youtz.
The core of our system for estimation and reporting of our forecasts is the R programming language. Our pooling-the-polls model and our multilevel model for constituency deviations are implemented in JAGS, called directly from R scripts. Our reports are generated using ggplot2 and pandoc. The pipeline is automated: each day we drop in new data, and then a master script re-estimates the model, re-generates the report you are currently reading, and uploads it to this web site.
Our primary interest is in providing a high quality forecast so we do not embarrass ourselves. We are not gambling on the basis of these predictions. None of us are working for a political party or candidate.
We have revised how we use older constituency polls in constituencies with multiple constituency polls. We now exclude constituency polls that are more than four months old, if there are more recent polls in that constituency. See this article on FiveThirtyEight.com for a discussion of some of the relevant issues.
We have incorporated information about which parties are standing candidates in which constituencies. Vote shares for parties which are not standing a candidate in a given constituency are set to zero.
We have added 90% prediction intervals for all the major GB pollsters to the site. These are under the "Interpreting New Polls" tab, which used to be called the "Constituency Polls" tab, and before that the "Nowcast" tab. That tab now has information predicting both the national and constituency-level polls.
The corrections to the Ashcroft constituency polls do not change our results at all, as we were using the polling figures reported before allocating "don't knows".
We have added in the East Belfast constituency poll for Northern Ireland, bringing the total polling for Northern Ireland up to 1 NI poll and 1 constituency poll. We still forecast that the DUP is likely to end up with the same 8 seats they have now: they have enough constituencies where they have some risk of losing a seat to balance out the likely gain in East Belfast.
We added a plot with the history of our seat forecasts since the end of August.
We updated the text in the Nowcast tab to clarify what those estimates correspond to. It was slightly misleading to say they are estimates of what would happen in an election if held today. It is more accurate to say they are estimates of what a Lord Ashcroft constituency poll would say in the constituency if it was fielded today.
Today we added a Coalitions tab to our forecast. This provides the probabilities that Labour and the Conservatives will be able to form two-party majorities with politically plausible partners among the other parties. This does not constitute a forecast of which coalitions will form, but rather the situation that the largest parties will find themselves in regarding which coalitions are mathematically possible.
In the last few days, in order to take into account the fact that Ashcroft and YouGov have added UKIP to the first stage of their question prompt, we have introduced new house effects for those two pollsters, which are referred to as YouGov 2 and Ashcroft 2 in the house effects plot. These house effects will be relatively uncertain until more polls come in from those two pollsters using their new methods.
Today we added Northern Ireland to our forecast. For the moment, we are only showing the seat forecasts for the DUP and SDLP in the seat table, all other parties are grouped into "Other" as the list of parties becomes too long otherwise. Sinn Fein currently has more seats than the SDLP, however their policy of abstentionism means that they do not enter into coalition calculations, which is why we have grouped them into "Other". We plan to provide further details in future updates.
Our Northern Ireland model is relatively simple compared to the GB model, because there is very little polling data to use. Since May 2014, there has been only a single Belfast Telegraph poll of Northern Ireland vote intention. As a result, the model we have primarily reflects historical evidence. We use historical patterns in how much vote share for the Northern Irish parties changes from election to election across constituencies. We also use demographic data from the 2011 Census to model where particular parties are likely to be competitive in 2015, and where they are not.
For further discussion, see our FAQ item on Northern Ireland.
Today, we added a new feature to the website and made a change to the model.
The new feature is the "Nowcast" tab, which includes information on what we think the current polls imply about what would happen in an election today, if those polls are unbiased. We still think there are strong historical reasons not to take this as the best guess of what will happen on 7 May, but we have received many questions about this and so we thought we should include it. There is a table showing the seats and votes for each party across the UK, and a sortable table with the vote share for each seat.
The change to the model concerns the SNP. The lack of any constituency polls in Scotland has made it difficult for us to treat the SNP the same way we treat the other parties. In particular, part of our model uses Ashcroft's constituency polls to calibrate the difference between the generic "how would you vote in an election tomorrow" question and the specific "thinking about your constituency... how will you vote next May" question. We have been unable to do this for the SNP, because of the lack of Ashcroft polls in Scotland. Unfortunately, we think this was being a little unfair to the SNP's chances, because the effect of the more specific question is to make parties stronger where they are strong and weaker where they are weak, as survey respondents think more about the specific strategic situation in their constituency. We have arrived at a better solution, which is to take the typical recalibration of other parties as our guess for the SNP. This adjustment has important consequences for our seat forecast. Because the SNP primarily gains seats that Labour might otherwise have won, we now have a slight advantage fot the Conservatives in the forecast, whereas previously we had a slight advantage for Labour. It is not a large change, we are still on course for a very close election. We hope that Ashcroft polls some of these constituencies soon, so we have a bit more data and a bit less guesswork to go on for Scotland.
As part of a systematic check of our codebase that we are doing as part of writing up documentation on our model, we found an unfortunate programming error today. This was causing us to mis-apply our turnout estimates for individual constituencies when we reconcile the UK vote with individual constituency vote shares. Some errors have trivial consequences, but this was not one of those. The error was reducing our seat estimates for Labour by about 10 seats, because Labour tends to perform better in constituencies with less turnout, and we were not modelling this as we intended.
Because we use the same code for the 2010 retrocast, fixing this also changed our retrospective test of how the model would have performed in 2010, entirely for the better. Before, we 'predicted' the Conservatives on 319 seats, Labour on 238, and the Liberal Democrats on 66 seats. This gives a total error of 44 seats. We now 'predict' the Conservatives on 297 seats, Labour on 259, Lib Dems on 66. This gives a total error of 19 seats. At the level of individual seats, fixing this error helped as well, our retrocast now 'predicts' 9 more individual seats correctly than does applying uniform national swing using the true national vote figures from the 2010 election (information that was not available before the 2010 election, but which yields the best case prediction for UNS).
Today we made a change to the way we weight the microdata we have been receiving from YouGov. Since their data is weighted for the UK, not for individual constituencies, we have to use our own weighting scheme to make sure that we have representative constituency-level samples. Because YouGov surveys the same individuals multiple times over many months, we had repeated observations of the same individuals, that were being erroneously treated as independent observations. We have now fixed this problem. This leads to very little change in aggregate seat counts for the Conservatives and Labour, but it does have consequences for the other parties and for many individual seats. We think that most of our more puzzling individual seat predictions are now more reasonable. Thanks to everyone who pointed out weird constituency-level predictions, they were useful in helping us to identify that this overweighting of repeated observations really had serious consequences for the forecasts.
Reflecting the fact that we are now more confident that our polling data is reliably weighted, we are now also putting full weight on the polls for estimating constituency-level variation around the national-level party support. For our 2010 retrocast, the individual-level data is not properly weighted to constituency (and cannot easily be), so it made sense to put some weight on history as well. But given the substantial swings in LD, SNP and UKIP support, historical vote shares are less likely to be predictive going forward. The past vote still informs the model we use to interpolate from the limited polling data, as it always has; we just no longer take that final polling estimate and weight it with last election's results, before reconciling it with the predicted national vote. As a matter of our modelling approach, we are going further from (the safety of) UNS with this change.
We have not made any changes to the model since the beginning of September, and we were certainly hoping those changes would be our last before the election. However, we value being right more than being consistent. Watching the forecasts over the last month and a half has indicated that we could generate a better forecast with a few changes, so we are making a few changes.
We have also been generating ideas for a more fundamental overhaul of the model design for 2020. But maybe we will get lucky: if our current forecast is spot on, and the parties cannot figure out how to form a viable coalition, we might get another election to forecast much sooner.
The new Clacton constituency poll from Lord Ashcroft asks a question about vote intention for the general election. Since that is the target of our forecast, we are now using that question rather than Ashcroft or Survation's questions about the by-election.
Our seat predictions are now posted under the "Seat Predictions" tab. In the preperation for posting these, we have made several minor bug fixes as well.
We have decided to include the Clacton by-election poll conducted by Survation in our analysis for the moment, but we have downweighted it substantially to reflect the fact that it is about the by-election rather than the general election. The consequence of using it is that UKIP is now favoured to win Clacton. When more polls of the Clacton by-election become available, ideally once all candidates are declared, we will remove this poll.
We are planning to release our full seat predictions on Monday, 1 September. In doing our final sanity checks on these, we found several errors/inconsistencies in the data source we are using for past election results. For whatever reason, about 25 constituencies were labeled as being in the wrong region, with some consequences for our model. We also discovered that the turnout variable we were using was not the right one. All of this is now fixed.
Today's forecast shows a swing back towards a very narrow lead for the Conservatives. Some of this is due to the aforementioned fixes, and some of it is due to recent YouGov and Populus polls suggesting the current Labour polling lead is somewhat smaller than previously indicated.
We have made what we hope will be our final two tweaks to the model. One change was to improve the pre-processing of polling micro-data to incorporate information about respondents' past voting behavior in our reweighting procedure. The more noticeable change was reducing the weight we place on historical variation in support across constituencies for UKIP, which was suppressing their chances of winning a seat. While we do not like making exceptions for single parties, in this case it seemed that the argument for using history was consistency rather than any real expectation that history would be informative. This change has the effect of increasing the expected number of UKIP seats to 1 seat, and there is a non-trivial chance of UKIP winning multiple seats. At the same time, the most likely outcome is still zero UKIP seats and there is no particular seat that we predict going to UKIP. To understand how all these statements can be true, it is helpful to look at the predictive distribution of UKIP seats.
We are planning to freeze the model on 1 September, and only add new data after that date.
Today's forecast incorporates a new batch of data at the constituency level, leading to some noticeable changes in the map, although not in the overall seat totals. The most visually obvious change is the constituency of Caithness, Sutherland and Easter Ross at the northern tip of Scotland, currently held by the Liberal Democrats, which we now forecast as most likely to be a Labour pickup. The vote in this constituency at the last election was 41.4 Liberal Democrat, 24.6 Labour, 19.2 SNP. The model is anticipating a close three-way result: 49% probability of Labour victory, 26% chance of the Liberal Democrats retaining the seat and a 25% chance of an SNP pickup. The predicted votes for the three parties are: 26.4 Labour, 24.6 Liberal Democrat, 24.2 SNP.
More comprehensive seat-level details coming soon...
Today we added data from 12 constituency polls fielded by Lord Ashcroft. Because we only use alternate days of YouGov releases, so as not to double-count when they provide samples from overlapping sets of days, we had no new national polling to add. This gives a relatively clean illustration of the consequences of adding constituency-level data to our model. In the aggregate, not a lot changed: none of the constituency polls released today were very surprising results. Labour gained Weaver Vale and Dewsbury in the forecast from the Conservatives, both of which were polled constituencies. We already had Labour ahead in the other polled constituencies, so the polling data mostly reinforced what the model already expected.
Labour did not gain in the aggregate seat predictions though. If Labour is doing a little better than we thought in Weaver Vale and Dewsbury, they might be doing a little better everywhere. But more likely, given how much evidence we have from the many national polls, they are just doing a little worse somewhere else in the UK than we thought. In the end, all the constituency results have to add up to the UK results, and this shapes how the model interprets the new constituency polling results.
A small change to the methodology today. We realized yesterday, when we switched from the generic party support question to the "thinking about your own constituency" questions in the Ashcroft constituency surveys, that we could use systematic patterns in the support for the parties under those two different questions to recalibrate other constituency survey data where we only have the generic question. We implemented that calibration yesterday, and the forecasts posted today reflect that change.
The obvious change is that this brings the Liberal Democrats up by another 5 seats from 20 to 25. While we apply recalibration to all parties, by far the biggest discrepencies between the two survey questions in the Ashcroft data are for the Liberal Democrats. In the Ashcroft polls fielded since the beginning of May, the average support for the Liberal Democrats is 62% higher in the "thinking about your own constituency" question than in the generic question, and the relationship is almost perfectly proportional, regardless of the level of Liberal Democrat support in a given constituency.
All the national polls ask the generic question, so there is good reason to believe that the national polls are understanding Liberal Democratic support. Looking at our national numbers, the Liberal Democrats are currently at about 9% in our polling average and at 15% in our forecast, which very nearly matches the 62% difference we see in the Ashcroft constituency polls between the generic and the specific question. We expect that difference to close over the course of the campaign, but this is exactly the kind of problem with generic polling that leads to put weight on past election results as well as current polls.
We are hoping to stop changing our forecasting methodology soon, but we have received a lot of useful questions and feedback since we posted the site, which have provided ideas about how to improve the model. We may have a few more changes to make in the coming week or two.
In today's update, we changed which survey question we use from the Ashcroft constituency polls. We were previously using the "If there was a general election tomorrow, which party would you vote for?" question that is generally asked in national surveys. We have changed over to using the question: "Thinking specifically about your own parliamentary constituency at the next General Election and the candidates who are likely to stand for election to Westminster there, which party's candidate do you think you will vote for in your own constituency?" We think this question better captures local conditions, which is the aim of using these polls in our model. For example, the Ashcroft poll of 5-12 June for the Green-held seat of Brighton Pavilion has Labour 39% to Green 27% on the generic question, while the constituency question has Labour 33% to Green 32%. We see similar improvements of Liberal Democrat and UKIP standing in this question in their stronger constituencies. Versus using the generic data, the Liberal Democrats and Greens have higher seat forecasts as a result of this change.