Social Housing Cost per Unit - Predictive Model

Overview

One of the challenges faced when completing a VFM submission is the identification of appropriate comparisons. There are many options for identifying comparator groups but the Regulator of Social Housing produced a Technical Regression Report back in 2018 which explained how they generated a model for the VFM metrics from which unique predicted values can be generated.

Since then, it is understood that this analysis has been extended but the equations have not been published. Rather the main differential costs drivers have. These differential costs drivers include the percentages of supported housing, housing for older people and reinvestment along with major works expenditure and the regional wage index.

Clearly these operate together and not in isolation and the best approach is to generate a multivariate model in which all these factors are included and so a unique ‘predicted’ value can be generated based on the individual values of these core explanatory factors for each provider. This is what we have done and so the predicted value is based on using our updated model. This model is based on the latest published VFM submission and estimates how much each factor impacts the overall social housing cost per unit.

It is very important to note that this model only generates a predicted value based on these known factors. It explains about 65% of the variation between providers and the other 35% is made up of a mixture of factors that may be historic, external or the result of different staff and management performance. All that the model will tell us is the level of social housing costs based on the five explanatory factors, all other factors being equal. Other potential differential cost drivers such as the total stock size or the percentage of rural stock did not have any significant impact.

By using this value, the analysis / exploration can then focus on the other factors – ignoring the percentages of supported housing, housing for older people and reinvestment along with major works expenditure and the regional wage index, as these have been taken into account in generating the predicted score.

The actual model generated, based on the VFM 2023 data published in February 2024, is a log linear model. This makes it harder to interpret and understand but yields better predictions and was more robust in the Ramsey test for model specification. The dependent variable is then the log of SHCU and the parameter values are shown in the the table below:

Parameter
(Intercept) 8.22
Social Housing (%) 1.36
Housing for Older People (%) 0.83
Regional Wage Index 1.39
All Major Repairs per Unit 0.13

If we then choose some values for each of the cost drivers, we multiply each value by corresponding parameter, add them up together with the intercept to create a value for the log of the dependent variable. This will be between about 8 and 10. We then take the exponential of that number to get the SHCU (e.g. exp(8) = 2,981 and exp(10) = 22,026).

Different values of each of the cost drivers can be entered into our scenario modeller to generate each expected value.

As noted above, this is an estimate which does not explain all of the variability between different providers. Therefore, we have computed prediction intervals which provide a sense of how robust these estimates are. We provide upper and lower bounds at the 90% and 95% levels. These mean that if the providers actual value is outside of these 90% ranges, there is only a 10% chance that it happens randomly and a 90% chance that the provider is unusually cost effective (below the predicted value) or expensive (above the predicted value). These ranges are quite wide, as this sort of analysis rarely provides a high level of precision. The closer the actual value is to the predicted, the more typical (“average”) is the performance.

Further Analysis

The report provides the opportunity to dig into these numbers a little more by including more the VFM entity level data.

Compare Results

In the Compare Results page a range of comparator providers can be selected. To select a group choose the other providers from the drop down list (there is a search facility which works dynamically as you start to type a name). Note that use CTRL + Click to select multiple providers. At this point you may wish to add a personal bookmark using the Power BI menu so that this selection is available to you the next time.

The selected list of providers will be used to populate the charts on this page. The main chart shows the predicted and actual cost for a the selected providers / comparators.

Note the ability to sort the columns by clicking on the three dots to the top right to bring up the Power BI menu

The next set of charts shows the values for each of the explanatory factors for the comparator group.

The dotted line is the median value - half of the group are above and half below this line, in other words it is the middle value, not the arithmetic average of the values.

To identify a particular provider just click on the appropriate column and it will be highlighted in all the charts.

Also, if you move the mouse pointer over a particular value and pause, a popup chart will show the value for that provider over the last five years.

There is also an option to display these charts as scatterplots, via the button above the charts.

The final chart is the histogram. This shows the distribution of SHCU and counts the number of providers in the selected group of comparators in each of the groupings of cost. These are set at £500 intervals, in other words the height of the column how many have costs between, for example, £3,500 and £4,000.

This gives a quick impression of the distribution of costs between providers, and if a particular provider has been selected, it shows which ‘bucket’ they are in.

Underlying Costs

The natural next step in any analysis is to try and understand why there is variation and which factors are driving that variation. This is where the Underlying Costs page comes in.

Each of the contributory costs is shown in a separate column chart.

Again, click on a column to highlight one of the providers. Any outliers in terms of costs can qucikly be identified as a possible explanation for a wide variance between predicted and actual costs.

Note, however, that our model includes the impact of expenditure on Major Repairs and Capitalised Major Repairs in generating a predicted value, so it is the other factors that should be reviewed, for example Other Activities.

All VfM Metrics

The final reporting pages shows all the VfM metrics, with several years of historic data. The metrics have categories i the original data, so an appropriate category has to be selected first. This creates a second list of the metrics available in that category. Simply select any metric of interest to display in the column charts, to show the values, comapred to the same, previously selected, comparator group. Once again a time line will appear on mouse over and an average is also shown in the bottom right.

On all the pages the ? can be clicked to bring up a help screen over the top of the page, to explain the functionality.