From one-off data collection, to continual updates. From constant city boundaries, to sector-specific functional clusters. From thousands of data points, to millions and growing. The UK Tech Innovation Index 2.0 is ready to scale, powered by The Data City.
UK Tech Innovation 2 is unconstrained by standardised geographies. In deciding where clusters are we consider only the geography and links between businesses, universities, institutions, event venues, and people.
Clusters are different sizes and shapes for different categories and they will change over time. Some will look very strange. We show the approximate geographical extent of the clusters on the map. We include names of cities within them so that they are recognisable.
With millions of rows of data, and thousands more rows every week we can't classify events by hand. Instead we use machine-learning.
The new UK Tech Innovation Index is produced by The Data City with support from the Open Data Institute (ODI). The project is part of the ODI's innovation programme, a three-year, £6m programme to support and build upon the UK’s strengths in data and data analytics, funded by Innovate UK.
The results show the absolute size of the cluster. So bigger places score higher just because they're bigger. We have experimental scores divided by population, please get in touch if you'd like them.
Probably not. We think that our clustering algorithm gives good results about nine times out of ten. It struggles because one of the main factors it uses to assign points to clusters is physical proximity. It doesn't understand what water is for example. The reliance on physical proximity is reduced when we have data on links between businesses, events, and papers. Away from big cities that data is often poor. An improvement that we're working on is to consider travel time rather than physical proximity.
We've considered manually correcting clusterings that look wrong, and decided against it for three main reasons.
The scores add to 100. They're the proportion of the UK's total ecosystem.
This comes back to how we define and name clusters. Below is an example from our map for the urban North-East of England in two categories, AI & Data and Clean Growth. In AI & Data our algorithm measures sufficient collaboration and proximity between Teesside, Wearside, and Tyneside to put them into a single cluster. We then name the cluster after the three largest places within it.
In Clean Growth the industry on Teesside is sufficiently strong and independent of Tyneside and Wearside that our algorithm places it in its own cluster.
Because of this, we don't have consistent geographies across industrial categories. We can't sensible give a score for each cateogy for each cluster.
We mean all the events, businesses and papers in The Data City. It's a tiny sample of the UK as a whole, but growing quickly.