This afternoon sees the start of the “Rankings and the Visibility of Quality Outcomes in the European Higher Education Area” conference in Dublin Castle, part of the events associated with Ireland’s Presidency of the EU.  A good chunk of today’s proceedings focuses on the adoption and roll-out of the EU’s new university ranking exercise, called U-Multirank, which aims to be live by 2014.

Since the initial global university ranking in 2003, there have been a plethora of ranking systems developed, with the big three being the ARWU (Shanghai) ranking, QS ranking, and the Time Higher Ed ranking.  These rankings have become key benchmarks for comparing universities within regions and across the globe, seized upon by some universities for marketing, and the media and government to either promote or denigrate institutions.  They are undoubtedly being used to shape education policy and the allocation of resources and yet they are routinely criticised for being highly flawed in their methodology.

Somewhat ironically, a sector devoted to measurement and science has been evaluated to date by weak science. There are several noted problems with existing rankings.

The rankings use surrogate, proxy measures to assess the things they purport to be measuring, and involve no site visits and peer assessment of outputs (but rather judgements of reputation, alongside indicators such as citation rates).  An example of such proxies include using the number of staff with PhDs as a measure of teaching quality; or the citation rate to judge quality of scholarship.  The relationship in both cases is tangential not synonymous.

The rankings are highly volatile, especially outside the top 20, with universities sliding up and down the rankings by dozens of places on an annual basis.  If the measures were valid and reliable we would expect them to have some stability – after all universities are generally stable entities, and performance and quality of programmes and research do not dramatically alter on a yearly basis.  And on close examination some of the results are just plain nonsense – for example, several of the universities listed in the top 20 institutions for geography programmes in the QS rankings in 2011 do not have a geography department/programme (e.g. Harvard, MIT, Stanford, Chicago, Yale, Princeton, Columbia; note the link automatically redirects to 2012 results for some reason) and other rankings barely correspond to much more thorough assessments such as the UK departments vis-a-vis the UK research assessment exercise (very few geographers would rank Oxford University as being the best department in the UK, let alone the world).  Such nonsense casts doubts on all the results.

The measures do not simply measure performance but also reputation judged by academics. The latter is highly subjective based on opinion (often little informed by experience or on-the-ground knowledge of the relative performance of universities in other country systems) and is skewed by a range of factors such as the size of alumni, resources and heritage (their past reputation as opposed to present; or simply name recognition), and is inflected by wider country reputation.  The sample of academics who return scores is also skewed to certain countries.

Because the measures add weight to data such as citation and research income they favour universities who are technical and scientific in focus, and work against those with large social science or humanities faculties (whose outputs such as books are not captured by citation and require less research funding to do high quality research).  They also favour universities with large endowments and are well resourced.  The citation scores highly skew towards English-Language institutions.

The rankings take no account of the varying national roles or systems of universities, but looks at more global measures.  Universities in these systems are working towards different ends and are in no way failing by not having the same kind of profile as a large, research-orientated university.

None of the ranking standardise by resourcing, so there is no attempt to see who is performing the best with respect to inputs; they simply look at the scale and reputation of outputs and equate these to quality and performance.  This conflation raises some serious questions concerning the ecological fallacy of the studies.

These failings favour certain kinds of institutions in certain places, with the top 100 universities in the three main rankings all dominated by US/UK institutions, particularly those which are science and technology orientated.  There is clearly an Anglo-Saxon, English language bias at work, hence the new EU ranking.  Very few people who work in academia believe that the UK has many more better universities than those in Germany, or France, or the Netherlands, or Sweden, or Japan, or Australia, etc.  Yet only a handful of universities in these countries appear in the 100, and hardly any at all in the top 50.

Whether the U-Multirank system will provide a more valid and robust ranking of universities, time will tell.  The full final report on its feasibility suggests a wider vision and methodology and some concerted attempts to address some of the issues associated with the other rankings.  One thing is certain, rankings will not disappear, as flawed as each of them are, because they serve a useful media and political function.  However, they should be viewed with very healthy scepticism, mindful of the criticisms noted above.

Rob Kitchin

For an interesting set of blog posts and links to media stories re. university rankings see these collections at Global Higher Ed and Ninth Level Ireland.