HomeOur World in Data

How we’re building a team for better data at Our World in Data

Shifting scientific research towards a more specialized, collaborative approach is not just about increasing productivity. There are also benefits for

Shifting scientific research towards a more specialized, collaborative approach is not just about increasing productivity. There are also benefits for the quality of the research in terms of its reliability and value for others.

Increased reliability

In the one-man-band approach to organizing research that is common in many research disciplines, often only one pair of eyes will ever look at many of the computational steps involved in processing the data and producing the final analyses and outputs before they are published.

That’s inherently fragile. And tellingly, it’s very different to what professional programmers do in the software industry, where code review is the norm.

Moreover, it makes it harder for researchers to justify taking the trouble to produce and publish their data in a replicable, well-documented way that’s easy for others to understand and check. And because it’s often hard to scrutinise the data and code underlying research, people rarely do.

Overall, it means mistakes are more likely to occur and are less likely to be spotted, contributing to wider concerns about the replicability of published scientific research.

Increased value

Many of the poor practices that make data hard to scrutinize also make it harder for others to build on.

The value of good data work lies not just in the initial research it informs but in the chain of subsequent innovations and learning that it can enable. Data and code shared in an open, accessible and well-documented way can be used by others over and over to conduct their own analyses and build new tools. Publishing data is not only an end in itself, it is the input into other people’s work.

The one man-approach to research makes it harder to realise these positive externalities, by encouraging data and code that is poorly documented, poorly structured, written in idiosyncratic programming languages, or published without regard to standards and norms that increase reusability.