Tag Archives: EthicalAI

Library of missing datasets: Are you being digitally excluded?

A file cabinet with four drawers, one of them is opened and empty. At the right of the file cabinet, there is the sentence “whose data are we missing?” with an arrow pointing to the empty drawer.
Image by OpenClipart-Vectors from Pixabay  adapted by Patricia Gestoso.

(7 min read)

Data protection and privacy regulations like GDPR, the pervasiveness of social media, and the boom of artificial intelligence have prompted debates among academic, governmental, commercial, and non-profit organisations about our rights to own our data and how that data is used to sell us stuff and surveil us. These discussions often forget whose and which data are we missing.

My research on the effect of covid-19 on the unpaid work of professional women made me painfully aware of the gap between intent and impact when we talk about collecting data. The dataset that constitutes the basis of the report came from 1,300+ responses from mostly White women to a survey. We had relied on snowballing – our network – to get more women to answer the survey. Unsurprisingly, our network looked like us!

This mishap prompted my interest in the harms of missing or incomplete datasets – both in general and in the case of children.

Recently, a found somebody that has made a great job at using art to bring awareness to the topic of missing datasets.

The Library of Missing Datasets

Mimi Ọnụọha is a Nigerian-American artist and researcher whose work highlights the social relationships and power dynamics behind data collection.

She has created a Library of Missing Datasets. In her words

“Missing data sets” are my term for the blank spots that exist in spaces that are otherwise data-saturated. My interest in them stems from the observation that within many spaces where large amounts of data are collected, there are often empty spaces where no data live. Unsurprisingly, this lack of data typically correlates with issues affecting those who are most vulnerable in that context.

Mimi Onuoha

Why should we care? Onuoha believes that “what we ignore reveals more than what we give our attention to. It’s in these things that we find cultural and colloquial hints of what is deemed important. Spots that we’ve left blank to reveal our hidden social biases and indifferences.”

She compiles a list of missing or incomplete datasets. Some examples are:

  • People excluded from public housing because of criminal records.
  • Trans people killed or injured in instances of hate crime (note: existing records are notably unreliable or incomplete).
  • Poverty and employment statistics that include people who are behind bars.
  • Muslim mosques/communities surveilled by the FBI/CIA.
  • Mobility for older adults with physical disabilities or cognitive impairments.
  • Undocumented immigrants currently incarcerated and/or underpaid.
  • Firm statistics on how often police arrest women for making false rape reports.

Onuoha has created a version 2.0, where she focused on blackness. She says “Black folks are both over-collected and under-represented in American datasets, featuring strongly as objects of collection but rarely as subjects with agency over collection, ownership, and power.

I found very thought-provoking the images of the file cabinets with the drawers open showing the tagged empty folders. You can check them yourself the initial project and the 2.0 version.

Some of the datasets I’m missing or existing records are incomplete

  • Women that have not been promoted in spite of having all the requirements because of bias.
  • Disabled people that have been discriminated against by hiring algorithms.
  • People that have unfairly been denied work permits and residence visas.
  • Children with long covid.
  • LBTQ+ people that fear coming out because of backlash.
  • People in Venezuela that have endured “express” kidnapping.

Back to you

  • Which datasets are you missing?
  • Which datasets are missing you?

Before I go

For reflection

Diversity is not the magic bullet to fix inequity. For those still doubting it, in this edition of The Flock with Jennifer Crichton newsletter, Gemma Doswell reflects on the relative broad gender and ethnic diversity of the candidates for the Tory leadership in the UK and how we assume that it automatically should translate into advocacy for their visible identities.

A boost of energy

Mastercard now links all employee bonuses to ESG goals!

In 2021, the company introduced a compensation model for executives tied to three main Environmental, Social and Corporate Governance priorities: carbon neutrality, financial inclusion, and gender pay parity. This year they have rolled the scheme out to all employees globally.

News from me

Early this year, I went to Edinburgh to deliver a workshop at the Scottish AI Summit called Goodbye shiny robots & glowing brains: Why Better Images of AI matter. This is in the context of my work as Head of Diversity, Equity, and Inclusion at We and AI and my participation in the Better Images of AI project.

The workshop was delivered both in-person and online with Tania Duarte, Co-Founder and CEO of We and AI, and Tristan Ferne, executive producer at BBC Research & Development. You can watch it on the summit’s website.

Do you prefer a podcast? You can listen to Tania and me discussing with Steph Wright why better images of AI matter and the reasons we need trustworthy, ethical, and inclusive AI on this episode of Scotland’s AI Strategy podcast, Turing’s Triple Helix.


As I mentioned on a previous post, I’m writing a book and I need your help!

[ASK] I’d be immensely grateful if you could complete and/or share with your network of women in tech this short survey about your/their experiences at work.

What do I mean by “Women in Tech”? Women working in any function (R&D, HR, services, finance, CXO) in the tech sector (software, hardware…) or in tech-related functions in other sectors (e.g. IT, cybersecurity…).

Whilst the survey is anonymous, you’ll have the option to get involved in the project before submitting the form. Thanks for your support!


Inclusion is a practice, not a certificate!