menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

AI tool helps journalists detect passports hidden in massive offshore data leaks

32 0
24.05.2025

In the fast-evolving landscape of investigative journalism, where massive troves of leaked data hold the keys to uncovering hidden corruption and financial secrecy, precision and efficiency have become critical. Recently the Global Investigative Journalism Network (GIJN) published a revealing article titled “Passports Are Key to Uncovering Offshore Secrecy – We Use Machine Learning to Find Them Efficiently,” highlighting how passports-seemingly mundane travel documents-have become one of the most essential tools in exposing offshore financial wrongdoing.

At the heart of this investigative revolution is the International Consortium of Investigative Journalists (ICIJ), a global network that has led some of the most significant journalistic breakthroughs of the 21st century, including the Panama Papers and Pandora Papers investigations. These monumental efforts exposed the complex financial webs spun by elites, politicians, and public officials worldwide to hide wealth and evade scrutiny. What many might not realize is that passports serve as vital identifiers in these investigations, helping to connect shadowy companies and trusts to real individuals.

Passports provide crucial, irrefutable data points-names, dates of birth, nationalities, and unique passport numbers-that allow journalists to pierce through layers of offshore anonymity. In jurisdictions where corporate ownership can remain opaque behind a veil of shell companies, trusts, and nominee directors, a passport scan is often the only way to link these entities back to actual people.

However, finding a passport scan buried among millions of documents is a daunting challenge. In massive data leaks, information can be buried in millions of files spanning PDFs, emails, images, and scanned documents. Passports rarely have obvious filenames, and Optical Character Recognition (OCR) software struggles with the poor quality of many scans. Journalists previously relied on keyword searches using ICIJ’s open-source search engine, Datashare, filtering for terms like “passport” or “visa” and specific file types.........

© Blitz