Since 2020 we’re scraping Dutch sex advertisement websites to find scammers. We’re analyzing this data phone numbers that occur at different advertisements, people who are traveling a lot, or copypaste advertisement text. We’ve built an automatic analysis process that finds interesting leads in these advertisements, so we can research these individual leads separately.
This resulted in a series of articles, tv episodes and interactives about online scammers in sex advertisements and on fake dating websites. During this investigation, we used data journalism, OSINT, AI and went undercover to find out who are behind these scams.
This project tells the stories of persons who are not represented as victims in our society. When you are scammed while paying for online sex or trying to find love, the first and foremost reaction people will get is: ‘Well, it’s you own fault.’ The criminals behind these scams don’t have to be afraid to get caught, because embaressment and low attention by the police are contributing to a small amount of police reports. While the victims can be charged with tens of thousand of euros in damages. And that’s not even considering the loss of trust for these victims.
By uncovering these scams, we made people aware of fraudulous dating websites and safe dating practices. For sex workers, the scammers pose a danger to thei safety and wellfare: clients can be chastised for the fraud of a scammer.
We started our investigation by scraping 5 different websites with sex advertisements. These scrapers are made in Python and run on a cron job: we visit these websites every week to collect new advertisements.
After that, a cron job for an analysis in R runs to find the most suspicious advertisements. We’re looking at several clues, like phone numbers that occur at multiple advertisements, people who seem to travel a lot, and advertisement text that’s almost similar to other advertisements. After this analysis, we receive a list of the most promising leads.
A lot of times, analyzing the photographs is enough to find suspicious advertisements. With reversed image search, we can determine that most of these advertisements are fake. We also collect the fake advertisements in a Google Spreadsheet, which feeds into the next analysis loop. This way, we can find a lot more scammes in the next round.
After a while, we got a tip about fake dating websites. This person claimed to have worked at a dating website, where she was paid to to play the role of several dating profiles. Her only job was to keep genuine customers talking (because they’e paying per sent message). We decided to go undercover as a so-called chat agent. We could confirm that thousands of fake dating websites are active in the Netherlands. We also got the manuals for new employees for some websites. We made an interactive about the tactics these chat agents deploy to keep you talking.
After that, we got another tip about fake profiles on Tinder. People who are trying to find someone over there, are being lured to these fake dating websites. We hired the world record holder for Tinder matches to build a Tinder scraper. We trained an AI to find hundreds of fake profiles.
Context about the project:
In 2018, our colleagues uncovered that a lot of fake dating websites are online in the Netherlands. You can’t find real profiles on these websites: instead, you’re connected to a so-called chat agent, a person who’s being paid to keep you talking indefinitely. These websites don’t communicate that they’re fake.
But a lot of readers still think that the blame is on the customer. They should’ve taken more care when signing up for an account, is the general consensus. We wanted to get a peek behind the scenes in how these chat agents are trained for the job. We succesfully got a job as a chat agent to confirm our suspicions. From that experience, we made a tv episode and an interactive story on how people are being scammed. These website use a variety of smart tricks to keep customers engaged and chatting.
Our Tinder research was one of our first experiments with Lobe.AI: software that allows you to train an AI te categorize images. We trained our AI to recognize fake profiles on Tinder, so that we didn’t have to manually look through hunderd of thousands of images.
For our investigation into sex advertisements, we build our own scraping and analysis process. If we had to look through thousands of advertisements, we probably wouldn’t have found anything. But because we automated most of the work, we had a lot more time for verification and additional research.
For our investigation into webcam scamming, we had to use a bit of OSINT to verify the location of a hotel in Warsaw (Poland). We had a lot of images from this fraudulous web cam company, and we could verify the exact hotel room thanks to the wood nerves in a wooden beam.
What can other journalists learn from this project?
This investigation shows that a lot of data journalism skills, OSINT and AI can be used to investigate a very difficult topic. When it’s hard to find victims or suspects, these techniques can provide a lot of information to write. One of my main pet peeves is that stories make stories: when you write long enough about a topic, you will get more and better tips. So at a certain moment, these hard to find victims came to us by themselves. They wouldn’t have done that if we hadn’t published our first few stories, mostly made with data.
During this project, we also had the feeling that we could experiment a bit. Our Tinder research could’ve failed at any moment, because it was a learning process all along. But that was okay. Nobody knew that we were working on it, so we were the only ones knowing that we potentialy would’ve failed. It’s cool that there are still opportunities for journalists to tinker away with new technologies and techniques. And it’s extra rewarding if those risks pay out with some unique, exciting and important stories about scams and frauds that aren’t talked about that much.