Since early in the COVID-19 pandemic, America’s Frontline Doctors has been lying about public health, including spreading disinformation about vaccines and promoting unproven medicines hydroxychloroquine and ivermectin as miracle cures for the virus. Our investigation, based on hacked records that an anonymous source sent us from AFLDS’s telehealth partner SpeakWithAnMD and online pharmacy Ravkoo, showed that during a two month period, 72,000 patients paid $6.7 million for phone consultations alone. The vast majority of 340,000 prescriptions filled were for ineffective COVID-19 medications, costing patients $8.5 million for these bogus drugs during 11 months of the pandemic.
Following our story, the House Select Subcommittee on the Coronavirus Crisis, a committee in the U.S. House of Representatives, announced an investigation into AFLDS and SpeakWithAnMD, widely citing our work. The committee also wrote a letter to the chair of the Federal Trade Commission requesting that the agency “investigate the deceptive conduct of companies promoting and profiting from misinformation” about the pandemic.
In response to our reporting, and similar reporting by other newsrooms about AFLDS, we have confirmed that the Medical Board of California is investigating Dr. Simone Gold, the founder of AFLDS who was arrested while storming the Capitol on January 6, 2021, which could lead to her getting her medical license revoked.
An anonymous hacker sent The Intercept health care records from SpeakWithAnMD’s partner Cadence Health and from the online pharmacy Ravkoo. The Cadence Health data was in hundreds of thousands of JSON files full of patient data, and the Ravkoo data was in enormous CSV files full of prescription data.
Because there was so much data and it was all in machine formats, roughly half of the time working on the story involved writing Python code to parse the data and convert it into usable components. These included spreadsheets containing useful fields related to patients, simplified versions of the prescription data that could be graphed in a pie chart, or spreadsheets listing city names, number of patients, and geolocation coordinates to build a map.
We also used several other tools and technologies for this project. In order to untangle the history and ownership of domain names belonging to a network of LLCs, we made extensive use of the service WhoisXML API to look up historical domain name records. The Internet Archive’s Wayback Machine was extremely useful in building out a timeline of events of AFLDS’s activities. We also made use of the OSINT tool Maltego to build a graph that shows the links between the various organizations, individuals, and domain names.
What was the hardest part of this project?
The hardest part of this project was making sense of massive, unintelligible datasets in a short period of time, all while protecting the privacy of individuals. Because the data we were working was sensitive patient and prescription records that are subject to HIPAA regulations, we had to take precautions to carefully protect it, including not uploading any of it to third party services.
The anonymous source provided hundreds of thousands of JSON files and several CSV files each containing hundreds of thousands of rows, but without any documentation. It wasn’t until we began closely looking at the data, and parsing it all using Python scripts, that what we actually had became clear: a network of companies were making millions of dollars off of pandemic misinformation. The fact that these were two separate datasets, one from the pharmacy and the other from the telehealth company, each covering data from different periods of time, made it all the more complicated.
What can others learn from this project?
Although journalists have a responsibility towards the public and towards the data they use and publish, we found ways to responsibly investigate a large hack of medical records without compromising the privacy of individuals. We compared public social media posts by individuals who were in the hack to confirm that they were patients, and conducted all of the data science locally without sharing it with third parties.