2022 Shortlist

Fashion brands aren’t keeping their Instagram diversity promises

Country/area: United States

Organisation: Quartz

Organisation size: Big

Publication date: 16/03/2021

Credit: Amanda Shendruk, Marc Bain, David Yanofsky


Amanda Shendruk is a visual journalist on Quartz’s Things team. She reports at the intersections of code, data and design.

Marc Bain was Quartz’s fashion reporter. He covered anything and everything related to clothes and footwear, whether sneakers or luxury, business, or design.

David is the editor of Quartz’s Things team, the publication’s cohort of journalists who use code-based methods to originate and execute their stories.

Project description:

A year after fashion and beauty companies took to Instagram en mass to show support for the Black community and the Black Lives Matter movement, our analysis of 27,000 images posted by 34 brands showed that while many did increase the diversity of skin tones in their Instagram images, the increases were often only marginal. Light skinned models still prevail.

We made this readily apparent with interactive and static data visualization.

Impact reached:

The piece was one of the more widely read items on our site and was especially well read by members of the fashion and beauty industry. Researchers of inequity and company representatives reached out asking us to share our data and methods so that they could bring better accountability to their organizations and study it further. Influencers shared the story and graphics with their followers. Fashion influencer Bryanboy called it “very essential reading” Later in the year our data and graphics were included in an episode of an episode of the The BoF Show on Bloomberg TV.

Techniques/technologies used:

First we used custom built tools to collect and store Instagram posts using python and node. Then we constructed a database front-end that allowed us to evaluate and categorize every image we collected. That piece of software was written in node.

We then analyzed our data using the python library pandas. Visualized the data using HTML, CSS, and D3.js and added interactivity using javascript. 

The visualizations have three modes to allow readers to explore the data. A timeline view, a clustered gradient view, and a combination of the two—a view of two clusters, split by whether the post was from before or after Blackout Tuesday. These three modes deftly showed how long brands stopped posting to Instagram during the US unrest, the distribution of skin tone depicted on a brand’s account, and how that distribution changed after Blackout Tuesday. In all three views, dots can be tapped or moused-over to reveal the image it represents.

We size-optimized the photographs using the command line tool imagemagick.

What was the hardest part of this project?

Collecting this data was extremely hard. Instagram does not have an API and the site will block IP addresses that it perceives as trying to harvest data. Nevertheless, we devised ways to both collect the data without violating the site’s terms of use—and avoid being blocked.

But that was just the start. We then used the software we wrote to evaluate each image by hand, establishing the number of people in the image, their skin colors, and whether or not the image was suitable for inclusion in our analysis. 

What can others learn from this project?

Firstly, our project is a great example of how to hold organizations accountable through data. Second, it shows the opportunity for journalists to create data where none previously existed. There was no dataset of the skin tones of models promoted by fashion and beauty brands, despite the information being in plain view. We were willing to put in the work, and made a first-of-its kind dataset.   

Project links: