To find out how women of all shades were represented in Vogue magazine over the last 19 years, we programmatically calculated how light a model’s skin tone looked in each photograph. At a glance, you could argue that Vogue covers are diverse, or at least that they have gotten more diverse in recent years.But when we really look, it’s easy to see that the majority of the black women are light-skinned and the majority of dark-skinned women are actually a single person.
The project made waves on social media, was shared by The Guardian, and spurred conversations around colorism, tokenism, and representation in the fashion industry.
The covers were downloaded from the Vogue archive and then fed through a python script that identified female faces. For each face, several k-means clustering models were fit. The clustering models varied in terms of which features were used (some combination of the rgb and hsl color values) as well as how many clusters were formed (two or three). Since the style of the covers varied so differently, different clustering models did a good job at identifying the skin. The script filtered out just the pixels that were determined by the computer model to contain skin, and calculated and stored the median rgb color, as well as the corresponding lightness value.
What was the hardest part of this project?
The most challenging part of the project was making sure that we didn’t gloss over the more nuanced findings. Overall, Vogue models’ skin tones have gotten more diverse, but we needed to effectively communicate that a single person, Lupita Nyong’o, was pulling the trend toward the darker end of the spectrum and that you still saw curious overlaps within skin tone ranges. For example, Anne Hathaway’s darkest tone overlapped Rihanna’s lightest tone.
What can others learn from this project?
This project is a great example of what The Pudding does best — take a common occurrence like magazine covers and turn it into a thoughtful and thorough data-backed critique on deeper cultural issues. It’s also proof that data viz by women and for women is important, powerful, and deserves space in this industry.