How gendered is the clothing offer for children and what does this mean for the kids? For our data analysis, we examined 20,000 shirts and shorts from
H&M, Zalando and About You that they offer online for children under 10. The results show: Current children’s fashion cements gender images. According to prints, boys should surf and girls should dream. And on average, a pair of 30cm-wide shorts for girls is six cm shorter than boys’ trousers. This data journalism project is about more than colours and cuts, it’s about whole worlds that are withheld from some and imposed on others.
The reactions were overwhelming. The piece is behind the paywall and led to an unusually high number of subscriptions. In the social networks, where we published the key findings including data visualizations in threads also in English, the analysis was strongly received, liked, shared and discussed. One of the English Twitter threads now has over 8,000 likes. Domestic and foreign media interviewed our analysis team and podcasts picked up our findings. We also know of at least three adaptations by french, norwegian and swiss journalists. Excitingly, analysts from Zalando contacted us and asked about our methods because they wanted to evaluate how gender stereotypical their site is.
For the analyses and charts, the team programmed in R and used machine learning models from Google Cloud Vision.
Trouser length: Since the dimensions of the pants on offer were not specified across the board, we calculated the length-to-width ratio from the product images. To do this, these were first converted into silhouettes, from which outlines could then be extracted. The sample consists of 1700 pants for girls and 2106 pants for boys. For the graphical representation, the outlines were set to the same width and the trouser cuffs were superimposed. The middle pair of pants highlighted in the graphs is the one with the median length-to-width ratio. All steps were done in R.
Colors: To find out how the colors of shirts offered for girls and boys differ, we evaluated more than 17,000 product images for their dominant colors using machine learning models from Google Cloud Vision. The ten colors most dominating the product image were then mapped with R to a color palette of 139 colors of the main colors. For the treemaps, the occurrence of colors was summed by gender: If a color occurs more frequently as a dominant color, it occupies a larger area on the visualization.
It was difficult to collect the patterns and details on shirts. For an overview, we used a pattern recognition algorithm from the Google Cloud Vision API. But because this is quite unspecific, for example only outputs “vertebrate” and not “dog” or “cat”, we rely more on the information provided by the retailers for details and motifs.
Slogans: In order to find out which terms distinguish boys’ and girls’ clothing, the word clouds focus on so-called distinctive terms we calculated in R: terms that occur particularly frequently in one gender and (almost) not at all in the other gender.
Context about the project:
The topic of stereotypes in children’s clothing – especially with regard to color selection – is basically not a new one, but one that is always current. Those of us who have children have had the feeling: the gender division in clothing brands like H&M is stronger than ever, despite the gender debate, although we actually know better and role thinking should no longer be as strongly anchored in society as it was 50 years ago. This summer, we were also particularly struck by the fact that shorts are worn much more short by young girls than by boys of the same age. And this at a time when the LGTBQ movement is being accused of early sexualization of children. With our methods, we data journalists can reflect or prove a perceived truth with data. We wanted to verify our impression with data journalistic methods and feed the debate with numbers. We wanted to know: Is it not just individual taste, but are parents of and very young children already being purposefully steered into stereotypes by the pre-sorted offer according to gender? And what does that do to the children?
This has never been analyzed so systematically, there are no databases on cuts and colors of children’s fashion. That’s why we collected the data ourselves. One particular aspect of the analysis was the comparison of short lengths, for which we worked with product images and outline analysis techniques.
What can other journalists learn from this project?
The project and its audience success show what an important part data journalists play in verifying perceived truths and providing facts to ongoing debates.