We looked at the gender and race/ethnicity of the people who appeared in clues and answers across 5 major crossword publications.
Direct from the story: “Crosswords tell us something about what we think is worth knowing. A puzzle that subtly promotes the idea that white men are the standard, the people everyone should know about, is a problem for all of us (yes, even the white men).”
The piece was a first-of-its-kind analysis and prompted many crossword devotees to change with puzzles they play and subscribe to.
The data collection and cleaning was done manually and in Python. The site was built using HTML/CSS/JS, including React and D3.js.
What was the hardest part of this project?
The hardest part of the project was manually collecting gender and race/ethnicity data. We had a team that combed through a sample of crossword clues and anwers from each puzzle and manually researched how each person identified. Then we cross checked each other’s work. It was by far the most time consuming part of the project, but it was also a foundation that we had to get right. When we were collecting the data, you could instantly feel the changes in the people who were included as you moved from publication to publication and we wanted to recreate that feeling for our readers using a playable crossword.
What can others learn from this project?
- Big data doesn’t have to mean programmatically collected data. Sometime you have to get your hands dirty.
- Just because you anecdotally know what the data will show doesn’t mean that it’s not worth showing to others in a new way.