It’s officially summer and you’re looking for your next read. But you don’t want to read what everyone else is reading. No, you want something more obscure. You’ll take the books that no one is reading. Better yet, how about the books no one has touched in years?
We (programmatically) sifted through over 100 million checkout records from the Seattle Public Library to find fiction books that haven’t been checked out in over a decade*. Sounds like they’re just your speed.
This project put a new spin on a publicly available dataset. Instead of showcasing the MOST checked out books, we looked at the LEAST and hopefully highlighted some forgotten gems within the Seattle Public Library’s collection.
Both checkout and inventory data are publicly available thanks to the Seattle Public Library. You can view our processed data and the R scripts used to process the data here. All rating data available comes from Goodreads.
What was the hardest part of this project?
The biggest challenge for this project was working within the data caveats, detailed below:
All books discussed in the article are fiction books that have appeared in the Seattle Public Library’s physical book inventory for the entire span of time between September 2017 and May 2019 (the earliest and latest dates available for these data) and are still available, but have not been checked out any time between September 2005 and May 2019.
Since we don’t have inventory data from 2005 – 2017, there is a chance that some of the books may have entered the library’s collection during that time span. We excluded any books that were published after 2005 to minimize this likelihood. At the absolute minimum, all the books on our list have been in the inventory and have gone unchecked out since September 2017.
What can others learn from this project?
A data story doesn’t have to include any traditional charts or graphics to make it successful. You can communicate data with whismy without losing integrity.