I performed an in-depth analysis of the Himalayan Database to seek answers to why there were such crowds on the summit of Mount Everest this year. I followed two main threads. First, looking at the changing ascent and summit patterns over last decades. Second, exploring how the expedition character shifted towards one composed of older members, more thoroughly assisted by the hired staff, and prevalently ending with a successful summit bit. The data tell a story of impressive technical progress in high-altitude mountaineering, but also one of commercialization of an activity once considered available only to highly skilled explorers.
Providing new insights to the mountaineering community and empowering them to develop better policies aiming to deal with the problem of overcrowding.
I used Python and Pandas for data extraction from the Himalayan Database, as well as subsequent cleaning and manipulation. I used Matplotlib with Seaborn to create custom data visualizations.
What was the hardest part of this project?
The Himalayan Database is a very unique and unfortunately quite overlooked dataset. It contains records of most of the mountaineering activities in the Himalayas since 1910 (all expeditions excluding the Pakistani area). It has been created and maintained by a handful of passionate people and the amount of detailed data contained there is invaluable and unique on a global scale. However, due to the closed nature of the project (it was open-sourced only in 2017) and somewhat obscure way to access it (the authors provide only a proprietary software built on top of an SQL database), it hasn’t gotten the analytic attention it deserves. There has been some analysis done already, however not in-depth enough to tell a meaningful story – mostly simple aggregate statistics with a relatively shallow commentary to illustrate a more qualitative article. I have approached the dataset from a perspective of a data scientist and carved a more detailed and multifaceted story, tightly connected to the data and not stereotypes. Diving into the ecosystem of mountaineering blogs and mainstream articles, I identified the main questions and statements about the topic the public is discussing right now, such as: it is very dangerous to climb Everest nowadays due to the crowds, crowding on the summit is a very recent phenomenon, who are the people taking part in the expeditions. I confronted them with the data, and in some cases I confirmed the common knowledge present in the community, but in other, I invalidated it or shed new light on the topic. After publication, I have received many feedbacks from the community, thanking me for the in-depth insights and their balanced interpretation.
What can others learn from this project?
In a way, the story hidden in this dataset is a story of commercialization of adventure. Only 70 years ago summiting the Everest was an unbelievable achievement at the limits of human skill and endurance. Even in the 80s, it was mostly professional mountaineers who came to the Himalayas. Now the crowds on the Everest are composed in a big proportion on older men who can afford coming there time- and money-wise, no matter their country of origin, and the rise in arrivals is driven consistently by ever older population groups. This behavior is not exclusive to mountaineering: a very similar phenomenon has been described in the cycling world as the rise of MAMILs (Middle Aged Men In Lycra). While this has nothing to say about the motivation or passion of the expedition members, the data shows that the summit of Everest, at the 80% summit rate, is currently pretty much available to everyone that can afford coming (there are even “full-service” expeditions for people outside of their best fitness are available for fees of around 100 000 USD). Interestingly, what the data also hints to, this shift has also been possible because of the accumulated knowledge about the mountain and great technical improvement of mountaineering gear, ranging from axes and boots, to lightweight insulation and oxygen masks. This makes it a nice illustration of an intimate coupling between nature, technology and culture, visible only through the analysis of this unique dataset.