The project is telling about phenomenom of ‘Oscar Bait’ — the movies that produced mainly to earn major film awards. We used the Gabriel Rosman and Oliver Schilke algorithm to predict oscar nominations that movie gets using only formal variables such as release date, genre and IMDB keywords.
We discovered that most of Oscar ‘Best Picture’ winners since 1990 have the greatest Oscar Bait score. It probably means that Oscar-wiining films doesn’t have to be really great. It’s simply enough that they satisfy jury tastes.
Our media isn’t interested in promoting our works so we have a small amount of readers.
Python, JS, API, WebGL, Machine Learning, Tableau (prototypes), Big Data, Parallel Downloading
What was the hardest part of this project?
There are four harders points in this work:
1. Data downloading. We used IMDB API to download about 2Gb of data (more than 100 000 movies). It taked about a week to download the data.
2. Understating the algorithm. The calculations of this work are based of Gabriel Rosman and Oliver Schilke scientific work — Close, But No Cigar: The Bimodal Rewards to Prize-Seeking. It was really challenging to understand right formula of Binomial Regression to get Oscar Bait score.
3. Machine Learning. It was our first work with ML using, so it was pretty hard to us.
4. Data Visualisation with OpenGL. There are a lot of objects in data visualisation — about 15 000, so we need to learn OpenGL to create interactive visualisation without bugs.
What can others learn from this project?
1. Use case of ML in journalist
2. Use case of thousands of objects visualisation