Since I started this “pointless” research project, something has happened that has made my efforts ever so slightly less pointless. Last Monday (June 14), US regulators approved the first movies futures contract, to be offered on the movie Takers (Aug 20). The idea can potentially be scratched in the financial reform, but for now, this means that people can place legitimate bets on how much revenue a movie will gross (oh my god, that’s like the point of my project!). I actually finished up everything on Tuesday, so I guess that was some auspicious timing.
Now, this project was not pursued in a rigorous, academic manner, and results should be taken lightly. It’s unlikely that the model would be conducive to predicting gross revenue figures of movies for purposes as mentioned above.
Methodology
Independent variable and sample size
As the intention was to see how factors affected a movie’s gross revenues, Worldwide Gross Revenue was the independent variable. I decided to use worldwide as opposed to domestic because it would provide for a bigger range of revenues.
I defined “High Grossing Movie” as any movie from 2005 until now that has grossed $150MM or more. I wanted to get a decently sized sample within the past 5-6 years to limit the impact of inflation. 150 happened to work out perfectly to a sample size of 200 movies.
Quantitative dependent variables
Rotten Tomatoes Rating – proxy for movie quality, because presumably, the better a movie, the more people will go see it, right? This is the main reason I wanted to do this project, I was curious if critic ratings actually meant anything.
Budget – this number covers everything – cast, director, writers, production, effects, music, marketing, copyrights, etc. The more effort and resources you invest into a movie, the higher a return you would expect. This also helped me solve the issue of incorporating the number of “stars” that were cast. To measure that directly would have been both difficult and subjective, and budget indirectly covers that aspect of a movie.
Qualitative dependent variables – These are factors that a simply measured by a yes or a no.
IMAX/3D – if a movie is offered in either/both formats, there’s usually more hype. Coupled with the fact that there is a premium on IMAX/3D ticket prices, you would expect revenues to increase.
Franchise – movies that are based on previously published material already have a fan base, which are often enormous (Harry Potter). People also like to watch sequels to good originals (Shrek 2/3/4), so you would expect these to generate higher revenues.
Summer – people have more time to watch movies regularly during the summer, and the industry tends to schedule blockbusters meant to earn money and not Oscars during the summer, so revenues should be higher for summer movies.
Coloured leads – this was included purely as a curiosity. People can argue about the moral issues of casting leads of colour, but is there any incentive from a profitability standpoint to cast a certain race? Every movie has a list of “lead” actors (the number of them varies from movie to movie); if any were of colour, I ticked yes.
Some other variables that would probably impact revenues are genre (comedy films generally do well, but rarely top the box office) and film rating (more people can watch PG movies than R). But these were more complicated than “yes” or “no” classifications and I was lazy, so I didn’t bother. =)
Here’s the spreadsheet for the raw data showing the 25 highest grossing movies since 2005 (click for larger version):
Just taking a look at this, you can see how abundant IMAX/3D, franchise, and summer movies are in the top 25. Most tend to be high budget movies and the ratings are generally positive, though not overwhelmingly.
Results
Here’s a summary table of some statistics for all the variables (click for larger version):
There are some interesting trends that are quite apparent. The raw number of movies that are able to gross over $150MM tends to be increasing, with mean and median revenues on an upward trend. Average ratings on high grossing movies seem to be declining. There’s a definite increase in 3D/IMAX movies, and 2010 has started out with some strong franchised movies.
Here’s what actually matters; the regression results. It’s pretty intuitive that franchise movies will gross more than originals, but how much more is what actually matters (click for larger version):
The first row is single variable model, while the other rows are multivariable models (this is all covered in the first year stats course). 2006 and 2009 data weren’t very consistent so I ran another regression excluding those years. In the single variable regressions, all the variables were actually strongly correlated to revenues, except colour leads (well, there goes that), so I ran another regression excluding that.
The way you read the number is, take for instance “2006&2009 out” (which was the best model), on average, a 1% increase in the rating on Rotten Tomatoes should result in the movie grossing an extra $1.8MM, and for every extra $1MM invested in the budget, the movie will grow an additional $1.9MM. On average, a movie shown in IMAX/3D will gross $83.3MM more than one that’s not, a franchise movie will gross $28.3MM more than an original, a summer movie will gross $49.8MM more than one released at other times of the year, and casting a coloured lead will reduce revenues by $5MM.
So that’s my first pointless research project. Sorry for the long post and congrats if you made it all the way to the bottom. The next one will probably be something to do with hockey or basketball (or maybe even track some random thing that’s going in with the World Cup).




