Answers
For the purpose of this analysis, statistical tools like descriptive statistics, ANOVA, correlations and other methods have been used.
Descriptive statistics
The major findings from the descriptive statistics calculations are:
For the opening gross, the mean value comes out to be $30,219,993.96, and this value is obtained with $3,316,653.72 of standard error.
For the total gross, the mean value comes out to be $100,348,852.44, and this value is obtained with $9,582,798.35 of standard error.
For the number of theatres, the mean value comes out to be 3,130.05, and this value is obtained with 61.74 of standard error.
For the number of weeks, the mean value comes out to be 15.73, and this value is obtained with 0.56 of standard error.
The value of opening sales finds their range between 23719472.2 to 36720515.8
The value of gross sales finds their range between 81566912.81 to 119130792.07
The value of number of theatres finds their range between 3009.04 to 3251.06
The value of weeks in top 60 finds their range 14.64 to 16.82
There is a variance of the opening sales from a small value of 31,610 to the high value of 207,438,708. There is a variance of the gross sales from 28,835,528 to a whopping 623,357,910. The same variance in a number of theatres is 924 to 4,404 and the variance in the number of days in top 60 changes from as low as 5 to as high as 35.
The interpretations of the findings of the data may be presented as:
There is vivid variation in the success rates received by the movies. An example could be taken for the variable, “total gross”.
The least revenue generating movies was somewhere in the half where median lied, whereas the highest revenue generating movie stood at a value almost reaching to ten times the value of median.
Many of the movies generated a good amount of income while there were a few movies, which stood as blockbusters.
There are only a few movies in the higher part of the group; most of them cluster at the end or bottom part.
Outliers
The calculation done here is taking each of the variables and measuring the cutoff points for the dual standard deviations that lie higher and lower than the mean value. Being an outlier meant that a value was outside this stated range of data. The high performers of the analysis are given in green whereas the low performers are given red colors.
In our case, the analysis of performances reveals that there are four outliers, which have performed higher than the higher scoring ones. This is so because their opening gross sales have exceeded the standard deviations in comparison with these four means. The four outliers are:
Marvel’s The Avengers
The Dark Knight Rises
The Hunger Games
Skyfall
If we consider the total value of gross sales instead of the opening, we get the same outliers along with two more movies. But, for the theatre number group, there are also outliers since the performance is not so good. There are three movies that performed below the lower cutoff point. In case of numver of weeks, there are five movies as outliers in the green zone as their performance was above the upper cutoff value.
3. Relation ships
The relation between variables may be that of correlation type or it may be cause and effect relationship. Anyways, the output gives us the following summary:
The regression analysis gives us the dependent and independent variable and thus the relationship between total gross sales and the remaining variables can be established. The resulting regression equation is,
Total gross sales=37292147.85+2.53X+3565527.013 (Y)
Where,
X= Opening Gross sales
Y= Number of weeks
There is no much significance of the variable: number of theatres. The above relation means that when there is an unit increase in the value of opening gross sales, the increase in total gross sales is experienced by almost $2.53. Another relation denotes that when there is an unit increase in the value of number of weeks, the increase in total gross sales is experienced by almost $3565527.03.
Correlation Calculation:
Finding: The above analysis suggests that all three variables have a positive or favorable correlation with the variable total gross sales. Other observations may be summarized as:
The highest level of correlation is between total gross sales and opening sales. The relation is not much of a surprise since opening day sales can be taken as a good predictor of future overall sales and that the opening sales are included in the total sales.
The second highest level of correlation is between total gross sales and the theatre numbers. This is justified by the reason that popular movies will lure a higher number of theatres to carry that movie. The more theatres carrying the movie means that there is a higher profit opportunity.
The moderate level of correlation is between total gross sales and the number of weeks in top 60. A longer running movie has higher chances of gaining profits. The longer the time the movies are in theatres and making money, the longer the theatre will wish for the association.
References
Eml.berkeley.edu,. (2016). Regression Analysis. Retrieved 13 January 2016, from https://eml.berkeley.edu/sst/regression.html
Explorable.com,. (2016). ANOVA - Statistical Test - The Analysis Of Variance. Retrieved 13 January 2016, from https://explorable.com/anova
Roberts, D. (2016). Statistics 2 - Correlation Coefficient and Coefficient of Determination.Mathbits.com. Retrieved 13 January 2016, from http://mathbits.com/MathBits/TISection/Statistics2/correlation.htm
Socialresearchmethods.net,. (2016). Descriptive Statistics. Retrieved 13 January 2016, from http://www.socialresearchmethods.net/kb/statdesc.php
Statistics Solutions,. (2016). Descriptive Statistics and Interpreting Statistics - Statistics Solutions. Retrieved 13 January 2016, from http://www.statisticssolutions.com/descriptive-statistics-and-interpreting-statistics/