Purpose:
Being a finance enthusiast, I decided to go for recent closing prices of S& P 500 index from the period September 8th- September 19th. The reason for selecting the data was to unfold the financial numbers using the statistical techniques ranging from Histogram to Normal Distribution study. This particular project will also deepen my understanding of various statistical techniques.
Data:
The data related to closing price for multiple dates of S&P 500 is sourced from Yahoo Finance. Important to note, we will be using only the adjusted closing prices incorporating the effect of dividends and stock splits. Below is the snapshot of the data to be used:
Frequency Distribution and Histogram:
*Refer to spreadsheet for bin and upper limits data
Median:
In order to calculate the median for the given data, we need to arrange the data in ascending order and the same is presented below:
Now, since the number of observations are even, i.e 10, the median will be the average of 5th and 6th observation:
Median= (1997.45+1998.98)/2= 1998.21
Mean:
Calculating the mean/average of the given data is a simple procedure and can be calculated using the following formula:
Mean= Sum of observations/ Number of observations
Mean= 19975.1/10= 1997.51
Range:
Range of the given data is the difference between the maximum value(2011.36) and the minimum value(1984.13)
Range= Maximum value in the data set- Minimum value in the data set
Range= 2011.36-1984.13
Range= 27.23
Sample Variance and Standard Deviation:
Data within one standard deviation of the mean must fall in the interval
i)Mean= 1997.51
ii) Standard Deviation= 9.432
iii) x-s,x+s=1997.51-9.43, 1997.51+9.43=(1988.08, 2006.93)
iv) In the interval 1988.08-2006.93, there are 8 observations, so 8/10= 80% of the observations fall within 1 standard deviation
Data within two standard deviations of the mean must fall in the interval
i)Mean= 1997.51
ii) Standard Deviation= 9.432
iii) x-2s,x+2s=1997.51-2*9.43, 1997.51+2*9.43=(1978.65, 2016.37)
iv) In the interval 1978.65-2016.37, there are 10 observations, so 10/10= 100% of the observations fall within 2 standard deviation
Data within three standard deviations of the mean must fall in the interval
i)Mean= 1997.51
ii) Standard Deviation= 9.432
iii) x-3s,x+3s=1997.51-3*9.43, 1997.51+3*9.43=(1969.22, 2025.8)
iv) In the interval 1969.322-2025.8, there are 10 observations, so 10/10= 100% of the observations fall within 3 standard deviation also
So, 80.0% of the temperatures fall within one standard deviation of the mean, 100% of the temperatures fall within two standard deviations of the mean, and 100% of the temperatures fall within three standard deviations of the mean. For a bell-shaped distribution, the respective percentages are approximately 68%, 95%, and 100%. For the given data of closing price of S&P 500 Index, the percentages are not closer to the suggested values and also since the median is not equal to mode, the data is not normally distributed, but somehow close to it.
Conclusion:
On observing the statistical behavior of S&P 500 closing prices, we find that the range of prices for the given dates was $27.23. While the mean price was $1997.51 and the median price was $1998.21 along with the standard deviation of 9.43.
In addition, the data distribution revealed that 80% of the data falls within 1 standard deviation and 100% of the data falls within 2 and 3 standard deviation. Hence, the results does not fit with the normal distribution parameters, but since our data consisted of only 10 observations, I believe including more data would have given data distribution approximate to normal distribution.