- Bookmakers owning series of betting shops. The customers will be available in the shops for betting on horse race. A sample data was collected from 150 shops. In that 50 days are Saturday and 150 days are week end. The scatter plot has been plotted between the expenditure per customers against the number of races and against the square root of races. The scatter plot against the expenditure per customers and the square root of races fits the best linear regression graph than the number of races. In statistics the slope of square roots makes the best fit linear line. Slope of line is the measurement of how many units goes up and down for every unit in other words the change in Y value divided by the change in X value. For a paired data of x and y we denote the x data as Sx and y data as SY and the slope for regression is r (Sy/Sx). The correlation coefficient is denoted by r, the least square root of the equation of the line indicates that the linear model of the plot. The square root value will avoid the correlation coefficient to be zero. If most of the correlation coefficient is not a zero value it can give a most fit linear graph. For standard deviation we have to take the positive square root to have a non negative number, so slope should also be a non negative number. The square root of the scatter data will give a best fit linear straight line than non square root value. The concept of slope of recession line and correlation coefficient makes a square values to have best fit straight line. In this analysis the scatter plot of expenditure per customers Vs Square root of races provided best fit straight line as it avoided the negative values of slope.
Scatter Plot
- Fifteen numbers from the two models provides a significant relation between the number of customers and the independent variable because all the fifteen values have the same amount expenditure per customer and the same independent variable.
- Model 2 gives a better analysis because in model 2 the standard deviation errors are less and the standard deviation errors of model 2 are 1.5226, whereas the standard deviation errors of model 1 are 1.6354. The scattered values in the model 2 are less compare to model 1. The model 2 data are scattered within a particular range. The model 2 has less negative standard deviation value.
- The expenditure per customer during 10, 15, and 20 races are available to be €2800. The perdition of this value to be more accurate. When the available races were 10 to 15 the expenditure per customer is €2500, and when it was 20 the rate has been increased to €2800.
- Out of the four plots the plot between the unstandardized residue and the number of staff provides the most marked evidence and the plot between unstandardized residue and the number of customers does provides very less marked evidence because the scattered data on this plot is more. The plot between the unstandardized residue and the number of customers can be improved by reducing the number of unstandardized coefficients. It will be good to take square root for the number of customers to avoid negative or zero correlation coefficients so this can improve the plot.