- A 95% confidence interval for the mean income of shop assistants in a certain city is found to be (£12,000, £15,000). Explain briefly what this means. Would a 99% confidence interval be better than a 95% one? Justify your answer.
ANS – It means that there is 95% probability that the value of the parameter i.e. individual incomes of different shop assistants shall lie within the range of £12,000 to £15,000. Yes, 99% confidence level is better than a 95% one as it increases the probability of the incomes of different shop assistants to lie within the defined range from 95% to 99%. The survey stands true for 99% of cases and decreases the chance of error, if the study shall be used for any analysis.
- A charity believes that when it puts out an appeal for charitable donations the donations it receives will be normally distributed with a mean of £50 and a standard deviation of £6.
- Find the probability that the first donation it receives will be less than £40.
- Find the value x such that 5% of donations are more than £x.
- 1. Find the z-score for 40. (40-50)/6 = -1.66.
2. Find P(Z < -1.66) = 0.0485.
b) 1. Find y such that P(Z < y) = 0.95. The value of y is 1.645.
2. Find the z value for x.z = (x - 50)/6.
3. Solving the equation (x-50)/6 = 1.645
= 59.87
- A consultant for Apple was investigating computer usage among students at a particular university. 200 undergraduates and 100 postgraduates were chosen at random and asked if they owned a laptop. It was found that 81 of the undergraduates and 63 of the postgraduates owned a laptop. The consultant calculated that 48% (144 out of 300) of the students interviewed owned a laptop. Explain, with reasons, whether the figure of 48% will be a good estimate of the proportion of all students who own a laptop.
No. 48% will not be a good estimate of the proportion of all the students that own a laptop. This can be explained by the following reasons –
- For such a small sample range there is a significant difference between the proportion of undergraduate students owning a laptop (40.5%) and that of the postgraduate students (63%).
- When looking at the 2 group of students, one realizes that, both are a different target group and cannot be sampled as one.
- One of the disadvantages that one faces with simple random sampling is that it might not always cover an effective area of the sample. It is possible that the survey may miss a whole section of the area due to the randomness of selection.
- For a certain variable, the standard deviation in a large population is equal to 12.5.
How big a sample is needed to be 95% sure that the sample mean is within 1.5 units of the population mean?
n = (z*standard deviation/error margin)²
= (1.96*12.5/1.5)²
= 266.7233
- ANS
All the results that are obtained by statistical methods have a disadvantage that they might have been caused by accident. The level of statistical significance is determined by the probability that this is not the case. Significance percentage level is an estimate of the probability that the result has occurred by a pure statistical accident.
- What conclusions would you draw from a test which is significant at the 1% level?
The significance of the test at the 1% level means that there is a one percent chance that the result was accidental. Thus we can conclude that the results of the test are not based pure chance but are highly scientific and correct.
- What conclusions would you draw from a test which is significant at the 10% level, but not the 5% level?
The test being significant at 10% level means that out of each 10 values 1 value that has been predicted and included in the calculations gives faulty results. And the test being significant at 5% level means that out of each 20 values 1 value that has been predicted and included in the calculations gives faulty results.
Thus this statement means that the range of test results that are based on statistical accident is between 5% and 10%. This says that test results might be based on pure chance upto 5% range but not beyond 10% range.
- An accounting firm wishes to test the claim that no more than 5% of a large number of transactions contains errors. In order to test this claim, they examine a random sample of 225 transactions and find that exactly 20 of these are in error. What conclusion should the firm draw? Use a 5% significance level.
Here, 20 out of the 225 samples were in error, thus the percentage of error for the test is 8.88%. This does not satisfy the significance level of the test claim of 5%. Thus the firm may draw that the test was not conclusive.
- A profit-maximising retailer can obtain cameras from the manufacturer at a cost of £50 per camera. The retailer has been selling the cameras at a price of £80, and at this price consumers have been buying 40 cameras per month. The retailer is planning to lower the price to stimulate sales and knows that for each £5 reduction in the price, 10 more cameras will be sold each month. Assuming price is a multiple of £5, what price should the retailer charge and what will the monthly profits be?
cost per camera = £50
manufacturing cost of 40 camera = 40*50 = £2000
income on sale of cameras @ £80 = 80*40 = £3200
Net Profit = £3200-£2000 = £1200
cost per camera = £50
manufacturing cost of 50 camera = 50*50 = £2500
income on sale of 50 cameras @ £75 = 75*50 = £3750
Net Profit = £3750-£2500 = £1250.
cost per camera = £50
manufacturing cost of 60 camera = 60*50 = £3000
income on sale of 60 cameras @ £70 = 70*60 = £4200
Net Profit = £4200-£3000 = £1200
- Explain briefly the purpose of:
- hypothesis testing
- sampling.
- – hypothesis testing - We use hypothesis tests to check whether any claim made about a population is true or not (for example, a claim that 48% of students in a university use laptops). To test a statistical hypothesis, we need to take a sample, collect the data, form a subjective statistic, standardize it to be able to read the resylts on a singular standard scale and make a test statistic, and decide whether the test statistic supports the claim made.
- sampling –Sampling provides us with different types of statistical information which might be quantitative or qualitative in nature. Sampling works on the principle of examining the whole population by the statistics of the few selected units. It is a scientific method of selecting those sampling units which shall most correctly provide the estimates with the related ranges of uncertainty, that might arise from examining only a part and not the whole.
- The prospective operator of a shoe store has the opportunity to locate in an established and successful shopping centre. Alternatively, at lower cost, he can locate in a new centre, whose development has recently been completed. If the new centre turns out to be very successful, it is expected that annual store profits from
profits would be £60,000. If the new centre is unsuccessful, an annual loss of £10,000 would be expected. The profits to be expected from location in the established centre will also depend to some extent on the degree of success of the new centre, as potential customers may be drawn to it. If the new centre was unsuccessful, annual profits for the shoe store located in the established centre would be expected to be £90,000. However, if the new centre was moderately successful, the expected profits would be £70,000, while they would be only £30,000 if the new centre turned out to be very successful. All profits are inclusive of location cost. The probability that the new shopping centre will be very successful is 0.4 and the probability it will be moderately successful is also 0.4.
- Draw the decision tree for this problem.
- According to the expected monetary value criterion, where should the shoe store be located? Assume a risk-neutral decision-maker.
The store should be located at the new centre.
- Explain briefly how a perfect forecast of shopping centre success changes the order of the decision tree in ‘(a)'.
As the probability values show in the decision tree
- if the new centre is very successful the difference in the profit is very high and the new centre is a clear winner
- if the new centre is moderately successful the profit difference is less but it is still significant.
- if the new centre is unsuccessful the loss is not very high but the profit difference is significant.
the decision of having the store in the new centre will be right if it is very successful otherwise even if it is moderately successful it is not worth the risk.
- The vice president of purchasing for a large national retailer has asked you to prepare an analysis of retail sales by state. Data are available for the following variables:
- Y (retsal) = Per capita retail sales in £
- X1 (perinc) = Per capita personal income in £
- X2 (unempl) = Unemployment rate in %
- X3 (totpop) = State population in 000s
Excel regression output of a potential model is:
SUMMARY OUTPUT
- Comment on the effects of unemployment and per capita personal income.
ANS- in this model the change in the variable “unempl” and “perinc” impose opposite effects . The increase in the variable of unemp as we can see from the results increases much more in the lower 95% level as compared to that of the rest of the 5%. The p-value of unempl variable being small brings about a small change on the negative scale of the values.
Whereas perinc variable has a high p-value and lower values in the lower and upper 95% columns, which shows that there will not be a very significant change in the overall mean if only the perinc value changes.
The model does not give a very clear understanding of the scenario as the extensiveness of the sample has not been compared with.
- You think the prediction equation can be improved by adding state population as an additional explanatory variable. You obtained the following output:
ANS - Adding this explanatory variable improves the model for sure as one can read from the results that the standard error has decreased. The negative value of the unempl variable are balanced with the addition of totpop variable. The variation due to the smallest of change decreases in this model. Also the per capita rise or fall income gives a comparable variation in the results. The p-value of unempl goes up and balances the upper 95% at the positive side.
- Is this model better? Why/why not?
Yes this model is better as it distributes the values in a larger no. of samples within the same range. The model is more exhaustive and the significance level is also lowered from 5% to 1.1%.
- For this model, write out an expression for sales.
Y≤ 0.27x1- 71.3x2 -0.000024x3
- For this model, calculate a 95% confidence interval for predicted sales, if unemployment is 8.1%, per capita income is £15,000 and the state's population is 6 million. Use a z-value of 1.96.
ANS – 3329
95% confidence interval = 1699 to 3329
- Write down two additional explanatory variables which you think could help to explain sales. Give a brief justification for each.
- exposure to media – it shall up the sales significantly.
- literacy rate – it shall up the retails sales level in some groups and lower it down in others
- Time series are usually considered to have a combination of four components. What are these components? For each of them, give one example of data for which you would expect that component to be present.
- TREND – eg. newspaper sales
- SEASONAL – eg. temperature variation
- CYCLIC – eg. insurance sales.
- The following table gives average UK household electricity demand in kilowatt hours (kWh) over the last five years. Quarter 1 represents Spring.
ANS -
- constant increase in the demand
- max increase in Q2 and minimum increase in demand in Q4.
- Show that the 4-point centred moving average for Quarter 3 in 2007 is 5.025.
ANS -
4.4+4.6+4.8+5.1 = 18.7
avg = 4.67
next group average = 4.925
4.925-4.7 = 0.225
avg = 0.1
avg for 2007 = 4.925+0.1 = 5.025
- Calculate the ratio-to-moving-average (R2MA) for Quarter 3 in 2007.
0.995
- Compute the four seasonal indices using the following table of R2MA values. Replace `?' with your answer to part `iii.'.
seasonal average for autum = 0.969127
over time average = (0.969127+1.079403+1.012759+0.954331)/4
= 1.003905
seasonal indices (seasonal average*100/over time average)–
- autumn – 96.53572
- winter – 107.52043
- spring – 100.8819
- summer – 95.06188
- The estimated trend line is found to be:
where x is the Quarter number (Q1 of 2005 corresponds to x = 1). Provide a forecast, to three decimal places, for average UK household electricity demand for the summer of 2015. Do you have any comment to make about this forecast?
The summer for 2015 falls in the 44th quarter
Thus, y = 6.661 (demand for summer of 2015)
This trend line does not stand 100% true for the correct forecast of many of the quarters into consideration, but it is an averaged out trend line which gives the most expected results.