Q1. (a) What is the difference between descriptive statistics and inferential statistics?
Descriptive statistics refers to statistical procedures that are used to analytically describe a population that is being studied. The data used for such purpose might be obtained from a sample of the intended population. Organization and description of the data is based on the results obtained during the study.
Inferential statistics is the process production of inferences or predictions about a population based on observations and analysis of the sample being investigated. The sample results generated are then generalized to account for the whole population from the sample was obtained.
(b) Provide an example for each.
Descriptive statistics- Research on how age affects productivity in individuals
Inferential statistics- Research to test the effectiveness of a new drug before release to the market
Q2. For the following scores: 3, 7, 6, 5, 5, 9, 6, 4, 6, 8, 10, 2, 7, 4, 9, 5, 6, 3, 8.
- Develop a frequency distribution with the following columns: X, F, fx, cf, c%
- What is a measure of central tendency, and what the common measures of central tendency?
Measures of central tendency can be defined as numbers used in the description of what is typical of a given data distribution.
Common measures of central tendency:
Mode, Median, Mean
- For the data set that you already have calculate each measure of central tendency.
- The mean
x=XN
x=11319=5.95
- The median: the number that equally divides the given data set when arranged in either descending or ascending order into equal parts, 50 %.
2, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 8, 8, 9, 9, 10
It is 6
- The mode: most frequent value in data set
It is 6
- What is a measure of variability, and what are the common measures of variability?
Measures of variability refer to indexes that show the distribution of data with respect to heterogeneity, variation, dispersion and spread of scores.
Common measures of variability:
Standard deviation, Variance, Range
- For the data set that you already have, calculate the measures of variability.
Range
R=Xh-Xl+I
Where, R = range
Xh = highest value
Xl = lowest value
I = interval
Therefore, R=10-2+1=9
Variance
σ2=x2N
σ2=88.947519=4.68
Standard deviation
σ=x2N
σ=4.68=2.16
Q3. (a) When is the median preferred over the mean?
In the illustration of typical performances
(b) Provide an example for this.
Demonstration of variations in the income earned annually by a group of individuals
(c) If the mean is much greater than the median, are the data skewed to left or skewed to the right?
When the mean is greater than the median, the data would be skewed to the left
(d) Please draw a distribution where you can show this skew.
Q4. (a) If a set of data is normally distributed, how many of the cases fall within one standard deviation? How many fall within two standard deviations? How many fall within three standard deviations?
One standard deviation = 68%
Two standard deviations = 95%
Three standard deviations = 99.7%
(b) Show this graphically.
Figure 2: Values falling within 1, 2, and 3 standard deviations
Q5. What is the importance of z score and how do you calculate a z score? For a sample of X=60 and σ=12 find the
- z-score
- T-score, and
- Stanine score corresponding to each of the following.
X values: 66, 78, 57
Importance of z-score: accurately measures the relative position of an individual in a group or precisely compares scores of individuals against a standard measure.
Z-score formula;
z=xσ
Where, x = deviation score
σ = standard deviation of the distribution
- Z-score, z1=xσ=66-6012=0.5
z2=78-6012=1.5
z3=57-6012=-0.25
- T-score, T=10z+50=10(X-X)σ+50
T1=10z1+50=10×0.5+50=55
T2=101.5+50=65
T3=10-0.25+50=47.5
- Stanine-score, =2z+5
St1=2z1+5=2×0.5+5=6
St2=2×1.5+5=8
St3=2×-0.25+5=4.5=5
Q6. A researcher is examining the effects of computer based training program designed to teach algebra. The researcher randomly selects subjects for two groups and gives one group the computer training and the other standard teaching methods to see if the results of the two methods differ. The following scores are from the subjects’ final tests.
- Which test is the most appropriate for this data and hypothesis? Explain why.
Two-tailed test
The researcher’s interest is not limited to only the results of one method, but wants to know the effect of both methods so as to develop the best between them
- What is the null hypothesis? Write it in scientific terms.
The null hypothesis is defined as a default position where there is no relationship between any two measured parameters.
- The degrees of freedom for the test are? Explain why.
Degrees of freedom
n1 + n2 -2 = 8+ 8 -2 = 14
This is so because the variable parameters are independent
- At what level would you test the significance and in how many tails? Explain why.
Significance test would be done within the group scores.
It would be done in two tails.
Q7. For the following scores; (a) calculate Pearson r and r2; (b) sketch a scatterplot of your data.
X=10, Xx=2, Y=15, Yy=3, N=5
- Pearson r=SpSSxSSy=(X-Xx)(Y-Yy)SSxSSy=-118×18=-1112=-0.916
Pearson, r2=-0.9162=0.839
- Scatterplot
Xx=2; Yy=3; σx=1.26;σy=1.90
Figure 3: Scatterplot for scores, z
References
Ary, D., Jacobs, L.C., Razavieh,A., & Sorensen, C.K., (2014, 2010). Introducation to Educational Research (9th Ed.). Belmont, CA: Wadsworth Cengage Learning.