Assume, my research question is related to the average income level of people living in different areas. Let’s say, I want to compare the average annual income of those, who live in an urban area and in a rural area. In this case, the pair of hypotheses is the following.
Null hypothesis: There is no significant difference in the population average income between urban and rural residents. Alternative hypothesis: there is a significant difference in the population average income between urban and rural residents ("Hypothesis Tests", 2016).
My next task is to gather the sample information about the income levels in the two areas. I have to survey enough rural and urban individuals, so to satisfy the assumptions of a two sample hypothesis test. Since I want to compare the mean values of a quantitative variable between two groups, this will be a parametric test. In the most cases, the population standard deviation is unknown. Hence, I cannot use a z-test and I have to use Student’s t-test for independent samples ("Independent T-Test in SPSS Statistics - Procedure, output and interpretation of the output using a relevant example | Laerd Statistics", 2016). One of the assumptions of this test is that the dependent variable is approximately normally distributed in both groups. So, the sample sizes should be large enough to consider the sample distribution to be approximately normal.
Then, I calculate t-statistics and p-value (the probability of type 1 error) and compare it with the critical t-value and the level of significance alpha. Usually, the most common level of significance for such tests is 5% (or 0.05). If the p-value is less than 0.05 (or t-observed is greater than t-critical), we reject the null hypothesis and say that there is enough evidence to claim about the significant difference in income levels between urban and rural residents (at the 5% level of significance). Else, we failed to reject the null hypothesis, indicating that there is no significant difference in the average income of the two groups.
Now, let’s say about the eventual issues that may arise with the sampling procedure in this case. As we already know, the representativeness is the property of a sample to provide the general characteristics of a population ("Representative Sample Definition | Investopedia", 2010). If there is no match, we can say about the error of representativeness. This is the deviation from a statistical sample design corresponding to population structure. Assume that the population average annual income in rural area is $40,000, but in our sample, we obtained a mean value of $100,000. This means that we have interviewed only the prosperous part of the rural residents.
References
Hypothesis Tests. (2016). Stattrek.com. Retrieved 24 April 2016, from http://stattrek.com/hypothesis-test/hypothesis-testing.aspx
Independent T-Test in SPSS Statistics - Procedure, output and interpretation of the output using a relevant example | Laerd Statistics. (2016). Statistics.laerd.com. Retrieved 24 April 2016, from https://statistics.laerd.com/spss-tutorials/independent-t-test-using-spss-statistics.php
Representative Sample Definition | Investopedia. (2010). Investopedia. Retrieved 24 April 2016, from http://www.investopedia.com/terms/r/representative-sample.asp