- Why is the coefficient of correlation an important tool for statisticians? Provide some specific examples from real life.
Correlation (from Lat. Correlatio - ratio, the relationship), correlation - a statistical relationship between two or more random variables (or variables that can with some reasonable degree of accuracy considered as such). When this change in the values of one or more of these quantities are accompanied by systematic variation, or other values of other variables. Mathematical measure of correlation between two random variables is the correlation ratio, or the correlation coefficient r. If the change of a random variable does not lead to a natural change in another random variable, but it leads to a change in other statistical characteristics of the random variable, such a relationship is not considered to be a correlation, although a statistical.
Significant correlation between two random variables is always evidence of the existence of a statistical relationship in this sample, but this relationship does not have to be observed for the other samples and have causal. Often tempting simplicity correlation study encourages researchers to make false intuitive conclusions of a causal relationship between pairs of traits, while the correlation coefficients set only statistical relationships. For example, considering the fires in a particular city, you can uncover a very high correlation between the damage that caused the fire, and the number of firefighters involved in extinguishing the fire, and this correlation is positive. From this, however, must not be inferred "increase in the number of fire leads to an increase of the damage," and the more successful will not attempt to minimize the damage from fires by eliminating fire brigades. At the same time, the lack of correlation between two variables does not mean that between them there is no connection. For example, the relationship can be a complex nonlinear nature, which does not reveal a correlation.
- What are some terms related to hypothesis testing with which you are already familiar? Why do null and alternative hypotheses have to be mutually exclusive?
I’m very familiar with hypothesis testing for comparing means of two samples. In this case, the null hypothesis will be:
H0: μ1=μ2
which means that there is no significant difference between the means in two data samples.
And the alternative hypothesis could be one of the following:
Ha: μ1>μ2
Ha: μ1<μ2
Ha: μ1≠μ2
They are always mutually exclusive. This is because we divide our probability space on mutually exclusive intervals to test if our estimated value is within the confidence interval or outside.
- Discuss what regression models can do for decision making.
Adequate linear model has the form of a polynomial of the first degree. The polynomial coefficients are partial derivatives of the response function of relevant variables. Their geometric meaning - tangents of the angles of inclination of the axis to the corresponding hyperplane. The larger the absolute value of the coefficient corresponds to a greater angle and consequently more substantial changes the setting when the current factor.
For example, in business, we can create regression models to predict price of houses based on location, area and other conditions. This model may be used to make forecasts and predicts the price of house with given parameters of area, location, etc.