Introduction
According to medical studies, caffeine a stimulant drug contained in coffee, affects the heart rate of a human being. A state-of-the-art medical facility, Premiere Hospital wants to determine whether or not to serve caffeinated beverages to their patients in waiting rooms; the hospital would like to establish whether a relationship exists between coffee and the human heart rate. The hospital tested a sample of fifteen patients to whom they had served coffee, and obtained data. Each patient’s heart rate was recorded before and after taking a caffeinated beverage. This analysis seeks to establish whether there is a significant relationship between coffee and the heart rate of a human being through extensive data analysis techniques. Examples of data analysis techniques to be used are descriptive statistics, hypotheses tests, and correlation analysis. Descriptive statistics are comprised of measures of dispersion and central tendency that help a researcher identify the characteristics of a data set. Hypotheses tests establish the validity of a hypothesized phenomenon while a correlation analysis investigates the existence of relationships between the dependent and independent variable under study.
Primary Data Analysis
The data obtained from the fifteen patients is experimental data generated by measurements or test methods. The medical practitioners of Premiere Hospital, in this case, used heart rate metrics to obtain individual heart rate before and after taking coffee. This study, therefore, is an experimental kind of research which seeks to investigate the cause and effect relationship between caffeine intake and heart rate of a human being. Experimental data is mostly used in clinical research to establish relationships between variables. Clinical researchers using experimental data vary variables while holding others constant to obtain the desired outcome. Experimental data is beneficial to a researcher because it helps them determine the cause and effect of variables under study; for example, experimental data can be used to determine whether or not nicotine affects an individual’s ability to drive.
The data generated for this study represents the interval level of data. Interval data is comprised of logically ordered figures that are classified by type. Examples of categories in interval data include age in years, temperature in degrees Celsius, and distance in kilometers. This form of data has no zero starting point, and its categories are mutually exclusive. The random sampling technique was used to select the participants of this experiment. The hospital tested fifteen random patients who took caffeinated beverages in the waiting rooms on a single day. This is a random sampling, where every element in a given population has an equal chance of selection. The random sampling technique facilitates generalization, whereby the characteristics of the sample are deemed to represent those of the population. The fifteen patients selected from the hospital’s waiting rooms are the sample for this study while all patients taking caffeinated beverages in the hospital represent the population.
Caffeine, which is contained in the coffee beverage, is the independent variable for this study. The human heart rate, in contrast, is the dependent variable for the study. An independent variable is one which a researcher has control over and can manipulate it to establish its effects on other variables. A dependent variable is one expected to change when the independent variable is changed. Dependent variables, therefore, represent an output or effect, while explanatory variables are the causes or inputs in a study (Graham, 2003). This data has left out one confounding variable, which is the heart condition of the patients. This is because the increase in heart rate after taking coffee could be partially affected by the heart condition of individual patients. Additionally, there is a reason to believe that the population is normally distributed because the data changes are regular and vary by ranges close to each other.
Examination of Descriptive Statistics
Three graphical representations of data namely a scatter plot, bar graph, and line graph are used to determine whether the data generated by the study is normally distributed. The shapes generated by the diagrams are as follows:
Figure 1: Scatter plot
Scatter plots describe the relationship between two variables under study. This scatter plot represents a concentration of variables at one point. This is an indication that the data might be normally distributed.
Figure 2: Bar graph
The bar graph also shows a concentration of variables in a range of 60 and 100. This concentration implies that the data is normally distributed.
Figure 3: Line graph
Line graphs depict existing relationships between an independent and dependent variable. This line graph shows that the data set is normally distributed.
Measures of Central Tendency
Descriptive statistics are used to determine the mean, mode, and median for both the heart rate before and after coffee. The following output is generated for both data sets.
Heart rate before coffee
The mean represents the sum of all sample variables divided by the number of those variables; the mean is the average (Graham, 2003). The mean is computed using the formula:
The mean heart rate of the participants before taking coffee is 78.27. The median is the middle value in a given data set; for example, in this case, 78 is the middle value of the heart rate of a patient before coffee intake when these values are arranged in a descending or ascending order. The mode is the most frequent occurring value in a given data set. This implies that a majority of patients in this data set have a heart rate of 90 before taking a caffeinated beverage.
Heart rate after coffee
The average heart rate of an individual after taking coffee is 82.67. 84 is the middle value of the heart rate of a patient after coffee intake when these values are arranged in a descending or ascending order. A majority of patients has a heart rate of 95 after taking coffee, given the value of the mode.
Measures of Dispersion and Variation
Measures of dispersion and variation include variance, the range, standard deviation, and standard error (Graham, 2003). The descriptive statistics analysis output generated the values of these measures. The standard deviation is a measure of how widely the values of a data set are dispersed from the mean. The variance is the square root of the standard deviation and is a measure of data dispersion from the mean. The range represents the difference between the highest and lowest values in a set of data; however, it is a poor measure of dispersion as it increases with an increase in the population size. The standard error is a measure of the accuracy with which a sample represents a given population.
The standard deviation for a patient’s heart rate before taking coffee is 11.42. This implies that the values are spread from 78.27 by a value of 11.42. The difference between the highest and lowest heart rate of a patient before taking coffee is 40. A standard error of 2.95 implies that the mean heart rate of a patient before taking coffee is reliable. The standard deviation for a patient’s heart rate after taking coffee is 12.55. This indicates a spread of values from 82.67 by 12.55. The difference between the highest and lowest heart rate of a patient after taking coffee is 40. A standard error of 3.24 implies that the mean heart rate of a patient before taking coffee is less reliable compared to the average after coffee intake.
Outliers
An outlier is a value which differs significantly from the mean value. When an outlier is identified in a data set, it implies that it might be invalid (Graham, 2003). A researcher, therefore, should consider disregarding it as it may compromise the desired outcome. An outlier, however, might not affect findings if the outcome of an analysis does not change even after it is discarded. The mean and standard deviation of a set of data can be used to detect outliers through the calculation and comparison of residuals. A value is identified as an outlier if it is lies far from the mean. According to test for outlier identification performed for the heart rate before coffee intake, 100 and 60 are the outliers for this data set. The value 60 is the outlier identified for the heart rate after coffee intake.
Corrections
Based on the inspection of outliers above, the mean and standard deviations of the data sets are recalculated, while excluding these outliers. The mean heart rate before taking coffee on excluding the outlier is 63.60 while the standard deviation is 32.49. The mean and standard deviation of the heart rate after taking coffee are 78.67 and 24.32 respectively. This implies that there are additional errors that could have occurred when calculating the measures of dispersion and variation for the data. These errors would be corrected through a recalculation of these measures. The distribution of the data may also be wrongly identified. This error would be corrected by redrawing the charts to establish the type of distribution.
Inferential Statistics
Hypothesis Testing
Null hypothesis: There is no significant difference in the average heart rate before and after taking coffee.
Alternative hypothesis: There is a significant difference in the average heart rate before and after taking coffee.
Assumption: The distribution is normally distributed.
Test = z-test: Two samples for means
Level of significance = 0.05
Test statistic = -1.00
Decision rule: Reject the null hypothesis if the p-value is less than 0.05.
The following output was generated:
The z-test two samples for means test is used to investigate the validity of the hypotheses. A z-test is used because the data is normally distributed. This is a two-tailed test since we are testing whether there is a difference in the mean heart rate of a patient before and after coffee intake. The p-value (0.32) is greater than 0.05; hence we do not reject the null hypothesis. We, therefore, conclude that there is no significant difference in the average heart rate before and after taking coffee.
Correlation
A correlation test establishes the association between the independent and dependent variables of given data. A correlation value ranging between 0.8 and 1 implies that the variables have a positive and strong relationship. A correlation coefficient with a negative sign depicts an indirect association between the explained and explanatory variable; for, example a correlation coefficient ranging between -0.8 and -1 indicates a strong but indirect relationship between the variables (Graham, 2003). A correlation value less than 0.7 implies that the variables under study are not strongly associated. The following output was generated by the correlation analysis:
The correlation coefficient generated by the test is 0.9063. This implies that there is a direct and strong relationship between the heart rate of a patient before and after taking coffee.
Conclusion
The test hypothesis implies that the average heart rate of a patient before and after taking coffee does not differ significantly. This suggests that caffeine effects may be insignificant to changes in the heart rate. The correlation tests, in contrast, shows that a strong and direct relationship exists between coffee intake and heart rate changes. This implies that caffeine affects an individual’s heart rate to some extent. These results, however, do not completely mean that caffeine is the principal cause of changes in the heart rate of a patient. While it might play a significant role in affecting a patient’s heart rate, it is not the principal cause. Additional information such as a patient’s health conditions and age would be useful in establishing the extent to which, caffeine affects a patient’s heart rate. Individual demographic factors such as age and a history of health conditions are the variables missing in this study. While conducting a follow-up study, I would collect qualitative and quantitative data regarding individual aspects such as the frequency of caffeine intake, physical fitness, and diseases from which a patient might suffer. My null hypothesis would be that there is no relationship between caffeine intake and the health of an individual.
Recommendations
The medical practitioners at Premiere Hospital should investigate additional aspects that may cause changes in their patient’s heart rate. This is because the results obtained suggest that caffeine may be an insignificant cause for patients’ heart rate changes. Additionally, they should consider conducting a controlled experiment such that they investigate the heart rates for individuals who take coffee but are not their patients. A controlled experiment is one in which some values are held constant when others are under study. A comparison between the heart rate changes for both groups, the patients and non patient, should then be done to establish the validity of the set hypothesis. This way, the medical practitioners, are able to determine the real effects of caffeine on patients’ heart rates.
References
Graham, A. (2003). Statistics. Blacklick, Ohio: McGraw-Hill.