Introduction
In this paper, we will describe and discuss the application of statistics and probability theory to a real world problem. Our goal is to develop a research question, collect the appropriate data, and examine it by applying various statistical techniques. The topic of our research is body fat percentage.
Most people consider weight as the most important indicator in assessing the state of their figure. However, it is not always correct to determine the effectiveness of the diet or intensity of physical activity, focusing only on the body mass index. There is another measurement, which is much more important: the percentage of body fat.
Sometimes a person has a normal (as it seems to him) body weight, but his body looks fat. Sometimes a person thinks that he needs to lose weight, as the scales show the unpleasant figures for him. But his look says a different thing - he has no belly, no double chin, and the whole thing is only in the hilly muscles and dense bones.
Of course, weight is a very important factor to be used in the process of improving your figure. But the other criteria for evaluating the effectiveness of diets and workouts should be taken into consideration. The easiest way is to look in the mirror and honestly admit to yourself, whether you like your figure, if the stomach sticks out, whether the muscles are seen and so on. However, a more accurate, though more sophisticated criterion is still to determine the percent of body fat.
In our study, we will try to determine the key factors that significantly impact the percentage of body fat.
Literature Review, Research Question and Hypothesis
Each person has their own body fat percentage and its fluctuations are significant. This indicator may be 7-10% among malnourished people or athletes, but can reach 50% or more of total body weight. In women, the percentage of body fat is usually higher than that of men ("Fat Differences In Men Vs. Women", 2016). This is due to the fact that representatives of the male sex have more developed muscles and their bones are thicker and denser.
The standards of body fat percentage are determined by the means of special tables where the columns and rows specify sex and age. On average, there is a following distribution of the body fat percentage among various groups of sex and age ("Normal ranges of body weight and body fat", 2016):
Male, up to 30 years – 9-15%;
Male, 30-50 years – 11-17%;
Male, 50+ years – 12-19%;
Female, up to 30 years – 14-21%;
Female, 30-50 years – 15-23%;
Female, 50+ years – 16-25%;
Obviously, those with the body fat percentage within the normal range have the most attractive body figure. People should strive to this range, therefore, it is necessary to be able to determine the amount of fat in the body. There are different techniques of determining this indicator.
The simplest method of measuring body fat percentage is to use special weights analyzers. The procedure of the analysis of human body composition is performed by measuring the electrical resistance of various body tissues. An individual puts his hands and feet on the sensors, and through his body passes a small current. Special software allows the analyzer to calculate all the necessary figures.
Another way of the calculation of the body fat percentage is hydrostatic weighing. The technique involves the determination of body composition by measuring the weight and volume of the human body. Next, using a formula determined by the density of the tissue. The higher it is, the lower the percentage of body fat.
In 2016, MyNetDiary has published an article that highlights the most common factors that are associated with the body fat percentage. They write: According to Dan Benardot, PhD, RD, FACSM in Advanced Sports Nutrition, 2nd edition there are 7 factors that play a main role in body composition” ("Factors That Affect Body Fat | MyNetDiary", 2016). These factors are: genetics, age, sex, menopause, type of Activity, amount of Activity, and nutrition.
Another study published on BioMed Central website was conducted about the association of the body fat percentage with the body mass index (BMI), age and other health-related, demographic and work-related factors. The researchers indicated a significant relationship between BMI and BFP among the various age and gender groups (Garza et al., 2015).
Given that the literature reviewed has covered the association between body fat percentage and body mass index among the various age groups, my paper will expand the field of research: I will examine the relationship between the body fat percentage and various body measures, such as bone density, neck, chest, biceps, forearm measures, and other. The full list of factors that participate in my research will be listed in the method section.
Null hypothesis: There is no significant association between the body fat percentage and various body measures.
Alternative hypothesis: The body fat percentage is related to some of body measures that are included in this research.
Methods
In this analysis, I have used the data set published in the Journal of Statistics Education website. The data represents a sample of 252 men that were measured with various measures of body size and body fat percentage. The body fat percentage was measured normally by hydrostatic weighing (individuals were measured underwater by a cumbersome procedure). The following variables participate in the research study ("Dr. John Rasp's Statistics Website - Data Sets for Classroom Use", 2016):
Dependent variable: BODYFAT – represents the body fat percentage.
Independent variables: DENSITY (bone density), AGE (age of the participant), WEIGHT (weight of the participant, in pounds), HEIGHT (height of the participant, in inches), ADIPOSITY (the level of adiposity), NECK (neck circumference, in cm), CHEST (chest circumference, in cm), ABDOMEN (abdomen circumference, in cm), HIP (hip circumference, in cm), THIGH (thigh circumference, in cm), KNEE (knee circumference, in cm), ANKLE (ankle circumference, in cm), BICEPS (biceps circumference, in cm), FOREARM (forearm circumference, in cm), WRIST (wrist circumference, in cm). Could you please define the variables in paranthesis
In order to simplify the analysis, I decrease the number of independent variables by introducing a new independent variable: Body Mass Index. It is known that the body mass index is the ratio between weight in pounds and height in inches multiplied by 703 ("BMI Formula", 2016):
The new variable is an index variable and it is affected by both HEIGHT and WEIGHT. Thus, I remove WEIGHT and HEIGHT variables from the list of independent variables. Next, since I have to create at least one dummy variable, I decide to divide the ADIPOSITY on two groups: 0 – if ADIPOSITY is below the average value of 25.4 and 1 – if it is equal or above 25.4
The purpose of the analysis is to examine the relationship between the independent variable and the dependent variables using multiple regression analysis. General purpose of the multiple regression analysis is to examine the relationship between the number of independent variables and one dependent variable. It is possible to develop a regression equation only if there is a correlation between the independent variables and the dependent variable. Regression analysis can be used only for the analysis of quantitative indicators. Qualitative indicators that are not transient values are not suitable for this analysis. Finally, regression analysis can be used only if the relevant assumptions are met.
The distributions of the variables are examined by histograms and boxplots:
Histograms and boxplots indicate that the distributions of AGE, ADIPOSITY, NECK, CHEST, ABDOMEN, HIP, ANKLE, and BMI variables are significantly different from a normal distribution.
Analysis
I start my analysis with descriptive statistics. Since the distributions of the variables vary significantly, it is not possible to choose one measure of central tendency or one measure of variability for all quantitative variables. We know that mean and standard deviation are usually used for symmetrical data, while median and interquartile range (IQR) are the best measures of the skewed data. In the table below, I represent the full descriptive summary for all quantitative variables in addition to use frequency distribution for the categorical data (ADIPOSITY): I deleted the spss table and added this one as the professor said
Interpret the results of the descriptive please in a paragraph.
The average body fat percentage is 18.94% with a standard deviation of 7.75. The value varies in the range of 45.10. The average density is 1.06 with a standard deviation of 0.02. The value varies in the range of 0.11. The average age is 44.88 years with a standard deviation of 12.6 years. The value varies in the range of 59 years. The average neck circumference is 37.99 cm with a standard deviation of 2.43 cm. The value varies in the range of 20.10 cm. The average chest circumference is 100.82 cm with a standard deviation of 8.43 cm. The value varies in the range of 56.90 cm. The average abdomen circumference is 92.56 cm with a standard deviation of 10.78 cm. The value varies in the range of 78.70 cm. The average hip circumference is 99.90 cm with a standard deviation of 7.16 cm. The value varies in the range of 62.70 cm. The average thigh circumference is 59.41 cm with a standard deviation of 5.25 cm. The value varies in the range of 40.10 cm. The average knee circumference is 38.59 cm with a standard deviation of 2.41 cm. The value varies in the range of 16.10 cm. The average ankle circumference is 23.10 cm with a standard deviation of 1.69 cm. The value varies in the range of 14.80 cm. The average biceps circumference is 32.27 cm with a standard deviation of 3.02 cm. The value varies in the range of 20.20 cm. The average forearm circumference is 28.66 cm with a standard deviation of 2.02 cm. The value varies in the range of 13.90 cm. The average wrist circumference is 18.23 cm with a standard deviation of 0.93 cm. The value varies in the range of 5.60 cm. The average body mass index is 25.94 with a standard deviation of 9.56. The value varies in the range of 147.59. There are 137 (54.4%) participants with adiposity below the average value and 115 participants (45.6%) with adiposity above the average value.
The relationship between the quantitative variables was preliminary tested with the correlation coefficient. The correlation matrix of Pearson’s correlation coefficients is given in the table below:
Please interpret the results in this table. Use the bottom part.
Legend:
1= Body fat; 2= density; 3 = age; 4 = neck; 5 = chest; 6 = abdomen; 7 = hip; 8 = thigh; 9 = knee; 10 = ankle; 11 = biceps; 12 = forearm; 13 = wrist; 14 = BMI
The interpretation of the correlation coefficient:
We are interested only in correlation pairs of BODYFAT variable and other variables. Since the p-values of all coefficients of correlation (highlighted in yellow) are less than 0.05, I conclude that there is a significant association between BODYFAT and other variables. The relationship is described mathematically by using multiple regression analysis:
ANOVA A multiple regression analysis conducted to test the relationship between body fat percentages and independent variables. The results indicate that the coefficients of the model are jointly significant (F=778.214, p<0.05). The coefficient of determination is 0.989, indicating that approximately 98.9% of variance in BODYFAT variable is explained by this model. However, almost all the coefficients are not separately significant in predicting BODYFAT (p>0.05 for all the coefficients). Only the coefficient of DENSITY variable is significant (t=-50.893, p<0.05). Could you add to this interpretation the value of F and Adjusted R Square? And what do they mean in our result? The coefficient of determination (R-square) and ANOVA results (F-value) are interpreted a few rows above. Please doublecheck.
Could I interpret the result of this regression in a paragraph? Then compare it with the previous regression. The results of the regression are also given above.
In order to resolve this issue, I have removed all the variables except DENSITY and run different multiple regression models to find the best one. Finally, I have found the following combination of the independent variables, which gives the same percentage of variance explained:
This regression equation has approximately the same value of R-squared. This means that the two chosen factors explain the variance of the response variable almost to the same extend as all variables together. The model is also significant (F=839,078, p<0.001). All coefficients are significant (p<0.001). This regression is considered as the best linear regression model for body fat percentage prediction.
Conclusion
In this paper, I tried to determine the key factors that significantly impact the percentage of body fat. I have operated with the sample of 252 men that were measured with various measures of body size and body fat percentage. The body fat percentage was measured normally by hydrostatic weighing. Basically, the bone density represents the most important factor in the comparison. The other factors appeared to be insignificant in predicting body fat percentage.
However, there is a number of limitations in this study. The first limitation is that we have examined only men, hence, the results cannot be extended to women. The second limitation is that we do not know almost anything about the common factors that may “unite” the individuals participated in the research. Higher levels of bone density and chest volume may also be associated with athletics bodies that do not have a high body fat percentage. Also, there were a number of variables which are not normally distributed. Those variables violate the assumptions of all independent variables in normally distributed.
The further research may include such procedures as qualitative study (searching for new factors and kinds of the relationships between the body fat percentage and these factors), inclusion of females participants and dividing the sample by body types (athletic, non-athletic, etc.).
References
BMI Formula. (2016). Bmi-calculator.net. Retrieved 5 March 2016, from http://www.bmi-calculator.net/bmi-formula.php
Dr. John Rasp's Statistics Website - Data Sets for Classroom Use. (2016). Www2.stetson.edu. Retrieved 5 March 2016, from http://www2.stetson.edu/~jrasp/data.htm
Factors That Affect Body Fat | MyNetDiary. (2016). Mynetdiary.com. Retrieved 5 March 2016, from http://www.mynetdiary.com/factors-that-affect-body-fat.html
Fat Differences In Men Vs. Women. (2016). MedicineNet. Retrieved 5 March 2016, from http://www.medicinenet.com/script/main/art.asp?articlekey=8519
Garza, J., Dugan, A., Faghri, P., Gorin, A., Huedo-Medina, T., & Kenny, A. et al. (2015). Demographic, health-related, and work-related factors associated with body mass index and body fat percentage among workers at six Connecticut manufacturing companies across different age groups: a cohort study. BMC Obesity, 2(1). http://dx.doi.org/10.1186/s40608-015-0073-1
Normal ranges of body weight and body fat. (2016). human-kinetics. Retrieved 5 March 2016, from http://www.humankinetics.com/excerpts/excerpts/normal-ranges-of-body-weight-and-body-fat
Testing for Normality using SPSS Statistics when you have only one independent variable.. (2016). Statistics.laerd.com. Retrieved 6 March 2016, from https://statistics.laerd.com/spss-tutorials/testing-for-normality-using-spss-statistics.php