Math Studies
Introduction:
In a recent Time Magazine article, Leon Botstein, the President of Bard College wrote that the SAT is biased, and “the only persistent statistical result from the SAT is the correlation between high income and high test scores” (Botstein 2014), In statistical research, there has been an observable relationship between SAT scores and the test-takers’ family income. Generally, the wealthier a student’s family is, the higher the SAT score. Does this hold true for average state mean income and education expenditure invested per student?
The strong correlation has been attributed to a number of factors, better school districts, more learning opportunities and the high cost of SAT prep courses. This data seems to indicate the more economic resources a student has, the higher their SAT score will be.
However, there is also research that shows that spending money on public education does not raise economic outcomes. A study from the CATO Institute showed that despite tripling nationwide per student state education expenditure, student performance has slightly declined in mathematics and verbal skills. They concluded that there is “no relationship, effectively, between spending and academic outcomes” (Cato Institute 2014).
Questions:
Since we already know that a family’s income has a strong relationship with SAT scores, it may be interesting to look at how average state income (what the average person earns in a state) and average state educational expenditure (how much the state spends on education per student) relate to SAT scores.
1) What is the correlation between average state SAT scores and average state income?
– Outcome (Y) variable – Verbal SAT scores
– Predictor (X) variables - Mean state incomes, State expenditure per student
2) What is the R2 for 1) State mean income 2) State expenditure per student
If there is a relationship, is it positive (more income, more educational expenditure equal higher SAT score), or vice-versa? Discussion and analysis will examine if states that are wealthier in general have higher SAT scores and what role state spending on educational expenditure has on a specific educational outcome: higher SAT scores.
Plan:
First I will gather the data on average state SAT scores, average state income and educational expenditure; and use MS EXCEL to create data sets and two scatterplots.
In order to find the correlations between variables I will be graphing scatter plots of paired information, and calculating the correlation coefficients in order to determine whether or not there are any statistically significant relationships between the variables.
I am interested in looking at state averages. Specifically, what effect does average state income and average state expenditure have on SAT scores? I will be looking for correlation, a relationship or interdependence of variable quantities. The data will allow for an analysis of the relationship between average state incomes, educational expenditures and SAT performance. Data has been included for all fifty states excluding the District of Columbia. I will use two methods of bivariate correlation and regression to determine the relationship between two quantitative variables. The scatterplots will give me a visual representation of the date. The Pearson correlation coefficient (R) will give me a linear idea about the nature and strength of the relationships. And the Spearman rank correlation will show the statistical dependence and rank order of the relationship. The scatterplot and two mathematical processes should offer some insight into the relationship between income/educational investment per state and average SAT scores. I am not sure average incomes fluctuates enough to be statistically significant, but I expect to see the more a state spends on education, the higher the SAT scores.
Data Collection:
SAT data came from the annual “College-Bound Seniors” reports. State average income data was retrieved from the U.S. Department of Commerce Bureau of Economic Analysis. State educational expenditures per student was available from US Census data. (Appendix A, B and C). I researched background information on economic investment and student SAT performance using JSTOR and trustworthy media outlets.
Data and Works Cited:
Botstein, L. (2014). College president: SAT is part hoax, part fraud. Time. Retrieved March 25, 2014
Seniors, 2013 College-Bound. Total Group Profile Report (n.d.): n. pag. The College Board. Web.
"State Personal Income, Second Quarter, 2014." US Bureau of Economic Analysis. Web. 2 Dec. 2014. <http://www.bea.gov/newsreleases/regional/spi/
"Education Spending Per Student by State." Education Spending Per Student by State. N.p., n.d. Web. 01 Dec. 2014
Methodology:
First, In Excel, I compiled a data set of three sets of data. (A) Average state SAT scores, (B) average state educational expenditures and (C) average income.
I determine mean by adding data sums and dividing by 50 (states):
Average State Verbal SAT: 535.28
Mathematical Processes:
I will be using the Pearson (r) and Spearman (p) correlation processes. I really want to explore the data so I will use both, and the benefit is along with the information they provide, I can also evaluate the relationship between the Spearman (S) and Pearson (P) to increase the validity of my conclusion. Spearman is computed on ranks, does not make assumptions about the data, and is goo for depicting monotonic relationships. Pearson is more traditional method and relies in true values and good for depicting linear relationships:
Scatterplots:
The scatter-plots above show visual representations of “Average State Income vs. SAT” and “State Per Capita Educational Expenditure vs. SAT. Looking for trends, I can see a slightly negative relationship. They look about even. Both relationships have a cluster in the middle and some outliers. These two scatter-plots are visual representations of this data. Now I need a mathematical representation of these two relationships. Statistical tests can help me understand the strength of each relationship.
Pearson Correlation Coefficient:
Then I use Correlation Coefficient (r) between Income/Educational Expenditure and SAT:
Where N is the number of pairs of scores, xy sum of the products of pairs, x
Is sum of x scores, y is sum of y scores, x^2 is the sum squared x scores, y^2= sum of squared y scores
Both are negative results, The Linear coefficient or ‘r’ has a negative value, so the linear correlation between the two variables is negative. The negative correlation means that as one of the variables increases, the other tends to decrease, and vice versa.
In general, the states that spend more money have lower SAT’s, and the states that spend the most on education per student have lower SAT scores. It is, however, a low negative correlation.
.
Spearmen’s Rank Order Coefficient:
Since I was able to detects a slight negative trench using the Pearson process, I would also to see if the variables covary; so if one variable increases, the other variable tends to increase or decrease. The Spearman’s rank correlation determines the rank for each of the state income/educational investment and each of the SAT scores. This is nonparametric correlation. It ranks the data and measures how strong the relationship is, which is useful for analyzing if the results are valid and meaningful. If there were no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect copy of each other. Smaller results indicate less connection between variables.
n is the number of paired ranks, and d is the difference between the paired ranks. The number of values in the final calculation of a statistic that are free to vary. It is the minimum number of independent coordinates that are able to specify the position of the system completely.
AVERAGE STATE INCOME VS AVERAGE SAT SCORE
Last time, the Pearson Correlation Coefficient gave me a measure the strength of the linear relationship between SAT and income/state educational investment. When the variables are not normally distributed or the relationship between the two variables is not linear, it may be better to also test the relationship using the Spearman rank correlation method. The results of the data are monotonic and weak, but both are negative which confirms the negative relationship we saw in the Pearson Correlation. The P values are small, so there may be little correlation. If there is one, it is negative.
Validity:
I used governmental sources and College Board numbers for my data. There should be no errors and the sources are trustworthy and unbiased. I used some advanced math processes, including Pearson correlation coefficient test for linear Relationship and Spearman’s Rank Order Correlation Coefficient. Using the Pearson test, I noticed a negative relationship between average income by state and average state educational expenditure. The Spearman process showed the variables are not strongly associated.
I did not measure math or essay writing sections of the SAT, it may have been useful to see if there was also a negative correlation between different tests. The Pearson and Spearman tests seemed to validate each other; they were not very different. Finally, I believe that geography may play a bigger role than average state income or state educational expenditures. If you group SAT scores by region there seems to be large regional differences that may show much larger correlations to SAT scores.
Discussion/Analysis:
My initial assumption – that SAT scores would be higher in states with higher average incomes and educational investment was incorrect. In fact, I expected to see a positive relationship, and there appears to be a slightly negative trench. I was aware there was a strong correlation between a family’s income and SAT scores. I was expecting to see similar results here. However, I believe the higher SAT scores in high-income families may be a result of expensive test taking courses. Everyday investments in education may not correlate to higher SAT scores. Likewise, average income may not say much about trends in education. Many of the incomes were clustered around the 40-50k mark. The data in the studies that show higher SAT really spike over the 100 thousand dollar a year income level. The difference between 30 thousand dollars and 50 thousand dollars may not correlate in a significant way.
There were some interesting outliers: Iowa spends the least on education per student in the U.S., however the average Iowa SAT score is well above average (568 vs. 535). Overall, I feel the data is likely accurate, average income and educational expenditure has very little effect (or a slightly negative relationship) on SAT scores. Maryland and Alaska both have very high average incomes and relatively low below average SAT verbal scores. States that I would assume to have high SAT’s, like Connecticut (509) and Massachusetts (512) have rather low scores. There are so many factors that you could evaluate when it comes to academic performance, from student teacher ratios, teacher pay to if schools have art and music programs.
The most surprising conclusion was the correlation between state expenditures and SAT scores. While the Pearson correlation coefficient calculation revealed that the relationship was significant for both state average income and educational investment vs. SAT scores, the significance was negative, just as the slope of the scatterplot. In the end, I was disappointed because I thought there would be data that backed up the conclusions about individual family income. The Spearman data had such a low rho, that I think these variables have little to do with each other. These results were a little surprising. There are no significant relationships between the data sets that allows for a decisive conclusion, but the negative trend seems clear enough to warrant further examination. I would assume that the kind of income that would statistically drive higher SAT scores would be much higher than average. This data could be analyzed a number of different way. It could be used to argue for less investment in schools, since the added investment does not seem to help students prepare for college. On the other hand, it could also be used to show the income and educational inequality of the first chart in this paper, where students with family income over 100k seem to see the highest scores. However, with the loose correlation and interdependence of the data, it really is difficult to reach any kind on conclusion.
Appendix A: State SAT scores
Appendix B: State Expenditures per Student
Appendix C: Mean Income by State