MODULE 1
2. Displays
Histograms are used to show distributions of variables while bar charts are used to compare variables. Histograms plot quantitative data with ranges of the data grouped into bins or intervals while bar charts plot categorical data.
Whereas an example of a bar graph data would show what percentage of people with a certain hair type have a certain eye color.
3. Linear regression
We are comparing the birth rate and the life expectancy in third world countries such as Somalia, Tanzania, Zambia, adding a few developed countries in the end such as France and the Netherlands. It can be seen that the higher the life expectancy, the lower the birth rate there is.
4. Trends
The more people run on the track, the warmer it should get. The problem is that people are not constantly running on one and the same place, therefore there is time for the track to cool down. Another influence on the track’s temperature is the weather outside. If it is sunny, the track will get warmer with or without people running on it. The correlation between them could be seen via a scatterplot, for example. If the fit line is upward streaming, there is a positive correlation between the temperature at an outdoor track and the number of people using the track.
5. Misrepresented
National Security Agency was broadly collecting domestic Internet communications of Americans and misrepresenting the scope of that effort to the court. They were not only collecting foreign intelligence and trying to prevent terrorism, but they were violating the law by collecting domestic information.
http://edition.cnn.com/2013/08/21/politics/nsa-fisa-court/
6. Outliners
The argument is invalid since there is a hidden premise. It is not mentioned how and on what the increased amount of money for law enforcement was spent. Therefore, it cannot be concluded that just because the spending on law enforcement increased, the crime will necessarily decrease.
7. Correlation coefficients
Tutorial: How to make a histogram in Excel?
- You should open Excel
- Enter the data that you want to be in the histogram
- Click on the Data tab and then click on Data Analysis
- Click on Histogram to select the histogram option and click OK
- Check the chart output box
- Click OK and your histogram will appear
MODULE 2
8. Displays
There are many situations when we want to create bins of different sizes. In school practices this happens when we score a test, for example, on a scale from 0 to 100 points, and we assign grades as follows: A: 91-100, B: 76-90, C: 61-75: D: 51-60, and F: 0-50.
When bins are of the same width, the heights of bars and their areas are either equal or proportional to each other. But with bins of different widths, we have to choose whether to represent frequency by the height of the bar or by its area = width*height, which might be misrepresented.
Example: In a classroom of 30 students the grade distribution on a test was the following:
A: 5 (17%) B: 6 (20%) C: 8 (28%) D: 7 (23%) F: 4 (13%)
In the graph below the widths of the bars represent the range of scores for each grade (10, 15, 15, 10, 51), and their heights represent the frequencies. Not a good representation!
9. measures of spread
Tutorial: Standard deviation
The standard deviation is the most common measure of variability, measuring the spread of the data set and the relationship of the mean to the rest of the data. If the data points are close to the mean, indicating that the responses are fairly uniform, then the standard deviation will be small. Conversely, if many data points are far from the mean, indicating that there is a wide variance in the responses, then the standard deviation will be large. If all the data values are equal, then the standard deviation will be zero. The standard deviation is calculated using the square root of the following formula:
S2=(X-M)2n-1
Where Σ = Sum of
X = Individual score
M = Mean of all scores
N = Sample size (number of scores)
Therefore the standard deviation formula looks like this:
S=(X-M)2n-1
10. distributions
The intervals of equal width are necessary for a histogram to be an accurate display of the distribution data because this allows us to compare the data in each interval. If the intervals are not of equal width, the data cannot be observed and analyzed accurately.
11. central tendency
Exercise: For the following sets of data, find to one decimal place
- the mean
- the median, and
- the mode
- 0 – 0 – 0 – 0 – 1 – 0 – 0 – 0 – 0 – 0 – 0
- 2 – 1 – 2 – 3 – 1 – 3 – 0 – 2 – 4 – 2 – 2
- 2.4 – 3.9 – 1.8 – 1.7 – 4.0 – 2.1 – 3.9 – 1.5 – 3.9 – 2.6
- 153.8 – 154.7 – 156.9 – 154.3 – 152.3 – 156.1 – 152.3
Solution:
- Mean = sum of all the observation values/number of observations
Mean (a) = (0+0+0+0+1+0+0+0+0+0)/10 = 1/10 = 0.1
Mean (b) = (2+1+2+3+1+3+0+2+4+2+2)/10 = 22/10 = 2.2
Mean (c) = (2.4+3.9+1.8+1.7+4.0+2.1+3.9+1.5+3.9+2.6)/10 = 27.8/10 = 2.8
Mean (d) = (153.8+154.7+156.9+154.3+152.3+156.1+152.3)/7 = 1080.4/7 = 154.3
- Median = the middle value of a set of ordered data
- 0 – 0 – 0 – 0 – 0 – 0 – 0 – 0 – 0 – 0 – 1
Median (a) = (0+0)/2 = 0
- 0 – 1 – 1 – 2 – 2 – 2 – 2 – 2 – 3 – 3 – 4
Median (b) = (2+2)/2 = 4/2 = 2
- 1.5 – 1.7 – 1.8 – 2.1 – 2.4 – 2.6 – 3.9 – 3.9 – 3.9 – 4.0
Median (c) = (2.4+2.6)/2 = 5/2 = 2.5
- 152.3 – 152.3 – 153.8 – 154.3 – 154.7 – 156.1 – 156.9
Median (d) = 154.3
- Mode = the most frequently observed data value
Mode (a) = 0
Mode (b) = 2
Mode (c) = 3.9
Mode (d) = 152.3
MODULE 3
12. Venn
- Two mutually exclusive events cannot happen at the same time. An example of two mutually exclusive events is tossing a coin – the outcome can either be heads, or tails, not both at the same time.
- Two not mutually exclusive events can happen at the same time. An example of two not mutually exclusive events is turning left and scratching your head – you can do both at the same time.
13. tree diagrams
A dependent event has an outcome that is affected by previous outcomes. For example, removing colored marbles from a bag. Each time you remove a marble the chances of drawing out a certain color will change.
An independent event is not affected by previous events. For example, a coin does not "know" it came up heads before, thus each toss of a coin is a perfect isolated thing.
14. conditional probability
Definition: If P(F) > 0, then the probability of E given F is defined to be PEF=P(E∩F)P(F)
Example 1 A machine produces parts that are either good (90%), slightly defective (2%), or obviously defective (8%). Produced parts get passed through an automatic inspection machine, which is able to detect any part that is obviously defective and discard it. What is the quality of the parts that make it through the inspection machine and get shipped?
Let G (resp., SD, OD) be the event that a randomly chosen shipped part is good (resp., slightly defective, obviously defective). We are told that P(G) = .90, P(SD) = 0.02, and P(OD) = 0.08. We want to compute the probability that a part is good given that it passed the inspection machine (i.e., it is not obviously defective), which is:
PGODc=P(G∩ODc)P(ODc)=P(G)1-P(OD)=0.901-0.08=9092=0.978
15. probability and combinations
A combination is a set of objects in which order does not matter. A permutation is an ordered combination – a set of objects in which order does matter.
nPr = n!/(n-r)!
A permutation is the choice of r things from a set of n things without replacement and where the order matters. A fixed amount of r is taken from the given set n.
16. probability permutations
A combination is a collection of things, in which the order does not matter.
Example: How many different committees of 4 students can be chosen from a group of 15?
The notation for Combination is:
C=n!r!n-r!
n! is the total number of objects, and
r! is the number of objects chosen.
In this problem, n is the total number of students (15), and r is the number of students chosen (4).
C=15!4!×15-4!=15!4!×11!=15×14×13×12×11!4!×11!=15×14×13×124×3×2×1=1365
17. baby boys
We throw a coin for 5 times to see the random distribution of how many boys will a family of five children have. A girl is considered to be “heads” (1) and a boy is “tails” (2). We make 25 coin tosses in order to have a larger sample. The outcome is only one family (number 17) will have five male children. Therefore the probability is 1/25.
MODULE 4
18. Pascal
The first step represents decomposing the variables into factorial fractions. This results into the sum of two different fractions. The second step is reveals the brackets in the denominator of the second fraction. The third step multiplies both of the two fractions by the needed amount in order to bring them under the same denominator. The forth step is incorporate the r! and (n-r-1)! into the denominators of each fraction respectively, so that they get the same denominator of (r+1)!(n-r)! The fifth and sixth steps sum the fractions together and bring n! in the front of the nominator. Last but not least, n! is brought into the brackets and turns n!(n+1) into (n+1)! in the nominator. The final result is equal to the revealed and simplified result of the right hand side of the equation.
19. probability distribution
Game: Spindice
20. confidence interval
A confidence interval is an interval in which a measurement or trial falls corresponding to a given probability. Usually, the confidence interval of interest is symmetrically placed around the mean, so a 50% confidence interval for a symmetric probability density function would be the interval [-a,a] such that
12=-aaPxdx
21. Z sources
https://www.youtube.com/watch?v=fXOS4Q3nJQY
I chose this video because it is easy to follow and the voiceover is clear. The graphics help you understand the material better and visualize the distribution as well.
22. Bernoulli
At the age of 7, Daniel expressed his desire to study mathematics but his father encouraged him to study business instead. Daniel agreed on the condition that his father would tutor him in mathematics. His relation with his father was sour due to the shame that he felt after coming in the same place in a scientific contest in the University of Paris. He went as far as using plagiarism. Despite Daniel’s efforts of reconciliation, his father did not end the grudge till he died.
MODULE 5
23. sampling
In a systematic sample, the elements of the population are put into a list and then every kth element in the list is chosen (systematically) for inclusion in the sample. For example, if the population of study contained 2,000 students at a high school and the researcher wanted a sample of 100 students, the students would be put into list form and then every 20th student would be selected for inclusion in the sample.
A stratified sample is a probability sampling technique in which the researcher divides the entire target population into different subgroups, or strata, and then randomly selects the final subjects proportionally from the different strata. This type of sampling is used when the researcher wants to highlight specific subgroups within the population.
For example, to obtain a stratified sample of university students, the researcher would first organize the population by college class and then select appropriate numbers of freshmen, sophomores, juniors, and seniors.
24. search
Search: "Canadian mathematics"
First result: Canadian Mathematical Society
https://cms.math.ca/
Second result: Sun Life Financial Canadian Mathematical Olympiad
http://cms.math.ca/Competitions/CMO/
Search: illicit drugs -cocaine –heroin
First result: illicit - definition of illicit by the Free Online Dictionary
http://www.thefreedictionary.com/illicit
Second result: Policy – European Commission
http://ec.europa.eu/health/drugs/policy/index_en.htm
25. quesstionare
1) What is your gender?
2) Do you participate in extracurricular activities?
3) If yes, what kind of extracurricular activities?
4) What is your ethnicity?
5) What is your unweighted GPA?
6) What is your weighted GPA?
26. characteristics
What is the influence of above-the-line (ATL) marketing campaigns on the purchasing behavior of Hispanics related to dairy products?
27. thesis
The outline of the article is to present the problem that the world’s population is increasing rapidly and will continue to rise. Moreover, if it rises, the resources needed would have to be produced in the next 40 years, whereas the same amount has been produced for the past 800 years.
The thesis of the newspaper is that if the population reaches nine billion in 2050, there will not be enough resources to support it.