Discussion 1
If you were given a large data set such as the sales over the last year of our top 1,000 customers, what might you be able to do with this data? What might be the benefits of describing the data?
If such large data set is given, many useful conclusions can be made by simply using descriptive statistics. First of all, the average check can be found. When found this value will be very useful for planning purposes, for analyzing how much profit will advertising bring, etc. Also for this same purpose, the median check could be found and compared with the average check. This would show whether the customers are uniform in their spending or that there are several strategical customers who spend the most. Next, the mode of all the goods purchased could be found. This would give information about which goods are sold most often and which are not. This will give an opportunity to enhance the choice of goods by removing positions which are seldom sold and adding the ones which are sold most often. This will also allow to find the customers’ preferences and tell what can be additionally offered to the customers choosing specific goods. The same analysis could be performed for each season individually to allocate seasonal patterns demand. This would allow to form the seasonal offers or change the assortment accordingly.
Discussion 2
The social security numbers have 9 digits. Each digit can possibly take one of the 10 numbers (0-9). The probability of guessing one digit right then is 1/10. Since there is one guess to get all the digits right, the probabilities of getting each digit right should be multiplied: 1/10 * 1/10 *1/10 *1/10 *1/10 *1/10 *1/10 *1/10 *1/10 = 1/109. So the chances to guess a social security number randomly is rather small – 1 in a billion. To say, such chances almost deny the chance of guessing it by random and ensure high level of security.
In the past, many teachers posted grades along with the last four digits of the student's social security numbers. If someone already knows the last four digits of your social security number, what is the probability that if they randomly generated other digits, they would match yours? Is that something wrong about?
If the last four digits of the security number are known, then randomly guessing it gets much easier. Using the calculation strategy from the previous question we will obtain: 1/10 *1/10 *1/10 *1/10 *1/10 = 1/105 or 1 in hundred thousand. From one point of view, it is still a small possibility for someone to simply guess it, so maybe using them for identification when posting grades might be not the worst idea. On the other hand, the social security numbers are not totally random which makes it much easier to guess the actual number. Social security number consists of three parts – the first three digits are the area code which represents the geographical region where the number was issued, the next two digits are the group number, and the last four are the straight numerical sequence within a group. The last four digits are in fact the hardest to guess because they are just a straight numerical sequence. The first three digits can be guessed by allocating the code of the area where a person lives. So, if the first three digits, the area number, are also identified, then the chances of guessing the number rise to 1/100. And this is already a real possibility for someone to guess this number and use it for unethical purposes. So, in my opinion, no part of the social security number should be revealed publicly.