IM Alg1.1.14 Lesson: Outliers

The histogram and box plot show the average amount of money, in thousands of dollars, spent on each person in the country (per capita spending) for health care in 34 countries.
[img][/img][br][br]One value in the set is an [b]outlier[/b]. Which one is it?[br]
What is its approximate value?
By one rule for deciding, a value is an outlier if it is more than 1.5 times the IQR greater than Q3. Show on the box plot whether or not your value meets this definition of outlier.
Here is the data set used to create the histogram and box plot from the warm-up.
[table][tr][td]1.0803[/td][td]1.0875[/td][td]1.4663[/td][td]1.7978[/td][td]1.9702[/td][td]1.9770[/td][td]1.9890[/td][td]2.1011[/td][td]2.1495[/td][td]2.2230[/td][/tr][/table][table][tr][td]2.5443[/td][td]2.7288[/td][td]2.7344[/td][td]2.8223[/td][td]2.8348[/td][td]3.2484[/td][td]3.3912[/td][td]3.5896[/td][td]4.0334[/td][td]4.1925[/td][/tr][tr][td]4.3763[/td][td]4.5193[/td][td]4.6004[/td][td]4.7081[/td][td]4.7528[/td][td]4.8398[/td][td]5.2050[/td][td]5.2273[/td][td]5.3854[/td][td]5.4875[/td][/tr][tr][td]5.5284[/td][td]5.5506[/td][td]6.6475[/td][td]9.8923[/td][td][/td][td][/td][td][/td][td][/td][td][/td][td][br][br][/td][/tr][/table]
Use technology to find the mean, standard deviation, and five-number summary.
[size=150]The maximum value in this data set represents the spending for the United States.[br][/size][br]Should the per capita health spending for the United States be considered an outlier? Explain your reasoning.[br]
[size=150]Although outliers should not be removed without considering their cause, it is important to see how influential outliers can be for various statistics. Remove the value for the United States from the data set.[/size]
Use technology to find the mean, standard deviation, and five-number summary.
How do the mean, standard deviation, median, and interquartile range of the data set with the outlier removed compare to the same summary statistics of the original data set?[br]
The number of property crime (such as theft) reports is collected for 50 colleges in California.
[size=150]Some summary statistics are given:[br]15 17 27 31 33 39 39 45 46 48 49 51 52 59 72 72 75 77[br]77 83 86 88 91 99 103 112 136 139 145 145 175 193 198 213[br]230 256 258 260 288 289 337 344 418 424 442 464 555 593 699 768[/size][br][br][list][size=150][*]mean: 191.1 reports[/*][*]minimum: 15 reports[/*][*]Q1: 52 reports[/*][*]median: 107.5 reports[/*][*]Q3: 260 reports[/*][*]maximum: 768 reports[/*][/size][/list][br]Are any of the values outliers? Explain or show your reasoning.[br]
If there are any outliers, why do you think they might exist? 
Should they be included in an analysis of the data?
The situations described here each have an outlier.
[size=150]For each situation, how would you determine if it is appropriate to keep or remove the outlier when analyzing the data? [br][/size][i]Discuss your reasoning with your partner.[/i][br][br]A number cube has sides labelled 1–6. After rolling 15 times, Tyler records his data:[br]1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 20
The dot plot represents the distribution of the number of siblings reported by a group of 20 people.[br][img][/img]
In a science class, 12 groups of students are synthesizing biodiesel. At the end of the experiment, each group recorded the mass in grams of the biodiesel they synthesized. The masses of biodiesel are[br]0, 1.245, 1.292, 1.375, 1.383, 1.412, 1.435, 1.471, 1.482, 1.501, 1.532
Look back at some of the numerical data you and your classmates collected in the first lesson of this unit.
Are any of the values outliers? Explain or show your reasoning.
If there are any outliers, why do you think they might exist?
Should they be included in an analysis of the data?

IM Alg1.1.14 Practice: Outliers

The number of letters received in the mail over the past week is recorded.
2 3 5 5 5 15
Elena collects 112 specimens of beetle and records their lengths for an ecology research project. When she returns to the laboratory, Elena finds that she incorrectly recorded one of lengths of the beetles as 122 centimeters (about 4 feet).
What should she do with the outlier, 122 centimeters, when she analyzes her data?
Mai took a survey of students in her class to find out how many hours they spend reading each week.
[size=150]Here are some summary statistics for the data that Mai gathered:[br][/size][br][table][tr][td]mean: 8.5 hours[/td][td][/td][td]standard deviation: 5.3 hours[/td][td] [/td][td] [/td][td] [/td][td][/td][td] [/td][td] [/td][td] [/td][/tr][tr][td]median: 7 hours[/td][td]Q1: 5 hours[/td][td]Q3: 11 hours[/td][td] [/td][td] [/td][td] [/td][td] [/td][td][/td][td][/td][td] [/td][/tr][/table][br]Give an example of a number of hours larger than the median which would be an outlier. Explain your reasoning.[br]
Are there any outliers below the median? Explain your reasoning.[br]
The box plot shows the statistics for the weight, in pounds, of some dogs.
[img][/img][br]Are there any outliers? Explain how you know. 
The mean exam score for the first group of twenty examinees applying for a security job is 35.3 with a standard deviation of 3.6.
[size=150]The mean exam score for the second group of twenty examinees is 34.1 with a standard deviation of 0.5. Both distributions are close to symmetric in shape.[br][/size][br]Use the mean and standard deviation to compare the scores of the two groups.
The minimum score required to get an in-person interview is 33. Which group do you think has more people get in-person interviews?
A group of pennies made in 2018 are weighed. The mean is approximately 2.5 grams with a standard deviation of 0.02 grams.
Interpret the mean and standard deviation in terms of the context. 
These values represent the expected number of paintings a person will produce over the next 10 days.
[size=150]0, 0, 0, 1, 1, 1, 2, 2, 3, 5[/size][br][br]What are the mean and standard deviation of the data?[br]
The artist is not pleased with these statistics. If the 5 is increased to a larger value, how does this impact the median, mean, and standard deviation? [br]
List the four dot plots in order of variability from least to greatest.

Information