# Chapter 3 STATS

## Unlock all answers in this set

question
The ______ is another term for the arithmetic average.
mean note: The mean is another term for the arithmetic average. It can be thought of as the balancing point of a distribution of data.
question
3.1.RA-2 Fill in the blank below. The mean of a collection of data is located at the​ ______ of a distribution of data.
The mean of a collection of data is located at the​ "balancing point" of a distribution of data.
question
The symbol ∑ stands for which of the​ following? Multiplication Summation Division Finding the mean
note: The symbol ∑ stands for summation. If x represented a single​ observation, then ∑x would mean that all the values should be added together.
question
3.1.RA-4 The mean represents the typical value in a set of data for what type of​ distribution? A. For distributions that are roughly symmetric B. For distributions that are bimodal C. For distributions that are skewed D. For all distributions
A. For distributions that are roughly symmetric
question
The​ ______ is a number that measures how far away the typical observation is from the mean.
answer: standard deviation note: For most​ distributions, a majority of observations are within one standard deviation of the mean value.
question
3.1.RA-6 In a​ symmetric, unimodal​ distribution, about​ two-thirds of the observations are​ where? A. Within three standard deviations of the mean B. Within two standard deviations of the mean C. More than one standard deviation from the mean D. Within one standard deviation of the mean
D. Within one standard deviation of the mean
question
3.1.RA-7 To compute the​ variance, what should one​ do? A. Double the standard deviation. B. Square the standard deviation. C. Divide the standard deviation by n minus −1. D. Take the square root of the standard deviation.
note: The variance is the square of the standard deviation. It is represented symbolically by s squared s2.
question
For most​ applications, why is the standard deviation is preferred over the​ variance? A. The units for the variance are always squared. B. The standard deviation is easier to calculate than the variance. C. The standard deviation more accurately measures the variability in a distribution. D. All of the above
A. The units for the variance are always squared. note: The units for the variance are always squared. Measuring spread distance with the variance implies that the units for measuring spread are different from the units for measuring​ center, which is not true.
question
3.1.1 A sociologist​ says, "Typically, men in a certain country still earn more than​ women." What does this statement​ mean? A. The center of the distribution of salaries for men in the country is greater than the center for women. B. The highest paid people in the country are men. C. All​ women's salaries in the country are less varied than all​ men's salaries. D. All men make more than all women in the country.
A. The center of the distribution of salaries for men in the country is greater than the center for women. note: In a distribution of​ values, the typical value is given by the mean. In this​ case, the average salary of a man is higher than that of a​ woman, so when comparing the distribution of​ men's salaries to​ women's salaries, the center of the distribution for men is greater than the center of the distribution for the women.
question
3.1.19 In a recent​ competition, do you think the standard deviation of the running times for all men who ran the​ 100-meter race would be larger or smaller than the standard deviation of the running times for the​ men's marathon? Explain. A. The standard deviation for the​ 100-meter event would be less. All the runners come to the finish line within a few seconds of each other. In the​ marathon, the runners can be quite widely spread after running that long distance. B. The standard deviation for the marathon event would be less. Many more runners compete in a marathon rather than a​ 100-meter event.​ Therefore, the average time will be determined with greater precision. C. The standard deviation for the marathon event would be less. All the runners finish the race in a matter of seconds. In the​ marathon, the runners can be quite widely spread after running that long distance. D. The standard deviation for the​ 100-meter event would be less. All the runners finish the race in a matter of seconds. In the​ marathon, the runners take at least a few hours to complete the course.
note: Since the difference between running times in the​ 100-meter event will be within a few seconds of each​ other, the running times will have small variation. In the​ marathon, since the running times are likely to be minutes​ apart, the times will have greater variation.​ Thus, the marathon running times will have a greater standard deviation.
question
3.2.RA-1 According to the Empirical​ Rule, ________ will be within two standard deviations of the mean.
note: According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 95% of the observations will be within two standards deviation of the mean.
question
3.2.RA-2 The Empirical Rule applies to distributions that are​
answer: symmetric and unimodal. note: According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 68% of the observations will be within one standard deviation of the​ mean, approximately​ 95% of the observations will be within two standard deviations of the​ mean, and nearly all the observations will be within three standard deviations of the mean.
question
3.2.RA-3 A standard unit measures which of the​ following? A. How many standard deviations away an observation is from the mean B. How many standard deviations away an observation is from the median C. The magnitude of the standard deviation D. The interval within which approximately​ 68% of the observations fall
A. How many standard deviations away an observation is from the mean note: A standard unit is how many standard deviations away an observation is from the mean. A measurement converted to standard units is called a​ z-score.
question
3.2.RA-4 If an observation has a​ z-score of​ 0, this means which of the​ following? Choose the correct answer below. A. The observation is equal to the standard deviation. B. The observation is equal to the median. C. The​ z-score was computed incorrectly. D. The observation is equal to the mean.
D. The observation is equal to the mean. note: If an observation has a​ z-score of​ 0, then it is equal to the mean. The mean is 0 standard deviations away from​ itself, so it has a​ z-score of 0.
question
3.2.RA-5 Which of the following can be used to compare values measured in different​ units, such as inches and​ pounds? ​z-score standard deviation standard error interquartile range
answer: ​z-score note: The​ z-score measures distance from a mean in terms of standard​ deviations, so it can be used to compare values measured in different​ units, such as inches and pounds.
question
The value that would be right in the middle if you were to sort the data from smallest to largest is called the
median note: The median is the value that would be right in the middle if you were to sort the data from smallest to largest. About​ 50% of the observations are below it and about​ 50% of the observations are above it.
question
For what purpose is the median​ used? A. To give the spread of a distribution B. To measure the variation of a data set C. To give a typical value of a data set D. None of these
C. To give a typical value of a data set note: The median is a typical value of a data set. It is used particularly when the distribution is skewed.
question
The median is often used for which of the following types of​ distribution? Uniform Skewed Symmetric Bimodal
Skewed note: The median is often used for skewed distributions. The mean is not often used for skewed distributions because skew affects the mean more than it affects the median.
question
When a distribution is​ skewed, the​ _______ is used to measure the center and the​ _______ is used to measure variation.
note: The mean and standard deviation are used to measure the center and​ variation, respectively, when a distribution is symmetric.
question
The interquartile range tells us how much space the​ _____ of the data occupy.
note: The interquartile range tells us how much space the middle​ 50% of the data occupy. It is found by subtracting the third quartile from the first quartile.
question
Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. What are two measures of the center of a​ distribution? interquartile range and standard deviation first quartile and third quartile median and mean
median and mean
question
Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. Under what conditions is the median​ preferred? A. The median is preferred when there are few data points. B. The median is preferred when the data is strongly skewed or has outliers. C. The median is preferred when there are many data points. D. The median is preferred when the data is relatively symmetric.
B. The median is preferred when the data is strongly skewed or has outliers. note: The median provides a better measure of center when the data is skewed or has outliers because the presence of an outlier has a much greater effect on the mean.
question
Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. Under what conditions is the mean​ preferred? A. The mean is preferred when the data is relatively symmetric. B. The mean is preferred when the data is strongly skewed or has outliers. C. The mean is preferred when there are few data points. D. The mean is preferred when there are many data points.
A. The mean is preferred when the data is relatively symmetric.
question
In a​ right-skewed distribution, which of the following is​ true? A. The mean and median are approximately the same. B. The mean tends to be greater than the median. C. The mean tends to be less than the median. D. None of these
B. The mean tends to be greater than the median. note: The mean tends to be greater than the median in a​ right-skewed distribution. This is because the higher values to the right of the center pull the mean up more than they affect the median.
question
If the mean and the median of a distribution are approximately the​ same, then the shape of the distribution is likely to be​ _______.
note: If the mean and the median of a distribution are approximately the​ same, then the shape of the distribution is likely to be symmetric.
question
When a distribution contains​ outliers, which of the following is the best choice for a measure of​ center? Choose the correct answer below. Interquartile range Mean Median Standard deviation
note: The median is resistant to​ outliers, so when a distribution contains​ outliers, the median is the best choice for a measure of center.
question
Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme​ value, we say the median is​ _____ to outliers.
note: The median is resistant to outliers. This makes it a good choice for a measure of center when a distribution is skewed.
question
When comparing​ groups, if one group is strongly skewed or has outliers and the other is​ symmetric, which of the following should be used to compare the​ groups? A. The median and interquartile range for the skewed group and the mean and standard deviation for the symmetric group B. The mean and standard deviation for the skewed group and the median and interquartile range for the symmetric group C. The means and standard deviations D. The medians and interquartile ranges
D. The medians and interquartile ranges note: When comparing two​ distributions, one should always use the same measures of center and spread for both distributions. Since the mean will be affected by the skew or outliers in the first​ distribution, use the median and interquartile range for both distributions.
question
In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier?
Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers. note: Outliers are observed values that lie outside the range of the main group of data. When an outlier is​ present, the observer needs to consider it more closely. Sometimes it is just a mistake that happened while collecting the data and can be corrected or discarded. Other times it is a legitimately observed value. In those​ cases, the analysis needs to be presented once with outliers and once without outliers to give a better idea of what a typical value is. Note that in​ statistics, potential outliers are defined as observations that are more than 1.5 interquartile ranges below the first quartile or above the third​ quartile, not above or below the median. Also note that a potential outlier is not the same thing as an outlier.
question
Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?
The median is more​ resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers. note: The median is more resistant to outliers than the​ mean, especially when the outliers have extreme values. The presence of an extreme value can cause the mean to become very skewed because it will shift heavily in the direction of the extreme value. The amount the median shifts by is based on the number of data​ observations, because it is determined by the middle value after ordering all the observations from lowest to highest. If there is only one outlier with an extremely large value the median will shift very​ slightly, while the mean will change significantly.
question
A dieter recorded the number of calories he consumed at lunch for one week. As you can​ see, a mistake was made on one entry. The calories are listed in increasing order below. 349, 371, 386, 398, 412, 4190 When the error is corrected by removing the extra​ 0, will the mean​ change? Will the​ median? Explain without doing any calculation.
note: The median is resistant to outliers and extreme values because it orders the data from lowest to highest and looks at the middle value. The highest value does not change the​ order, and so it does not change the median. The mean is the balancing point for the data set. When looking at the shape of a​ histogram, the mean is the point which balances the weight on both sides. If an extreme value is placed on one end of the​ mean, it has to shift in that direction to keep everything balanced.
question
Why is the mean different from the​ median?
note: The median gives a better measure of center for this distribution because the​ professor's age is an outlying observation. The median also tends to give a better representation of a typical observation in a skewed distribution.
question
In a​ boxplot, the vertical line inside the box marks the location of the
median.
question
The length of the box in a boxplot is proportional to which of the​ following? Choose the correct answer below. IQR Mean Median Standard deviation
note: The length of the box in a boxplot is proportional to the IQR. The left edge of the box is at the first quartile and the right edge is at the third quartile.
question
In a​ boxplot, potential outliers are points that are more than​ ___ IQRs from the edges of the box.
In a​ boxplot, potential outliers are points that are more than 1.5 IQRs below the first quartile or above the third quartile.
question
In a​ boxplot, the whiskers extend to which of the​ following? Choose the correct answer below. A. The smallest and largest values in the data set B. To the most extreme values that are not potential outliers C. To the first and third quartiles D. None of these