question

The ______ is another term for the arithmetic average.

answer

mean
note: The mean is another term for the arithmetic average. It can be thought of as the balancing point of a distribution of data.

question

3.1.RA-2 Fill in the blank below.
The mean of a collection of data is located at the ______ of a distribution of data.

answer

The mean of a collection of data is located at the "balancing point" of a distribution of data.

question

The symbol ∑ stands for which of the following?
Multiplication
Summation
Division
Finding the mean

answer

note: The symbol ∑ stands for summation. If x represented a single observation, then ∑x would mean that all the values should be added together.

question

3.1.RA-4 The mean represents the typical value in a set of data for what type of distribution?
A. For distributions that are roughly symmetric
B. For distributions that are bimodal
C. For distributions that are skewed
D. For all distributions

answer

A. For distributions that are roughly symmetric

question

The ______ is a number that measures how far away the typical observation is from the mean.

answer

answer: standard deviation
note: For most distributions, a majority of observations are within one standard deviation of the mean value.

question

3.1.RA-6 In a symmetric, unimodal distribution, about two-thirds of the observations are where?
A. Within three standard deviations of the mean
B. Within two standard deviations of the mean
C. More than one standard deviation from the mean
D. Within one standard deviation of the mean

answer

D. Within one standard deviation of the mean

question

3.1.RA-7 To compute the variance, what should one do?
A. Double the standard deviation.
B. Square the standard deviation.
C. Divide the standard deviation by n minus −1.
D. Take the square root of the standard deviation.

answer

note: The variance is the square of the standard deviation. It is represented symbolically by s squared s2.

question

For most applications, why is the standard deviation is preferred over the variance?
A. The units for the variance are always squared.
B. The standard deviation is easier to calculate than the variance.
C. The standard deviation more accurately measures the variability in a distribution.
D. All of the above

answer

A. The units for the variance are always squared.
note: The units for the variance are always squared. Measuring spread distance with the variance implies that the units for measuring spread are different from the units for measuring center, which is not true.

question

3.1.1 A sociologist says, "Typically, men in a certain country still earn more than women." What does this statement mean?
A. The center of the distribution of salaries for men in the country is greater than the center for women.
B. The highest paid people in the country are men.
C. All women's salaries in the country are less varied than all men's salaries.
D. All men make more than all women in the country.

answer

A. The center of the distribution of salaries for men in the country is greater than the center for women.
note: In a distribution of values, the typical value is given by the mean. In this case, the average salary of a man is higher than that of a woman, so when comparing the distribution of men's salaries to women's salaries, the center of the distribution for men is greater than the center of the distribution for the women.

question

3.1.19 In a recent competition, do you think the standard deviation of the running times for all men who ran the 100-meter race would be larger or smaller than the standard deviation of the running times for the men's marathon? Explain.
A. The standard deviation for the 100-meter event would be less. All the runners come to the finish line within a few seconds of each other. In the marathon, the runners can be quite widely spread after running that long distance.
B. The standard deviation for the marathon event would be less. Many more runners compete in a marathon rather than a 100-meter event. Therefore, the average time will be determined with greater precision.
C. The standard deviation for the marathon event would be less. All the runners finish the race in a matter of seconds. In the marathon, the runners can be quite widely spread after running that long distance.
D. The standard deviation for the 100-meter event would be less. All the runners finish the race in a matter of seconds. In the marathon, the runners take at least a few hours to complete the course.

answer

note: Since the difference between running times in the 100-meter event will be within a few seconds of each other, the running times will have small variation. In the marathon, since the running times are likely to be minutes apart, the times will have greater variation. Thus, the marathon running times will have a greater standard deviation.

question

3.2.RA-1
According to the Empirical Rule, ________ will be within two standard deviations of the mean.

answer

note: According to the Empirical Rule, if a distribution is unimodal and symmetric, approximately 95% of the observations will be within two standards deviation of the mean.

question

3.2.RA-2
The Empirical Rule applies to distributions that are

answer

answer: symmetric and unimodal.
note: According to the Empirical Rule, if a distribution is unimodal and symmetric, approximately 68% of the observations will be within one standard deviation of the mean, approximately 95% of the observations will be within two standard deviations of the mean, and nearly all the observations will be within three standard deviations of the mean.

question

3.2.RA-3
A standard unit measures which of the following?
A. How many standard deviations away an observation is from the mean
B. How many standard deviations away an observation is from the median
C. The magnitude of the standard deviation
D. The interval within which approximately 68% of the observations fall

answer

A. How many standard deviations away an observation is from the mean
note: A standard unit is how many standard deviations away an observation is from the mean. A measurement converted to standard units is called a z-score.

question

3.2.RA-4
If an observation has a z-score of 0, this means which of the following?
Choose the correct answer below.
A. The observation is equal to the standard deviation.
B. The observation is equal to the median.
C. The z-score was computed incorrectly.
D. The observation is equal to the mean.

answer

D. The observation is equal to the mean.
note: If an observation has a z-score of 0, then it is equal to the mean. The mean is 0 standard deviations away from itself, so it has a z-score of 0.

question

3.2.RA-5 Which of the following can be used to compare values measured in different units, such as inches and pounds?
z-score
standard deviation
standard error
interquartile range

answer

answer: z-score
note: The z-score measures distance from a mean in terms of standard deviations, so it can be used to compare values measured in different units, such as inches and pounds.

question

The value that would be right in the middle if you were to sort the data from smallest to largest is called the

answer

median
note: The median is the value that would be right in the middle if you were to sort the data from smallest to largest. About 50% of the observations are below it and about 50% of the observations are above it.

question

For what purpose is the median used?
A. To give the spread of a distribution
B. To measure the variation of a data set
C. To give a typical value of a data set
D. None of these

answer

C. To give a typical value of a data set
note: The median is a typical value of a data set. It is used particularly when the distribution is skewed.

question

The median is often used for which of the following types of distribution?
Uniform
Skewed
Symmetric
Bimodal

answer

Skewed
note: The median is often used for skewed distributions. The mean is not often used for skewed distributions because skew affects the mean more than it affects the median.

question

When a distribution is skewed, the _______ is used to measure the center and the _______ is used to measure variation.

answer

note: The mean and standard deviation are used to measure the center and variation, respectively, when a distribution is symmetric.

question

The interquartile range tells us how much space the _____ of the data occupy.

answer

note: The interquartile range tells us how much space the middle 50% of the data occupy. It is found by subtracting the third quartile from the first quartile.

question

Name two measures of the center of a distribution, and state the conditions under which each is preferred for describing the typical value of a single data set.
What are two measures of the center of a distribution?
interquartile range and standard deviation
first quartile and third quartile
median and mean

answer

median and mean

question

Name two measures of the center of a distribution, and state the conditions under which each is preferred for describing the typical value of a single data set.
Under what conditions is the median preferred?
A. The median is preferred when there are few data points.
B. The median is preferred when the data is strongly skewed or has outliers.
C. The median is preferred when there are many data points.
D. The median is preferred when the data is relatively symmetric.

answer

B. The median is preferred when the data is strongly skewed or has outliers.
note: The median provides a better measure of center when the data is skewed or has outliers because the presence of an outlier has a much greater effect on the mean.

question

Name two measures of the center of a distribution, and state the conditions under which each is preferred for describing the typical value of a single data set.
Under what conditions is the mean preferred?
A. The mean is preferred when the data is relatively symmetric.
B. The mean is preferred when the data is strongly skewed or has outliers.
C. The mean is preferred when there are few data points.
D. The mean is preferred when there are many data points.

answer

A. The mean is preferred when the data is relatively symmetric.

question

In a right-skewed distribution, which of the following is true?
A. The mean and median are approximately the same.
B. The mean tends to be greater than the median.
C. The mean tends to be less than the median.
D. None of these

answer

B. The mean tends to be greater than the median.
note: The mean tends to be greater than the median in a right-skewed distribution. This is because the higher values to the right of the center pull the mean up more than they affect the median.

question

If the mean and the median of a distribution are approximately the same, then the shape of the distribution is likely to be _______.

answer

note: If the mean and the median of a distribution are approximately the same, then the shape of the distribution is likely to be symmetric.

question

When a distribution contains outliers, which of the following is the best choice for a measure of center?
Choose the correct answer below.
Interquartile range
Mean
Median
Standard deviation

answer

note: The median is resistant to outliers, so when a distribution contains outliers, the median is the best choice for a measure of center.

question

Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme value, we say the median is _____ to outliers.

answer

note: The median is resistant to outliers. This makes it a good choice for a measure of center when a distribution is skewed.

question

When comparing groups, if one group is strongly skewed or has outliers and the other is symmetric, which of the following should be used to compare the groups?
A. The median and interquartile range for the skewed group and the mean and standard deviation for the symmetric group
B. The mean and standard deviation for the skewed group and the median and interquartile range for the symmetric group
C. The means and standard deviations
D. The medians and interquartile ranges

answer

D. The medians and interquartile ranges
note: When comparing two distributions, one should always use the same measures of center and spread for both distributions. Since the mean will be affected by the skew or outliers in the first distribution, use the median and interquartile range for both distributions.

question

In your own words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What action(s) should be taken with an outlier?

answer

Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are mistakes, they might be removed or corrected. If they are not mistakes, you might do the analysis twice, once with and once without the outliers.
note: Outliers are observed values that lie outside the range of the main group of data. When an outlier is present, the observer needs to consider it more closely. Sometimes it is just a mistake that happened while collecting the data and can be corrected or discarded. Other times it is a legitimately observed value. In those cases, the analysis needs to be presented once with outliers and once without outliers to give a better idea of what a typical value is. Note that in statistics, potential outliers are defined as observations that are more than 1.5 interquartile ranges below the first quartile or above the third quartile, not above or below the median. Also note that a potential outlier is not the same thing as an outlier.

question

Which measure of the center (mean or median) is more resistant to outliers, and what does "resistant to outliers" mean?

answer

The median is more resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers.
note: The median is more resistant to outliers than the mean, especially when the outliers have extreme values. The presence of an extreme value can cause the mean to become very skewed because it will shift heavily in the direction of the extreme value. The amount the median shifts by is based on the number of data observations, because it is determined by the middle value after ordering all the observations from lowest to highest. If there is only one outlier with an extremely large value the median will shift very slightly, while the mean will change significantly.

question

A dieter recorded the number of calories he consumed at lunch for one week. As you can see, a mistake was made on one entry. The calories are listed in increasing order below.
349, 371, 386, 398, 412, 4190
When the error is corrected by removing the extra 0, will the mean change? Will the median? Explain without doing any calculation.

answer

note: The median is resistant to outliers and extreme values because it orders the data from lowest to highest and looks at the middle value. The highest value does not change the order, and so it does not change the median. The mean is the balancing point for the data set. When looking at the shape of a histogram, the mean is the point which balances the weight on both sides. If an extreme value is placed on one end of the mean, it has to shift in that direction to keep everything balanced.

question

Why is the mean different from the median?

answer

note: The median gives a better measure of center for this distribution because the professor's age is an outlying observation. The median also tends to give a better representation of a typical observation in a skewed distribution.

question

In a boxplot, the vertical line inside the box marks the location of the

answer

median.

question

The length of the box in a boxplot is proportional to which of the following? Choose the correct answer below.
IQR
Mean
Median
Standard deviation

answer

note: The length of the box in a boxplot is proportional to the IQR. The left edge of the box is at the first quartile and the right edge is at the third quartile.

question

In a boxplot, potential outliers are points that are more than ___ IQRs from the edges of the box.

answer

In a boxplot, potential outliers are points that are more than 1.5 IQRs below the first quartile or above the third quartile.

question

In a boxplot, the whiskers extend to which of the following?
Choose the correct answer below.
A. The smallest and largest values in the data set
B. To the most extreme values that are not potential outliers
C. To the first and third quartiles
D. None of these

answer

note: In a boxplot, the whiskers extend to the most extreme values that are not potential outliers. Potential outliers are then represented by others markers, such as dots.

question

What is the first step to do with potential outliers?
Choose the correct answer below.
A. Eliminate them from the data set
B. Assume there was an error in the sampling process
C. Assume there was an error in entering the data
D. Investigate further

answer

The first step with potential outliers is always to investigate. A potential outlier might not be an outlier at all. Or a potential outlier might tell an interesting story, or it might be the result of an error in entering data.

question

Boxplots are NOT recommended for use with which of the following types of distributions?
Unimodal
Skewed
Symmetric
Bimodal or any multimodal distribution

answer

Correct answer: Bimodal or any multimodal distribution
Boxplots are best used only for unimodal distributions because they hide bimodality or any multimodality.

question

Which of the following is NOT one of the five numbers needed to make a boxplot?
Q1
The maximum
Median
Mean

answer

The mean is not shown in a boxplot, so it is not used to construct boxplots.