The Statistical Process

Section 4.1 The Statistical Process

Note: this section is still in development.

Objectives

Students will be able to:

Define and identify the population, parameter, sample and statistic
Identify sampling methods: simple random, stratified, systematic, and convenience
Identify and discuss types of bias association with sampling
Distinguish between experimental, quasi-experimental, and observational studies

Subsection 4.1.1 Statistical Process Overview

Statistics is a branch of mathematics that involves collecting, describing, and iterpreting data. It is used to test claims, such as whether a new drug is effective, and make predictions, such as how many people will vote for a certain candidate in an election. A statistical study typically involves the following steps:

Question: Identify the question of interest.
Design: Determine what data are needed to answer the question and how to collect the data.
Data: Collect the data.
Analysis: Perform calculations and draw conclusions from the collected data.
Report: Communicate the results of the study to others.

Example 4.1.1.

A dentist wants to know if flossing daily results in healthier gums than brushing alone. To answer this question, they can use the statistical process:

Question: Does flossing daily result in healthier gums than brushing alone?
Design: To answer this question, the dentist decides to survey their patients about their flossing habits and then compare the results of the survey with their clinical observations.
Data: To gather the data, the dentist creates a survey and asks patients to fill it out at the beginning of their appointment. The survey includes questions about how often they floss, how often they brush, and their gum health. The dentist does not look at the survey answers until after the appointment, so they can make their clinical observations without bias.
Analysis: After collecting the data, the dentist compares the survey answers with their clinical observations to see if there is a correlation between flossing habits and gum health.
Report: The dentist writes a report summarizing the findings of the study and shares it with their colleagues and patients.

You may have noticed some potential flaws in the dentist’s study. For example, the patients may not answer the survey questions truthfully because they might be embarrassed about their flossing habits. There may be confounding variables, such as the patients’ overall health or age, that impact both flossing and gum health. For example, the patients who participate may include a split of young people who do not floss regularly but have healthy gums due to their age and older people who do floss regularly but have unhealthy gums due to their age. We will explore ways to improve the quality of the data later in this section. First, we will introduce some terminology.

Subsection 4.1.2 Statistics Terminology

Some of the more commonly used terms in statistics are explained below.

Subsubsection 4.1.2.1 Population and Sample

Before we begin gathering any data to analyze, we need to identify the population we are studying. The population of a study is the group we want to know something about. For example, if we want to know the average height of 10-year-olds in the United States, the population is all 10-year-olds in the United States.

It isn’t often feasible to collect data from the entire population, so we usually select a smaller group to study. The sample is the subset of the population that is studied.

The intended population is also called the target population, since if we design our study badly, the collected data might not actually be representative of the intended population.

Example 4.1.2.

A newspaper posts a poll on their website asking people who they plan to vote for in the upcoming election, and 129 people respond. What are the target population, the real population of the study, and the sample?

Solution.

While the target (intended) population may have been all voters, the real population of the survey is readers of the website.

The sample is the 129 people who responded to the poll.

Be careful when identifying the sample; the sample consists of all the individuals in the study, not just those meeting a certain criteria. For example, if the newspaper poll asked, "Do you plan to vote for Candidate A?", the sample would still be the 129 people who responded to the poll, not just the people who said they planned to vote for Candidate A.

Example 4.1.3.

A wildlife biologist heard reports of dark charcoal-colored squirrels in Bremerton, Washington. The biologist knows that these charcoal-colored squirrels are a rare genetic mutation of the eastern gray squirrel, called black morph, and they want to know what percentage of the squirrels in Bremerton are black. To find out, they set up a trail camera in a park and take pictures of 100 squirrels that come by. They find that 5 of the squirrels are black. What are the target population, the real population of the study, and the sample?

Solution.

The target population is all squirrels in Bremerton.

The real population of the study is the squirrels that come by the trail camera.

The sample is the 100 squirrels that come by the trail camera.

In the squirrel example, the biologist observed that 5 out of the 100 squirrels were charcoal-colored and 95 were of the more common gray color. This data can be expressed as a percentage, with 5% of the squirrels being charcoal-colored and 95% being gray. The biologist can then use this data to estimate the percentage of all squirrels in Bremerton that are charcoal-colored. Because the percentage was calculated based on a sample, we call it a statistic, which brings us to the next set of terms.

Subsubsection 4.1.2.2 Parameters and Statistics

A parameter is a number that describes a population. In the squirrel example, the parameter is the percentage of all squirrels in Bremerton that are charcoal-colored.

A statistic is a number that describes a sample. In the squirrel example, the statistic is the percentage of squirrels in the sample that are charcoal-colored, which is 5%.

We use statistics to estimate parameters. In the squirrel example, we are using the statistic of 5% to estimate the parameter of the percentage of all squirrels in Bremerton that are charcoal-colored.

Example 4.1.4.

A medical researcher is testing whether a new vaccination is effective against a certain strain of rhinovirus, a common cold. They recruit 20,000 volunteers of various ages, genders, and health conditions. They give the vaccine to 10,000 of the volunteers and a placebo to the other 10,000 volunteers. After a year, they find that 500 of the vaccinated volunteers got sick with the rhinovirus, while 1000 of the placebo volunteers got sick. What are the population, sample, parameters, and statistics in this study?

Solution.

The population is all people who could potentially get the rhinovirus.

The sample is the 20,000 volunteers who participated in the study.

The parameters are the percentage of all people who would still get sick with the rhinovirus if they were vaccinated and the percentage of all people who would get sick if they were not vaccinated.

The statistics are the percentages of volunteers in each group who got sick (5% and 10%).

Subsubsection 4.1.2.3 Types of Studies

An observational study is based on observations or measurements. These observations may be solicited, like in a survey or poll. Or, they may be unsolicited, such as studying the percentage of cars that turn right at a red light even when there is a “no turn on red” sign.

As a general rule, there is no impactful change or intervention involved in an observational study. So a study that compares the pass rates of all the sections of Math in Society at a given college for three quarters is observational. However, a study that compares the pass rates of all sections of Math in Society before and after switching to a new textbook would not be observational because it is testing for differences that might be due to the intervention of using a different book. (It would be quasi-experimental, which is described later in this section.)

An experimental study or experiment measures or assesses the effects of a treatment. In an experiment, some kind of treatment is applied to a randomized sample of the subjects, who make up the treatment group, and the results are compared with those of the subjects who do not receive the treatment. The subjects who do not receive the treatment make up the control group. Assignment to the treatment group and the control group must be random.

In a placebo-controled experiment, the control group receives a placebo, which is a dummy treatment that is designed to be indistinguishable from the real treatment. This is done to control for the placebo effect, which is when the effectiveness of a treatment is influenced by the patient’s perception of how effective they think the treatment will be, so a result might be seen even if the treatment is ineffectual. If the control group receives no treatment or they receive the standard treatment, the experiment is not placebo-controlled. However, you may see studies in which the control group does not receive a placebo called a controlled experiment.

To further prevent human perception from influencing results, some studies are blind, meaning that the subjects do not know whether they are in the treatment group or the control group. A double-blind study is one in which neither the subjects nor the investigators know who is in the treatment group and who is in the control group until after the data has been analyzed.

Example 4.1.5.

A study is conducted to test the effectiveness of a new pain medication. How could this study be designed as an obersvation, a controlled experiment, and a placebo-controlled experiment?

Solution.

An observational study might use a survey of patients in a treatment center. The survey would ask patients about their pain levels and whether they are taking the new medication. The results would be analyzed to see if there is a correlation between taking the medication and pain levels. However, since this is an observational study, we would not be able to determine whether the medication is actually effective or if there are confounding variables that are responsible for the observed correlation.

A controlled experiment would randomly assign patients to either a treatment group that receives the new medication or a control group that receives the standard treatment. After a certain period of time, the pain levels of both groups would be measured and compared to see if there is a significant difference between the two groups. This would allow us to determine whether the new medication is more or less effective than the standard medication.

A placebo-controlled experiment would randomly assign patients to either a treatment group that receives the new medication or a control group that receives a placebo. The patients would not know which group they are in, and the investigators would not know either (double-blind). After a certain period of time, the pain levels of both groups would be measured and compared to see if there is a significant difference between the two groups. This would allow us to determine whether the new medication is more effective than no treatment at all, while controlling for the placebo effect.

Many studies that are published as experiments are not true experiments because they do not have random assignment to the treatment and control groups. These are called quasi-experiments. A quasi-experiment is a study intended to assess the effects of a treatment or intervention that lacks random assignment to the treatment and control groups. There are three main types of quasi-experimental studies:

Pre-post intervention study: data are gathered before and after an intervention is applied. This is commonly used in medicine and education. For example, a psychologist may ask patients to assess their sense of self-worth before and after undergoing talk therapy. A form of pre-post intervention study design in which data is gathered multiple times before and after the intervention is often referred to as interrupted time series design.
Nonequivalent groups: a treatment or intervention is applied to a control group that is not randomly selected or may have additional differences beyond the intervention being studied. For example, a college that offers four sections of a class may assign a teaching assistant to two of the sections to determine if students learn better with a teaching assistant in the class. Since students were not randomly assigned to the class, they may not be representative of the same student populations. Further, if the sections have different instructors with different teaching styles, there will be additional differences in the experinces for the students.
Natural experiment: a natural experiment occurs when a researcher studies the impacts of an intervention that was not made or done by the researcher. For example, if a state enacts new stricter gun laws, researchers can study violent crime rates before and after the laws are enacted. They also might compare these rates with those of other states that did not enact similar laws.

Example 4.1.6.

Classify each of the following studies as an observational study, a controlled experiment, a placebo-controlled experiment, or a quasi-experiment.

A study that compares the test scores of students in a class before and after switching to a new textbook.
A study that randomly assigns patients to either a treatment group that receives a new medication or a control group that receives a fake treatment, with neither the patients nor the investigators knowing which group they are in until after the data has been analyzed.
A study that surveys people about their exercise habits and then analyzes the correlation between exercise and heart health.

Solution.

This is a quasi-experiment, specifically a pre-post intervention study, because it compares test scores before and after an intervention (switching to a new textbook) without random assignment to treatment and control groups.
This is a placebo-controlled experiment because it involves random assignment to a treatment group that receives the new medication and a control group that receives a placebo, with blinding of both patients and investigators.
This is an observational study because it surveys people about their exercise habits and analyzes the correlation between exercise and heart health without any intervention or random assignment.

Of these three studies, the placebo-controlled experiment is the only one that can establish causation. The quasi-experiment can suggest causation but cannot establish it due to potential confounding variables. The observational study can only establish correlation, not causation.

Subsection 4.1.3 Sampling and Bias

As we mentioned in a previous section, the first thing we should do before conducting a survey is to identify the population that we want to study. Suppose we are hired by a politician to determine the amount of support they have among the electorate should they decide to run for another term. What population should we study? Every person in the district? Eligible voters might be better, but what if they don’t register? Registered voters may not vote. What about “likely voters?”

This is the criteria used in a lot of political polling, but it is sometimes difficult to define a “likely voter.” Here is an example of the challenges of political polling.

Example 4.1.7.

In November 1998, former professional wrestler Jesse "The Body" Ventura was elected governor of Minnesota. Up until right before the election, most polls showed he had little chance of winning. There were several contributing factors to the polls not reflecting the actual intent of the electorate:

Ventura was running on a third-party ticket and most polling methods are better suited to a two-candidate race.
Many respondents to polls may have been embarrassed to tell pollsters that they were planning to vote for a professional wrestler.
The mere fact that the polls showed Ventura had little chance of winning might have prompted some people to vote for him in protest to send a message to the major-party candidates.

But one of the major contributing factors was that Ventura recruited a substantial amount of support from young people, particularly college students, who had never voted before and who registered specifically to vote in that election. The polls did not deem these young people likely voters (since in most cases young people have a lower rate of voter registration and a lower turnout rate for elections) so the polling samples were subject to sampling bias: they omitted a portion of the electorate that was weighted in favor of the winning candidate.

So, identifying the population can be a difficult job, but once we have identified the population, how do we choose a good sample? We want our statistic to estimate the parameter we are interested in, so we need to have a representative sample. Returning to our hypothetical job as a political pollster, we would not anticipate very accurate results if we drew all of our samples from customers at a Starbucks, or your list of TikTok followers. How do we get a sample that resembles our population?

Subsubsection 4.1.3.1 Sampling Methods

One way to get a representative sample is to use randomness. We will look at three types of sampling that use randomness and one that does not.

A simple random sample, abbreviated SRS, is one in which every member of the population has an equal probability of being chosen.

Example 4.1.8.

If we could somehow identify all likely voters in the state, put each of their names on a piece of paper, toss the slips into a (very large) hat and draw 1000 slips out of the hat, we would have a simple random sample.

In practice, computers are better suited for this sort of endeavor. It is always possible, however, that even a random sample might end up not being totally representative of the population. If we repeatedly take samples of 1000 people from among the population of likely voters in the state of Oregon, some of these samples might tend to have a slightly higher percentage of Democrats (or Republicans) than the general population; some samples might include more older people and some samples might include more younger people; etc. This is called sampling variation. If there are certain groups that we want to make sure are represented, we might instead use a stratified sample.

In stratified sampling, a population is divided into a number of subgroups (or strata). Random samples are then taken from each subgroup. It is often desirable to make the sample sizes proportional to the size of each subgroup in the population.

Example 4.1.9.

Suppose that data from voter registrations in the state indicated that the electorate was comprised of 39% Democrats, 37% Republicans and 24% Independents. In a sample of 1000 people, they would then expect to get about 390 Democrats, 370 Republicans and 240 Independents. To accomplish this, they could randomly select 390 people from among those voters known to be Democrats, 370 from those known to be Republicans, and 240 from those with no party affiliation.

A way to remember stratified sampling is think about having a piece of layer cake. Each layer represents a stratum or subgroup, and a slice of the cake represents a sample of each layer.

In systematic sampling, every \(n^{\text{th}}\) member of the population is selected to be in the sample. The starting position is often chosen at random.

Example 4.1.10.

To select a systematic sample, Portland Community College could use their database to select a random student from the first 100 student ID numbers. Then they would select every 100th student ID number after that. Systematic sampling is not as random as a simple random sample (if your ID number is right next to your friend’s because you applied at the same time, you could not both end up in the same sample) but it can yield acceptable samples. This method can be useful for people waiting in lines, parts on a manufacturing line, or plants in a row.

Convenience sampling is when samples are chosen by selecting whomever is convenient. This is the worst type of sampling because it does not use randomness.

Example 4.1.11.

A pollster stands on a street corner and interviews the first 100 people who agree to speak to them. This is a convenience sample.

Subsubsection 4.1.3.2 Sampling Bias

The word "bias" as used in statistics is not the same as personal bias. Sampling bias occurs whenever the data collected is not representative of the intended population.

There is no way to correct for biased data, so it is very important to think through the entire study and data analysis before we start. We talked about sampling or selection bias, which is when the sample is not representative of the population. One example of this is voluntary response bias, which is bias introduced by only collecting data from those who volunteer to participate. This can lead to bias if the people who volunteer have different characteristics than the general population. Here is a summary of some additional sources of bias.

Sampling bias occurs whenever the sample is not representative of the population. Some of the reasons for this include:

Voluntary response bias – the sampling bias that can occur when the sample is made up of volunteers

Self-interest study – bias that can occur when the researchers have an interest in the outcome

Response bias – when the responder gives inaccurate responses for any reason

Perceived lack of anonymity – when the responder fears giving an honest answer might negatively affect them

Loaded questions – when the question wording influences the responses

Non-response bias – when people refuse to participate in a study or drop out of an experiment, we can no longer be certain that our sample is representative of the population

Sources of bias may be conscious or unconscious. They may be innocent or as intentional as pressuring by a pollster. Here are some examples of the types of bias.

Example 4.1.12.

Consider a recent study which found that chewing gum may raise math grades in teenagers
¹
Reuters. news.yahoo.com/s/nm/20090423/od_uk_nm/oukoe_uk_gum_learning. Retrieved 4/27/09
. This study was conducted by the Wrigley Science Institute, a branch of the Wrigley chewing gum company. This is an example of a self-interest study; one in which the researches have a vested interest in the outcome of the study. While this does not necessarily mean the study was biased, we should subject the study to extra scrutiny.
Consider online reviews of products and businesses. Customers tend to leave reviews if they are very satisfied or very dissatisfied. While you can look for overall patterns and get useful information, these reviews suffer from voluntary response bias and likely capture more extreme views than the general population.
A survey asks participants a question about their interactions with people of different ethnicities. This study could suffer from response bias. A respondent might give an untruthful answer to not be perceived as racist.
An employer puts out a survey asking their employees if they have a drug abuse problem and need treatment help. Here, answering truthfully might have serious consequences; responses might not be accurate if there is a perceived lack of anonymity and employees fear retribution.
A survey asks, “Do you support funding research on alternative energy sources to reduce our reliance on high-polluting fossil fuels?” This is an example of a loaded or leading question – questions whose wording leads the respondent towards a certain answer.
A poll was conducted by phone with the question, “Do you often have time to relax and read a book?” Fifty percent of the people who were called refused to participate in the survey (Probably because they didn’t have the time). It is unlikely that the results will be representative of the entire population. This is an example of non-response bias.

Loaded questions can occur intentionally by pollsters with an agenda, or accidentally through poor question wording. Also of concern is question order, where the order of questions changes the results. Here is an example from a psychology researcher

Swartz, Norbert. umich.edu/~newsinfo/MT/01/Fal01/mt6f01.html Retrieved 3/31/2009

Example 4.1.13.

“My favorite finding is this: we did a study where we asked students, ’How satisfied are you with your life? How often do you have a date?’ The two answers were not statistically related - you would conclude that there is no relationship between dating frequency and life satisfaction. But when we reversed the order and asked, ’How often do you have a date? How satisfied are you with your life?’ the statistical relationship was a strong one. You would now conclude that there is nothing as important in a student’s life as dating frequency.”

Subsection 4.1.4 Confounding

Confounding occurs when there are two or more potential variables that could have caused the outcome and it is not possible to determine which one actually caused the result.

Example 4.1.14.

A drug company study about a weight loss pill might report that people lost an average of 8 pounds while using their new drug. However, in the fine print you find a statement saying that participants were encouraged to also diet and exercise. It is not clear in this case whether the weight loss is due to the pill, to diet and exercise, or a combination of both. In this case confounding has occurred.
Researchers conduct an experiment to determine whether students will perform better on an arithmetic test if they listen to music during the test. They first give the student a test without music, then give a similar test while the student listens to music. In this case, the student might perform better on the second test, regardless of the music, simply because it was the second test and they were warmed up.

There are a number of measures that can be introduced to help reduce the likelihood of confounding. The primary measure is to use a control group.

Exercises 4.1.5 Exercises

1.

Describe the difference between a sample and a population.

2.

Describe the difference between a statistic and a parameter.

3.

The ASPCC randomly selects 200 students from PCC Cascade campus to participate in a childcare survey in order to determine the demand for additional childcare options for PCC students.

Who is the intended population?
What is the sample?
Is the collected data representative of the intended population? Why or why not?

4.

A local research firm randomly selects 1200 homes in Washington County to determine support for adding compost pick up to residents’ regular garbage service.

Who is the intended population?
What is the sample?
Is the collected data representative of the intended population? Why or why not?

5.

A political scientist surveys 28 of the current 106 representatives in a state’s congress. Of them, 14 said they were supporting a new education bill, 12 said there were not supporting the bill, and 2 were undecided.

Who is the population of this survey?
What is the size of the population?
What is the size of the sample?
Give the statistic for the percentage of representatives surveyed who said they were supporting the education bill.
If the margin of error was 5%, give the confidence interval for the percentage of representatives we might we expect to support the education bill and explain what the confidence interval tells us.

6.

The city of Raleigh has 9,500 registered voters. There are two candidates for city council in an upcoming election: Brown and Feliz. The day before the election, a telephone poll of 350 randomly selected registered voters was conducted. 112 said they’d vote for Brown, 207 said they’d vote for Feliz, and 31 were undecided.

Who is the population of this survey?
What is the size of the population?
What is the size of the sample?
Give the statistic for the percentage of voters surveyed who said they’d vote for Brown.
If the margin of error was 3.5%, give the confidence interval for the percentage of voters surveyed that we might we expect to vote for Brown and explain what the confidence interval tells us.

7.

To determine the average length of trout in a lake, researchers catch 20 fish and measure them. Describe the population and sample of this study.

8.

To determine the average diameter of evergreen trees in a forested park, researchers randomly tag 45 specimens and measure their diameter. Describe the population and sample of this study.

9.

A college reports that the average age of their students is 28 years old. Is this a parameter or a statistic?

10.

A local newspaper reports that among a sample of 250 subscribers, 45% are over the age of 50. Is this a parameter or a statistic?

11.

A recent survey reported that 64% of respondents were in favor of expanding the BIKETOWN bike share system to the greater Portland area. Is this a parameter or a statistic?

12.

Which sampling method is being described?

In a study, the sample is chosen by separating all cars by size and selecting 10 of each size grouping.
In a study, the sample is chosen by writing everyone’s name on a playing card, shuffling the deck, then choosing the top 20 cards.
Every 4th person on the class roster was selected.

13.

Which sampling method is being described?

A sample was selected to contain 25 people aged 18-34 and 30 people aged 35-70.
Viewers of a new show are asked to respond to a poll on the show’s website.
To survey voters in a town, a polling company randomly selects 100 addresses from a database and interviews those residents.

14.

Identify the most relevant source of bias in each situation.

A survey asks the following: Should the mall prohibit loud and annoying rock music in clothing stores catering to teenagers?
To determine opinions on voter support for a downtown renovation project, a surveyor randomly questions people working in downtown businesses.
A survey asks people to report their actual income and the income they reported on their IRS tax form.
A survey randomly calls people from the phone book and asks them to answer a long series of questions.
The Beef Council releases a study stating that consuming red meat poses little cardiovascular risk.
A poll asks, “Do you support a new transportation tax, or would you prefer to see our public transportation system fall apart?”

15.

Identify the most relevant source of bias in each situation.

A survey asks the following: Should the death penalty be permitted if innocent people might die?
A study seeks to investigate whether a new pain medication is safe to market to the public. They test by randomly selecting 300 people who identify as men from a set of volunteers.
A survey asks how many sexual partners a person has had in the last year.
A radio station asks listeners to phone in their response to a daily poll.
A substitute teacher wants to know how students in the class did on their last test. The teacher asks the 10 students sitting in the front row to state their latest test score.
High school students are asked if they have consumed alcohol in the last two weeks.

16.

Identify whether each situation describes an observational study or an experiment.

The temperature on randomly selected days throughout the year was measured.
One group of students listened to music and another group did not while they took a test and their scores were recorded.
The weights of 30 randomly selected people are measured.

17.

Identify whether each situation describes an observational study or an experiment.

Subjects are asked to do 20 jumping jacks, and then their heart rates are measured.
Twenty coffee drinkers and twenty tea drinkers are given a concentration test.
The weights of potato chip bags are weighed on the production line before they are put into boxes.

18.

A team of researchers is testing the effectiveness of a new vaccine for human papilloma virus (HPV). They randomly divide the subjects into two groups. Group 1 receives new HPV vaccine, and Group 2 receives the existing HPV vaccine. The patients in the study do not know which group they are in.

Which is the treatment group?
Which is the control group (if there is one)?
Is this study blind, double-blind, or neither?
Is this best described as an experiment, a controlled experiment, or a placebo-controlled experiment?

19.

Studies are often done by pharmaceutical companies to determine the effectiveness of a treatment. Suppose that a new cancer treatment is under study. Of interest is the average length of time in months patients live once starting the treatment. Two researchers each follow a different set of 40 cancer patients throughout this new treatment.

What is the population of this study?
Would you expect the data from the two researchers to be identical? Why or why not?
If the first researcher collected their data by randomly selecting 8 patients from each of 5 local hospitals, which sampling method did they use?
If the second researcher collected their data by choosing 40 patients they knew, what sampling method did they use? What concerns would you have about this data set, based upon the data collection method?

20.

For the clinical trials of a weight loss drug containing Garcinia Cambogia the subjects were randomly divided into two groups. The first received an inert pill along with an exercise and diet plan, while the second received the test medicine along with the same exercise and diet plan. The patients do not know which group they are in, nor do the fitness and nutrition advisors.

Which is the treatment group?
Which is the control group (if there is one)?
Is this study blind, double-blind, or neither?
Is this best described as an experiment, a controlled experiment, or a placebo-controlled experiment?

21.

A study is conducted to determine whether people learn better with routine or crammed studying. Subjects volunteer from an introductory psychology class. At the beginning of the semester 12 subjects volunteer and are assigned to the routine studying group. At the end of the semester 12 subjects volunteer and are assigned to the crammed studying group.

Identify the target population and the sample.
Is this an observational study or an experiment?
This study involves two kinds of non-random sampling: 1. Subjects are not randomly sampled from a specified population and 2. Subjects are not randomly assigned to groups. Which problem is more serious? What effect on the results does each have?

22.

To test a new lie detector, two groups of subjects are given the new test. One group is asked to answer all the questions truthfully. The second group is asked to tell the truth on the first half of the questions and lie on the second half. The person administering the lie detector test does not know what group each subject is in. Does this experiment have a control group? Is it blind, double-blind, or neither? Explain.

23.

A poll found that 30%, plus or minus 5% of college freshmen prefer morning classes to afternoon classes.

What is the margin of error?
Write the survey results as a confidence interval.
Explain what the confidence interval tells us about the percentage of college freshmen who prefer morning classes?

24.

A poll found that 38% of U.S. employees are engaged at work, plus or minus 3.5%.

What is the margin of error?
Write the survey results as a confidence interval.
Explain what the confidence interval tells us about the percentage of U.S. employees who are engaged at work?

25.

A recent study reported a confidence interval of (24%, 36%) for the percentage of U.S. adults who plan to purchase an electric car in the next 5 years.

What is the statistic from this study?
What is the margin of error?

26.

A recent study reported a confidence interval of (44%, 52%) for the percentage of two-year college students who are food insecure.

What is the statistic from this study?
What is the margin of error?

27.

A farmer believes that playing Barry Manilow songs to his peas will increase their yield. Describe a controlled experiment the farmer could use to test his theory.

28.

A sports psychologist believes that people are more likely to be extroverted as an adult if they played team sports as a child. Describe two possible studies to test this theory. Design one as an observational study and the other as an experiment. Which is more practical?

29.

Find a newspaper or magazine article, or the online equivalent, describing the results of a recent study (not a simple poll). Give a summary of the study’s findings, then analyze whether the article provided enough information to determine the validity of the conclusions. If not, produce a list of things that are missing from the article that would help you determine the validity of the study. Look for the things discussed in the text: population, sample, randomness, blind, control, margin of error, etc.

30.

Use a polling website such as pewresearch.com or gallup.com and search for a poll that interests you. Find the result, the margin of error and confidence level for the poll and write the confidence interval.

Prev Top Next