7964
Lecture 2
This lecture builds on lecture 1 by focusing on issues of gathering data.
I. Descriptive versus Inferential Statistics
One distinction that researchers make is
between summary statistics and statistics that
are used to draw conclusions about a larger (population) group.
-
Descriptive statistics
or information about variables refers to statistics describing
or summarizing variables, one by one--for example, averages (or means), modes,
medians, ranges, standard deviations.
-
inferential statistics
are statistics used not merely to summarize, but to make inferences from sample data
to draw conclusions about the population.
Inferential statistics are used to draw conclusions about a population. A population is
a relatively large group of cases that represents all the cases that the researcher is
interested in. For instance, if the researcher is exploring what factors influence
legislative behavior, the population may be all legislators. Of course, populations
can be defined in a multitude of ways--a researcher may only want to draw conclusions about
members of the U.S. Congress, or about state legislators, or about individuals elected to
European parliaments. Pollsters may want to daw conclusions about all citizens in the U.S.--
or about all U.S. citizens who are eligible to vote--or about all voters in the U.S.--
or about all voters in Louisiana.
II. Samples
- A. Samples and Populations
In order to draw conclusions about the population in question, researchers generally
draw a sample. The sample is a smaller number of cases or units; generally, the
sample
needs to be "representative" of the population in order for the researcher to
draw conclusions.
Sometimes, we talk about the researcher "generalizing" from the sample to the population--
that is, drawing conclusions about the population from data anslysis of the sample.
What if you are studying crime statistics, and you happen to have a database of all
arrests in Baton Rouge? You wish to generalize only to arrests in Baton Rouge--you're
not trying to make any general conclusions about arrests elsewhere. Is your dataset
a population or a sample? Well, in some senses it's a population--you have all the
cases in which you are interested. But in anoother, very important, sense it's a sample--
you want to generalize to all possible, even hypothetical arrests in Baton Rouge--and
you are using the sample of arrests that actually occurred. Likewise, if you have a
dataset of all members of the U.S. House of Representatives in 2005, and are drawing
conclusions only about the behavior of U.S. House Representatives, the data set is
still a sample--because you're trying to draw more genearl conclusions about all
hypothetical members of the U.S. Congress.
When employing inferential statistics, we generally use sampling techniques.
- B. Sampling Techniques
- random samples: every member of the population has an equal chance
of being selected. So, for instance, if the population is "undergraduate students
enrolled full time at LSU", one could randomly select 200 students out of a full-time
enrollment list.
Note that random samples are not always representative--it's possible to randomly
select a non-representative sample. So, researchers often describe the procedure
in which they selected their sample (i.e., was it randomly selected?) and then
describe whether it is representative.
- stratified random sampling: is sometimes used to make a sample
more representative. In stratified sampling, you divide the population up into
particular groups (i.e., based on sex, or income, or occupation, etc.) and then
randomly sample within those groups.
- random assignment: in experiments, a specified group of individuals are
defined (i.e., overweight individuals) and a random sample is taken from this population.
The sample is then randomly divided into two groups: one set of individuals is assigned
to the treatment condition (i.e., weightloss treatment) and one set of
individuals is assigned to the control condition (no weightloss treatment--or placebo).
III. Experiments
- A. Overview
Randomized experiments are those in which individuals
are randomly assigned to two groups--one which has a test condition, and one which
does not (this group may have a placebo). The participants in the experiment generally
do not know which group they have been assigned to. Indeed, in "double-blind
experiments", neither the researchers nor the participants know who has been
assigned to which group. The researchers manipulate the
test condition between the two groups--that is, one group is exposed to the
condition, and the other group is not.
An example would be media experiments--that is, researchers could explore whether
exposure to negative political advertising makes individuals less likely to turn out to
vote. One group would be exposed to the test condition, which is negative political
advertising; the other group would not be exposed to that condition. After the
experiment, researchers could either ask individuals whether they intended to vote--
or they could track the behavior of individuals after the experiment.
The advantage of experiments is that if the two groups are large enough, other influences
on the behavior in question (in this case, voting turnout) would "randomize out"--
the only consistent difference between the two groups would be the test condition (in
this case exposure to negative advertising), and so if turnout (or intention to
vote) is significantly lower among the test condition participants than among the
control/placebo participants, the researchers could conclude that exposure to
negative advertising causes lower turnout. In other words, experiments are
a very useful way to sort out cause and effect.
- B. Limitations of Experiments
- 1. Often experiments rely on
volunteers, and questions can be raised about how representative that sample of volunteers
are of a larger population. For example, if researchers offer financial incentives for
participation in experiments, those from lower socioeconomic backgrounds may be
more likely to participate--and therefore may be a significantly different group than
the population in question. If the group is different in such a way that affects how
it responds to the test condition, then one can't draw conclusions about the population.
So, for instance, those of lower socioeconomic status are perhaps more likely to
be recruited to participate in an experiement--and less likely to vote. That isn't
necessarily a problem for the experiment described above, as long as socioeconomic
status doesn't influence how one reacts to negatie advertising. But consider an
experiment that relies on undergraduate students, asked to participate as a condition of
receiving a grade in a particular course. Undergraduate students may react differently
to negative advertising than the population as a whole--younger individuals, and
specifically individuals in college, may be less influenced by negative advertising.
In this case, one couldn't draw conclusions from the sample used in the experiement to
the larger population of tv viewers.
- 2. There are also ethical considerations in experiments. Occasionally, for instance,
you will hear of medical experiments being suspended because it was becoming increasingly
clear that either a test condition (medical treatment) was harmful for the test condition
group--or that it was so helpful for the test condition group that the researchers
decided to stop the experiement to allow wider access to the treatment.
- 3. A third issue with experiments is that it is difficult to determine how long any effect
of the test condition persists. So, for instance, experiements are sometimes used to
explore whether exposure to violence in media makes individuals more aggressive. If
researchers find that the group that was exposed to violence in media does indeed express
more aggressive opinions in post-experiment surveys, it is still unclear whether this
effect persists over time. Experiments often rely on self-reporting of behavior
after the experiments, which raises issues of memory and truthfulness (particularly
because the behavior is often seen as undesirable--such as smoking).
- 4. A fourth issue is the Hawthorne effect, in which participants respond
differently than they would otherwise simply because they are in an experiment.
For instance, medical treatment may look more effective in clinical trials than
in reality, because experiment participants may be more motivated to carry
out treatment protocol. A similar concept that you may have heard about is
the placebo effect.
- 5. If experiments are not double-blind, and the experimenter knows
who has been assigned to the test condition, and who has been assigned to the
placebo, there can be an experimenter effect; that is, the experimenter
may treat individuals differently based on what case they were assigned to,
or (if the results depends on subjective coding of the data) may be biased when
observing the outcome.
- 6. A sixth issue is that experiments still ideally should take into account
interacting variables. When the effect of the treatment condition on the outcome
depends on some other variable, the two variables are seen as "interacting"
with each other. So, for instance, many experiments have been done on treatments
to encourage smokers to quit. The nicotine patch is the treatment condition; individuals
in the placebo / control group are given a placebo patch. Researchers have found in
these experiments that whether a smoker lives in a household with other smokers is
another important factor in determining whether a smoker can successfully quit--
in fact, although the nicotine patch has a positive effect on discouraging smoking,
the size of that effect depends on whether there are other smokers in the house.
Having other smokers in the house is an additional variable--a variable that
interacts with the variable of "using the nicotine patch".
- C. A few other terms that are relevant to experiments:
- A matched-pair design is an experimental design that uses either two
matched individuals, or uses the same individual to receive each of two treatments.
It is important that the order of the two treatments be randomized--that
is, that in some cases the individual receives treatment A first, and then treatment
B, and in other cases the individual receives treatment B first, and then treatment A.
Again, these are often done in a double blind manner, so that neither the
participant nor the researcher knows which order was used.
- In block designs, individuals are units are divided into homomgeneous groups
called "blocks" and each treatment is randomly assigned to one or more units in each
block. This method actually originated in agricultural research, where the "blocks"
were plots of lands. The treatment might be exposure to particular chemical treatments
for agricultural growth.
- In repeated-measures designs for the social sciences, individuals are repeatedly
measured under different conditions (i.e., drivers could be tested after exposure to
alcohol, marijuana, or no substance). It's important that the order of the
test conditions vary across individuals, so that the effect of time on outcome
"randomizes out". For example, the drivers mentioned above could simply learn to
do better on a driving test after repeated tries, so if the same order of exposure was
used--say, always first alcohol, then sobriety, then marijuana--the effect of marijuana
might look less powerful simply because by the third try, drivers had learned something
about the driving test.
IV. Observational studies
- A. Overview
In observational studies, researchers observe but do not control the explanatory variables.
It becomes much more difficult to draw conclusions about cause and effect. On the other
hand, the advantage of observational studies is that researchers can measure
participants in their natural setting.
An example of an observational study is one conducted by the Boston University School
of Medicine. They compared 665 men who had been admitted to the hospital with their
first hear attack to 772 men in the same age group (21-54) who had been admitted to the
same hospitals for other reasons. There were 35 hospitals involved, all in Massachusetts
and Rhode Island. The study found that the percentage of men that showed some degree
of pattern baldness was higher (42%) among those who had a heart attack than among
those who did not (34%). They found as well that the increase in risk was more
severe with increasing severity of baldness, after adjusting for age and other relevant
factors.
Of course, association is not causality. What other factor--
what confounding variable might cause both a tendency
to heart disease and pattern baldness?
This was an observational study because the researchers couldn't randomly "assign"
baldness to some participants and not to others. Because of that, they know that there
is quite possibly some other factor that is associated with baldness and with
heart disease--it's not that baldness causes heart disease, but that something else
causes both baldness and heart disease. (If they had been able to randomly
assign baldness to some but not other participants, then the other thing
presumably would have randomized out--bald participants would be no more likely
to have that other (hormonal?) condition than non-bald participants.).
- B. Types of Observational Studies
- case-control study
In this design, "cases" who have a particular condition or attribute are
compared to "controls" who do not. An example would be comparing individuals
who had lung cancer with those who had not. Often, the control group should
be somewhat similar to the condition group--for example, medical researchers
often employ case-control studies that recruit participants for both the
condition group and the control group from hospitalized individuals. In that
way, case control studies can control for the health of the participants--
they are all in similar health, but some have a particular condition, and others
do not.
Case-control studies are useful because they can account for the effects
of other variables. They also avoid any ethical considerations
that arise from "assigning" potentially harmful conditions or attributes
to subjects. They are also very efficient--while it would be hypothetically
possible to (for instance) track smokers versus non-smokers over time to see
if they developed lung cancer, the amount of time that this would take would
be prohibitive.
V. Recognizing the Effect of Other Variables
In both observational studies and experimental studies (but particularly in
observational studies) it's important to take into account other variables. For
instance, wearing a nicotine patch is associated with successfully quitting smoking--
but only if there are no other smokers in the house. An experimental design would
randomly assign the patch to some particpants (and a placebo patch to others).
And, the likelihood of having other smokers in the house will randomize out --
since the patch was randomly assigned to participants, those with the patch are no more
likely to have other smokers in the house than those without the patch. After this
experiment, almost half of the patch group had quit smoking, while only about 20% of
the placebo group had quit. However, the researchers recognized that when there were
no other smokers in the home, the eight-week success rate was actually 58% for
the patch group--when there were smokers in the home, the eight-week success rate for
the patch group was 31%. So it would be misleading to claim that 50% of smokers
quit with the patch--the effect of the patch depended on whether there were smokers
in the home. When the effect of one variable depends on the value of another--that
is, the effect of the patch depended on "no other smokers in home"--we say that
the variables interact with each other.
This experiment also illustrates what we've covered regarding representative samples
and problems of timing. The subjects recruited for the sample had to be
smokers (at least a pack a day for the last year--and they were physiologically tested
to make sure that they were smokers), they had to be in good health, and (most important)
they had to be motivated to quit. So, conclusions can only be drawn to a similar population.
In terms of timing, "quitting" was defined as "self-reported abstinence (not even a
puff) since the last visit and an expired air carbon monoxide level of 8 ppm
or less." This is often an issuse when measuring the effectiveness of alcohol treatment
programs--it is difficult to accurately physiologically measure exposure to alcohol,
and often treatment programs rely on self-reports.
VI. Questions
(You can fill these out on the discussion board on blackboard.>
- 1. Describe an experiment (hypothetical or real) that you believe was designed well.
Explain why you think it was a well designed experiment. What are the limits of experiments?
- 2. Describe an observational study (hypothetical or real) that you believe was
designed well. Explain why you think it was a well designed observational study. What
are the limits of observational studies?
- 3. Describe and explain the significance of the following terms. If possible,
give an example.
- descriptive versus inferential statistics
- sample; population
- representative sample
- random samples
- stratified random samples
- double-blind
- control groups
- placebo effect
- interaction between two variables
- Hawthorne effect