Not signed in

An experiment is a study that imposes one or more treatments on the participants and controls their environment in such a way that clear comparisons [to lack of treatment] can be made. After the treatments are applied, the responses (i.e. effects or lack there off) are recorded.[1,p.12,60] Contrary to observational studies, which aim to not affect the individuals that are being studied, experiments aim to affect them in order to study the consequence or degree of change. An individual participating in a study is called a subject. A response is the variable whose outcome is of interest in the experiment. A factor is the variable whose effect on the response is being studied. A level is a possible outcome of the factor. Each factor has a certain number of levels. A treatment is a combination of the levels of the factors being studied. If there is only one factor, then levels and treatments are the same thing. If there are several factors, then each combination of levels of the factors is called a treatment.[1,pp.276-277] At the end of the experiment, responses of participants in the treatment group are compared with the responses from the control group. When designed correctly, an experiment can establish a cause-and-effect relationship between treatment and response, if the difference in responses between the treatment group and the control group are statistically significant.[1,pp.12,60-61] Unfortunately, if an experiment was done poorly, nothing can be done after the fact, other than to ignore the results.[1,p.292]

Subjects who are chosen to participate in the experiment are typically divided into two groups: a treatment group and a control group:[1,p.61]

  • Treatment group -
    consists of participants who receive the experimental treatment, effect of which is being studied. More than one treatment group is possible.
  • Control group -
    consists of participants who receive a placebo (a fake treatment), a standard (non-experimental) treatment, or no treatment at all.

A blind experiment is one in which the subjects of the study don't know whether they are in the treatment group or the control group. A blind experiment attempts to control for bias on the part of the participants.[1,p.62] A double-blind experiment means that neither the subjects nor the researchers know who got what treatment or who is in the control group. This controls for potential bias on the part of both the subjects and the researchers. Researchers cannot treat the subjects differently by expecting or not expecting certain responses from certain individuals based on which group they are in. This also allows to control for the placebo effect in both groups.[1,p.288] Typically a third party, who is not otherwise involved in the experiment, puts the results together independently. A double-blind study is best, because researchers often have a special interest in the results.[1,p.62] In some cases it is known which subjects are in which group because it is unconcealable. However, bias can be reduced by not telling the subjects the precise purpose of the study. This type of a study would have to be reviewed by an institutional review board to make sure it isn't unethical to carry out.[1,p.289]

Potential problems that can occur with experiments include:[1,p.12]

  • Researchers and/or subjects who know which treatment they got.
  • Factors not controlled for that affect the outcome.
  • Lack of control group.

An experiment must comply with the following criteria for it to be credible:[1,p.278]

  • Making comparisons - to establish the real effect that a factor has on the response, there has to be a baseline to compare to. This baseline is called the control. Without a control group there is nothing to compare the results to, and it is impossible to know whether the treatment was the real cause of any differences found in the response. Common methods for including control are to administer a fake treatment, a standard treatment or no treatment at all.[1,p.279] A fake treatment (also called a placebo) is not distinguishable from a "real" treatment. A placebo establishes a baseline measure for what responses would have taken place anyway, in absence of any treatment. It also takes into account the placebo effect- a response that people have (or report having) because they know they are getting some type of a "treatment" (regardless if it is real or fake). If placebo effect is not taken into account, any reported side effect has to be assumed to be due to the treatment. This would give an artificially high number of reported side effects, because at least some of those reports are likely due to the placebo effect and not the treatment itself. When there is a control group to compare with, the percentage of people reporting a side effect in the control group is subtracted from the percentage of people in the treatment group reporting the same side effect, allowing to examine the magnitude of the number that remains.[1,p.280] In some situations, such as when the subjects have very serious diseases, offering a fake treatment as an option may be unethical. When ethical reasons bar the use of a fake treatment, the new treatment is compared to at least one existing or standard treatment that is known to be an effective treatment. In situations where the control group receives no treatment, researchers must make sure that the groups of subjects are similar in as many ways as possible.[1,p.281]
  • Choosing a large enough sample size for accurate results - the larger the sample size, the more accurate are the results.[1,p.281] Detecting true statistically significant results in a large population using a small sample is difficult because small data sets have more variability from sample to sample. Statisticians want to have at least five subjects per treatment, but (much) more is (much) better. Sometimes researchers draw conclusions based on only those subjects who completed the study. This can be misleading, because the data does not include information about those who dropped out and why, which can lead to biased data .[1,pp.282-283]
  • Choosing subjects that most accurately represent the target population - in most cases it is not appropriate to randomly select subjects for an experiment. In the setting of experiments, a sample means the group of subjects who have volunteered to participate. [In this sense the sample is not entirely random, because the subjects are self-selecting] However, statisticians can build techniques into the design of an experiment to help minimise the potential bias that can occur.[1,pp.282-283]
  • Assigning subjects to treatment and control groups randomly - once the sample of participants has been decided upon, the subjects should be divided into the treatment group and the control group randomly. Making random assignments of subjects to treatments is an extremely critical step towards minimising bias in an experiment. Subjects must be randomly assigned to groups by a third party and not allowed to choose which group they will be in. The goal of random assignment is to create homogeneous groups, where any unusual characteristics have an equal chance of appearing in any of the groups.[1,pp.283-284]
  • Controlling for confounding variables - a confounding variable is a variable or factor that was not controlled for in the study, but can have an influence on the results. Researchers try to control for as many confounding variables as they can anticipate, trying to minimise their possible effect on the response. In experiments involving human subjects, there are usually many confounding variables. Some, such as timing, can be controlled for in the design of the study. But others depend totally on the individual in the study. The ultimate control of subject-specific confounding variables is to use pairs of people that are matched according to important variables, or to just record responses from the same person twice: once with/after the treatment and once without/before. This type of experiment is called a matched-pairs design.[1,pp.285-286]
  • Respecting ethical issues - some experiment designs that are theoretically possible are not ethical in practice. For example, forcing a subject to engage in a specific behaviour, such as smoking, to study adverse effects on health. It is only possible to study people with a particular condition and work backwards to see what factors may have caused it. But since it is impossible to control for historical confounding variables, singling out one particular cause becomes difficult.[1,p.286] Although causes of diseases can't be determined ethically by conducting experiments, new treatments can be tested using experiments. Medical studies that involve experiments are called medical trials. Serious experiments must pass an extensive series of tests that can take years to carry out. One reason the cost of prescription drugs is so high is the massive amount of time and money needed to conduct research and development of new drugs, most of which fail to pass the tests.[1,pp.286-287]
  • Collecting good data - there are three criteria when it comes to evaluating data quality: reliability, validity and absence of bias. Data is reliable if it is possible to get repeatable results with subsequent measurements. Unreliable data comes from unreliable measurement instruments (such as poor calibration) or unreliable data collection methods (such as ambiguous questions). Data is valid if it measures what it is supposed to measure. Use of the appropriate measurement instrument is important. The data collected must be appropriate for the goal of the study. Data is unbiased if it contains no systematic errors that either add or subtract from the true values. Biased data is such that systematically over-measure or under-measure the true results. Bias can be caused by instrument error, leading questions or preconceived expectations of the researchers.[1,pp.287-288]
  • Analyzing data properly - the choice of analysis is just as important for the quality of the results as any other aspect of the study. A proper analysis should be planned in advance, during the design phase of the study. The analysis has to legitimately and correctly answer the questions initially posed by the study. Some basic types of statistical analyses (also known as statistical inference) include confidence intervals, hypothesis tests and correlation, and regression analyses.[1,p.289]
  • Making appropriate conclusions - the biggest mistakes made when drawing conclusions about the studies are overstating results, making connections that are not backed up by the statistics and generalising the results beyond the scope of the study, commonly onto a broader population than was represented in the sample.[1,pp.290-291]