When the AAMC revised the MCAT in April 2015, there was much fanfare and hype about two new subjects on the test: biochemistry and psychology. Seemingly under the radar, the AAMC actually added another new topic to the exam that most students are unaware of: experimental design.

The MCAT has always included questions that tangentially asked about how an experiment was put together, but never before has this area been so explicitly and so heavily tested. The two new “scientific inquiry and reasoning skills” on the exam – “reasoning about the design and execution of research” and “data-based and statistical reasoning” – comprise between 35 and 40 questions on the new test. To put that in perspective, that’s more questions than physics and organic chemistry, combined. So, what do they actually mean when they talk about testing experimental design? Broadly, that means three things:

 

Looking for proper controls

Every researcher knows that an experiment has to have proper controls. There’s no way to tell if your treatment had any effect if you don’t have a baseline to compare against. While this isn’t new for the MCAT, the degree to which it is being tested is.

We also need to be aware of the difference between positive and negative controls. A negative control is how we typically think of a “control group”. That is, a group that gets no treatment at all. Researchers need to see what happens when we don’t use the treatment or experimental effect. By contrast, a “positive” control is a group that does get a treatment, but not an experimental one. The positive control group gets a treatment that we already know works, sometimes referred to as the “gold standard.” Researchers need to do both positive and negative controls when doing biomedical research, especially when the research has a psychosocial component.

As an example, consider a case in which researchers are investigating a new pain medication. The negative control group would receive the standard sugar pill placebo. The experimental group would receive the new drug. And the positive control group would receive a drug already proven to reduce pain – aspirin, for example. On the MCAT, test takers would be expected to look for some sort of discussion of these control groups in the passage. If the discussion is missing, then there’s a pretty serious flaw in the study. Which leads us to:

Spotting limitations in experimental procedures

Many times the questions on the MCAT will have the researchers making conclusions that are much too broad for the given data. To succeed on the exam, you’ll need to spot when researchers reach too far.

As mentioned earlier, the simplest example is a lack of controls. Missing a control group is such a serious flaw in an experimental design, that almost no conclusions can then be drawn based on the data.

More common, however, is when the data set is reasonable, yet not strong enough to make the sorts of conclusions the questions will present. The two big issues the MCAT likes to test are p-values and correlation/causation flaws.

In the latter case, we must remember that data showing correlations (whenever A shows up, B also does) do not directly prove causation (A causes B). In a recent official sample passage from the AAMC, the test showed that parents with anxiety problems tended to have children with anxiety problems. The test then asks us whether or not we can conclude a genetic causation link here. Needless to say, the answer is “no” (I won’t say more to avoid spoiling that question).

The p-value is a statistical tool that determines how likely it is that the results of an experiment are pure chance. Suppose we are testing the pain medication mentioned earlier. We find that the average pain relief report for users of the new experimental drug is 10% more relief than with aspirin. That sounds good, but we have to make sure this is significant, so we test the p-value for our data. If the value is, say, 0.01 then that means there is only a 1% probability that our findings are totally random. There’s a 99% likelihood that our findings are not just due to random chance.

Finally, we should note that when it comes to the p-value, smaller is better. A p-value of 0.0001 means there’s only a 0.01% chance that the results are random, and a 99.99% chance that the null hypothesis (data is random) is false. The rule of thumb here is that the p-value must be 0.05 or smaller for an experiment to have meaningful data. The AAMC is fond of giving us an experiment that has a p-value of 0.1 (or higher!) and expecting us to recognize that the data doesn’t show a statistically significant link. Without that validation, even if their appears to be a relationship, the numbers don’t support it and we cannot say it is true.

Solving “calculation” questions without doing any calculation

Finally, data-based reasoning often involves doing calculation-based questions. The test may ask us to do calculations related to physics, general chemistry, biochemistry, or even biology. Although we won’t see many, they can be exceptionally time-consuming and frustrating. After all, you could spend several precious minutes solving a math problem, only to get a number that isn’t even among the answer choices!

So when it comes to managing tough calculations, you need to do two things: look at the units and estimate aggressively. When it comes to the units, the MCAT loves to give us questions with choices using two different units. For example, a choice might ask about the tension in a muscle and give two answer choices in Newtons and two answer choices in joules. While the unprepared test-taker would just dive right into equations, the savvy student will first eliminated the two choices given in joules. We know that joules are a measure of energy, but tension is a force. It’s measured in N, not J. At that point, it’s often the best test-management choice to simply take your best guess and move on, so as to conserve precious time. Your goal is to work smarter, not harder on test day.

Second, the calculations should only be done after estimating aggressively. If you’re doing a limiting reagent problem with stoichiometry and they tell you that you have 364 g of sucrose, we wouldn’t round that off to 365 g or even 360 g. Instead, we would see that the molecular weight of glucose is around 342 g/mol, and round off “364 g of sucrose” as “about 1 mole”. Numerical answer choices are always spread fairly far apart, allowing us to do that kind of rounding off without sacrificing accuracy.

So how do I practice these skills?

The only way to develop the expertise you need is to practice, practice, practice! To that end, you should start by going to http://www.aamc.org and picking up a copy of the official practice that the AAMC offers. That includes the Official Guide as well as the Sample Test.

Most students find, however, that the limited resources offered by the AAMC are not enough. They want the opportunity to practice multiple full exams and full sections under timed conditions. To that end, students often find success with the packages of full length exams available at nextstepmcat.com.

About the author

Bryan Schnedeker is the National MCAT Director at Next Step Test Preparation, a company that specializes in 1-on-1 tutoring for the MCAT. Bryan has taught the MCAT for over a decade and has scored a 44 on the test himself.