Can You Recover From Sleep Deprivation?

How would a statistician answer:

Does sleep deprivation impair performance even after a full night of recovery sleep?

Experimental design

Student-aged volunteers were recruited and divided into two groups:

The "Deprived" group had a sleepless night, then a recovery night.
The "Unrestricted" group had good sleep for two nights.

Experimental design

Student age volunteers were recruited and divided into two groups:

The "Deprived" group had a sleepless night, then a recovery night.
The "Unrestricted" group had good sleep for two nights.

Both groups were tested on a cognitive task both before and after the two-night ordeal and their performance was assessed with some quantitative measure.

The improvement (positive or negative) in their score after the sleepless/sleepful experience was recorded as data.

Results: Back-to-back stem plot

Results: Same data, dot plot

Question 1: Convinced yet?

Homework A. Question #1: In looking at the data in the stem plot and in the dot plot on the previous slides, are you convinced that sleep deprivation impairs performance after a single night's recovery

Alternatives for assigning groups

How should we assign groups?

The "nice" way: allow each subject to decide which group they will be in.

Alternatives for assigning groups

How should we assign groups?

The "nice" way: allow each subject to decide which group they will be in.
The "evil" way: ask subjects to state a preference, then assign them to the opposite group.

Nice versus evil?

HW #2a: Would you expect the results of the "nice" study to be any different from the "evil" one?

Nice versus evil?

HW #2b: Would you expect the results of the "nice" study to be any different from the "evil" one? How?

Assigning groups, best practice

The "random" way: let each possible group assignment have an equal chance of being used.

The problem with random assignment

With random group assignment, the single most "evil" grouping is just as likely as any other grouping.

The problem with random assignment

With random group assignment, the single most "evil" grouping is just as likely as any other grouping.

Can we prove anything? No, not like a mathematician can prove the Pythagorean Theorem.

But with a well-shuffled deck we can say something precise about likelihood.

The problem with random assignment

With random group assignment, the single most "evil" grouping is just as likely as any other grouping.

Can we prove anything? No, not like a mathematician can prove the Pythagorean Theorem.

But with a well-shuffled deck we can say something precise about likelihood. For example…

HW #3: …if you figure out that there are 20,000 ways of shuffling the deck, each equally likely, and you decide that 2,000 of those shuffles are "evil," how likely are you to get an evil shuffle? In other words, how frequently, as a percentage, would evil occur, if you repeated the experiment many times?

Main question

How can we assess the strength of the evidence seen in the experiment?

Hint 1: The null hypothesis

Most studies that try to show an "effect" start by framing a statement, called the null hypothesis, that asserts that "there is no effect," i.e. sleep deprivation has no effect on performance after a night of recovery.

Our null hypothesis: All subjects' scores would have been exactly the same, no matter what group they had been assigned to.

Hint 1: The null hypothesis

Our null hypothesis: All subjects' scores would have been exactly the same, no matter what group they had been assigned to.

If the null hypothesis is correct, then any evidence of an effect that we think we see in the data is actually due to just unlikely grouping of subjects with naturally varying scores.

Hint 2: The test statistic

The test statistic is a value used to assess the strength of the evidence for or against the null hypothesis. The test statistic is always easy to compute from data, but not necessarily easy to directly interpret.

Our test statistic will be "Mean of Unrestricted Improvement" Minus "Mean of Deprived Improvement." Our test statistic has a value of (19.82 - 3.9) or 15.92 in the experiment.

Hint 2: The test statistic

Our test statistic will be "Mean of Unrestricted Improvement" Minus "Mean of Deprived Improvement." Our test statistic has a value of (19.82 - 3.9) or 15.92 in the experiment.

More positive separation between the two groups' performances (i.e. a greater test statistic) implies stronger evidence that the null hypothesis is false, and that sleep deprivation does impair performance.

Hint 3: Permutations

A permutation means we move the subjects around from group to group—so that some of the Deprived become Unrestricted and some of the Unrestricted become Deprived, but the number of subjects in each group remains the same.

Hint 3: Permutations

In practice, we would have to repeat the experiment with each permutation, but if the null hypothesis were true, we would know that each subject's score, in their new group, would be the same as in their original one. With that knowledge we could compute the new test statistic for any permutation.

Hint 4: A computer can cycle through all permutations

A computer can cycle through all possible permutations, and calculate the test statistic for each possible grouping—assuming the null hypothesis is true. For dividing 21 subjects into two groups of 11 Deprived and 10 Unrestricted, there only 352,716 unique groupings.

Hint 5: Evil versus extreme

An "evil" grouping places subjects most susceptible to sleep deprivation into the Deprived group and the rest into the Unrestricted group.

An "extreme" grouping places subjects scoring lower (with or without sleep deprivation) in the Deprived group, and the rest in the Unrestricted group.

Under the null hypothesis, no grouping is evil but the test statistic tells us how extreme it is.

Hint 6: The most extreme grouping

The most extreme grouping places the 11 worst scorers (with or without sleep deprivation) into the Deprived group, and the 10 best scorers into the Unrestricted group. With the scores seen in the experiment, assuming the null hypothesis is true, the maximum possible test statistic is 23.27, compared to 15.92, actually seen.

Hint 6: The most extreme grouping

If the null hypothesis were correct, then there would be only a 1 in 352,716 chance of getting the maximum possible test statistic with a random grouping.

HW #4 If you got the maximum possible test statistic, what could you say about the null hypothesis?

Hint 7: Our objective

We want to define a number that will tell us how likely it would be to see similarly extreme data, if the null hypothesis were true.

If we find it unlikely, we interpret that result as strong evidence against null hypothesis and for the statement that performance is impaired even after a night of recovery.

The hard question + tools to answer it:

You have:

the test statistic seen in the actual experiment, 15.92, and
a computer that can run through all possible permutations, and compute the test statistic under different groupings.

HWA #5: What are we going to tell the computer to do with the results? And how will we interpret what we get?

These slides are located at:

http://stats.seancarver.org/sleep.html