5 NHST: Binomial test and One-Sample t-test

5.1 Overview

This chapter will introduce you to Null Hypothesis Significance Testing (NHST): In quantitative research we collect a sample, calculate a summary statistic about that sample, and use probability to establish the likelihood of that statistic occurring given certain situations. The aim is to draw inferences from a sample to a population.

However these concepts and ideas are hard to grasp at first and take playing around with data a few times to help get a better understanding of them. As such, to demonstrate these ideas further, and to start introducing commonly used tests and approaches to reproducible science, we will look at data related to sleep. The study we will look at, to explore NHST more, is by one of our team members and makes use of a well known task in Psychology, the Posner Paradigm: Woods et al. (2009). The clock as a focus of selective attention in those with primary insomnia: An experimental study using a modified Posner paradigm .

In this chapter, through the activities, we will:

Introduce testing a hypothesis through null hypothesis significance testing (NHST).
Learn about Binomial tests.
Learn about One-sample t-tests.

5.2 Brief Introduction to NHST

This will be a very short introduction to the concept of NHST before we dive into the more practical aspects. The main idea of NHST is that you are testing that there is no significant difference between two values - let's say two means for the moment. The null hypothesis states that there is no difference between the two means (or groups) of interest. And as such, any test that you do on the difference between the means of those two groups is trying to determine the probability of finding a difference of the size you found, or larger, in the sample you are using in your experiment, if there is actually no real difference between the two groups in the population.

Let's say we ran an experiment and collected a sample for it: In the experiment, we have two groups, A and B, and we calculated the difference between the means of those two groups to be D_diff = 7.39. Putting that in terms of the Null Hypothesis (H_0) we are now wanting to know: What is the probability of finding a difference between means of -7.39 (or larger) if there is no real difference between the two groups in the population? In order to test this question (our Null Hypothesis) we need to compare our observed difference against a distribution of possible differences to see how likely the observed difference is in that distribution - extreme values, i.e., large differences between groups, are located in the tails of a distribution.

The p-value is the probability of finding a difference equal to or greater than the one we found in our sample if there is no difference in the population. Thus, let's say our p-value is p = .017. This indicates a very small probability of finding a difference equal or greater in the population if there was no difference. The obtained p-value is also smaller than the standard cut-off that we use in Psychology of \(p <= .05\). As such we would reject our null hypothesis and suggest that there is a significant difference between the two groups.

The following resource by Daniel Lakens is very helpful in deepening the understanding of the p-value and we recommend taking a look at it: Understanding common misconceptions about p-values

5.3 Background of data: Sleep

Woods and colleagues (2009) were interested in how the attention of people with poor sleep (Primary Insomnia - PI) was more tuned towards sleep-related stimuli than the attention of people with normal sleep (NS). Woods et al. hypothesised that participants with poor sleep would be more attentive to images related to a lack of sleep (i.e., an alarm clock showing 2AM) than participants with normal sleep would be. To test their hypothesis, the authors used a modified Posner paradigm, shown in Figure 1 of the paper, where images of an alarm clock acted as the cue on both valid and invalid trials, with the symbol ( .. ) being the target.

As can be seen in Figure 3 of Woods et al. found that, on valid trials, whilst Primary Insomnia participants were faster in responding to the target, suggesting a slight increase in attention to the sleep related cue compared to the Normal Sleepers, there was no difference between groups. In contrast, for invalid trials, where poor sleep participants were expected to be distracted by the cue, the authors did indeed find a significant difference between groups consistent with their alternative hypothesis \(H_{1}\). Woods et al., concluded that poor sleepers (Primary Insomnia participants) were slower to respond to the target on invalid trials, compared to Normal Sleepers, due to the attention of the Primary Insomnia participants being drawn to the misleading cue (the alarm clock) on the invalid trials. This increased attention to the sleep-related cue led to an overall slower response to the target on these invalid trials.

Now, imagine you are looking to replicate this finding from Woods et al. (2009). As a pilot study, to test recruitment procedures, you gather data from 22 Normal Sleepers. It is common to use only the control participants in a pilot (in this case the NS participants) as they are more plentiful in the population than the experimental group (in this case PI participants) and saves using participants from the PI group which may be harder to obtain in the long run.

After gathering your data, you want to check the recruitment process: Whether or not you were able to draw a sample of normal sleepers similar to the sample drawn by Woods et al. To keep things straightforward, allowing us to understand the analyses better, we will only look at valid trials today, in NS participants, but in effect you could perform this test on all groups and conditions.

Are These Participants Normal Sleepers (NS)?

Below is the data from the 22 participants you have collected in your pilot study. Their mean reaction time for valid trials (in milliseconds) is shown in the right hand column, valid_rt.

Table 5.1: Pilot Data for 22 Participants on a Sleep-Related Posner Paradigm. ID is shown in participant column and mean reaction time (ms) on valid trails is shown in valid_rt column.
participant	valid_rt
1	631.2
2	800.8
3	595.4
4	502.6
5	604.5
6	516.9
7	658.0
8	502.0
9	496.7
10	600.3
11	714.6
12	623.7
13	634.5
14	724.9
15	815.7
16	456.9
17	703.4
18	647.5
19	657.9
20	613.2
21	585.4
22	674.1

If you look at Woods et al (2009) Figure 3 you will see that, on valid trials, the mean reaction time for NS participants was 590 ms with a SD = 94 ms. As above, as part of our pilot study, we want to confirm that the 22 participants we have gathered are indeed Normal Sleepers. We will use the mean and SD from Woods et al., to confirm this. Essentially we are asking if the participants in the pilot are responding in a similar fashion as NS participants in the original study.

When using NHST we are working with both a null hypothesis (\(H_{0}\)) and an alternative hypothesis (\(H_{1}\)). Thinking about this experiment it makes some logical sense to think about it in terms of the null hypothesis (\(\mu1 = \mu2\)). So we could phrase our hypothesis as: "We hypothesise that there is no significant difference in mean reaction times to valid trials on the modified Posner task between the participants in the pilot study and the participants in the original study by Woods et al."

There is actually a few ways to test this null hypothesis. Today we will show you how to do two of these: In tasks 1-3 we will use a binomial test and in tasks 4-8 we will use a one-sample t-test

The Binomial test is a very simple test that converts all participants to either being above or below a cut-off point, e.g. a mean value, and looking at the probability of finding that number of participants above that cut-off.

The One-sample t-test is similar in that it compares participants to a cut-off, but it compares the mean and standard deviation of the collected sample to an ideal mean and standard deviation. By comparing the difference in means, divided by the standard deviation of the difference (a measure of the variance), we can determine if the sample is similar or not to the ideal mean.

5.4 The Binomial Test

The Binomial test is one of the most "basic tests" in null hypothesis testing in that it uses very little information. The binomial test is used when a study has two possible outcomes (success or failure) and you have an idea about what the probability of success is. This will sound familiar from the work we did in Chapter 4 and the Binomial distribution.

A Binomial test tests if an observed result is different from what was expected. For example, is the number of heads in a series of coin flips different from what was expected. Or in our case for this chapter, we want to test whether our normal sleepers are giving reaction times that are the same or different from those measured by Woods et al. The following tasks will take you through the process.

5.4.1 Task 1: Creating a Dataframe

First we need to create a tibble with our data so that we can work with it.

Enter the data for the 22 participants displayed above into a tibble and store it in ns_data. Have one column showing the participant number (called participant) and another column showing the mean reaction time (called valid_rt).

ns_data <- tibble(participant = c(NULL,NULL,...), valid_rt = c(NULL,NULL,...))

You could type each value out or copy and paste them from the hint below.

You can use this code structure and replace the NULL values:

ns_data <- tibble(participant = c(NULL,NULL,...), valid_rt = c(NULL,NULL,...))

The values are: 631.2, 800.8, 595.4, 502.6, 604.5, 516.9, 658.0, 502.0, 496.7, 600.3, 714.6, 623.7, 634.5, 724.9, 815.7, 456.9, 703.4, 647.5, 657.9, 613.2, 585.4, 674.1

5.4.2 Task 2: Comparing Original and New Sample Reaction Times

Our next step is to establish how many participants from our pilot study are above the mean in the original study by Woods et al.

In the original study the mean reaction time for valid trials was 590 ms. Store this value in woods_mean.
Now write code to calculate the number of participants in the new sample (ns_data created in Task 1) that had a mean reaction time greater than the original paper's mean. Store this single value in n_participants.
The function nrow() may help here.
nrow() is similar to count() or n(), but nrow() returns the number as a single value and not in a tibble.
Be sure whatever method you use you end up with a single value, not a tibble. You may need to use pull() or pluck().

Part 1

woods_mean <- value

Part 2

A few ways to achieve this. Here are a couple you could try

ns_data %>% filter(x ? y) %>% count() %>% pull(?)

ns_data %>% filter(x ? y) %>% summarise(n = ?) %>% pull(?)

ns_data %>% filter(x ? y) %>% nrow()

dim[] %>% pluck()

Quickfire Questions

The number of participants that have a mean reaction time for valid trials greater than that of the original paper is:

5.4.3 Task 3: Calculating Probability

Our final step for the binomial test is to compare our value from Task 2, 16 participants, to our hypothetical cut-off. We will work under the assumption that the mean reaction time from the original paper, i.e. 590 ms, is a good estimate for the population of good sleepers (NS). If that is true then each new participant that we have tested should have a .5 chance of being above this mean reaction time (\(p = .5\) for each participant).

To phrase this another way, the expected number of participants above the cut-off would be \(.5 \times N\), where \(N\) is the number of participants, or \(.5 \times 22\) = 11 participants. * Calculate what would be the probability of observing at least 16 participants out of your 22 participants that had a valid_rt greater than the Woods et al (2009) mean value.
* hint: We looked at very similar questions in Chapter 4 using dbinom() and pbinom() * hint: The key thing is that you are asking about obtaining X or more successes. You will need to think back about cut-offs and lower.tails.

Think back to Chapter 4 where we used the binomial distribution. This question can be phrased as, what is the probability of obtaining X or more successes out of Y trials, given the expected probability of Z.

How many Xs? (see question)
How many Ys? (see question)
What is the probability of being either above or below the mean/cut-off? (see question)
You can use a dbinom() %>% sum() for this or maybe a pbinom()

Quickfire Questions

Using the Psychology standard \(\alpha = .05\), do you think these NS participants are responding in a similar fashion as the participants in the original paper? Select the appropriate answer:
According to the Binomial test would you accept or reject the null hypothesis that we set at the start of this test?

The probability of obtaining 16 participants with a mean reaction time greater than the cut-off of 590 ms is p = .026. This is smaller than the field norm of p = .05. As such we can say that, using the binomial test, the new sample appears to be significantly different from the old sample as there is a significantly larger number of participants above the cut-off (M = 590ms) than would be expected if the new sample and the old sample were responding in a similar fashion. We would therefore reject our null hypothesis!

5.5 The One-Sample t-test

The binomial test of the null hypothesis testing suggested that there was a significant difference in mean reaction times to valid trials on the modified Posner task between the participants in the pilot study and the participants in the original study by Woods et al. However, the binomial test did not use all the available information in the data because each participant was simply classified as being above or below the mean of the original paper, i.e. yes or no. Information about the magnitude of the discrepancy from the mean was discarded. This information is really interesting and important however and if we wanted to maintain that information then we would need to use a One-sample \(t\)-test.

In a One-sample \(t\)-test, you test the null hypothesis \(H_0: \mu = \mu_0\) where:

\(H_0\) is the symbol for the null hypothesis,
\(\mu\) (pronounced mu - like few with an m) is the unobserved population mean. It is the unobserved true mean of all possible participants. We don't know this value. Our best guess is the mean of the sample of 22 participants so we will use that mean here. As such will substitute this value into our formula, which we call \(\bar{X}\) (pronounced X-bar), instead of \(\mu\).
and \(\mu_0\) (mu-zero) is some other mean to compare against (which could be an alternative population or sample mean or a constant). For us this is the mean of the original paper which we observed to be 590 ms.

And we will do this by calculating the test statistic \(t\) which comes from the \(t\)-distribution - more on that distribution below and in the lectures. The formula to calculate the observed test statistic \(t\) for the one-sample \(t\)-test is:

\[t = \frac{\mu - \mu_0}{s\ / \sqrt(n)}\]

\(s\) is the standard deviation of the sample collected,
and \(n\) is the number of participants in the sample.

So, we are testing the null hypothesis that \(H_0: \bar{X} =\) 590. As such the formula for our one-sample \(t\)-test becomes:

\[t = \frac{\bar{X} - \mu_0}{s\ / \sqrt(n)}\]

Now we just need to fill in the numbers.

5.5.1 Task 4: Calculating the Mean and Standard Deviation

Calculate the mean and standard deviation of valid_rt for our 22 participants (i.e., for all participant data).
Store the mean in ns_data_mean and store the standard deviation in ns_data_sd. Make sure to store them both as single values!

In the below code, replace NULL with the code that would find the mean, m, of ns_data.

ns_data_mean <- summarise(NULL) %>% pull(NULL)

Replace NULL with the code that would find the standard deviation, sd, of ns_data.

ns_data_sd <- summarise(NULL) %>% pull(NULL)

5.5.2 Task 5: Calculating the Observed Test Statistic

From Task 4, you found out that \(\bar{X}\), the sample mean, was 625.464 ms, and \(s\), the sample standard deviation, was 94.307 ms. Now, keeping in mind that \(n\) is the number of observations/participants in the sample, and \(\mu_0\) is the mean from Woods et al. (2009):

Use the One-sample t-test formula above to compute your observed test statistic. Store the answer in t_obs .
t_obs <- (x - y)/(s/sqrt(n))

Quickfire Questions

Answering this question will help you in this task as you'll also need these numbers to substitute into the formula:

The mean from Woods et al. (2009) was , and the number of participants in our sample is: (type in numbers) .
Remember the solutions at the end of the chapter if you are stuck. To check that you are correct without looking at the solutions though - the observed \(t\)-value in t_obs, to two decimal places, is

Remember BODMAS and/or PEDMAS when given more than one operation to calculate. (i.e. Brackets/Parenthesis, Orders/Exponents, Division, Multiplication, Addition, Subtraction)

t_obs <- (sample mean - woods mean) / (sample standard deviation / square root of n)

5.5.3 Task 6: Comparing the Observed Test Statistic to the t-distribution using `pt()`

Now you need to compare t_obs to the t-distribution to determine how likely the observation (i.e. your test statistic) is under the null hypothesis of no difference. To do this you need to use the pt() function.

Use the pt() function to get the \(p\)-value for a two-tailed test with \(\alpha\) level set to .05. The test has \(n - 1\) degrees of freedom, where \(n\) is the number of observations contributing to the sample mean \(\bar{X}\). Store the \(p\) value in the variable pval.
Do you reject the null?
Hint: The pt() function works similar to pbinom() and pnorm().
Hint: Because we want the p-value for a two-tailed test, multiply pt() by two.

Remember to get help you can enter ?pt in the console.

The pt() function works similar to pbinom() and pnorm():

pval <- pt(test statistic, df, lower.tail = FALSE) * 2
Use the absolute value of the test statistic; i.e. ignore minus signs.
Remember, df is equal to n-1.
Use lower.tail = FALSE because we are wanting to know the probability of obtaining a value higher than the one we got.
Reject the null at the field standard of p < .05

5.5.4 Task 7: Comparing the Observed Test Statistic to the t-distribution using `t.test()`

Now that you have done this by hand, try using the t.test() function to get the same result. Take a moment to read the documentation for this function by typing ?t.test in the console window. No need to store the t-test output in a dataframe, but do check that the p-value matches the pval in Task 6.

The structure of the t.test() function is t.test(column_of_data, mu = mean_to_compare_against)

The function requires a vector, not a table, as the first argument. You can use the pull() function to pull out the valid_rt column from the tibble ns_data with pull(ns_data, valid_rt).

You also need to include mu in the t.test(), where mu is equal to the mean you are comparing to.

Quickfire Questions

To make sure you are understanding the output of the t-test, try to answer the following questions.

To three decimal places, type in the p-value for the t-test in Task 7
As such this One-sample t-test is
The outcome of the binomial test and the one sample t-test produce answer

5.5.5 Task 8: Drawing Conclusions about the new data

Given these results, what do you conclude about how similar these 22 participants are to the original participants in Woods et al (2009) and whether or not you have managed to recruit sleepers similar to that study?

Think about which test used more of the available information?
Also, how reliable is the finding if the two tests give different answers?

We have given some of our thoughts at the end of the chapter.

5.6 Practice Your Skills

This activity uses open data from Experiment 1 of Mehr, Song, and Spelke (2016). This exercise is taken from the Open Stats Lab at Trinity University, US.

In order to complete these tasks you will need to download the data .csv file and the .Rmd file, which you need to edit, titled Ch5_PracticeSkills_Template.Rmd. These can be downloaded within a zip file from the link below. Once downloaded and unzipped, you should create a new folder that you will use as your working directory; put the data file and the .Rmd file in that folder and set your working directory to that folder through the drop-down menus at the top. Download the Exercises .zip file from here.

Now open the .Rmd file within RStudio. You will see there is a code chunk for each task. Follow the instructions on what to edit in each code chunk. The exercises in this section will guide you to do some data wrangling, run One-Sample t-tests on published data, and visualise data. Thus, you can check the published paper to see if you were able to replicate their results.

5.6.2 Load in the data

Call tidyverse to the library() and load in the data (socialmelodies_exp1.csv) and store them in object melodydata.

library("tidyverse")

melodydata <- NULL

View the data

It is always a good idea to familiarise yourself with the layout of the data that you have just loaded in. You can do this through using glimpse() or View() in the Console window. Note, you will not analyse all of these variables. Try to find the variables that are relevant to the study description above.

The Tasks:

Now that we have the data loaded, tidyverse attached, and have viewed our data, you should now try to complete the following tasks. Go through the tasks and change only the NULL with what the question asks for and then make sure that the file knits at the end so that you have a fully reproducible code.

5.6.3 Task 1 - Filter data

This data file includes the variables for all 5 experiments reported in the paper. We only want to analyze the data for Experiment 1. Using the filter() function, create a new data frame melodydata_exp1 that only contains data of Experiment 1 (variable exp1 with value 1 indicates all data fro Experiment 1). Hint: The new data frame should have 32 observations.

melodydata_exp1 <- NULL

5.6.4 Task 2 - Select data

As noted above, we will not be using all variables from melodydata_exp1 and in order to get a better overview of the variables we are particularly interested in, we want to create an object that only contains the variables we are focusing on here.

Some context: * First, you want to show that infants' looking behavior did not differ from chance during the baseline trial (Baseline_Proportion_Gaze_to_Singer). In other words, the infants did not show an attentional bias prior to hearing the unfamiliar others sign the song. * Second, you want to examine whether the proportion of infants' looking behaviour toward the singer of the familiar melody (Test_Proportion_Gaze_to_Singer) was higher than chance at the test phase.

Thus, create a new data frame called melodydata_exp1_reduced that only contains the variables id, Baseline_Proportion_Gaze_to_Singer, and Test_Proportion_Gaze_to_Singer.

Hint: The new data frame contains 32 observations and 3 variables.

melodydata_exp1_reduced <- NULL

5.6.5 Task 3 - Testing baseline gazing proportion

Perform a One-sample t-test to examine whether the proportion of time spent looking at the person singing the familiar song at baseline did not differ from chance (0.5).

t.test(pull(NULL, NULL), mu = NULL)

5.6.6 Task 4 - Testing test trial gazing proportion

Now, perform a One-sample t-test to examine whether the proportion of infants' looking behaviour toward the singer of the familiar melody was higher than chance at the test phase (0.5).

t.test(pull(NULL, NULL), mu = NULL)

5.6.7 Task 5 - Gathering data into a different format

We want to create a boxplot to depict the proportion of time infants spent looking at the singer of the familiar song at the baseline and test trials. However, before we can do that, we need to slightly change the format of our melodydata_exp1_reduced data frame: Because we need to represent the same cases (individuals) twice in the same figure, we need to reorganise the data so it has all of the gaze proportions as one variable (Proportion), but with a separate variable indicating whether it belongs to the baseline or test trial (TrialType). Put differently, each participant should be represented by two rows; one row with the baseline proportion and another row with the test trial proportion.

Store the new data frame in object melodydata_exp1_wide.

Hint: You will need pivot_longer function for this step. Hint: The new data frame has 64 observations and 3 variables (id, TrialType, Proportion).

melodydata_exp1_wide <- melodydata_exp1_reduced %>% 
  pivot_longer(cols = NULL,
               names_to = NULL,
               values_to = NULL)

5.6.8 Task 6 - Visualise baseline and test trial data

Generate a boxplot to depict the proportion of time infants spent looking at the singer of the familiar song at the baseline and test trials.

Turn off the legend using the guides() as it isn't needed because the x-axis tells you which trial is which.

ggplot(NULL)

5.6.9 Task 7 - Compare your analyses with the analyses reported in the published paper

Finally, check the results section for Experiment 1 in the published paper and compare the results from your analyses with the ones reported in the paper: Mehr, Song, and Spelke (2016). What do you find?

Well done, you are finished! Make sure to knit this .Rmd file. You have successfully replicated analyses and data visualisation from a published paper. Check your answers against the solutions at the end of the chapter, too.

5.7 Solutions to Questions

Below you will find the solutions to the questions for the Activities for this chapter. Only look at them after giving the questions a good try and speaking to the tutor about any issues.

5.7.1 The Binomial Test

5.7.1.1 Task 1

ns_data <- tibble(participant = 1:22,
                  valid_rt = c(631.2,800.8,595.4,502.6,604.5,
                               516.9,658.0,502.0,496.7,600.3,
                               714.6,623.7,634.5,724.9,815.7,
                               456.9,703.4,647.5,657.9,613.2,
                               585.4,674.1))

Return to Task

5.7.1.2 Task 2

woods_mean <- 590

n_participants <- ns_data %>%
  filter(valid_rt > woods_mean) %>%
  nrow()

Giving an n_participants value of 16

Return to Task

5.7.1.3 Task 3

You can use the density function:

sum(dbinom(n_participants:nrow(ns_data), nrow(ns_data), .5))

## [1] 0.0262394

Or, the cumulative probability function:

pbinom(n_participants - 1L, nrow(ns_data), .5, lower.tail = FALSE)

## [1] 0.0262394

Or, If you were to plug in the numbers directly into the code:

sum(dbinom(16:22,22, .5))

## [1] 0.0262394

Or, finally, remembering we need to specify a value lower than our minimum participant number as lower.tail = FALSE.

pbinom(15, 22, .5, lower.tail = FALSE)

## [1] 0.0262394

It is better practice to use the first two solutions, which pull the values straight from ns_data, as you run the risk of entering an error into your code if you plug in the values manually.

Return to Task

5.7.2 The One-Sample t-test

5.7.2.1 Task 4

For ns_data_mean use summarise() to calculate the mean and then pull() the value.
For ns_data_sd use summarise() to calculate the sd and then pull() the value.

# the mean
ns_data_mean <- ns_data %>%
  summarise(m = mean(valid_rt)) %>%
  pull(m)  

# the sd
ns_data_sd <- ns_data %>%
  summarise(sd = sd(valid_rt)) %>%
  pull(sd)

NOTE: You could print them out on the screen if you wanted to "\n" is the end of line symbol so that they print on different lines

cat("The mean number of hours was", ns_data_mean, "\n")
cat("The standard deviation was", ns_data_sd, "\n")

## The mean number of hours was 625.4636 
## The standard deviation was 94.30693

Return to Task

5.7.2.2 Task 5

t_obs <- (ns_data_mean - woods_mean) / (ns_data_sd / sqrt(nrow(ns_data)))

Giving a t_obs value of 1.7638067

Return to Task

5.7.2.3 Task 6

If using values straight from ns_data, and multiplying by 2 for a two-tailed test, you would do the following:

pval <- pt(abs(t_obs), nrow(ns_data) - 1L, lower.tail = FALSE) * 2L

Giving a pval of 0.0923092

But you can also get the same answer by plugging the values in yourself - though this method runs the risk of error and you are better off using the first calculation as those values come straight from ns_data. :

pval2 <- pt(t_obs, 21, lower.tail = FALSE) * 2

Giving a pval of 0.0923092

Return to Task

5.7.2.4 Task 7

The t-test would be run as follows, with the output shown below:

t.test(pull(ns_data, valid_rt), mu = woods_mean)

## 
##  One Sample t-test
## 
## data:  pull(ns_data, valid_rt)
## t = 1.7638, df = 21, p-value = 0.09231
## alternative hypothesis: true mean is not equal to 590
## 95 percent confidence interval:
##  583.6503 667.2770
## sample estimates:
## mean of x 
##  625.4636

Return to Task

5.7.2.5 Task 8

According to the one-sample t-test these participants are responding in a similar manner as the participants from the original study, and as such, we may be inclined to assume that the recruitment process of our pilot experiment is working well.

However, according to the binomial test the participants are responding differently from the original sample. So which test result should you take as the finding?

Keep in mind that the binomial test is very rough and categorises participants into yes or no. The one-sample t-test uses much more of the available data and to some degree would give a more accurate answer. However, the fact that two tests give really different answers may give you reason to question whether or not the results are stable and potentially you should look to gather a larger sample to get a more accurate representation of the population.

Return to Task

5.7.3 Practice Your Skills

5.7.3.1 Load in the data

library("tidyverse")

melodydata <- read_csv("socialmelodies_exp1.csv")

Return to Task

5.7.3.2 Task 1 - Filter data

melodydata_exp1 <- melodydata %>%
  filter(exp1 == 1)

Return to Task

5.7.3.3 Task 2 - Select data

melodydata_exp1_reduced <- melodydata_exp1 %>%
  select(id, Baseline_Proportion_Gaze_to_Singer, Test_Proportion_Gaze_to_Singer)

Return to Task

5.7.3.4 Task 3 - Testing baseline gazing proportion

t.test(pull(melodydata_exp1_reduced, Baseline_Proportion_Gaze_to_Singer), mu = 0.5)

Return to Task

5.7.3.5 Task 4 - Testing test trial gazing proportion

t.test(pull(melodydata_exp1_reduced, Test_Proportion_Gaze_to_Singer), mu = 0.5)

Return to Task

5.7.3.6 Task 5 - Gathering data into a different format

melodydata_exp1_wide <- melodydata_exp1_reduced %>% 
  pivot_longer(cols = Baseline_Proportion_Gaze_to_Singer:Test_Proportion_Gaze_to_Singer,
               names_to = "TrialType",
               values_to = "Proportion")

Return to Task

5.7.3.7 Task 6 - Visualise baseline and test trial data

ggplot(data = melodydata_exp1_wide, 
       aes(x = TrialType, 
           y = Proportion, 
           fill = TrialType)) + 
  geom_boxplot() +
  guides(fill = FALSE)

Return to Task

5.7.3.8 Task 7 - Compare your analyses with the analyses reported in the published paper

When looking at the results section you should see that the two reported one-sample t-tests in the published paper and are the same that you have produced as part of this exercise. In addition, Fig 2a in the published paper should resemble yours. Well done!

Return to Task

4 Revisiting Probability Distributions

6 NHST: Two-Sample t-test

5 NHST: Binomial test and One-Sample t-test

5.1 Overview

5.2 Brief Introduction to NHST

5.3 Background of data: Sleep

5.4 The Binomial Test

5.4.1 Task 1: Creating a Dataframe

5.4.2 Task 2: Comparing Original and New Sample Reaction Times

5.4.3 Task 3: Calculating Probability

5.5 The One-Sample t-test

5.5.1 Task 4: Calculating the Mean and Standard Deviation

5.5.2 Task 5: Calculating the Observed Test Statistic

5.5.3 Task 6: Comparing the Observed Test Statistic to the t-distribution using pt()

5.5.4 Task 7: Comparing the Observed Test Statistic to the t-distribution using t.test()

5.5.5 Task 8: Drawing Conclusions about the new data

5.6 Practice Your Skills

5.6.1 Does Music Convey Social Information to Infants?

5.6.2 Load in the data

5.6.3 Task 1 - Filter data

5.6.4 Task 2 - Select data

5.6.5 Task 3 - Testing baseline gazing proportion

5.6.6 Task 4 - Testing test trial gazing proportion

5.6.7 Task 5 - Gathering data into a different format

5.6.8 Task 6 - Visualise baseline and test trial data

5.6.9 Task 7 - Compare your analyses with the analyses reported in the published paper

5.7 Solutions to Questions

5.7.1 The Binomial Test

5.7.1.1 Task 1

5.7.1.2 Task 2

5.7.1.3 Task 3

5.7.2 The One-Sample t-test

5.7.2.1 Task 4

5.7.2.2 Task 5

5.7.2.3 Task 6

5.7.2.4 Task 7

5.7.2.5 Task 8

5.7.3 Practice Your Skills

5.7.3.1 Load in the data

5.7.3.2 Task 1 - Filter data

5.7.3.3 Task 2 - Select data

5.7.3.4 Task 3 - Testing baseline gazing proportion

5.7.3.5 Task 4 - Testing test trial gazing proportion

5.7.3.6 Task 5 - Gathering data into a different format

5.7.3.7 Task 6 - Visualise baseline and test trial data

5.7.3.8 Task 7 - Compare your analyses with the analyses reported in the published paper

5.5.3 Task 6: Comparing the Observed Test Statistic to the t-distribution using `pt()`

5.5.4 Task 7: Comparing the Observed Test Statistic to the t-distribution using `t.test()`