Edit the code chunks below and knit the document. You can pipe your
objects to glimpse()
or print()
to display
them.
Set the vector v1
equal to the following: 11, 13, 15,
17, 19, …, 99, 101 (use a function; don’t just type all the
numbers).
v1 <- NULL
Set the vector v2
equal to the following: “A” “A” “B”
“B” “C” “C” “D” “D” “E” “E” (note the letters are all uppercase).
v2 <- NULL
Set the vector v3
equal to the words “dog” 10 times,
“cat” 9 times, “fish” 6 times, and “ferret” 1 time.
v3 <- NULL
Use apply()
or map()
functions to create a
list of 11 vectors of 100 numbers sampled from 11 random normal
distributions with means of 0 to 1.0 (in steps of 0.1) and SDs of 1.
Assign this list to the object samples
. Set the seed to
321
before you generate the random numbers to ensure
reproducibility.
samples <- NULL
Use apply()
or map()
functions to create a
vector of the sample means from the list samples
in the
previous question.
sample_means <- NULL
Write a function called my_add
that adds two numbers
(x
and y
) together and returns the
results.
my_add <- NULL
Create a vector testing your function my_add
. Every item
in the vector should evaluate to TRUE
if your function is
working correctly.
my_add_test <- NULL
Copy the function my_add
above and add an error message
that returns “x and y must be numbers” if x
or
y
are not both numbers.
my_add <- NULL
Create a tibble called dat
that contains 20 rows and
three columns: id
(integers 101 through 120),
pre
and post
(both 20-item vectors of random
numbers from a normal distribution with mean = 0 and sd = 1). Set seed
to 90210
to ensure reproducible values.
dat <- NULL
Run a two-tailed, paired-samples t-test comparing
pre
and post
. (check the help for
t.test
)
t <- NULL
Use broom::tidy
to save the results of the t-test in
question 8 in a table called stats
.
stats <- NULL
Create a function called report_t
that takes a data
table as an argument and returns the result of a two-tailed,
paired-samples t-test between the columns pre
and
post
in the following format:
“The mean increase from pre-test to post-test was #.###: t(#) = #.###, p = 0.###, 95% CI = [#.###, #.###].”
Hint: look at the function paste0()
(simpler) or
sprintf()
(complicated but more powerful).
NB: Make sure all numbers are reported to three decimal places (except degrees of freedom).
report_t <- NULL
Use inline R to include the results of report_t()
on the
dat
data table in a paragraph below.
Write a function to simulate data with the form.
\(Y_i = \beta_0 + \beta_1 X_i + e_i\)
The function should take arguments for the number of observations to
return (n
), the intercept (b0
), the effect
(b1
), the mean and SD of the predictor variable X
(X_mu
and X_sd
), and the SD of the residual
error (err_sd
). The function should return a tibble with
n
rows and the columns id
, X
and
Y
.
sim_lm_data <- function(n) {
# add code here and define arguments above
}
dat12 <- sim_lm_data(n = 10) |> print() # do not edit
## NULL
Use the function from Question 12 to generate a data table with 10000 subjects, an intercept of 80, an effect of X of 0.5, where X has a mean of 0 and SD of 1, and residual error SD of 2.
dat13 <- NULL
Analyse the data with lm()
. Find where the analysis
summary estimates the values of b0
and b1
.
What happens if you change the simulation values?
mod13 <- NULL
Use the function from Question 6 to calculate power by simulation for the effect of X on Y in a design with 50 subjects, an intercept of 80, an effect of X of 0.5, where X has a mean of 0 and SD of 1, residual error SD of 2, and alpha of 0.05.
Hint: use broom::tidy()
to get the p-value for the
effect of X.
power <- NULL
Calculate power (i.e., the false positive rate) for the effect of X on Y in a design with 50 subjects where there is no effect and alpha is 0.05.
false_pos <- NULL
Make a histogram of the p-values from the simulation above. Use
geom_histogram with binwidth=0.05
and
boundary=0
. What kind of distribution is this?
ggplot()