function()
Function{}
return()
\[ \bar{x} = \frac{\sum x}{n} \]
A vector of numeric values (x
)
\[ \bar{x} = \frac {\sum \color{red}{x}} {n} \]
x
\[ \bar{x} = \frac {\color{red}{\sum x}} {n} \]
x
x
\[ \bar{x} = \frac {\sum x} {\color{red}{n}} \]
x
x
\[ \bar{x} = \color{red}{\frac {\sum x} {n}} \]
Return the value contained in output
\[ \color{red}{\bar{x}} = \frac {\sum x} {n} \]
Now that we have made our function, we can test it by comparing the output it gives us to the regular old mean()
function in base R!
Let’s simulate some random values to test our function with:
Let’s see how our function compares to mean()
in base R
We can use test_data
as a test case:
It wasn’t magic after all!!
\[\sigma^2 = \frac{\sum(x - \bar{x})^2}{n-1}\]
A vector of numeric values (x
)
\[ \sigma^2 = \frac {\sum(\color{red}{x} - \bar{x})^2} {n-1} \]
x
using our mean_function()
\[ \sigma^2 = \frac {\sum(x - \color{red}{\bar{x}})^2} {n-1} \]
Calculate the top part (numerator) of the formula
\[ \sigma^2 = \frac {\sum(x \color{red}{-} \bar{x})^2} {n-1} \]
Calculate the top part (numerator) of the formula
\[ \sigma^2 = \frac {\sum(x - \bar{x})^\color{red}{2}} {n-1} \]
Calculate the top part (numerator) of the formula
\[ \sigma^2 = \frac {\color{red}{\sum}(x - \bar{x})^2} {n-1} \]
\[ \sigma^2 = \frac {\sum(x - \bar{x})^2} {\color{red}{n-1}} \]
\[ \sigma^2 = \color{red}{ \frac {\sum (x - \bar{x})^2} {n-1} } \]
Return the resulting value from Step 4 in the process
\[ \color{red}{\sigma^2} = \frac {\sum (x - \bar{x})^2} {n-1} \]
Let’s use our test_data
again to see if our function works:
[1] TRUE
The mystery is disappearing before our eyes!!
mean
and var
\[ SE = \frac{s}{\sqrt{n}} \]
Where \(s\) is the standard deviation of the sample (i.e. the square root of the variance)
A vector of numeric values (x
)
\[ SE = \frac{s}{\sqrt{n}} \]
x
by taking the square root of our var_function()
\[ SE = \frac {\color{red}{s}} {\sqrt{n}} \]
x
by taking the square root of our var_function()
x
\[ SE = \frac {s} {\color{red}{\sqrt{n}}} \]
x
by taking the square root of our var_function()
x
\[ SE = \color{red}{\frac {s} {\sqrt{n}} } \]
Return the resulting value from this division
\[ \color{red}{SE} = \frac {s} {\sqrt{n}} \]
And of course you can start to simplify things e.g.
Demystification is a philosophy, not a lesson
The core idea is emboldening and empowering learners
Programming has a special capacity to overwhelm:
You go to check-in on a student and see the following:
You go to check-in on a student and see the following:
Error in tibble(Group = rep(c(“G1”, “G2”), each = 10), ReactionT = rnorm(20, : could not find function “tibble”
We realise that they did not load ‘tidyverse’, so
Students are used to high-stakes assessments (not used to ‘move fast and break things’-mentality)
Red is scary!
Language is often convoluted and technical
We need to actively help create the right relationship with errors
Welch Two Sample t-test
data: ReactionT by Group
t = 0.78859, df = 17.395, p-value = 0.441
alternative hypothesis: true difference in means between group G1 and group G2 is not equal to 0
95 percent confidence interval:
-77.8362 171.0075
sample estimates:
mean in group G1 mean in group G2
554.0498 507.4641
Error in t.test.formula(formula = Group ~ ReactionT, data = d, paired = FALSE) : grouping factor must have exactly 2 levels
?help
(documentation)Pre-plan helpful errors for live coding
For instance:
?help
(documentation)Error: unexpected input in: ” “Group” = rep(c(“G1”, “G2”), each = 10), “ReactionT” = rnorm(20, 500, 150)) %>”
But we can only think of so many errors…
… so get help from your students!
“My favourite error”-activity:
This can help form foundation for a community error library
So when trying the off-the-shelf mean()
-function:
Warning: argument is not numeric or logical: returning NA
Let’s compare to our function from earlier:
Error in sum(x) : invalid ‘type’ (character) of argument
Let’s write an error message:
Now let’s test it:
Now let’s test it:
[1] 2
Now let’s test it:
[1] 2
Error in mean_function(test_chr) : mean_function must take an array of numbers
When starting from realistic raw data, nearly 80% of the data analytic effort for this task involves skills not commonly taught—namely, importing, manipulating, and transforming tabular data.