16 O
16.1 object
A word that identifies and stores the value of some data for later use.
Sometimes objects are also called variables. An object in R:
 contains only letters, numbers, full stops, and underscores
 starts with a letter or a full stop and a letter
 distinguishes uppercase and lowercase letters (
rickastley
is not the same asRickAstley
)
The following are valid and different objects:
 songdata
 SongData
 song_data
 song.data
 .song.data
 never_gonna_give_you_up_never_gonna_let_you_down
The following are not valid objects:
 _song_data
 1song
 .1song
 song data
 songdata
16.2 observation
All of the data about a single trial or question.
In a tidy dataset, each row contains only one observation.
Each row contains 3 observations:
library(dplyr)
library(tidyr)
untidy < data.frame(
id = 1:5,
score_1 = sample(1:7, 5),
score_2 = sample(1:7, 5),
score_3 = sample(1:7, 5),
rt_1 = rnorm(5, 800, 100) %>% round(),
rt_2 = rnorm(5, 800, 100) %>% round(),
rt_3 = rnorm(5, 800, 100) %>% round()
)
id  score_1  score_2  score_3  rt_1  rt_2  rt_3 

1  6  4  2  679  923  908 
2  7  5  6  884  821  701 
3  5  1  7  696  951  1011 
4  4  3  5  774  751  805 
5  3  6  3  814  1047  636 
Now each row contains 1 observation:
tidy < untidy %>%
gather(var, val, score_1:rt_3) %>%
separate(var, c("var", "trial")) %>%
spread(var, val)
id  trial  rt  score 

1  1  679  6 
1  2  923  4 
1  3  908  2 
2  1  884  7 
2  2  821  5 
2  3  701  6 
3  1  696  5 
3  2  951  1 
3  3  1011  7 
4  1  774  4 
4  2  751  3 
4  3  805  5 
5  1  814  3 
5  2  1047  6 
5  3  636  3 
16.3 onetailed
A statistical test for which the critical region consists of all values of the test statistic greater or less than a given value.
See pvalue for a comparison of onetailed and twotailed tests.
16.4 operator
A symbol that performs some mathematical or comparative process.
Arithmetic operators in R
Operator  Definition  Example 

+ 
Addition: adds two numbers  3+2 = 5 
 
Subtraction: subtracts the second number from the first (32 = 1 ) 

* 
Multiplication: multiplies two numbers  3*2 = 6 
/ 
Division: divides the first number by the second  3/2 = 1.5 
%% 
Modulus: returns the remainder after dividing the first number by the second  3%%2 = 1 
^ 
Exponent: raises the first number to the power of the second  3^2 = 9 
Relational operators in R
Operator  Definition  Example 

== 
Equal to 
1 == 1 or "A" == "A"

!= 
Not equal to 
1 != 2 or "A" != "B"

> 
Greater than 
2 > 1 or "B" > "A"

>= 
Greater than or equal to 
2 >= 1 or "B" >= "A"

< 
Less than 
1 < 2 or "A" < "B"

<= 
Less than or equal to 
1 <= 2 or "A" <= "B"

%in% 
Match operator  "A" %in% LETTERS 
Logical operators in R
Operator  Definition  Example 

& 
AND (compares each element of vectors)  c(T, T, F, F) & c(T, F, T, F) == c(T, F, F, F) 
 
OR (compares each element of vectors)  c(T, T, F, F)  c(T, F, T, F) == c(T, T, T, F) 
&& 
AND (only compares the first element of vectors)  c(T, F) && c(T, F) == TRUE 
 
OR (only compares the first element of vectors)  c(T, F)  c(F, F) == TRUE 
! 
NOT  !TRUE == FALSE 
16.5 ordinal
Discrete variables that have an inherent order, such as level of education or dislike/like.
Ordinal variables are not necessarily evenly spaced. That is, there may be a bigger (or smaller) difference between any two consecutive items. E.g., the first and second element on an ordinal variable may be further apart than the second and third; in other words, you can assume 3 is higher than 2 on a likert scale, but not by how much, and you cannot assume that 2 is just as far away from 1 as it is from 3. Therefore, think carefully before averaging ordinal values, since the average of 2 and 4 is not necessarily equal to 3.
16.6 outer join
A mutating join that lets you join up rows in two tables while keeping all of the information from both tables (full_join)
The term "outer join" is more commonly used in SQL. See full_join for the R version.
16.7 outlier
A data point that is extremely distant from most of the other data points
Outliers can be clear errors (e.g., a value of 1.56 cm for human height), fully random extreme values (e.g., 0.27% of values from a normal distribution are expected to be more than 3 SD from the mean), or reflect potential moderators (e.g., reaction times when paying attention versus being distracted).
The unthinking "rule" to label all data points more than 3 SD from the mean as outliers is not considered to be a good way to deal with outliers. The paper below contains useful suggestions.
Leys, C., Delacre, M., Mora, Y. L., Lakens, D., & Ley, C. (2019). How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on PreRegistration. International Review of Social Psychology, 32(1), 5. DOI: 10.5334/irsp.289