How to use this book

Welcome to the Fundamentals of Quantitative Analysis

We wrote and designed this book to support RM1 and RM2 on the MSc Psychology Conversion programme, where you will learn core quantitative data skills using R and R Studio. In addition to this book, the course team will support you with demonstration videos and we encourage you to use Teams or office hours to ask any questions.

The ability to work with quantitative data is a key skill for psychologists and by using R and R Studio as our tool, we can also promote reproducible research practices. Although at first it may seem like writing a programming script is more time-consuming than other point-and-click approaches, you speed up with practice. Once you have written a script that does what you need it to do, you can easily re-run your analysis without having to go through each step again manually which is easier and less likely to result in errors if you do something slightly different or forget one of the steps.

Crucially, with an analysis script, you can demonstrate to other researchers how you got from the raw data to the statistics you report in your final paper. Sharing analysis scripts alongside published articles on sites such as the Open Science Framework is now an important open science practice. Even if you do not continue with quantitative research yourself, the skills you develop throughout these courses will allow you to evaluate quantitative research and to understand what goes on behind the scenes to produce their numbers and conclusions, allowing you to become a much more confident and competent consumer and user of research.

How to use this book

We follow a scaffolding approach in this book to build your skills from following along to independently being able to apply these skills to new scenarios.

In each chapter, we first guide you through a set of demonstrations focused on different data skills, highlighting key parts of the output and what it means (just keep in mind we focus on the practical side of doing and interpreting in this book, the course lectures cover the conceptual background). We strongly encourage you to type out the code yourself, as this is good practice for learning to code, but remember you can copy and paste from the book if you need to. Typing the code will seem much slower at first and you will make lots errors, but you will learn much more quickly this way.

As you work through the book, you will see technical terms highlighted like console which link to a glossary we have developed as a team. If you hover the cursor over the highlighted word, it will show you a little definition, and you will be able to see a full list of words we highlighted at the bottom of each chapter. There are also different colour-coded boxes to emphasise different content. These provide information, warn you about something, highlight if something is dangerous and you should really pay attention, and try it yourself boxes when we have activities for you to complete.

Note

These boxes have little interesting - but not critical - bits of information.

Important

These boxes warn you about something important in R / R Studio, so you pay attention when you use it.

Warning

These boxes highlight where you need to be cautious when using or interpreting something, as it might be easy to make an error.

Try this

These boxes will invite you to try something yourself, like complete independent activities or answer questions.

Solution

These boxes will include small hints or solutions to check your understanding. Here we show the explanations by default, but normally they will be collapsed.

After these demonstrations, we give you little activities and independent tasks using a new data set to test your understanding using interactive questions. This gives you instant feedback on whether you could apply the skills or if there is something you need to check again.

Some of these activities are what we call “error mode”. You never stop making errors when coding - we still make them all the time - but you get faster at recognising and fixing common sources of error. Hoffman & Elmi (2021) demonstrated incorporating errors in learning materials can be useful to students, so we will give you a few segments of code containing errors that you need to fix. Seeing errors can be one of the most intimidating parts of learning to code, so this activity will normalise making errors and develop your problem solving skills.

Finally, there are four data analysis journeys we will direct you to at key points throughout the book. We will tell you the end product you are aiming for from a new raw data set we provide, and your job is to break that end product down and identify a list of tasks to get there. Completing an independent data analysis task is the skill set you will need once it comes to assessments and your dissertation, as this is the point where you are the one making decisions.

In contrast to assignments, for all of these activities and data analysis journeys we provide solutions at the end of each chapter. No one is going to check whether you tried to figure out an activity yourself, rather than going straight to the solution, but if you copy and paste without thinking, you will learn nothing. Developing data skills and the knowledge that underpins those skills is like learning a language: the more you practice and the more you use it, the better you become.

Accompanying videos

Most of the chapters of this book have an associated video that you can access via Moodle. These videos are there to support you as you get comfortable in your data skills as you can see how someone else interacts with R / R Studio. However, it is important that you use them wisely. You should always try to work through each chapter of the book on your own first, and only then watch the video if you get stuck, or for extra information.

Finally, this book is a living document. That means occasionally we will make updates to the book such as fixing typos and including additional detail or activities. When we make substantial changes, we will create new support materials such as the videos. However, it would be impossible to record a new video every time we make a minor change to an activity, so you may notice slight differences between the videos and the content of this book. Where there are differences between the book and the video, the book should always be considered the definitive version.

Data set acknowledgements

Almost all the demonstrations and activities in this book use real data from published research to reinforce how these skills are the same ones researchers use behind the scenes of their articles. We use a range of data from our own research, open data that researchers share with their papers, and some we adapted from the Open Stats Lab who created activities using open data.

Intended learning outcomes (ILOs)

By the end of the courses associated with this book, you will be able to:

Write reproducible reports using R Markdown.
Clean and wrangle data into appropriate forms for analysis.
Visualise data using a range of plots.
Conduct and interpret a core set of statistical tests from the general linear model (regression, ANOVA).

Acknowledgements

We put a lot of effort into creating the resources in this book, but occasionally typos or broken links will slip through. When a student helps us and highlights an error or makes a suggestion, we like to acknowledge it. We would like to thank the following students for helping us:

Andrés Rajiv Arora, Liam Caldwell, John Gleeson, Ria Manwar, Verity Rawlins, Jacquelyn Scott, Yee Lam So.

Hoffman, H. J., & Elmi, A. F. (2021). Do Students Learn More from Erroneous Code? Exploring Student Performance and Satisfaction in an Error-Free Versus an Error-full SAS® Programming Environment. Journal of Statistics and Data Science Education, 0(0), 1–13. https://doi.org/10.1080/26939169.2021.1967229