13  Assessment 3

A project demonstrating data wrangling skills

13.1 Instructions

Your task is to replicate this html report using these datasets and code, and to generate an appropriate visualisation for the final plot. Nearly identical to assessment 2, the submission must contain the following:

In addition, your code should demonstrate the following skills:

13.2 Hints and Tips

  • The html theme used by the demo report is flatly
  • The ggplot theme is theme_bw()
  • You will not be asked to do anything ‘tricky’ in this assessment, so if you find yourself needing 20 lines of code just to customise the labels in a plot, try looking for more efficient alternatives.
  • You will definitely need to join data, and convert between wide and long. Make sure you do this efficiently, creating the minimum number of extra data tables needed, and not re-creating the same table for different tasks.
  • If you find yourself doing the same thing to many columns, you could probably do it more efficiently by reshaping the data longer first
  • Use pipes to avoid creating many single-use tables.
  • The figure width and height of Figure 1 are 10 and 7 (it is fine if your fonts are a slightly different size)

13.3 Submission

  • Covers: chapters 1-7, emphasising 5-7
  • Worth: 30%
  • Do not put your name in your report; use your student ID as the author.
  • Please submit a zip file containing:
    1. the .rproj file
    2. your reproducible script, named report3_studentID.qmd
    3. any additional files necessary to reproduce your report(e.g., images or bibliography files),
    4. the rendered html report, named report3_studentID.html.

13.4 Marking Rubric

ILO A: Excellent B: Very Good C: Good D: Satisfactory E: Poor
Research & Knowledge
Skills from Chapters 1-3: You demonstrate skills to create reproducible reports and visualise data
Data Import and Joining: You import and join together data clearly and correctly
Data Reshaping: You can reshape data between long and wide formats where approriate
Data Wrangling: You demonstrate the ability to select and filter data, create new data columns, and edit existing data columns
Evaluation
Original Plot: The original plot is appropriate to the question asked
Communication
Code Clarity: Your code is organised in the quarto script cleanly and clearly, using separate code chunks to intersperse text and relevant code. Your code chunks contain comments that clarify the purpose of the code, but not overly-explaining each step. The names you use for objects are clear, consistent, and concise.
Code Efficiency: While there are many ways to do the same things in R, some ways are more efficient than others. These avoid unnecessary code (e.g., do not load packages you do not use) and redundancy (e.g., do not load or process the data the same way in several places).