Tidy data transformation and visualization with R (online) - November 2024

You are here

 

Online  - Postgraduate course

Tidy data transformation and visualization with R (online)

Monday 4, Friday 8, Monday 11, Friday 15 November 2024

Scope

It is often mentioned that 80% of a data analysis pipeline is involved with the tedious process of cleaning and preparing data in a correct way so they can be consumed for analysis and visualization (Dasu & Johnson, 2003). Tidy data facilitates easier data transformation and visualization. Tidy data works hand in hand with the tools provided by the tidyverse collection of R packages, in a way that promotes reproducibility and efficiency. ggplot2 (Wickham, 2009) is one of the core members of the tidyverse. It is one of the best and most used R packages for data visualization. In this workshop, participants will learn the principle of tidy data, how to transform and combine datasets using the tools from the tidyverse and how to generate advanced visualization with the ggplot2 package.
Dasu, T., & Johnson, T. (2003). Exploratory Data Mining and Data Cleaning. https://doi.org/10.1002/0471448354
Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. Retrieved from http://ggplot2.org

Is this workshop for me?
  • Do you routinely spend long days transforming and cleaning your Excel files to get them ready for analysis and/or making plots?
  • Do you work with tricky datafiles (measurements from different type of equipment, normally provided as raw text files)?
  • Do you often struggle to make sense of big, complex datasets?
  • Do you often have to combine different datasets in order to perform your research?
  • Do you want to communicate your findings in a beautiful and reproducible way by generating publication-ready plots?

Then this workshop will equip you with the skills to tackle the above use-cases and many more!

Assumed knowledge

Participants should be familiar with the concepts taught in the course “Introduction to Data Science with R and R Studio” and be comfortable in working with:

  • Vectors, Lists and Data.Frames
  • Importing and saving data
  • Using functions
Learning Outcomes
  • Participants will be taught the principles of tidy data, how to best structure their data using the tools of the tidyverse and the concept of data analysis pipelines.
  • Participants will learn the basics concepts of relational data and how to combine different datasets in a reproducible and efficient manner.
  • Participants will learn the syntax and philosophy of the grammar of graphics as implemented by the ggplot2 package.
  • Participants will learn how to make different types of visualizations using the ggplot2 package. They will be able to explore their own data sets using scatter plots, boxplots, bar charts, smooth fitted lines in scatter plots, etc.
  •  Participants will learn how to customize their figures to achieve publication-level quality, by adjusting the labels, legends, colors, and coordinate systems, among others.
  • Participants will be introduced to interactive data visualization and visualization of geographical data using interfaces in R to state-of-the-art web technologies
Methods
  • The workshop consist of interacting presentations that are often interrupted by short do it yourself periods during which the participants need to solve exercises of increasing complexity. The participants will spend at least half of the time on writing R code and thinking about data science problems.
  • In order to practice all the skills taught in the workshop, participants will have to solve a small project on the last day using a dataset from a real research project. The different challenges faced by the participants will then be discuss at the end of the course as a wrap-up.
Course setup

The course is spread across four days in four consecutive weeks and takes place on a dedicated Microsoft Teams group that will be created for the course. Each day of the course will be broken into three sections by a lunch break (1 hour) and two shorter breaks (20 minutes). Each of the first two sections of a course day will be chaired by one of the instructors who will share his computer screen via with the rest of participants. During these sections, theoretical concepts will be taught via a presentation, mixed with live coding practice (i.e. the instructor writes the code to solve a problem), interaction with participants as well as short exercises (2 – 3 minutes each) performed by participants on their own. The other instructor will be answering questions on the chat of the Teams group. The course instructors will organise a session of 2 hours, 1 week before the start of the course, to help solve any technical difficulties for students.

General information
Target Group The course is aimed at PhD candidates, postdocs, and academic staff
Group Size 24 participants
Course duration 4 days (two days a week from 9:00 to 17:00)
Language of instruction English
Frequency of recurrence To be determined
Number of credits 1.2  ECTS
Lecturers Ioannis Baltzakis, Alejandro Morales Sierra
Prior knowledge Participants should be familiar with all the concepts taught in the course “Introduction to R” offered by PE&RC
Location Online (Microsoft Teams)

 

Fees 1
  EARLY-BIRD FEE  REGULAR FEE 
PE&RC/WIMEK/EPS/WASS/VLAG/WIAS PhD candidates with approved TSP and EngD candidates € 150,- € 200,-
PE&RC postdocs and staff € 300,- € 350,-
All other academic participants € 340,- € 390,-
Non academic participants € 490,- € 540,-

1 The Early-Bird Fee applies to anyone who REGISTERS ON OR BEFORE 9 SEPTEMBER 2024

Note:

  • If you need an invoice to complete your payment, please send an email to office.pe@wur.nl, including ALL relevant details that should be mentioned on the invoice (e.g., purchase order no., specific addresses, attendees, etc.).
  • The Early-Bird policy is such that the moment of REGISTRATION (and not payment) is leading for determining the fee that applies to you.
  • Please make sure that your payment is arranged within two weeks after your registration.
  • It is the participant's responsibility to make sure that he/she (or his/her secretary) completes the payment correctly and in time.
PE&RC Cancellation Conditions
  • Up to 4 (four) weeks prior to the start of the course, cancellation is free of charge.
  • Up to 2 (two) weeks prior to the start of the course, 50% of the participation fee will be charged.
  • In case of cancellation within two weeks prior to the start of the course or no show, 100% of the participation fee will be charged.

Note: If you would like to cancel your registration, ALWAYS inform us (and do note that you will be kept to the cancellation conditions)

More information

Inka Bentum (PE&RC)
Phone: +31 (0) 317 845414
Email: inka.bentum@wur.nl

Registration

To register, please enter your details below and click "Register".

Please specify all relevant details that should be mentioned on the invoice (e.g., purchase order no., specific addresses, attendees, email contact person)