Tidy data transformation and visualization with R

You are here

 

Try Out - Postgraduate course

Tidy data transformation and visualization with R
tidyverse and ggplot

Monday 24 - Wednesday 26 February 2020

Scope

It is often mentioned that 80% of a data analysis pipeline is involved with the tedious process of cleaning and preparing data in a correct way so they can be consumed for analysis and visualization (Dasu & Johnson, 2003). Tidy data facilitates easier data transformation and visualization. Tidy data works hand in hand with the tools provided by the tidyverse collection of R packages, in a way that promotes reproducibility and efficiency. ggplot2 (Wickham, 2009) is one of the core members of the tidyverse. It is one of the best and most used R packages for data visualization. In this workshop, participants will learn the principle of tidy data, how to transform and combine datasets using the tools from the tidyverse and how to generate advanced visualization with the ggplot2 package.
Dasu, T., & Johnson, T. (2003). Exploratory Data Mining and Data Cleaning. https://doi.org/10.1002/0471448354
Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. Retrieved from http://ggplot2.org

Is this workshop for me?
  • Do you routinely spend long days transforming and cleaning your Excel files to get them ready for analysis and/or making plots?
  • Do you work with tricky datafiles (measurements from different type of equipment, normally provided as raw text files)?
  • Do you have to combine small or big datasets from different experiments/sources to get to your final datasets?
  • Do you want to communicate your findings in a beautiful and reproducible way by generating publication-ready plots?

Then this workshop will equip you with the skills to tackle the above use-cases and many more!

Assumed knowledge

Participants should be familiar with the concepts taught in the course “Introduction to Data Science with R and R Studio” (https://www.pe-rc.nl/data-science-r) and be comfortable in working with:

  • Vectors, Lists and Data.Frames
  • Importing and saving data
  • Using functions
Learning Outcomes
  • Participants will be taught the principles of tidy data, how to best structure their data using the tools of the tidyverse and the concept of data analysis pipelines.
  • Participants will learn the basics concepts of relational data and how to combine different datasets in a reproducible and efficient manner.
  • Participants will learn the syntax and philosophy of the grammar of graphics as implemented by the ggplot2 package.
  • Participants will learn how to make different types of visualizations using the ggplot2 package. They will be able to explore their own data sets using scatter plots, boxplots, bar charts, smooth fitted lines in scatter plots, etc.
  • Participants will learn how to customize visualization to achieve publication-level quality. They will learn how to customize the plots by adjusting the labels, legends, colors, and coordinate systems.
Methods
  • The workshop consist of interacting presentations that are often interrupted by 5 min do it yourself periods during which the participants need to solve exercises of increasing complexity. The participants will spend at least half of the time on writing R code and thinking about data science problems.
  • Participants will have to solve a small project on the afternoon of the last day, in order to practice all the skills taught in the workshop with a real dataset from a research project. The different challenges faced by the participants will then be discuss at the end of the course as a wrap-up.
General information
Target Group The course is aimed at PhD candidates, postdocs, and academic staff
Group Size 16 participants
Course duration 3 days
Language of instruction English
Frequency of recurrence To be determined
Number of credits 0.9 ECTS
Lecturers Ioannis Baltzakis, Alejandro Morales Sierra
Prior knowledge Participants should be familiar with the concepts taught in the course “Introduction to Data Science with R and R Studio” (https://www.pe-rc.nl/data-science-r) and be comfortable in working with "Vectors, Lists and Data.Frames", "Importing and saving data", and "Using functions".
Location Wageningen University Campus
Options for accommodation Accommodation is not included in the fee of the course, but there are several possibilities in Wageningen. For information on B&B's and hotels in Wageningen please visit proefwageningen.nl. Another option is Short Stay Wageningen. Furthermore Airbnb offers several rooms in the area. Finally, there are a number of groups on Facebook where students announce subrent possibilities and things like that. Examples include: Wageningen Room Subrent, Wageningen Room Sublets, Room Rent Wageningen, and Wageningen Student Plaza. Note that besides the restaurants in Wageningen, there are also options to have dinner on Wageningen Campus.

 

Fees 1
  EARLY-BIRD FEE 2 REGULAR FEE 2
PE&RC / WIMEK / EPS / WASS / VLAG / WIAS PhD candidates with an approved TSP € 150,- € 200,-
All other PhD candidates
Postdocs and staff of the above mentioned Graduate Schools
€ 290,- € 340,-
All others € 440,- € 490,-

1 The course fee includes coffee/tea, and lunches. It does not include accommodation (NB: options for accommodation are given above)
2 The Early-Bird Fee applies to anyone who REGISTERS ON OR BEFORE 10 FEBRUARY 2020

Note:

  • If you need an invoice to complete your payment, please send an email to office.pe@wur.nl, including ALL relevant details that should be mentioned on the invoice (e.g., purchase order no., specific addresses, attendees, etc.).
  • The Early-Bird policy is such that the moment of REGISTRATION (and not payment) is leading for determining the fee that applies to you.
  • Please make sure that your payment is arranged within two weeks after your registration.
  • It is the participant's responsibility to make sure that he/she (or his/her secretary) completes the payment correctly and in time.
PE&RC Cancellation Conditions
  • Up to 2 (two) weeks prior to the start of the course, cancellation is free of charge.
  • Up to 1 (one) week prior to the start of the course, a fee of € 150,- will be charged.
  • In case of cancellation within one week prior to the start of the course, a fee of € 290,- will be charged.
  • If you do not show at all, a fee of € 440,- will nevertheless be charged.

Note: If you would like to cancel your registration, ALWAYS inform us (and do note that you will be kept to the cancellation conditions)

More information

Dr. Lennart Suselbeek (PE&RC)
Phone: +31 (0) 317 485426
Email: lennart.suselbeek@wur.nl

Registration

To register, please enter your details below and click "Register".