Machine learning plays an increasingly important role in many scientific areas, including geo-information science and remote sensing, ecology, biosystems engineering, and bioinformatics. Today, scientific data are growing in complexity, size, and resolution, and scientists are challenged to leverage available data to inform decision making. In this course, you will learn how to model patterns and structures contained in data, and evaluate data-driven models, i.e. models that learn directly from observations the phenomena under study.
The course will focus on the following topics:
- The machine learning methodology, and framing scientific problems as machine learning tasks
- Data preparation and representation
- Key algorithms for regression, classification, and clustering
- Qualitative and quantitative comparison of characteristics, (dis)advantages, and performance of a number of key algorithms
- Design and implementation of effective solutions based on chosen algorithms to solve practical problems
Through a series of lectures and practical exercises (in Python), the participants will learn about different strategies and their pertinence for specific problems in environmental sciences, but the course will remain general for a broader audience. Participants are encouraged to bring their own problems in class and analyse data from their own research.
| Day 1 | morning: Introduction to machine learning, methodology and best practices, afternoon: Introduction to Python, Practical on data preparation and representation, cross validation, training/test splits |
| Day 2 | morning: lecture on regression methods: linear, LASSO, feature selection, trees, neural networks afternoon: practical on regression methods |
| Day 3 | morning: lectures on classification methods: Bayesian, kNN, logistic, SVMs, ensembles, forests afternoon: practical on classification methods |
| Day 4 | morning: lectures on unsupervised analysis: hierarchical, k-means, EM, PCA, t-SNE afternoon: practical on unsupervised analysis |
| Day 5 | morning: Bring your own data – Frame your science question as a learning task and work with own data afternoon: Feedback/ discussion – Outlook on advanced/current topics (i.e. deep learning) |
Useful links
Dr Ricardo da Silva Torres (Artificial Intelligence Group, Wageningen University & Research)
Dr Ioannis Athanasiadis (Artificial Intelligence Group, Wageningen University & Research)
| Target Group | The course is aimed at PhD candidates, postdocs, and other academics that are interested in machine learning applied to environmental data |
| Group Size | Min. 15 / Max. 20 participant |
| Course duration | 5 days |
| Prior knowledge | Basic skills in statistics are a plus. Practicals will be in Python. A short introduction will be provided on the first day, but previous programming experience in R or Python is required |
| Accommodation | Accommodation is not included in the course |
| Location | Wageningen University, Forum building |
| Room number | 01-06-2026 -> B0106 02-06-2026 -> B0106 03-06-2026 -> B0767 04-06-2026 -> B0107 05-06-2026 -> B0106 |
| Fees1 | FEE |
| PE&RC/WIMEK/EPS/WASS/VLAG/WIAS PhD candidates with approved TSP and WU EngD candidates | € 330,- |
| PE&RC postdocs and staff | € 660,- |
| All other academic participants | € 700,- |
| Non-academic participants | € 1360,- |
1 The course fee includes a reader, coffee/tea, and lunches. It does not include accommodation.
Note:
PE&RC Cancellation Conditions
IMPORTANT: ALWAYS read the Cancellation conditions for PE&RC courses and activities.
- PE&RC
Email: office.pe@wur.nl