During this course, you will understand how and why certain principles – such as immutability and pure functions – enable parallel data processing (‘divide and conquer’), which is necessary to manage big data. From this fundamental principle, we move forward. Namely, how to recognize and put into practice the scalable solution that’s right for your situation. The insights and tools of this course are regardless of programming language, but user-friendly examples are provided in Python, Hadoop HDFS, and Apache Spark. Although these principles can also be applied to other sectors, we will use examples from the agri-food sector.
Agri-food deserves special focus when it comes to choosing robust data management technologies due to its inherent variability and uncertainty. Wageningen University & Research’s knowledge domain is healthy food and the living environment. That makes our data experts especially equipped to forge the bridge between the agri-food business on the one hand, and data science, artificial intelligence (AI) on the other.
Combining data from the latest sensing technologies with machine learning/deep learning methodologies, allows us to unlock insights we didn't have access to before. In the areas of smart farming and precision agriculture, this allows us to:
Better manage dairy cattle by combining animal-level data on behaviour, health and feed with milk production and composition from milking machines. Reduce the amount of fertilizers (nitrogen), pesticides (chemicals), and water used on crops by monitoring individual plants with a robot or drone. More accurately predict crop yields on a continental scale by combining current with historic data on soil, weather patterns and crop yields.
In short, this course’s foundational knowledge and skills for big data prepare you for the next step: to find more effective and scalable solutions for smarter, innovative insights.
Making Data Machine-Readable WebinarThis webinar provides a basic overview of machine-readable data. We cover what machine-readable data is, why we prefer machine-readable data over other digital formats where possible, the characteristics of machine-readable data and how to convert tabular data to a machine-readable format.
Data Dictionaries WebinarThis webinar provides a basic overview of data dictionaries and the role they play in making data more easily understandable. We talk about what a data dictionary is, why we recommend data be accompanied by a data dictionary, the different types of data dictionaries and how to create a data dictionary.
#shareEGU20: Handling your data efficiently from planning to reuse–tips & tools to save time & nerves- webinarThis online Short Course will introduce you to useful tools and best practices that will make your work with research data much easier, more efficient, and enjoyable. It introduces you to Data Management Plans, reproducible data manipulation in R, Version control with git and github, working with clusters and large climate data, and publishing data. This Short Course is relevant to all geoscience fields and the tools presented can be widely applied through all kinds of data sets.
Forschungsdatenmanagement für Agrarwissenschaftler und BiologenPresentation slides for a Workshop about research data management for students and researchers of agricultural sciences and biology. Workshop held at Humboldt-Universität zu Berlin, 12 May 2016.
BonaRes Repositorium Intro VideoIn the BonaRes repository, which is operated at ZALF (Germany), digital research data from soil and agricultural sciences are published and made available to scientists and interested parties for reuse. This film explains the basic steps necessary to take advantage of the data publication.