ECOL6002P, the second semester of an academic year
Instructors: Fanglin Liu
Teaching assistants:
When & where: Wednesday and Friday 3,4,5 (9:45-12:10), 3A409
Some notes by Fanglin Liu: https://github.com/flliu315
Ecology is the study of the relationships between plants/animals, including humans, and their abiotic environment. It seeks to understand the vital connections between living organisms and the world around them. Ecology also provides information about the benefits of ecosystems and how we can use Earth’s resources in ways that leave the environment healthy for future generations.
Read more here
The distributed sensor networks allow for the acquisition of huge volumes of data on many relevant aspects, ranging from soil and vegetation characteristics, abiotic conditions like weather, to animals’ behavior. The collection of large amounts of data leads to a shift away from frequentist hypothesis testing towards analytics that is more focused on prediction, classification, pattern recognition or anomaly detection. To this end, machine learning techniques are often used, usually by high performance computing.
In the digital era, researchers are embracing data science, i.e., unifying data processing, statistics, artificial intelligence and their related algorithms to extract knowledge from data. Hence, data science is increasingly becoming an integral part of decision making in many fields, including ecology and wildlife conservation. To keep up with these steps, young scientists and students need to become acquainted with the terms, concepts and methodology of data science, including the integration and pre-processing of data from different sources, and the engineering of informative and discriminating features for creating effective algorithms.
This class covers the main elements using a data science approach to solve ecological problems. Students will be guided through the main concepts and skills that are required to become a successful data scientist. This class builds upon, and expands, the understanding and skills generated in other courses, and focuses on combining these in an interdisciplinary way to be optimally able to solve ecological problems with a data-driven approach. In the class, students will increase their knowledge and skills that will benefit their future career in academia.
Read more here
Experience with programming in R is needed to follow and successfully complete this course. The students without prior experience with programming in R are expected to master R programming in the class, including:
Main types of R objects (vector, matrix, data frame, list), reading and exporting data, manipulating data (sorting, merging), creating fully reproducible R script
How to draw effective scientific figures, including how to choose them and draw them, as well as raster vs vector graphics
After successful completion of this class, students are expected to be able to understand the significance of data science in solving typical ecological problems, including:
understand how key features of ecological data influence the selection, training, validation and evaluation of algorithms;
identify and select machine learning algorithms appropriate to specific ecological problems;
apply data science skills (data processing, feature engineering, and machine learning algorithms) to analyse ecological data;
critically evaluate the results and performance of trained algorithms, and assess the reliability and adequacy of trained algorithms in predicting ecological phenomena;
create ecological insight from data using a data science approach.
The final evaluation consists of three parts: