CRSS 8030
Data Science and Statistical Programming Applied to Ag.
(UGA Spring 2026)

Hi there!

This is the welcome page for the 2026 CRSS 8030 Data Science and Statistical Programming Applied to Ag taught by Dr. Leo Bastos.

Below you will find:


Prep material before first lab

To be ready for next class (Intro to R), please follow the link below and complete all tasks:

Lab 01 prep

If you have questions or issues, email me before class.

Schedule

Chalkboard

date time topic slides code recording reading
Tue, Jan 13 8:35 to 10:45 Welcome and Intro
Thu, Jan 15 8:35 to 10:45 R & RStudio intro
HW Assignment #1 - data viz
Tue, Jan 20 8:35 to 10:45 HW Assignment #1 - ggplot
Reproducible tools pt 1.
RStudio Projects
Thu, Jan 22 8:35 to 10:45 Rmarkdown
Data wrangling
HW Assignment #2 - data wrangling
Tue, Jan 27 8:35 to 10:45 Reproducible tools pt 2.
git/GitHub
Installing git
Connecting git and GitHub
Creating a GitHub repo
First push
Thu, Jan 29 8:35 to 10:45 Regression and optimum pt 1
Tue, Feb 03 8:35 to 10:45 Regression and optimum pt 2
Thu, Feb 05 8:35 to 10:45 DS: Iteration
Tue, Feb 10 8:35 to 10:45 DS: Iteration pt 2
Thu, Feb 12 8:35 to 10:45 ML: open data APIs
Tue, Feb 17 8:35 to 10:45 ML: feature engineering
Thu, Feb 19 8:35 to 10:45 Multivariate models and multicollinearity
Tue, Feb 24 8:35 to 10:45 ML: Dimensionality reduction
Thu, Feb 26 8:35 to 10:45 Review
Tue, Mar 03 8:35 to 10:45 Mid-term
Thu, Mar 05 8:35 to 10:45 Mid-term discussion
Tue, Mar 10 8:35 to 10:45 No class - Spring Break
Thu, Mar 12 8:35 to 10:45 No class - Spring Break
Tue, Mar 17 8:35 to 10:45 ML: Conditional inference tree
Thu, Mar 19 8:35 to 10:45 ML: Random forest
Tue, Mar 24 8:35 to 10:45 ML: XgBoost
Thu, Mar 26 8:35 to 10:45 Final project introduction
Tue, Mar 31 8:35 to 10:45 DS: Cloud computing with GACRC (Sapelo2)
Thu, Apr 02 8:35 to 10:45 ML: Explainable AI
Tue, Apr 07 8:35 to 10:45 ML: Random Forest classification
Thu, Apr 09 8:35 to 10:45 DS: Shiny Dashboard
Tue, Apr 14 8:35 to 10:45 Final project presentations
Thu, Apr 16 8:35 to 10:45 Final project presentations
Tue, Apr 21 8:35 to 10:45 Final project presentations
Thu, Apr 23 8:35 to 10:45 Final exam

Course Syllabus

Course information

General information

  • CRSS 8030 - Data Science and Statistical Programming Applied to Agriculture
  • Spring Semester 2026
  • 3 credit hours

Meeting times and locations

  • Lectures: Tuesday at 8:35 - 10:45
  • Labs: Thursday at 8:35 - 10:45
  • Location:
    • Athens campus: in person at 1203 Miller Plant Sciences
    • Tifton campus: in person at 601 NESPAL South OR remote
    • Griffin campus: in person at 217 SLC OR remote

Prerequisites

STAT 6315 – Statistical Methods for Researchers

Co-requisites

None.

Instructor information

General information

Dr. Leonardo M. Bastos, Assistant Professor
Crop & Soil Sciences Dept.
4101 Miller Plant Sciences Building, Athens Campus
University of Georgia
Email: lmbastos@uga.edu
URL: leombastos.github.io/bastoslab/

Anish Bhattarai, PhD student
Department of Crop & Soil Sciences
Miller Plant Sciences Building, Athens Campus
University of Georgia
Email: ab68010@uga.edu

Office hours

The TA office hours are Thursday from 1 to 3 pm, upon appointment. Please email Anish if you need help to get a time scheduled within the windows above.

Course description and details

Description

This course will expose students to common machine learning data analytical workflows in agriculture while utilizing data science principles. For that, students will learn how to develop workflows that include finding and importing data, exploratory data analysis, data wrangling and processing, fitting a model to the data, assessing model quality, extracting model information, and creating publication-ready figures. This type of workflow will be implemented mostly to observational data commonly found in agricultural sciences, including regression and machine learning algorithms. Students will learn how access publicly available data sets for crop, soils, and weather information, and train machine learning models on these data. All the above will be performed while learning and using data science tools for reproducibility like version control, R statistical programming, APIs to publicly available data sets, task automation, and creating online interactive dashboards.

Course learning outcomes

The general course objective is to provide students with hands-on applied experience in analyzing agricultural data using modern reproducible tools. That involves: - Learning and applying analytical workflows that involve importing data, processing, analyzing, assessing model fit, extracting model information and producing publication-ready figures for different analysis including regression. - Conducting linear and non-linear regression workflows. - Learning and applying machine learning concepts (bias-variance trade-off, data split, hyper-parameter optimization, predictive metrics) and algorithms to agricultural observational data (soils, weather, yield). - Doing all the above while learning and using data science tools for reproducibility like version control, statistical programming, APIs to publicly available data sets, task automation, and creating online interactive dashboards.

Topical Outline

  1. Intro to R and RStudio (R script, Rmarkdown, quarto, RStudio Projects)
  2. Version control with git and GitHub
  3. R APIs to publicly available data (USDA NASS, weather, soil)
  4. Data wrangling with dplyr, tidyr, pipe operator
  5. Data visualization with ggplot2, gganimate
  6. Automating repetitive tasks through iteration with purrr
  7. Linear regression
  8. Non-linear regression
  9. Regression for finding optimum
  10. Dimensionality reduction
  11. Machine learning concepts
    1. Bias-variance trade-off
    2. Data split
    3. Hyperparameter optimization
    4. Predictive assessment
  12. Machine learning models
    1. K-means (unsupervised)
    2. Conditional inference tree/Random forest (supervised, regression and classification)
    3. XGboost
  13. Cloud computing
  14. Explainable AI
  15. Dashboards
    1. Creating a simple dashboard with shiny apps
    2. Publishing a dashboard online

The topical outline is a general plan for the course; deviations announced to the class by the instructor may be necessary.

Course materials

Textbook

A textbook is not required. Reading materials will be supplied by instructor and will include benchmark research articles, manuals, and other materials.

Technology and software requirements

Students will need to have access to:

  • A computer (to install software, code along with instructors)
  • A second screen (main screen to code along, second screen to watch class if not in person)

If a student does not have access to these resources (personal laptop/desktop and a second screen), please let instructors know to ensure proper accommodations can be made.

Course website

Important links related to this course:

Assessment and Grading

Grading categories

The grade you receive in this course will be determined from your performance on a mid-term project, a mid-term exam, periodic quizzes, homework assignments and lab reports, a final project, and class participation. These factors will be weighted as follows:

Activity Grade
Mid-term project 10%
Mid-term exam 10%
Homework assignments 35%
In-class quizzes 15%
Final project:
Machine learning
20%
Class participation 10%

Written assignment quality

Up to thirty percent of the grade on written assignments (mini-project, homework, final project) will be based on quality of communication.

Spelling, grammar, punctuation, and clarity of writing are evidence of written communication quality.

Class participation

Active class participation is important for you to achieve the learning goals of the class. To receive maximum credit for class participation you must

  • attend every class period (lecture or laboratory)
  • arrive on time and remain for the entire class period
  • you are actively engaged and attentive throughout the class period
  • participate in the class discussion and ask and answer questions

Grading scale

Final grades will be assigned as follows:

Letter Grade
A 93 and above
A- 90-92
B+ 87-89
B 83-86
B- 80-82
C+ 77-79
C 73-76
C- 70-72
D+ 67-69
D 63-66
D- 60-62
F 59 and below

Extra credit opportunities

Extra credit opportunities may be made available during projects, homework assignments, and exams, at the discretion of the instructor.

Course statements and policies

Academic honesty

UGA Student Honor Code: “I will be academically honest in all of my academic work and will not tolerate academic dishonesty of others.” A Culture of Honesty, the University’s policy and procedures for handling cases of suspected dishonesty, can be found at www.uga.edu/ovpi.

For this course, all lab reports, projects, and other assignments can be discussed with your classmates but any work you turn in must be your own.

Students can work together through coding exercises, but direct copying and pasting from a colleague will be considered plagiarism.

If using code from an online source, it is ok to copy and paste IF proper credit is given (e.g., showing the website source from where the code was obtained).

Unless explicitly stated, artificial intelligence-based technologies, such as ChatGPT, must not be used to generate responses for student assignments.

Attendance policy

Students are expected to attend every class period.

Students on the Athens campus must attend class in-person. If a special circumstance arise (illness, travel, etc.), student absence or remote attendance must be informed to instructors prior to that class period.

Students on the Tifton and Griffin campuses may attend class in-person on their campuses or remote using the zoom link information. Student absence must be informed to instructors prior to that class period.

Per Board of Regents policy, I reserve the right to drop students from the class roll who miss more than 5 class periods unexcused. Such students will be given a WF grade.

Disclaimer

The course syllabus is a general plan for the course; deviations announced to the class by the instructor may be necessary.

Make-up procedures

  • There will be no make-ups for missed quizzes. Any missed quiz will be recorded as a zero
  • Exams can be made up only with a note from a doctor or if you can document extenuating circumstances. Any unexcused missed exam will be recorded as a zero
  • Homework assignments will be accepted up to one week beyond the due date. The penalty for submitting a late assignment is one letter grade
  • Homework assignments may be submitted late without penalty in case of illness, extenuating circumstances, or if prior arrangements are made with the instructors. All late assignments are due within a week of the original due date or within a week of when a student returns from an illness

Mental Health and Wellness Resources

  • If you or someone you know needs assistance, you are encouraged to contact Student Care and Outreach in the Division of Student Affairs at 706-542-7774 or visit https://sco.uga.edu/. They will help you navigate any difficult circumstances you may be facing by connecting you with the appropriate resources or services.
  • UGA has several resources for a student seeking mental health services (https://www.uhs.uga.edu/bewelluga/bewelluga) or crisis support (https://www.uhs.uga.edu/info/emergencies).
  • If you need help managing stress anxiety, relationships, etc., please visit BeWellUGA (https://www.uhs.uga.edu/bewelluga/bewelluga) for a list of FREE workshops, classes, mentoring, and health coaching led by licensed clinicians and health educators in the University Health Center.
  • Additional resources can be accessed through the UGA App.

Disability statement

If you plan to request accommodations for a disability, please register with the Disability Resource Center. They can be reached by visiting Clark Howell Hall, calling 706-542-8719 (voice) or 706-542-8778 (TTY), or by visiting https://sitedrc.uga.edu

Resources

Below there are some resources for students to further your knowledge in topics ranging from using quarto files, vector and raster manipulation in R, data visualization, and geostatistics.