intro image here

SCINet Geospatial Research Workshop 2020

Harnessing SCINet computational resources in geospatial data science to further sustainable and intensified agriculture.

Hosted By: SCINet Geospatial Research Working Group with support from USDA ARS, SCINet Scientific Computing Initiative

Dates: 8/25/2020 - 9/1/2020


Goals

The 2020 SCINet Geospatial Workshop continues the efforts outlined from the 2019 workshop held in Las Cruces, NM. The two overarching goals of this workshop are to:

  1. provide hands-on learning experiences (tutorials) on workflows to access the Ceres high-performance computing (HPC) system and conduct geospatial and machine learning research at scale,
  2. foster research efforts that had previously been un-attainable due to computational limitations or technical bottleknecks. This includes developing infrastructure and exploring state-of-the-art machine learning methods applicable to geospatial sciences.


Organizing Committee

Rowan Gaffney, Physical Scientist, Ft Collins, CO
Kerrie Geil, SCINet Postdoc, Las Cruces, NM
Amy Hudson, SCINet Postdoc, Las Cruces, NM
Yanghui Kang, SCINet Postdoc, Beltsville, MD
Suzy Stillman, SCINet Postdoc, Las Cruces, NM


How to Participate

All members of the working group as well as non-members from USDA ARS are welcome to participate! We also welcome our University collaborators who have USDA SCINet accounts. We are hoping that everyone will attend the general session of the working group (Session 1) and then pick and choose other sessions to attend based on your own interests and skill level.

The workshop is split over 6 separate Zoom sessions (as well as a pre-meeting assistance session) that will include:

To follow along with the tutorials you need to already have or apply for a SCINet account and be able to successfully login to your account. We recommend applying for an account by 8/12/2020 at the latest, as the process can take 1-2 weeks for final approval. Please note, if you need help accessing your SCINet account you should plan on attending the pre-meeting login assistance session on 8/19/2020 (Session 0), but make sure you have applied for an account well in advance of this session.

To follow along with the Session 4 Tutorial: Computational Reproducibility Tools make sure you create a free personal Github account for yourself and remember your Github username and password. You will also, of course, need a SCINet account as described above.

Please register for each session individually using the registration links below so we can have an idea of how many people will be present at each event. Note, each session will have a separate Zoom link and password so you must register for each session you would like to attend.

Lastly, review the pre-meeting checklist and background information on the Pre-meeting page to ensure you are prepared for the workshop sessions.


Schedule / Registration

Note: All workshop sessions are open to all scientists and scientific staff at USDA ARS. We also welcome ARS contractors and University collaborators who have a SCINet account. Please make sure to register separately for each session you plan on attending (the Zoom join details are different for each session).

Quick Links to Content Below:


Session 0: Pre-meeting SCINet Account Login Assistance

Wednesday August 19, 11am - 1pm MDT
No registration required, just show up at 11am MDT: session completed
Prerequisites: None

For those who plan on participating in any of the Sessions 2-5 tutorials, this pre-meeting session with the SCINet Vitural Research Support Core (VRSC) is to help anyone who is having trouble accessing their SCINet account.

Please ensure that you have applied for a SCINet account well in advance of this pre-meeting session, as there are multiple approvals (including your supervisor) that new accounts must pass through before it will receive final approval. Suggested final date for applying for a new account in order to be ready for this pre-meeting session is Wednesday Aug 12. Go to https://scinet.usda.gov/signup/ to start the account application process.


Session 1: Annual Meeting of the SCINet Geospatial Research Working Group

Tuesday August 25, 11am - 2pm MDT
Registration Required: session completed
Prerequisites: None

We encourage everyone to attend this general session- members and non-members from USDA ARS.

AGENDA (MDT)  
11-11:10 Welcome and Session Rules
11:10-11:30 Review of the 2019 workshop
11:30-11:45 Details on the upcoming 2020 sessions
11:45- 12 Introduction to the SCINet postdocs
12-12:15 break
12:15-1:15 Working Session: SCINet common data library
1:15-1:45 Working Session: geospatial workbook
1:45-2 Proposals for new working group initiatives


Session 2: Tutorial: Introduction to the Ceres High-Performance Computing System Environment (SSH, JupyterHub, Basic Linux, SLURM batch script)

Thursday August 27, 11am - 1pm MDT
Registration Required: session completed
Prerequisites: have a SCINet account and be able to login (apply for an account here)

This interactive follow-along session will demonstrate how to access the SCINet Ceres HPC system by using Secure Shell at the command line as well as by using the JupyterHub web interface. We will also cover how to access JupyterLab and RStudio on the Ceres HPC through the JupyterHub web interface, basic linux commands, and how to write a SLURM batch script to submit a compute job on the Ceres HPC.

We will not troubleshoot individual SCINet account access problems during this session. If you are having trouble accessing your account please plan to attend Session 0.


Session 3: Tutorial: Introduction to Distributed Computing on the Ceres HPC System Using Python and Dask

Thursday August 27, 1:30pm - 2:30pm MDT
Registration Required: session completed
Prerequisites: basic Python or other basic programming skill helpful (expertise not required), have a SCINet account and be able to login (apply for an account here)

This session will be an interactive follow-along about how to compute in parallel on the Ceres HPC system using Python tools. Participants will use their own SCINet account to walk through a Jupyter Notebook and execute Python code on the Ceres HPC system.

We will not cover how to login to your SCINet account or troubleshoot individual account access problems during this session. If you are having trouble accessing your account please plan to attend Session 0. If you are new to working in an HPC environment attending Session 2 first will be helpful but not required.


Session 4: Tutorial: Computational Reproducibility Tools (Git/Github, Conda, Docker/Singularity containers)

Friday August 28, 10:30am - 12:30pm MDT
Registration Required: session completed
Prerequisites: basic linux, create a free Github account for yourself and remember your username/password, have a SCINet account and be able to login (apply for an account here)

This interactive follow-along session will demonstrate how to use Git/Github, the Conda package/environment management system, and Docker/Singularity containers on the Ceres HPC system. During the Git/Github portion we will cover how to copy an existing Github repo to your SCINet/Ceres account, make a change to the repo locally, push the repo online to your own Github account, and how to pull request your changes to get them incorporated into the original repo. The Conda portion will cover how to access or install Conda on Ceres, how to use Conda to download software on Ceres, how to use Conda environments to document all the software you are using and eliminate dependency issues, and how to save your Conda environment details to a specification file so that you can quickly recreate your complete software environment for any project. We will also cover how containers can allow your codes to run successfully on different operating systems, how to use (and create) a Docker image, and how to use Singularity on the Ceres HPC to run a container from a Docker image.

We will not cover basic linux, how to login to your SCINet account, or troubleshoot individual account access problems during this session. If you are having trouble accessing your account please plan to attend Session 0. If you need basic linux help or are new to working in an HPC environment please plan to first attend Session 2.


Session 5: Tutorial: Distributed Machine Learning: Using Gradient Boosting to Predict NDVI Dynamics

Friday August 28, 1:00pm - 2:30pm MDT
Registration Required: session completed
Prerequisites: basic Python and basic HPC skill helpful (expertise not required), have a SCINet account and be able to login (apply for an account here)

This interactive follow-along tutorial uses a machine learning gradient boosting model (XGBoost) to predict NDVI (Harmonized Landsat Sentinel) from daily weather (PRISM) and physiologic variables (soil properties) at the Central Plains Experimental Range (CPER) Long Term Agro-ecosystem Research station. Participants will use their own SCINet account to walk through a Jupyter Notebook and execute Python code on the Ceres HPC system.

The workflow involves:

We will not cover basic Python, basic distributed/parallel computing, how to login to your SCINet account, or troubleshoot individual account access problems during this session. If you are having trouble accessing your account please plan to attend Session 0. If you have limited experience working on an HPC system we recommend first attending Sessions 2 and 3.


Session 6: Symposium: Challenges and opportunities in leveraging machine learning techniques to further sustainable and intensified agriculture

Tuesday September 1, 11am - 2pm MDT
Registration Required: session completed
Prerequisites: None

This session is for USDA ARS scientists, scientific staff, and University collaborators who are interested in learning about how machine learning is being used in agricultural research. We will have 4 invited speakers from outside of USDA ARS give talks about using maching learning for a range of agricultural research questions, followed by a panel discussion.

AGENDA (MDT)

11-11:10 Drs Yanghui Kang & Amy Hudson, USDA-ARS SCINet Postdocs

11:10-11:40 Dr Matthew Jones, University of Montana

11:45-12:15 Dr Liheng Zhong, Descartes Labs

12:20-12:50 Dr Vasit Sagan, Saint Louis University

12:55-1:25 Dr Jingyi Huang, University of Wisconsin

1:30-1:40 Short Break

1:40-2:15 Panel Discussion


More information about our invited speakers can be found on our Session 6 page


Website Content Metadata

Website Content: CC BY-SA Rowan Gaffney / Kerrie Geil 2020 (get source code). Creative Commons License

Website Theme: workshop-template-b by evanwill is built using Jekyll on GitHub Pages. The site is styled using Bootstrap with FontAwesome icons.