Tutorial 1a: Global Simulations#

This tutorial is an introduction to running global scale simulations of the Community Terrestrial Systems Model (CTSM). It will guide you through running a simulation and provides example visualization of simulation results. Once you complete the global simulation in this tutorial, you can use the Day1b_Global_Visualization tutorial to explore the model data further.

In the previous Day0b_NEON_Simulation_Tutorial example many steps to run CLM were condensed into a single function. Here you’ll how to create your own case, setup, build, and run the simulation. These are the basic steps reqired to run CLM and modify the code to conduct model experiments

A few notes about the model:#

  • There are several configuration options of CTSM, and throughout this tutorial we will use the Community Land Model (CLM) configuration which is the climate mode of CTSM. Throughout the rest of this tutorial, we refer to the model as CLM and will use version 5.1 with satelity phenology (CLM5.1-SP).

  • Satelite phenology (or SP) simulations prescribe he distribution of vegetation and its leaf area index (LAI). These SP simulations useful for understanding water and energy fluxes that are simulated by the model, as well as gross primary productivity (GPP). SP simulations focus on the biogeophysics of the model. Accordingly, there is no biogeochemistry (i.e., carbon fluxes) in SP simulations downstream of GPP calculations. This simplification removes potential feedbacks or biases that may occur from plant allocation, autotrophic respiration, and turnover on simulated LAI. SP simulations can be conducted with CLM5.1-SP, or with CLM_FATES-SP.

  • FATES - the Functionally Assembled Terrestrial Ecosystem Simulator - more relistically represents terrestrial vegetation by simulating forest size structure, disturbance, and competition. FATES is an external module which can run within a given “Host Land Model”, here we are using CLM. This examples uses the satelite phenology scheme in FATES, one of the reduced complexity modes of FATES that helps to isolate representation of canopy processes like radiative transfer and photosynthesis. Several FATES configurations are available in CLM.

Additional information:#

Additional information about CTSM and CLM is available on the website, including technical documentation, a user’s guide, and a quickstart guide for running various model configurations beyond what is covered in this tutorial.

Additional information about FATES is available on the FATES github site.

Questions about this tutorial?#


TIP: Before we get started, make sure you’re in a bash kernel

  • Switch kernal (upper right of your current notebook)
  • Select either one of the Bash Kernels from the pop-up window
  • Click select

HINT: Most of the examples in this tutorial can be run directly from code cell of this notebook.

It’s also helpful to have a terminal window open to run from the command line. To open a tab with a terminal window connection

First you need to open a terminal window within CESM-Lab:

  1. Click on the + symbol in the upper left for a New Launcher
  2. Click on the Terminal icon

In this tutorial#

The tutorial has several components. Below you will find steps to:

  1. Run an out-of-the-box CLM simulation using CLM_FATES-SP

  2. Locate model history files

  3. Basic visualization of model history files

NOTICE: This tutorial assumes that you’ve done your homework!

If you haven’t downloaded CTSM from the github repository you need to go back to the Day0a_GitStarted tutorial and do this first!


1. Set up and run a simulation

CTSM can be run in 4 simple steps.#

1.1 create a new case

  • This step sets up a new simulation. It is the most complicated of these four steps because it involves making choices to set up the model configuration

1.2 invoke case.setup#

  • This step configures the model so that it can compile

1.3 build the executable#

  • This step compiles the model

1.4 submit your run to the batch queue#

  • This step submits the model simulation to the cloud


1.1 create a new case

Set up a new simulation


1.1.1 create a directory for cases#

This is a one-time step to create a directory to store your experiment cases:

mkdir ~/clm_tutorial_cases

1.1.3 Create a new case#

./create_newcase --case ~/clm_tutorial_cases/I2000_CTSM_FATESsp --res f45_g37 --compset I2000Clm51FatesSpRsGs --run-unsupported

./create_newcase#

NOTE: There is a lot of information that goes into creating a case.

You can learn more about the options by typing ./create_newcase –help on the the command line.

We’ll briefly go over some of the highlights here.

Required argements to create a new case#

There are 3 Required arguments Needed to create a new case. These include

  1. --case, which specifies the location and name of the case being created

  • ~ = your home directory

  • clm_tutorial_cases = the subdirectory we created to store your cases in 1.1.1

  • I2000_CTSM_FATESsp = the name of the case you’re creating

  • Recommendation: Use meaningful names, including model version, type of simulation, and any additional details to help you remember the configuration of this simulation

  1. --res Defines the model resolution, or grid,

  • f45_g37, which is an alias for a 4x5 degree grid on the atmosphere and land (f45). This coarse resolution is more efficient for testing. Note, g37 refers to the resolution of the ocean model. Don’t worry the ocean model won’t be used for this case.

  • commonly the land model is run at a nominal 1 degree f09_g17 or 2 degree f19_g17 resolution

  • Using ./query_config --grids provides a list of supported model resolutions

  1. --compset Defines the component set for your case,

  • The Component set specifies the default configuration for the case which includes:

    • Component models (e.g. active vs. data vs. stub),

    • Time period of simulations and forcing scenarios (e.g. 1850 vs 2000 vs. HIST) and

    • Physics options (e.g. CLM5.1 vs CLM5.0).

  • I2000Clm51FatesSpRsGs is alias that actually describes a much longer set of components that are being used for this case.

We’ll come back to compsets, but there are a few other optional flags used to create this new case to briefly touch on here

  1. --run-unsupported avoids error using compsets are not scientifically supported

NOTE: You may notice an error about project codes when you create your case. The project code isn’t important for these simulations. But you may need to change this if you’re running on Cheyenne.


More on component sets#

  • All CLM-only compsets start with “I”.

  • Using ./query_config --compsets clm provides examples of other CLM compsets

you can try this here

./query_config --compsets clm

The long name for the compset used here is 2000_DATM%GSWP3v1_CLM51%FATES-SP_SICE_SOCN_MOSART_SGLC_SWAV, which defines

  • time = 2000_ (vs. 1850, HIST, SSP, etc)

  • data atmosphere DATM%GSWP3v1_, here from GSWP3, as opposed to model atmosphere (CAM)

  • land model CLM51%FATES-SP_ CLM5.1 physics package with FATES-SP

  • stub sea ice model SICE_

  • stub ocean model SOCN_ (see there’s no active ocean model!)

  • stub river model SROF_ (other options include MOSART, MIZUROUTE or RTM)

  • stub glacier SGLC_

  • stub wave model SWAV_

Key Definitions:

  • Active: Simulation is using the code from the model during the run

  • Data: Simulation is reading in data from a file for this component

  • Stub: Component is not being used

HINT: Some compsets are “scientifically supported” and others are not. A scientifically supported compset just means that we’ve done some additional testing and evaluation with that compset. You can use an unsupported compset (which is encouraged!), but will need to add the option –run_unsupported at the end of the create_newcase command line.

TIP: More information on model Configurations and Grids can be found on the CESM website (see Configurations and Grids subheading at the bottom of the page)


1.2 Invoke case.setup

This step configures the model so that it can compile


1.2.1 Move to your case directory#

cd ~/clm_tutorial_cases/I2000_CTSM_FATESsp

1.2.2 Set up your case#

The ./case.setup script:

  1. configures the model

  2. creates files to modify input data and run options

./case.setup

Using this command, we just configured the model and created the files to modify options & input data.


1.3 Build the executable

This step compiles the model

It also takes a long time, so be patient.


The ./case.build script:

  1. Checks input data

  2. Creates a build/run directory with model executable and namelists

qcmd -- ./case.build

You can read on, but before executing any code blocks in the notebook wait for the model to build. This can take a while, especially while you’re wating for your qcmd job to start and as code for the land model compiles.

You’ll see text stating MODEL BUILD HAS FINISHED SUCCESSFULLY when it’s finished.

NOTE: The command qcmd – ./case.build is specific for NCAR environments, including Cheyenne and cloud configurations, and runs the command on a computing node, reducing the load on the login node. You must include qcmd – when running on Cheyenne, and it’s highly advised on shared cloud systems too. On single-user cloud systems, it isn’t needed, though it may speed up builds.


1.4 Submit your run

This step submits the model simulation

But first we need to check on a few thing


Case Customization Checks#

The model is now compiled and ready to run!

There are a few things we should check before submitting the run.

Customizing your case can happen in a few ways. Here we’ll introduce:

  1. XML change (*.xml files), and

  2. Namelist modifications (user_nl_* files)

1.4.1 XML files#

You may want to customize a number of features for your simulation. For example:

  1. How many days or years will the model simulate?

  2. How much time does the computer need for this simulation?

  3. Which computing project account is the model charging to?

These options are specified in the env_*.xml files in your case directory

The XML files can be modified directly, but we recommend that you use the xmlchange script. Next, we’ll review how to check and modify variables in XML files.

xmlchange#

Using the xmlchange script is the preferred method to modify XML files, but you can edit XML files by hand

Benefits of using xmlchange:

  1. Allows changing variables in env_*.xml files using a command-line interface

  2. Won’t let you mess up the syntax! The script checks the setting immediately for validity.

  3. Settings are copied into the CaseStatus file, providing documentation of your changes.

The env_batch.xml and env_run.xml files include most of the variables you will need to modify to set up and run simulations.

These files can be changed at any time before running the simulation.

A few useful tips for using the xml scripts:

  1. Use ./xmlquery --listall to list variables and their values in the .xml files
  2. Modify a variable in a .xml file, use ./xmlchange
  3. For help, type ./xmlchange --help
  4. To list variables in a specific file ./xmlquery --file env_run.xml (here env_run.xml).
  5. You can also search for multiple variables, separating the variable names with a comma (“,”). For example, ./xmlquery –p {string} will search for every variable that includes this string. Try it for “STOP” or “CLM”.

Example:#

./xmlchange {variable to be changed}={value to change to}


Many runtime variables are found in the env_run.xml file. The variables in this file control the mechanics of the run (length, resubmits, and archiving).

Common variables in env_run.xml to change include:

  1. STOP_OPTION sets the run-time interval type, i.e. nmonths, ndays, nyears

  2. STOP_N sets the number of run-time intervals to run the model during the specified wallclock time.

  • Wallclock time is set in the env_batch.xml file and is a measure of the actual time.

  1. RESUBMIT sets the number of times to resubmit the run

To Do:#

Use xmlquery to find the values of the variables listed above. Feel free to play around with options

./xmlquery STOP_OPTION
./xmlquery STOP_N
./xmlquery RESUBMIT

From the list above, you should find these values:

  1. STOP_OPTION -> ndays

  2. STOP_N -> 5

  • By default, the model is set to run for 5 days.

  1. RESUBMIT -> 0

  • Resubmitting your job, allows for longer runs (when you run out of wall clock time) by continuing the run from a restart file.


We will not see any model output from a 5-day model run because history files with model output are only recorded every month by default

  • Let’s change the length of the simulation to 1 year.

./xmlchange STOP_OPTION=nyears
./xmlchange STOP_N=1

This changes the run-time interval from 5 days to one year.

We won’t resubmit this case, but setting RESUBMIT=1, would allow you to run for a second year.

NOTE: If you try to change a variable to a value that isn’t an option, you will get an error message with a list of valid values.

Verify that the changes worked as you expected:

./xmlquery STOP_OPTION,STOP_N

NOTE: If you’re running on Cheyenne, or another HPC machine changing the JOB_WALLCLOCK_TIME may help you get through the queue more quickly. For example, it won’t take 12 hours to run a 1-year simulation and your simulation will get into the queue more quickly if you set a shorter run time. This is not important running in the cloud.

You can find out timing information for some standard simulations here. Click on a compset similar to the one you will run and look for the “model throughput” value. This is how many simulated years the computer can run in a 24-hour timeframe. If you’re running on Cheyenne, you can estimate the amount of time you’ll need based on this.


1.4.2 Namelist changes#

You may want to customize additional features for your simulation. These can be controled with namelist changes.

  • Not all changes can be made with ./xmlchange

  • Additional changes made using namelist files: user_nl_<model>

  • user_nl_<model> files are created in the case directory after ./case.setup

The compset that you created your case with sets up initial, or default namelist options. These can be found in CaseDocs/lnd_in

  • user_nl_clm modifies the namelist lnd_in options

  • Important: Don’t modify the namelist file (lnd_in) directly. Instead, make changes in user_nl_clm.

  • Some namelist variables can also be changed in env_run.xml file.

  • This website has CLM5.0 namelist variables

We modified this already with the when we created your case with the --user-mods-dirs fates_sp flag. This user-mods-directory sets user_nl_clm options correctly for a FATES-SP case.

Check that it worked by opening the file in an editor.

TIP You can do this with any editor. The two options below use emacs or vi, but there are lots of other options as well as resources for how to use these online.


To Do:

Open your user_nl_clm file to see what’s inside.

You can navigate to your case directory in a few different ways:

  1. Open it via the navigation sidebar

  • Click on the clm_tutorial_cases directory

  • Click on the I2000_CTSM_FATESsp case directory

  • Click on the user_nl_clm file in the sidebar of your JupyterLab window.

  1. Open the same file with an editor from the command line in your terminal window.

    • emacs ~/clm_tutorial_cases/I2000_CTSM_FATESsp/user_nl_clm

or

  • vi ~/clm_tutorial_cases/I2000_CTSM_FATESsp/user_nl_clm

  1. Altrnatively you can just cat the file from this notebook, but you won’t be able to make changes

These options were set up for you when you created the FATES-SP case using user_mods_dirs

cat ~/clm_tutorial_cases/I2000_CTSM_FATESsp/user_nl_clm

Is fates_sp set to true?

If not, on the command line your can type: echo "use_fates_sp = .true." >> user_nl_clm This is a simplified way of writing information to user_nl_clm

The other text you see in user_nl_clm turns off some of the biogeochemistry and fire options that are not needed for an SP case. Other options reduce the output that’s being writting out to history files.


1.4.3 Submit the case!#

./case.submit

When you submit a job, you will see confirmation that it successfully submitted:

Congratulations! You’ve created and submitted global CLM-FATES-SP case.#

Next, you will probably want to check on the status of your jobs.

TIP: This is dependent on the scheduler that you’re using.

  • Cheyenne uses PBS where status is checked with qstat -u $USER
  • This is also enabled in the cloud for you, try it in the code block below

If you want to stop the simulation, you can do so with qdel here (or on Cheyenne).

  • Find your Job ID after typing qstat
  • Type qdel {Job ID}
qstat -u $USER

Once your jobs are complete (or show the ‘C’ state under the ‘Use’ column, which means complete), we can check the CaseStatus file to ensure there were no errors and it completed successfully. To do this, we’ll ‘tail’ the end of the CaseStatus file:

tail ~/clm_tutorial_cases/I2000_CTSM_FATESsp/CaseStatus

2. Locate model history files#

Your simulation will likely take some time to complete. The information provided next shows where the model output will be located while the model is running and once the simulation is complete. We also provide files from a simulation that is already complete so that you can do the next exercises before your simulation completes.

When your simulation is running history files go to your scratch directory:

  • /scratch/{USER}/{CASE}

Within this directory you can find /run and /bld subdirectories.

When the simulation is complete, a short-term archive directory is created, and history files are moved here:

  • /scratch/{USER}/archive/lnd/hist/

Note that files necessary to continue the run are left in the run directory: /scratch/{USER}/{CASE}/run

Let’s see what’s in your run directory.

cd /scratch/$USER/I2000_CTSM_FATESsp/run
ls

What’s in your run directory?

Do you see any log or history files?

  • log files look like lnd.log*

  • history files look like I2000_CTSM_FATESsp.clm2.h0.2000-01.nc

You can keep running the cell above until you see log and history files, then the model is running.


ls /scratch/$USER/archive/I2000_CTSM_FATESsp/lnd/hist

What’s in your archive directory?

If you don’t see any history files your simulation is likely still running or in the queue (check using squeue or qstat). Check again before you leave today to see if your simulation completed and if the files were transferred to archive. Even if your run isn’t finished, you can move on in this tutorial.

Next, let’s explore data from a similar simulation that already ran.

TIP: On Cheyenne these directory are found here

  • /glade/scratch/{USER}/{CASE}/run
  • /glade/scratch/{USER}/archive/{CASE}/lnd/hist/

3. Basic dump of model history files#

There are a few command-line tools you can use to view netCDF data files. One of the most useful is ‘ncdump’:

ncdump

This is a tool that generates a text representation of netCDF data. It is useful for providing information about the variables (names, types, and shapes), dimensions (names and sizes), attributes (names and values), and values of data for all or selected variables.


Navigate to this directory, where data from a completed simulation are stored:

cd /scratch/data/day1/I2000_CTSM_FATESsp/lnd/hist/
ls *2000-*

Let’s look at the information included in the file in a text format This is also better to execute in the command line, but you can do it here.

ncdump -h I2000_CTSM_FATESsp.clm2.h0.2000-01.nc | more

If you must run commands in a notebook, run the code block below - we’re only showing the first 50 lines here, to reduce the output:

ncdump -h I2000_CTSM_FATESsp.clm2.h0.2000-01.nc | head -50

NOTES:

  1. Use the “-h” option to look through the variable names, attributes and dimensions. If you do not use an option, ncdump will list this information and all the data values of all the variables, which is a lot of information!!
  2. In a terminal, use tools like the “| more” command so that you can scroll through the information from the start of the file.
    • hitting return will advance line by line
    • hitting the space bar will advance more quickly
    • hitting q will exit

Go ahead to the Day1b_GlobalVisualization notebook