This lesson has passed peer-review! See the publication in JOSE.

Command line programs


  • How can I write my own command line programs?

  • Use the defopt library to manage command-line arguments in a program.

  • Structure Python scripts according to a simple template.

We’ve arrived at the point where we have successfully defined the functions required to plot the precipitation data.

We could continue to execute these functions from the Jupyter notebook, but in most cases notebooks are simply used to try things out and/or take notes on a new data analysis task. Once you’ve scoped out the task (as we have for plotting the precipitation climatology), that code can be transferred to a Python script so that it can be executed at the command line. It’s likely that your data processing workflows will include command line utilities from the CDO and NCO projects in addition to Python code, so the command line is the natural place to manage your workflows (e.g. using shell scripts or make files).

In general, the first thing that gets added to any Python script is the following:

if __name__ == '__main__':

The reason we need these two lines of code is that running a Python script in bash is very similar to importing that file in Python. The biggest difference is that we don’t expect anything to happen when we import a file, whereas when running a script we expect to see some output (e.g. an output file, figure and/or some text printed to the screen).

The __name__ variable exists to handle these two situations. When you import a Python file __name__ is set to the name of that file (e.g. when importing, __name__ is script), but when running a script in bash __name__ is always set to __main__. The convention is to call the function that produces the output main(), but you can call it whatever you like.

The next thing you’ll need is a library to parse the command line for input arguments. Essentially, we want to turn our script into a command, something like

$ python [options] [arg1 arg2 ...]

where arg1, arg2 … might be input and ouput files and options might be --colour=red or -d. Arguments arg1, arg2 are called positional arguments – their role is determined by their position in the argument sequence. The other options are keyword arguments and thus can be supplied by the user in any order. Using keyword arguments is recommended when your script takes many arguments as it would be tedious to get the order right otherwise. Because scripts can take a large number of options, we also want to get a help message when we type

$ python -h

One common way to add argument parsing to your script involves using the argparse library, which is part of every standard Python installation. Here, we’ll introduce another way, the defopt library, a clean and simple way to obtain the same result. Regardless of whether you rely on argparse or defopt, the three steps involved are: (1) decide which function(s) to run when calling the script, (2) define the types of the funcion(s) arguments and (3) describe what the functions do and what the arguments are.

Let’s be specific, here’s a template for what most python command line programs look like:

$ cat
import defopt

# All your functions (that will be called by main()) go here.

def main(infile: str, outfile: str):
    Run the program.

    :param infile: Input file name
    :param outfile: Output file name

    print('Input file: ', infile)
    print('Output file: ', outfile)

if __name__ == '__main__':

By running at the command line we’ll see that defopt handles all the (positional) input arguments:

$ python
Input file:
Output file:

It also generates help information for the user:

$ python -h
usage: [-h] infile outfile

Print the input arguments to the screen.

positional arguments:
  infile      Input file name
  outfile     Output file name

optional arguments:
  -h, --help  show this help message and exit

and issues errors when users give the program invalid arguments:

$ python
usage: [-h] infile outfile error: the following arguments are required: outfile

Does this look like magic? This worked because we defined the types of arguments in the main function and each argument was documented. For instance infile: str indicates that we expect a string and the corresponding docstring is :param infile: Input file name. Finally, we had to tell defopt which function should be run when we execute the script (

Using this template as a starting point, we can add the functions we developed previously to a script called

$ cat
import xarray as xr
import as ccrs
import matplotlib.pyplot as plt
import numpy as np
import cmocean
import defopt

def convert_pr_units(darray):
    """Convert kg m-2 s-1 to mm day-1.
      darray (xarray.DataArray): Precipitation data
    """ = * 86400
    darray.attrs['units'] = 'mm/day'
    return darray

def create_plot(clim, model, season, gridlines=False):
    """Create plot.
      clim (xarray.DataArray): Precipitation climatology data
      model (str): Name of the climate model
      season (str): Season
      gridlines (bool): Select whether to plot gridlines    
    fig = plt.figure(figsize=[12,5])
    ax = fig.add_subplot(111, projection=ccrs.PlateCarree(central_longitude=180))
                                          levels=np.arange(0, 13.5, 1.5),
                                          cbar_kwargs={'label': clim.units},
    if gridlines:
    title = f'{model} precipitation climatology ({season})'

def main(pr_file: str, season: str, output_file: str):
    Plot the precipitation climatology.

    :param pr_file: Precipitation data file
    :param season: Season to plot
    :param output_file: Output file name

    dset = xr.open_dataset(pr_file)
    clim = dset['pr'].groupby('time.season').mean('time', keep_attrs=True)
    clim = convert_pr_units(clim)

    create_plot(clim, dset.attrs['source_id'], season)
    plt.savefig(output_file, dpi=200)

if __name__ == '__main__':

… and then run it at the command line:

$ python data/ MAM pr_Amon_ACCESS-CM2_historical_r1i1p1f1_gn_201001-201412-MAM-clim.png


For this series of challenges, you are required to make improvements to the script that you downloaded earlier from the setup tab at the top of the page.

For the first improvement, edit the line of code that defines the season argument (..., season: str,...) so that it only allows the user to input a valid three letter abbreviation (i.e. ['DJF', 'MAM', 'JJA', 'SON']).

(Hint: Read about the choices keyword argument in the defopt documentation.)


from typing import Literal
def main(pr_file: str, season: Literal['DJF', 'MAM', 'JJA', 'SON'], output_file: str):


Add an optional command line argument that allows the user to add gridlines to the plot.

(Hint: Define the keyword argument gridlines to be of type bool and give it a default value.)


Make the following additions to (code omitted from this abbreviated version of the script is denoted ...):


def main(pr_file: str, season: Literal['DJF', 'MAM', 'JJA', 'SON'], output_file: str, *, gridlines: bool=False):


   create_plot(clim, dset.attrs['source_id'], season, gridlines=gridlines)


Note the * argument in main argument list, which indicates that all subsequent arguments are keyword arguments. The name of the option will be autogenerated. The short version of the option takes the first letter of the variable name, -g for gridlines. In this case the long version is --gridlines.

Colourbar levels

Add an optional keyword argument that allows the user to specify the tick levels used in the colourbar


Make the following additions to (code omitted from this abbreviated version of the script is denoted ...):


def create_plot(clim, model_name, season, gridlines=False, levels=None):
    """Plot the precipitation climatology.
        gridlines (bool): Select whether to plot gridlines
        levels (list): Tick marks on the colourbar      

    if not levels:
        levels = np.arange(0, 13.5, 1.5)




def main(pr_file: str, season: Literal['DJF', 'MAM', 'JJA', 'SON'], output_file: str, *, 
         gridlines: bool=False, cbar_levels: list[float]=None):

    :param cbar_levels: list of levels / tick marks to appear on the colourbar

    create_plot(clim, dset.attrs['source_id'], season,
                gridlines=gridlines, levels=cbar_levels)


Free time

Add any other options you’d like for customising the plot (e.g. title, axis labels, figure size).

At the conclusion of this lesson your script should look something like the following:

import xarray as xr
import as ccrs
import matplotlib.pyplot as plt
import numpy as np
import cmocean
import defopt

def convert_pr_units(darray):
    """Convert kg m-2 s-1 to mm day-1.
      darray (xarray.DataArray): Precipitation data
    """ = * 86400
    darray.attrs['units'] = 'mm/day'
    return darray

def create_plot(clim, model, season, gridlines=False, levels=None):
      clim (xarray.DataArray): Precipitation climatology data
      model (str): Name of the climate model
      season (str): Season
      gridlines (bool): Select whether to plot gridlines
      levels (list): Tick marks on the colourbar    

    if not levels:
        levels = np.arange(0, 13.5, 1.5)
    fig = plt.figure(figsize=[12,5])
    ax = fig.add_subplot(111, projection=ccrs.PlateCarree(central_longitude=180))
                                          cbar_kwargs={'label': clim.units},
    if gridlines:
    title = f'{model} precipitation climatology ({season})'

def main(pr_file: str, season: Literal['DJF', 'MAM', 'JJA', 'SON'], output_file: str, *, 
         gridlines: bool=False, cbar_levels: list[float]=None):
    Plot the precipitation climatology.

    :param pr_file: Precipitation data file
    :param season: Season to plot
    :param output_file: Output file name
    :param gridlines: Select whether to plot gridlines
    :param cbar_levels: List of levels / tick marks to appear on the colourbar

    dset = xr.open_dataset(pr_file)
    clim = dset['pr'].groupby('time.season').mean('time', keep_attrs=True)
    clim = convert_pr_units(clim)

    create_plot(clim, dset.attrs['source_id'], season,
                gridlines=gridlines, levels=cbar_levels)
    plt.savefig(output_file, dpi=200)

if __name__ == '__main__':

Key Points

  • Libraries such as defopt can be used the efficiently handle command line arguments.

  • Most Python scripts have a similar structure that can be used as a template.