Wednesday, August 5, 2015

Function Arguments: *args and **kwargs

In Python, functions can have positional arguments and named arguments. In this post, I will describe both types and explain how to use special syntax to simplify repetitive function calls with nearly the same arguments. This extends the discussion in section 5.1.3 of A Student’s Guide to Python for Physical Modeling.

First, let’s look at np.savetxt, which has a straightforward declaration:

$ import matplotlib.pyplot as plt
$ from mpl_toolkits.mplot3d import Axes3D

$ np.savetxt?
Signature: np.savetxt(  fname, X, fmt='%.18e', delimiter=' ', newline='\n',
                        header='', footer='', comments='# ')
Docstring: Save an array to a text file.

We see the function has two required arguments, followed by several optional arguments with default values. Next, let’s look at something more exotic:

$ Axes3D.plot_surface?
Signature: Axes3D.plot_surface(X, Y, Z, *args, **kwargs)
Docstring: Create a surface plot.

The first three arguments seem obvious enough: These are the arrays that specify the points on the surface. The last two — *args and **kwargs — look strange. Let’s examine one more function:

$ plt.plot?
Signature: plt.plot(*args, **kwargs)
Docstring: Plot lines and/or markers to the :class:`~matplotlib.axes.Axes`.

*args and **kwargs are the only arguments for the familiar plotting function! They are the focus of this post.

Positional Arguments

In a Python function, positional arguments are Python expressions assigned to function variables based on their position in the function call.

Suppose I create a surface plot of topographic data with the command

ax = Axes3D(plt.figure())
ax.plot_surface(latitude, longitude, elevation)

Python evaluates ax.plot_surface with the substitutions

X = latitude
Y = longitude
Z = elevation

I.e., the substitutions are based on the positions of the arguments. I would get a different (meaningless) surface if I shuffled the order around:

ax.plot_surface(elevation, longitude, latitude)

*args

The *args argument in a function definition allows the function to process an unspecified number of positional arguments. Let’s look at a simple example:

def get_stats(*args):
    from numpy import mean, std
    return mean(args), std(args)

This function will compute the descriptive statistics (mean and standard deviation) of any sequence of values passed to it. Try the following commands:

get_stats(1, 2, 3, 4, 5)
get_stats(range(30))
get_stats(np.random.random(100))

You can type any number of arguments when calling the function, or you can pass the function any sequence of values — an array, a tuple, a list.

This ability to process any number of arguments is what makes it possible to call plt.plot in a variety of ways. All of these commands are valid:

t = np.linspace(-1, 1, 101)
plt.plot(t)
plt.plot(t, t**2 - 1)
plt.plot(t, t**3 - t, 'k--')
plt.plot(t, t**4 - t**2, t, t**5 - t**3 + t)

How can one function process so many different kinds of input, including mixtures of variable names, expressions, and strings? The plot function has several subroutines that determine exactly what is in the series of arguments you supply and what to do with those objects. This can make a function very flexible, but it is also likely to be complex — both to write and to interpret.

You can use the *args notation to “unpack” a sequence into a series of positional arguments for any function. For example, suppose the three topographic data arrays mentioned earlier had been packaged as a single tuple:

data = (latitude, longitude, elevation)

The surface plot function does not know what to do with this tuple, but I can use the *args notation to assign the three arrays to X, Y, and Z.

ax.surface_plot(data)       # Raises an exception.
ax.surface_plot(*data)      # Creates surface plot.

The *data command instructs Python to assign the items in data to the positional arguments of ax.surface_plot.

This method of passing positional arguments to functions can be convenient when you wish to automate calculations using various combinations of input parameters or to ensure that several functions use the same data:

data = (x, y, z)
f(*data)
g(*data)
h(*data)

If I want to perform the same analysis on a different set of data later, I only need to change the data variable.

Named Arguments

In a Python function, named arguments are Python expressions whose value in the function is specified by a keyword-value pair.

For example, this function call from the Illuminating Surface Plots post uses named arguments to specify several options:

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                facecolors=green_surface)

Function like plt.plot and Axes3D.plot_surface whose definitions include **kwargs can accept any number of keyword arguments.

**kwargs

Similar to the *args notation, you can use the **kwargs notation to pass a collection of named arguments to a function. To do this, you must package the keyword-value pairs in a dictionary.

A dictionary is a Python data structure that associates an immutable object called a key with a value, which can be mutable or immutable. A future post will discuss dictionaries in more detail. For this post, only the syntax for creating a dictionary is important. Enclose the contents of a dictionary between curly braces: { ... }. Each entry of the dictionary is a key, followed by a colon, followed by a value:

definitions = { 'cat':"n. A feline.", 'dog':"n. A canine."}

definitions['cat']

The first command creates a dictionary. The second accesses one of its members.

In Illuminating Surface Plots, I used the same set of plotting options many times. This led to a lot of typing, and for the commands in the script, a lot of retyping every time I decided to change one of these options. A more efficient method is to … Define once, reuse often. I can put all of the data arrays into a tuple and most of the plotting options into a dictionary:

data = (X, Y, Z)
plot_options = {'rstride':1,
                'cstride':1,
                'linewidth':0,
                'antialiased':False}

Most surface plot commands were identical except for the value of the facecolors option. I could create two surface plots with different values of this argument as follows:

ax1.plot_surface(*data, facecolors=green_surface, **plot_options)
ax2.plot_surface(*data, facecolors=illuminatedn_surface, **plot_options)

This is easier to type and ensures that both plots use the same data set and plotting options.

Summary

Python function accept positional and named arguments. Functions whose definitions include the arguments *args and **kwargs will accept an unspecified number of either type. You can use this notation to unpack a sequence (list, tuple, or array) into a series of positional arguments or a dictionary into a series of named arguments. This provides a convenient method for calling the same function with slightly different inputs and options, or calling different functions with the same inputs and options.

The rules for supplying arguments to functions are as follows:

  1. Positional arguments (if any) must come first.
  2. Named arguments (if any) and/or positional arguments in the form of *args (if any) must come next.
  3. Named arguments in the form of **kwargs (if any) must come last.

No comments:

Post a Comment

To avoid duplication and inappropriate content, your comment will be reviewed before being published. Thank you for your input!


-- Jesse & Phil