Thursday, October 1, 2015

Paths in Python

How do you get your Python interpreter to find modules that are not located in your current working directory? The answer is … you tell it where to look.

When you type something like

from my_module import my_function

Python searches a collection of directories (i.e., folders) on your computer. If the directory containing <my_module.py> is not in this collection, you will receive an ImportError. This can be frustrating if you know the file exists, and even more so if you know where it exists.

In this post, we will take a brief look at how to add paths to the collection of directories searched by Python.

Paths

A path is essentially a set of directions to a file:

/Users/username/modules/my_module.py        [Mac OS X, Linux]
C:\modules\my_module.py                     [Windows]

It tells your operating system how to navigate from a fixed starting point — the “root directory” / in Unix-based systems, or C:\ in Windows — through a collection of folders to the desired file.

In this case, there is a folder called <modules/> that contains the desired file, <my_module.py>. Let’s suppose that you are writing a script called <my_script.py> in your working directory:

/Users/username/scratch/my_script.py        [Mac OS X, Linux]
C:\scratch\my_script.py                     [Windows]

You want Python to import <my_module.py> when you run the script, but it cannot find the file.

Below are 4 options for adding the path for your module to Python’s collection of paths. None are extremely complicated, but the last two require working with the command line and editing startup files on your operating system. If you are unfamiliar or uncomfortable with this, you might prefer the first two methods.

You will need to replace directory names like /Users/username/ or C:\scratch\ with the correct directories on your own system for these examples to work.

Using The sys Module

There is a module called sys that lets you modify the collection of paths from within Python. According to its documentation,

This module provides access to some objects used or maintained by the
interpreter and to functions that interact strongly with the interpreter.

One of these objects is a list called sys.path. Again referring to the documentation,

path -- module search path

All we need to do is add the path to <my_module.py> to sys.path. On Unix-like systems, the commands are

import sys
sys.path.append('/Users/username/modules')

On Windows, it is easier to use a raw string to specify the path:

import sys
sys.path.append(r'c:\modules')

This will only affect the path in the current Python session. If you quit and restart, you will need to add the path again.

Using Spyder

If you are using the Spyder IDE, you can permanently add a directory to its collection of paths. Chose the “PYTHONPATH manager” from the menu (python > PYTHONPATH manager) as shown below:

Managing paths within Spyder.
Managing paths within Spyder.

This will open a window that shows a list of directories added to Spyder’s collection of paths. If you have not modified this list already, it will probably contain your working directory and nothing else. At the bottom of this window, there is an icon for “Add Path”. Click on this. Now you can find the directory you want to add, as if you were opening a file.

In this case, we navigate to

/Users/username/modules

and select “Choose”.

Once you have selected a directory, it will become a permanent addition to Python’s paths within Spyder. You will need to quit Spyder and restart or “Restart Kernel” for the changes to take effect.

This is a convenient way to access modules that are not part of the Anaconda distribution from within Spyder.

Using the Command Line

If you are working from a command line, you can add your module directory to Python’s collection of paths by setting the PYTHONPATH variable before you start a Python session or run a script:

$ export PYTHONPATH=${PYTHONPATH}:/Users/username/modules
$ python my_script.py

No matter what directory you are working in, Python now knows where to find <my_module.py>.

If you want to permanently add your module directory to PYTHONPATH, you can add the export command above to your <.profile> or <.bashrc> file.

Startup Files

Another option is to create a Python startup file — a Python script that will be run at the beginning of every interactive Python session (but not when you run scripts from the command line). First, create a file such as <python_startup.py> in your home directory. Then, enter Python commands exactly as you would type them at the command prompt:

# python_startup.py
# Startup script for interactive Python sessions.

import sys
sys.path.append('/Users/username/modules')

Next, edit your <.profile> or <.bashrc> file and add the line

export PYTHONSTARTUP=/Users/username/python_startup.py

And that’s it. Now, every time you start a new Python session, you can access your own modules.

IPython does not use the PYTHONSTARTUP variable. If you want to add a startup script for IPython, place a file like <python_startup.py> in IPython’s directory of startup scripts:

/Users/username/.ipython/default_profile/startup/

Now your modules will be available in every IPython session, including Spyder.

Summary

There are several ways to show Python where to find modules. For a single session, appending a path to the sys.path variable is an easy solution. For a permanent change, you can use Spyder’s PYTHONPATH manager or edit startup files from the command line. These methods will allow you to access modules without copying or moving files into your working directory, and you can easily access modules you write yourself or modules you download from the Web.

Monday, September 28, 2015

Speeding Up Python — Part 2: Optimization

The goal of this post and its predecessor is to provide some tools and tips for improving the performance of Python programs. In the previous post, we examined profiling tools — sophisticated stopwatches for timing programs as they execute. In this post, we will use these tools to demonstrate some general principles that make Python programs run faster.

Remember: If your program already runs fast enough, you do not need to worry about profiling and optimization. Faster is not always better, especially if you end up with code that is difficult to read, modify, or maintain.

Overview

We can summarize our principles for optimizing performance as follows:

  1. Debug first. Never attempt to optimize a program that does not work.
  2. Focus on bottlenecks. Find out what takes the most time, and work on that.
  3. Look for algorithmic improvements. A different theoretical approach might speed up your program by orders of magnitude.
  4. Use library functions. The routines in NumPy, SciPy, and PyPlot have already been optimized.
  5. Eliminate Python overhead. Iterating over an index takes more time than you think.

If fast code is important to you, you can start writing new code with these guidelines in mind and apply them to existing code to improve performance.

Debug first.

Your primary objective in programming is to produce a working program. Unless the program already does what it is supposed to do, there is no point in trying to speed it up. Profiling and optimization are not part of debugging. They come afterward.

Focus on bottlenecks.

The greatest gains in performance come from improving the most time-consuming portions of your program. The profiling tools described in Part 1 of this two-part post can help you identify bottlenecks — portions of a program that limit its overall performance. They can also help you identify faster alternatives. Once you have identified the bottlenecks in your program, apply the other principles of optimization to mitigate or eliminate them.

Look for algorithmic improvements.

The greatest gains in speed often come from using a different algorithm to solve your problem.

When comparing algorithms, one analyzes how the computation time scales with some measure of the size of the problem. Some number \(N\) usually characterizes the size of a problem — e.g., the number of elements in a vector or the number of items in a list. The time required for a program to run is expressed as a function of \(N\). If an algorithm scales as \(N^3\), it means that when \(N\) is large enough, doubling \(N\) will cause the algorithm to take roughly 8 times as long to run. If you can find an algorithm with better scaling, you can often improve the performance of your program by orders of magnitude.

As an example, consider the classic problem of sorting a jumbled list of numbers. You might consider using the following approach:

  1. Create an empty list to store the sorted numbers.
  2. Find the smallest item in the jumbled list.
  3. Remove the smallest item from the jumbled list and place it at the end of the sorted list.
  4. Repeat steps 2 and 3 until there are no more items in the jumbled list.

This method is called insertion sort. It works, and it is fast enough for sorting small lists. You can prove that sorting a jumbled list of \(N\) elements using an insertion sort will require on the order of \(N^2\) operations. That means that sorting a list with 1 million entries will take 1 million times longer than sorting a list of 1000 entries. That may take too long, no matter how much you optimize!

Delving into a computer science textbook, you might discover merge sort, a different algorithm that requires on the order of \(N \log N\) operations to sort a jumbled list of \(N\) elements. That means that sorting a list of 1 million entries will take roughly 6000 times longer than sorting a list of 1000 entries — not 1 million. For big lists, this algorithm is orders of magnitude faster, no matter what programming language you are using.

As another example, you might need to determine a vector \(\mathbf{x}\) in the linear algebra problem \(A \cdot \mathbf{x} = \mathbf{b}\). If \(A\) is an \(N \times N\) matrix, then inverting \(A\) and multiplying the inverse with \(\mathbf{b}\) takes on the order of \(N^3\) operations. You quickly reach the limits of what your computer can handle around \(N = 10,000\). However, if your matrix has some special structure, there may be an algorithm that takes advantage of that structure and solve the problem much faster. For example, if \(A\) is sparse (most of its elements are zero), you can reduce the scaling to \(N^2\) — or even \(N\) — instead of \(N^3\). That is a huge speedup if \(N = 10,000\)! These kinds of algorithmic improvements make an “impossible” calculation “trivial”.

Unfortunately, there is no algorithm for discovering better algorithms. It is an active branch of computer science. You need to understand your problem, delve into texts and journal articles, talk to people who do research in the field, and think really hard. The payoff is worth the effort.

Use library functions.

Once you have a working program and you have identified the bottlenecks, you are ready to start optimizing your code. If you are already using the best available algorithms, the simplest way to improve the performance of Python code in most scientific applications is to replace homemade Python functions with library functions.

It’s not that you are a bad coder! It’s just that someone else took the time to rewrite a time-consuming operation in C or FORTRAN, compile and optimize it, then provide you with a simple Python function to access it. You are simply taking advantage of work that someone else has already done.

Let’s use the profiling tool %timeit to look at a some examples of the speedup possible with library functions.

Building an array of random numbers

Suppose you need an array of random numbers for a simulation. Here is a perfectly acceptable function for generating the array:

import numpy as np

def Random(N):
    """
    Return an array of N random numbers in the range [0,1).
    """
    result = np.zeros(N)
    for i in range(N):
        result[i] = np.random.random()
    return result

This code works and it is easy to understand, but suppose my program does not run fast enough. I need to increase the size of the simulation by a factor of 10, but it already takes a while. After debugging and profiling, I see that the line of my program that calls Random(N) takes 99 percent the execution time. That is a bottleneck worth optimizing!

I can start by profiling this function:

$ %timeit Random(1000)
1000 loops, best of 3: 465 us per loop

When I look at the documentation for np.random.random, I discover it is capable of doing more than generate a single random number. Maybe I should just use it to generate the entire array …

$ %timeit np.random.random(1000)
10000 loops, best of 3: 28.9 us per loop

I can generate exactly the same array in just 6 percent of the time!

In this hypothetical example, my entire program will run almost 20 times faster. I can increase the size of my calculation by a factor of 10 and reduce the overall calculation time simply by replacing Random(N) by np.random.random(N).

Array operations

Let’s look at an even more dramatic example. In the previous post, I introduced a library to add, multiply, and take the square root of arrays. (The library, calculator.py is included at the bottom of this post.) Suppose, once again, I have identified the bottleneck in a working program, and it involves the functions in the calculator.py module.

NumPy has equivalent operations with the same names. (When you write x+y for two arrays, it is shorthand for np.add(x,y).) Let’s see how much we can speed the operations up by switching to the NumPy equivalents. First, import the modules and create two random two-dimensional arrays to act on:

import calculator as calc
import numpy as np
x = np.random.random((100,100))
y = np.random.random((100,100))

Now, time the operations in calculator.py and NumPy:

$ %timeit calc.add(x,y)
100 loops, best of 3: 9.76 ms per loop

$ %timeit np.add(x,y)
100000 loops, best of 3: 18.8 us per loop

Here we see an even more significant difference. The addition function written in pure Python takes 500 times longer to add the two arrays. Use %timeit to compare calc.multiply with np.multiply, and calc.sqrt with np.sqrt, and you will see similar results.

When to write your own functions

The implications of these examples are clear: NumPy array operations are much, much faster than the equivalent Python code. The same is true of special functions in the SciPy and PyPlot packages and many other Python libraries. To speed up your code, use functions from existing libraries when possible. This can save time in writing code, optimizing code, and running code.

So should you ever write your own functions?

I was once told, “Never write your own linear algebra routines. Somebody already wrote a faster one in the 1970s.” That may be generally true, but it is bad advice nonetheless. If you never write your own routine to invert a matrix, it is difficult to fully understand how these routines work and when they can fail, and you will certainly never discover a better algorithm.

If speed of execution is not important or if your goal is to understand how an algorithm works, you should write your own functions. If you need to speed up a working Python program, look to library functions.

Eliminate Python overhead.

Why are library functions are so much faster than their Python equivalents? The answer is a point we discussed in Chapter 2 of A Student’s Guide to Python for Physical Modeling: In Python, everything is an object. When you type “x = 1”, Python does not just store the value 1 in a memory cell. It creates an object endowed with many attributes and methods, one of which is the value 1. Type dir(1) to see all of the attributes and methods of an integer.

What’s more, Python has no way of knowing what type of objects are involved in a simple statement like z = x+y. First, it has to determine what kind of object x is and what kind of object y is (type-checking). Then it has to figure out how to interpret “+” for these two objects. If the operation makes sense, Python then has to create a new object to store the result of x+y. Finally, it has to assign this new object to the name z. This gives Python a lot of flexibility: x and y can be integers, floats, arrays, lists, strings, or just about anything else. This flexibility makes it easy to write programs, but it also adds to the total computation time.

To speed up programs, eliminate this overhead. In other words, make Python do as little as possible.

Using library functions from NumPy, SciPy, and PyPlot eliminates overhead, and this is the main reason they run so much faster. In the example above, np.add(x,y) is not doing anything fundamentally different than calc.add(x,y); it simply does addition and iteration in the background, without Python objects. Recall from the previous post that calc.add(x,y) spent almost 30 percent of its time iterating of the index j in the inner for loop.

Other ways to eliminate overhead are

  1. Use in-place operations. Operations like +=, -=, *=, and /= operate on an existing object instead of creating a new one.
  2. Use built-in methods. These methods are often optimized.
  3. Use list comprehensions and generators instead of for loops. Initializing a list and accessing its elements take time.
  4. Vectorize your code. (Section 2.5.1 of A Student’s Guide to Python for Physical Modeling)

Use %timeit to compare the performance of these functions. They use the principles above to eliminate some of the Python overhead in square_list0.

def square_list0(N):
    """
    Return a list of squares from 0 to N-1.
    """
    squares = []
    for n in range(N):
        squares = squares + [n**2]
    return squares


def square_list1(N):
    """
    Return a list of squares from 0 to N-1.
    """
    squares = []
    for n in range(N):
        # In-place operations: Replace "x = x + ..." with "x += ..."
        squares += [n**2]
    return squares
    

def square_list2(N):
    """
    Return a list of squares from 0 to N-1.
    """
    squares = []
    for n in range(N):
        # Built-in methods: Replace "x = x + ..." with "x.append(...)"
        squares.append(n**2)
    return squares
    

def square_list3(N):
    """
    Return a list of squares from 0 to N-1.
    """
    # Use list comprehension instead of for loop.
    return [n**2 for n in range(N)]
    

def square_array(N):
    """
    Return an array of squares from 0 to N-1.
    """
    # Vectorize the entire operation.
    from numpy import arange
    return arange(N)**2

In my tests, square_list3(1000) ran about 18 times faster than square_list0(1000), and square_array(N) was about 350 times faster than square_list0(1000). The last function virtually eliminates Python overhead by using NumPy arrays in vectorized code.

More Options

If performance is still not satisfactory after attempting the optimizations described here, you can try compiling your Python code. Compiling is beyond the scope of this post. You can find out more about Numba (which is included in the Anaconda distribution) or Cython by following these links. Numba allows you to compile pure Python code. Cython allows you to write fast C extensions for Python without learning C.

For users of the Anaconda distribution of Python, there is an optional add-on called Accelerate. This add-on will replace the standard NumPy, SciPy, and other scientific libraries with equivalent libraries that use Intel’s MKL routines for linear algebra. On many machines, this will improve performance without any effort on your part beyond installing the package. Accelerate also includes NumbaPro, a proprietary version of the Numba package. Accelerate is free to academic users.

Summary

To summarize, there are a few simple ways to speed up a Python program. Once you have a working program and you have identified its bottlenecks, you can look for library functions to replace the slowest functions in your program, you can rewrite your code to eliminate Python’s overhead, and you can search for faster algorithms to solve your problem. As you develop your programming skills, you will start to incorporate these principles automatically. Happily, this means less time profiling and optimizing!






Code Samples

The calculator.py Module

# -----------------------------------------------------------------------------
# calculator.py
# ----------------------------------------------------------------------------- 
"""
This module uses NumPy arrays for storage, but executes array math using Python
loops.
"""

import numpy as np

def add(x,y):
    """
    Add two arrays using a Python loop.
    x and y must be two-dimensional arrays of the same shape.
    """
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = x[i,j] + y[i,j]
    return z


def multiply(x,y):
    """
    Multiply two arrays using a Python loop.
    x and y must be two-dimensional arrays of the same shape.
    """
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = x[i,j] * y[i,j]
    return z


def sqrt(x):
    """
    Take the square root of the elements of an arrays using a Python loop.
    """
    from math import sqrt
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = sqrt(x[i,j])
    return z


def hypotenuse(x,y):
    """
    Return sqrt(x**2 + y**2) for two arrays, a and b.
    x and y must be two-dimensional arrays of the same shape.
    """
    xx = multiply(x,x)
    yy = multiply(y,y)
    zz = add(xx, yy)
    return sqrt(zz)

Thursday, September 24, 2015

Speeding Up Python — Part 1: Profiling

When people argue about programming languages, a common critique of Python is, “It’s slow.” This is occasionally followed by, “A program written in C will run a thousand times faster.” Such generalization carry little weight. Python is often fast enough, and a well-written Python program can run significantly faster than a poorly-written C program. Plus, Moore’s Law implies that computers today are over a thousand times faster than those of 15 years ago: You can do with Python today what was only possible with a highly optimized, compiled program in 2000.

It is also important to consider development time. Suppose a C program takes a week to write and debug and 1 minute to run, and an equivalent Python program takes a day to write and debug and 1000 minutes (about a day) to run. The “slow” Python program will finish running five days earlier than the “fast” C program! If you already know Python and don’t know any C, then the time difference will be even greater.

In short, you need not avoid Python or learn some other programming language just because someone tells you Python is slow. Of course, sometimes there is a need for speed. If you want to eke out the best performance from the available hardware, you may need to learn a compiled language. However, you might want to see how much you can improve a Python program first.

The goal of this post and its sequel is to provide some tools and tips for improving the performance of Python programs. In this post, we will look at some profiling tools — sophisticated stopwatches for timing programs as they execute. In the next post, we will use these tools to demonstrate some general principles that will help you speed up your Python programs.

Before proceeding, I offer this advice: If your program already runs fast enough, do not bother with profiling and optimization. There are an endless number of interesting problems waiting to be solved, and the question of how to improve the performance of a particular program by 20 percent is probably not one of them.

I have included a sample module <calculator.py> and a sample script <test.py> at the end of this post, which I will use to illustrate some of the profiling tools. You can copy and paste these into your own working directory to replicate the examples, or you can try the profiling tools on some of your own modules and scripts.

How Long Does It Really Take?

To improve the performance of a program, it is useful to gather quantitative data on how long it takes to run. This is called profiling. If the program takes a long time to run, you may be able to use your wristwatch to time it. For more accurate measurements, IPython provides some useful “magic” commands. These are commands preceded by a percent sign that must be entered at the IPython command prompt. (See Section 1.2.1 of A Student’s Guide to Python for Physical Modeling.)

All the commands that follow should by entered at the IPython command prompt.

The %time Command

%time is a basic stopwatch. It will tell you how much time elapses on your computer’s internal clock while a command is executed.

Try the following commands at an IPython command prompt and investigate the information provided by %time:

%time 2**100
%time pow(2,100)

You should see something like this:

CPU times: user 6 us, sys: 2 us, total: 8 us
Wall time: 14.1 us

The output includes several times: user, sys, total, and Wall time. Wall time is the time you would have measured with your stopwatch, were your reflexes fast enough. It is not a very good metric of how efficient a program is because it includes the time your job spent waiting in line to run as well as interruptions by other processes that your operating system thought were more important. user measures how much time your CPU spent running your code. sys is the amount of time devoted to such processes as memory access, reading and writing data, gathering input, and displaying output. total is the sum of user and sys. It is the best measure of performance, and it may be significantly less than Wall time.

Run the commands above several times. You may notice minor differences in the elapsed times, as well as a few significant variations. To get an accurate measure of performance, it is best to average over many repetitions of the same command. This is what the %timeit magic command does.

The %timeit Command

Try the same operations as before, but use the %timeit command instead of %time:

%timeit 2**100
%timeit pow(2,100)

The output should be something like this:

$ %timeit 2**100
10000000 loops, best of 3: 45.4 ns per loop

This means that Python inserted the command 2**100 inside a loop and carried out the operation ten million times. It evaluated 3 such loops. It recorded the total time for each loop, and then divided by 10 million. The best result from the 3 loops was an average execution time of 45.4 ns. (This is less than the result of %time, which includes the time required to transform the string "2**100" into instructions your CPU understands.)

You can already see the potential benefits of profiling. While 2**100 takes a mere 45 ns, pow(2,100) takes 1,230 ns — 27 times as long. If I am repeatedly computing large powers of integers, I can save time by using x**y instead of pow(x,y).

You may notice that %timeit does not execute different commands the same number of times. It uses an adaptive method to get as many iterations of a command as possible without taking too long. Its default is to do three loops of a large number of iterations, but you can modify this. For example, to force %timeit to use 10 loops of 1 million iterations, you would type

%timeit -r 10 -n 1000000 pow(2,100)

This method of specifying options will look strange if you have not worked at a UNIX command line. The hyphens and letters like “-r” are called option flags. -r tells Python to set the number of repetitions to whatever number comes next. Likewise, the -n tells Python to set the number of iterations in each loop to whatever number comes next. The command to time comes last. It may look jumbled and confusing, but don’t worry — Python knows what to do!

You can find out more about %time and %timeit at the IPython command prompt:

%time?
%timeit?

The %run -t Command

You can time the evaluation of an entire script by supplying an option flag to the %run magic command:

%run -t test.py

This will run the script and report the time it took to execute. You can repeat a script several times by supplying an additional option flag:

%run -t -N 10 test.py

This will run the script 10 times and report the total and average time of execution. Note that you must use a capital N here. Lower case n means something different to the %run command.

Which Part Takes the Longest?

You can accomplish a lot with the profiling tools mentioned so far. With %timeit, you can profile individual functions and commands. With %run -t, you can assess the effects of changes to a script. However, neither of these tools provides information on how time is divided among functions within a script, or subroutines within a function. You could try stepping through the program and using %timeit on each line to see which ones take the longest, but there are better ways. Spyder and the profile module allow you to see how much time Python spends executing individual functions, and the line_profiler.py module can measure the time spent on each line of a program!

This allows you to identify the elements of a program that take the most time — the bottlenecks of your program. You can then focus your efforts on optimizing those portions of your code that will have the greatest impact on overall performance.

To run the examples below, first execute the following commands (or run the test.py script) to import NumPy and the calculator module (included at the end of this post) and create two random arrays A and B.

import numpy as np
import calculator as calc

M = 10**3
N = 10**3

A = np.random.random((M,N))
B = np.random.random((M,N))

The profile Module

The profile module can be accessed from the command line or within scripts. It’s output is not always easy to understand, but it is useful for identifying which functions are consuming the most time.

To use the module, import it and use its run method.

import profile
profile.run('calc.hypotenuse(A,B)', sort='tottime')

This command will execute the command calc.hypotenuse(A,B) and display profiling statistics on the screen. I have used the optional keyword argument sort to instruct the method to display the most time-consuming functions at the top of the output. The default is to sort by function or method name. The output is plain text:

In [10]: profile.run('calc.hypotenuse(A,B)', sort='tottime')
          1000014 function calls in 3.943 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    2.010    1.005    2.018    1.009 calculator.py:16(multiply)
        1    0.999    0.999    1.003    1.003 calculator.py:3(add)
        1    0.743    0.743    0.917    0.917 calculator.py:29(sqrt)
  1000000    0.170    0.000    0.170    0.000 {built-in method sqrt}
        4    0.016    0.004    0.016    0.004 {built-in method zeros}
        1    0.004    0.004    3.943    3.943 <string>:1(<module>)
        1    0.000    0.000    3.943    3.943 {built-in method exec}
        1    0.000    0.000    3.938    3.938 calculator.py:42(hypotenuse)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:2264(_handle_fromlist)
        1    0.000    0.000    0.000    0.000 {built-in method hasattr}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The results show us which functions and modules were used during the execution of a script, how much time was spent on each, how many times each was called, and the file and line number where each function is defined.

This tool is useful for identifying which functions consume the most time. It does not provide a detailed analysis of individual functions, so you may still need to use %lprun.

The cProfile module contains the same methods and produces the same output as the profile module, but it takes less time to run. (Many of the Python routines in profile are rewritten as C routines in cProfile.) If you are doing a lot of profiling, you can speed up the process by replacing import profile with

import cProfile as profile

Profiling with Spyder

All of the tools mentioned so far are command-line tools that can be used with IPython, whether or not you are using Spyder. Spyder offers an additional option. You can use the menu command Run > Profile or the shortcut key <F10> to run a script using Spyder’s profiler. It will produce output like this:

Screen shot illustrating output from Spyders profiler.
Screen shot illustrating output from Spyder’s profiler.

Here we see information similar to the output of profile.run, but in a format that is easier to interpret.

The %lprun Command

If you have successfully identified the function that consumes the most time, you may wish to dissect it further to find out why. The line_profiler package, written by Robert Kern, does exactly this. It allows you to see how much time Python spends on each individual line of code.

The module is part of the Anaconda distribution. If you used the Miniconda installer, you may need to manually install this package from the command line:

$ conda install line_profiler

The package can do quite a lot, but we are only going to look at one of its tools — an IPython magic command for profiling. To make this command available in IPython, we need to load it using another magic command:

%load_ext line_profiler

This gives you access to a magic command called %lprun if the line_profiler module is installed correctly.

There are two modes available with %lprun. The first is “function mode”. This allows you to designate a specific function to be analyzed when you execute a command or series of commands. The second is “module mode”, which will analyze all of the functions in a module you designate.

To profile the add function in the calculator module with %lprun type the following:

%lprun -f calc.add calc.add(A,B)

The -f option indicates function mode. The next item is the name of the function to analyze. (Be sure you provide the name, and not a function call. Do not include parentheses or arguments.) The last item is the Python statement to execute. I have instructed Python to gather information on the add function in the calculator module (imported as calc) while it evaluates the statement calc.add(A,B). Here is the output:

Timer unit: 1e-06 s

Total time: 2.94468 s
File: calculator.py
Function: add at line 3

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     3                                           def add(x,y):
     4                                              """
     5                                              Add two arrays using a Python loop.
     6                                              x, y: 2D arrays with the same shape.
     7                                              """
     8         1            7      7.0      0.0     m,n = x.shape
     9         1         5704   5704.0      0.2     z = np.zeros((m,n))
    10      1001         1044      1.0      0.0     for i in range(m):
    11   1001000       872878      0.9     29.6         for j in range(n):
    12   1000000      2065045      2.1     70.1             z[i,j] = x[i,j] + y[i,j]
    13         1            1      1.0      0.0     return z

The total time is not that useful. It includes some of the overhead of analyzing the code line-by-line. If you are interested in the total execution time, use %timeit. The most useful information here is in the “% Time” column. This is the percentage of the total execution time spent on each line. Here, we see that most of the time (70.1 percent) is spent adding the elements of the arrays. However, it may surprise you to see that almost 30 percent of the time is spent on Line 11, evaluating the statement “for j in range(n)”.

Just seeing how time is spent during the function call can suggest ways to speed up the code. For example, if so much time is spent iterating over the values of the index, maybe a Python loop is a poor method for adding arrays …

It is also possible to use %lprun to analyze all of the functions in a module at once. This will print out a lot of information, but sometimes this is what you want.

%lprun -m calculator calc.hypotenuse(A,B)

The -m option indicates module mode, the next item is the name of the module to analyze, and the last item is the Python statement to execute. I have instructed Python to gather information on all of the functions in the calculator module while it evaluates the statement calc.hypotenuse(A,B).

Tips for Profiling

If Carl von Clausewitz were a computer programmer rather than a military strategist, he might have said, “The enemy of a good program is the dream of a perfect program.” The most important rules of profiling are

  • Avoid unnecessary profiling.

  • Avoid premature profiling.

Profiling is time-consuming. Unless you need a working program to run faster — or you simply want to learn about profiling — skip it. When you use profiling tools, you should only analyze a working program. Remember, the goal is to identify and eliminate bottlenecks. You cannot diagnose the most time-consuming step of a program is until the entire program is working. Profiling and “optimizing” code too early slow down development and often produce unintended consequences.

Profiling tools can provide a glut of information that is difficult to digest. If you are trying to speed up a program (for example, the test.py script at the end of this post), you might try the following procedure:

  1. Use %lprun in function mode, the profile module, or Spyder’s profiler to analyze the primary function (e.g., hypotenuse(A,B)) and identify bottlenecks.

  2. Use %lprun in function mode to dissect the secondary functions that consume the most time (e.g., multiply(x,y)).

  3. Use %timeit to find faster alternatives to the most time-consuming operations.

  4. Repeat steps 1–3 until your program is fast enough.

Analyzing the primary function is important. You might be able to speed up a secondary function by a factor of 1,000; however, if that function only takes 1 percent of the total run time of your program, you haven’t gained much. On the other hand, if another function takes 90 percent of the run time and you speed it up by a factor of 2, you have made a significant improvement.

There are many more profiling tools available in Python. Delve into the timeit, profile, and line_profiler modules if you need to go beyond the techniques described here.

Summary

The first step in improving the performance of your code is quantifying performance. IPython provides several tools that allow you to time statements, functions, code fragments, and scripts. These tools will allow you to identify the portions of your program that consume the most time — the bottlenecks. By focusing on these, you can get most out of your efforts toward optimization. Once your program is fast enough, you can move on to something more interesting!






Code Samples

The calculator.py Module

This module uses NumPy arrays for storage, but executes array math using Python loops.

# -----------------------------------------------------------------------------
# calculator.py
# ----------------------------------------------------------------------------- 
import numpy as np

def add(x,y):
    """
    Add two arrays using a Python loop.
    x and y must be two-dimensional arrays of the same shape.
    """
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = x[i,j] + y[i,j]
    return z


def multiply(x,y):
    """
    Multiply two arrays using a Python loop.
    x and y must be two-dimensional arrays of the same shape.
    """
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = x[i,j] * y[i,j]
    return z


def sqrt(x):
    """
    Take the square root of the elements of an arrays using a Python loop.
    """
    from math import sqrt
    m,n = x.shape
    z = np.zeros((m,n))
    for i in range(m):
        for j in range(n):
            z[i,j] = sqrt(x[i,j])
    return z


def hypotenuse(x,y):
    """
    Return sqrt(x**2 + y**2) for two arrays, a and b.
    x and y must be two-dimensional arrays of the same shape.
    """
    xx = multiply(x,x)
    yy = multiply(y,y)
    zz = add(xx, yy)
    return sqrt(zz)

The test.py Script

This is a short script that creaates some arrays and uses the calculator module.

# -----------------------------------------------------------------------------
# test.py
# ----------------------------------------------------------------------------- 
import numpy as np
import calculator as calc

M = 10**3
N = 10**3

A = np.random.random((M,N))
B = np.random.random((M,N))

calc.hypotenuse(A,B)

Friday, August 14, 2015

Lists, Comprehensions, and Generators

In A Student’s Guide to Python for Physical Modeling, we emphasized NumPy arrays and paid less attention to Python lists. The reason is simple: In most scientific computing applications, NumPy arrays store data more efficiently and speed up mathematical calculations, sometimes a thousandfold.

However, there are some applications where a Python list is the better choice. There are also times when the choice between a list and an array has little or no effect on performance. In such cases a list can make your code easier to read and understand, and that is always a good thing.

In this post, I will describe Python lists and explain a special Python construct for creating lists called a list comprehension. I will also describe a similar construct called a generator expression.

Lists

A list is an ordered collection of items. You may have made a “To Do” list this morning or a grocery list for a recent trip to the store. In computer science, a list is a data structure that supports a few basic methods like insert, remove, append, and find. You probably used several of these operations with your own list. Perhaps you penciled in a new task later in the day (append), then crossed tasks off the list as you completed them (remove).

An array is a rigid data structure that stores a fixed number of identical elements (integers, floats, eight-character strings, etc.). If operations like insert, remove, or append are important parts of a computational task, a more flexible data structure like a list may be appropriate. The type of data may also suggest that a Python list is a better choice than a NumPy array. For instance, how would you initialize an array to store a grocery list? Furthermore, if the number of items to be stored is not known at the outset, it may be easier to store the data in a list and convert it to an array later. (Perhaps you are taking data at regular intervals but do not know how long an experiment will run.) Finally, if you are not worried about performance and scaling, a Python list might be a simpler option than a NumPy array. If you just need the first 20 perfect cubes, do you really want to import all of NumPy?

Let’s use the example of a grocery list to create and transform a simple list. A more useful example in scientific computing might be managing a collection of data files to process into stunning figures for your latest report, but the principles are the same.

To create a Python list, just enclose a comma-separated list of elements inside square brackets.

groceries = ['milk', 'eggs', 'orange juice']

Some functions also return a list. I can create the same list from a string using the split method:

groceries = "milk, eggs, apples, orange juice".split(',')

To find an item in a list, use the index method. It will return the index of the first occurrence of the item you request if it is in the list and a ValueError if it is not.

groceries.index('eggs')
groceries.index('bread')

It looks like I forgot the bread! I can add it to the list using the append method:

groceries.append('bread')

Later, I see some orange juice in the back of the refrigerator (not yet expired …), so I will delete that item from the list using the remove method:

groceries.remove('orange juice')

Since I am headed to ABC Grocery, where everything is organized alphabetically, I will sort the list and review it:

groceries.sort()
print(groceries)

One more useful operation is joining lists, or concatenation. In Python, the addition operator for lists is defined to join two lists. The extend method of a list will also join two lists. Calling append with a list as its argument will not add each element to the original list. It will make the argument list the final element of the calling list. I.e., you will have a list within a list, not a concatenation of the two lists.

If I had two separate grocery lists, I could join them into a single list in a variety of ways:

old_list = ['bananas', 'coffee']
new_list = groceries + old_list         # Addition creates a new list.
groceries += old_list                   # In-place addition extends original list.
old_list.extend(groceries)              # extend also extends original list.

bad_list = ['bananas', 'coffee']
bad_list.append(groceries)              # append does NOT concatenate lists.

After this set of commands, groceries and new_list contain the same elements. old_list contains 'bananas' and 'coffee' twice, since the commands append old_list to groceries first, and then append the new groceries to the original old_list. As you can see, bad_list did not merge the two lists properly.

In case you are skeptical of the usefulness of Python lists in scientific endeavors, here is a function that creates a list of the first N Fibonacci numbers.

def fibonacci(N):
    if N == 0: return [1]       # Handle unusual request for 0th number.
    fib = [1, 1]                # Create a list for all other values of N.
    for k in range(1,N):
        # Next Fibonacci number is the sum of the previous two.
        fib.append(fib[-1] + fib[-2])
    return fib

If you are still skeptical of the possible utility of a Python list, try writing the same function using NumPy arrays and using it to compute fibonacci(100). I’ve included a solution at the end of the post.

The command inside the loop could also have been written using list addition. Either of the following commands will work:

fib += [fib[-1] + fib[-2]]
fib = fib + [fib[-1] + fib[-2]]

The second approach is less efficient because it makes a copy of the list every time a new element is added, but it is syntactically correct. Be aware that you can only use list addition to join two lists — not a list and some other object you would like to put inside it. The following alternatives would result in errors:

fib += fib[-1] + fib[-2]
fib += (fib[-1] + fib[-2])
fib = fib + (fib[-1] + fib[-2])

List Comprehensions

Sometimes a list is not a collection of random elements; it has a logical structure instead. For instance, suppose you want to find the sum of the first 20 perfect cubes. You could create an array or a list of cubes, then add them up. The familiar procedure using a NumPy array is

import numpy as np
cubes = np.arange(1,21)**3
cubes.sum()

This does not require too much typing, and the purpose of the code is fairly clear. However, compare it with the following code:

cubes = [n**3 for n in range(1,21)]
sum(cubes)

This is an example of a list comprehension: a Python expression inside of the square brackets that denote a list. The statement is similar to the notation used in mathematics to define a set. It also clearly describes what the list contains. Note the similarity of the list comprehension to the following loop, which creates the same list:

cubes = []                  # Initialize empty list.
for n in range(1,21):
    cubes.append(n**3)      # Append perfect cubes.

The list comprehension effectively compresses all of this into a single Python statement.

A list comprehension defines a new list from another collection of objects via an expression like “for n in ...” Rather than using range, you can build one list from another:

poly = [(x+1)*(x-1) for x in cubes]

You can also apply conditional statements in a list comprehension:

even_squares = [n**2 for n in range(1,51) if n%2 == 0]
odd_squares  = [n**2 for n in range(1,51) if n%2 == 1]

You can cram quite a lot of code into a list comprehension, but it is not always advisable:

pythagoras = [(a,b,c)   for a in range(1,31) for b in range(a,31) \
                        for c in range(1,31) if a**2 + b**2 == c**2]

Despite the length and complexity of this single expression, its meaning is still fairly clear.

Returning to our original task, we can do even better than adding up the first 20 perfect cubes. Using a nested list comprehension, we can make a list of sums of cubes!

sums_of_cubes = [sum([n**3 for n in range(1,N+1)]) for N in range(1,21)]

Generators

A list comprehension creates a Python list that stores all of the elements in a single data structure. Sometimes this is exactly what you need. Other times, you simply want to iterate over all of the items in a list. If you never need all of the items in the list at once, you can use a generator expression instead. A generator expression looks like a list comprehension, except that you enclose the expression in round parentheses instead of square brackets — (...) instead of [...]. Despite the round parentheses, a generator expression does not create a tuple, and there is no such thing as a “tuple comprehension”.

cube_list = [n**3 for n in range(1,101)]
cube_generator = (n**3 for n in range(1,101))

A generator is simpler than a list. You cannot insert, remove, or append items, nor can you search or sort a generator. A generator knows how to produce the next item in a sequence, and little else. Once it has reached the end of its sequence, it does even less.

for x in cube_list: print(x)            # Prints numbers stored in list.
for x in cube_list: print(x)            # Prints numbers stored in list again.

for x in cube_generator: print(x)       # Prints numbers provided by generator.
for x in cube_generator: print(x)       # Prints nothing.  Generator is finished.

The advantages of a generator over a list are size and speed. Compare the output of the __sizeof__() method for the following lists and generators. This method returns the size of the object in bytes.

cube_list = [n**3 for n in range(1,10)]
cube_generator = (n**3 for n in range(1,10))
print(cube_list.__sizeof__())
print(cube_generator.__sizeof__())

cube_list = [n**3 for n in range(1,10**3)]
cube_generator = (n**3 for n in range(1,10**3))
print(cube_list.__sizeof__())
print(cube_generator.__sizeof__())

cube_list = [n**3 for n in range(1,10**6)]
cube_generator = (n**3 for n in range(1,10**6))
print(cube_list.__sizeof__())
print(cube_generator.__sizeof__())

The list grows from 168 bytes to 9 kB to 8.7 MB, while the generator remains a constant 48 bytes. Also, you may have noticed a delay while Python created the large list during the last set of commands.

I generally prefer a generator when I iterate over a large sequence of items once — especially if the program might exit the loop before reaching the end of the sequence.

Summary

NumPy arrays are often the most efficient data structure for numerical work in Python. However, there are some tasks for which a Python list is a better choice — often when organizing data rather than processing data. Python offers a compact syntax for creating lists called a list comprehension. A generator expression is similar, but creates an object that can produce a sequence without storing all of its elements. A generator is often a better choice than a list or an array when iterating over a large sequence of items.




NumPy version of fibonacci(N)

Here is a version of the fibonacci(N) function above that uses NumPy arrays.

import numpy as np

def Fibonacci(N):
    if N == 0: return np.array(1)   # Handle unusual request for 0th number.
    fib = np.zeros(N+1, dtype=int)  # Initialize list for all other values of N.
    fib[0], fib[1] = 1, 1
    for k in range(2,N+1):
        # Next Fibonacci number is the sum of the previous two.
        fib[k] = fib[k-1] + fib[k-2]
    return fib

Perhaps you came up with a more elegant solution. I find this version more difficult to code and more confusing to read. Plus, using a NumPy array forces a compromise: Either use floating point numbers and lose significant digits for N > 78, or use integers and encounter an overflow error for N > 91. In either case, you cannot generate the 100th Fibonacci number!

Wednesday, August 5, 2015

Function Arguments: *args and **kwargs

In Python, functions can have positional arguments and named arguments. In this post, I will describe both types and explain how to use special syntax to simplify repetitive function calls with nearly the same arguments. This extends the discussion in section 5.1.3 of A Student’s Guide to Python for Physical Modeling.

First, let’s look at np.savetxt, which has a straightforward declaration:

$ import matplotlib.pyplot as plt
$ from mpl_toolkits.mplot3d import Axes3D

$ np.savetxt?
Signature: np.savetxt(  fname, X, fmt='%.18e', delimiter=' ', newline='\n',
                        header='', footer='', comments='# ')
Docstring: Save an array to a text file.

We see the function has two required arguments, followed by several optional arguments with default values. Next, let’s look at something more exotic:

$ Axes3D.plot_surface?
Signature: Axes3D.plot_surface(X, Y, Z, *args, **kwargs)
Docstring: Create a surface plot.

The first three arguments seem obvious enough: These are the arrays that specify the points on the surface. The last two — *args and **kwargs — look strange. Let’s examine one more function:

$ plt.plot?
Signature: plt.plot(*args, **kwargs)
Docstring: Plot lines and/or markers to the :class:`~matplotlib.axes.Axes`.

*args and **kwargs are the only arguments for the familiar plotting function! They are the focus of this post.

Positional Arguments

In a Python function, positional arguments are Python expressions assigned to function variables based on their position in the function call.

Suppose I create a surface plot of topographic data with the command

ax = Axes3D(plt.figure())
ax.plot_surface(latitude, longitude, elevation)

Python evaluates ax.plot_surface with the substitutions

X = latitude
Y = longitude
Z = elevation

I.e., the substitutions are based on the positions of the arguments. I would get a different (meaningless) surface if I shuffled the order around:

ax.plot_surface(elevation, longitude, latitude)

*args

The *args argument in a function definition allows the function to process an unspecified number of positional arguments. Let’s look at a simple example:

def get_stats(*args):
    from numpy import mean, std
    return mean(args), std(args)

This function will compute the descriptive statistics (mean and standard deviation) of any sequence of values passed to it. Try the following commands:

get_stats(1, 2, 3, 4, 5)
get_stats(range(30))
get_stats(np.random.random(100))

You can type any number of arguments when calling the function, or you can pass the function any sequence of values — an array, a tuple, a list.

This ability to process any number of arguments is what makes it possible to call plt.plot in a variety of ways. All of these commands are valid:

t = np.linspace(-1, 1, 101)
plt.plot(t)
plt.plot(t, t**2 - 1)
plt.plot(t, t**3 - t, 'k--')
plt.plot(t, t**4 - t**2, t, t**5 - t**3 + t)

How can one function process so many different kinds of input, including mixtures of variable names, expressions, and strings? The plot function has several subroutines that determine exactly what is in the series of arguments you supply and what to do with those objects. This can make a function very flexible, but it is also likely to be complex — both to write and to interpret.

You can use the *args notation to “unpack” a sequence into a series of positional arguments for any function. For example, suppose the three topographic data arrays mentioned earlier had been packaged as a single tuple:

data = (latitude, longitude, elevation)

The surface plot function does not know what to do with this tuple, but I can use the *args notation to assign the three arrays to X, Y, and Z.

ax.surface_plot(data)       # Raises an exception.
ax.surface_plot(*data)      # Creates surface plot.

The *data command instructs Python to assign the items in data to the positional arguments of ax.surface_plot.

This method of passing positional arguments to functions can be convenient when you wish to automate calculations using various combinations of input parameters or to ensure that several functions use the same data:

data = (x, y, z)
f(*data)
g(*data)
h(*data)

If I want to perform the same analysis on a different set of data later, I only need to change the data variable.

Named Arguments

In a Python function, named arguments are Python expressions whose value in the function is specified by a keyword-value pair.

For example, this function call from the Illuminating Surface Plots post uses named arguments to specify several options:

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                facecolors=green_surface)

Function like plt.plot and Axes3D.plot_surface whose definitions include **kwargs can accept any number of keyword arguments.

**kwargs

Similar to the *args notation, you can use the **kwargs notation to pass a collection of named arguments to a function. To do this, you must package the keyword-value pairs in a dictionary.

A dictionary is a Python data structure that associates an immutable object called a key with a value, which can be mutable or immutable. A future post will discuss dictionaries in more detail. For this post, only the syntax for creating a dictionary is important. Enclose the contents of a dictionary between curly braces: { ... }. Each entry of the dictionary is a key, followed by a colon, followed by a value:

definitions = { 'cat':"n. A feline.", 'dog':"n. A canine."}

definitions['cat']

The first command creates a dictionary. The second accesses one of its members.

In Illuminating Surface Plots, I used the same set of plotting options many times. This led to a lot of typing, and for the commands in the script, a lot of retyping every time I decided to change one of these options. A more efficient method is to … Define once, reuse often. I can put all of the data arrays into a tuple and most of the plotting options into a dictionary:

data = (X, Y, Z)
plot_options = {'rstride':1,
                'cstride':1,
                'linewidth':0,
                'antialiased':False}

Most surface plot commands were identical except for the value of the facecolors option. I could create two surface plots with different values of this argument as follows:

ax1.plot_surface(*data, facecolors=green_surface, **plot_options)
ax2.plot_surface(*data, facecolors=illuminatedn_surface, **plot_options)

This is easier to type and ensures that both plots use the same data set and plotting options.

Summary

Python function accept positional and named arguments. Functions whose definitions include the arguments *args and **kwargs will accept an unspecified number of either type. You can use this notation to unpack a sequence (list, tuple, or array) into a series of positional arguments or a dictionary into a series of named arguments. This provides a convenient method for calling the same function with slightly different inputs and options, or calling different functions with the same inputs and options.

The rules for supplying arguments to functions are as follows:

  1. Positional arguments (if any) must come first.
  2. Named arguments (if any) and/or positional arguments in the form of *args (if any) must come next.
  3. Named arguments in the form of **kwargs (if any) must come last.

Saturday, August 1, 2015

Illuminating Surface Plots

Matplotlib provides functions for visualizing three-dimensional data sets. One useful tool is a surface plot. A surface plot is a two-dimensional projection of a three-dimensional object. Much like a sketch artist, Python uses techniques like perspective and shading to give the illusion of a three-dimensional object in space. In this post, I describe how you can control the lighting of a surface plot.

Surface Plots

First, let’s look at some of the options available with the default three-dimensional plotting tools. This script will create a surface plot of a Bessel function. Its ripples will emphasize the effects of lighting later.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D     # Import 3D plotting tools.
from scipy.special import jn                # Import Bessel function.

# Define grid of points.
points = np.linspace(-10, 10, 51)
X, Y = np.meshgrid(points, points)
R = np.sqrt(X**2 + Y**2)
Z = jn(0,R)

# Create 3D surface plot.
ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1)

The default surface plot is a single color with grid lines and some shading. To get rid of the grid lines use the following plot commands instead:

ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                color='green')

We can use a colormap to assign colors to the figure based on the height of the surface

ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                cmap='coolwarm')

Assigning colors according to height may not be what you want, and there is no shading when using a color map. Furthermore, when you specify a single color, you cannot adjust the lighting angle to produce different shading effects. Sometimes, you may want to control these lighting effects, and Matplotlib provides a way.

Turn on the Lights

Tucked away in Matplotlib is an object called LightSource. It allows you to simulate illuminating a surface using a virtual light source placed at a location of your choosing. (LightSource creates an “illuminated intensity map.” You can find the details of the model in the source code.) It does not provide the same control or as many features as the lighting tools of commercial packages like MATLAB or Mathematica, but it is sufficient to produce some nice plots.

To use the object, import it from matplotlib.colors and then apply its shade method to the data set:

# Get lighting object for shading surface plots.
from matplotlib.colors import LightSource

# Get colormaps to use with lighting object.
from matplotlib import cm

# Create an instance of a LightSource and use it to illuminate the surface.
light = LightSource(90, 45)
illuminated_surface = light.shade(Z, cmap=cm.coolwarm)

The two arguments to LightSource are the azimuth and elevation angles of the light source. (0,0) corresponds to a light placed along the x-axis. As the name implies, the elevation is the angle above the xy-plane in degrees. The virtual light source position is then rotated about the vertical axis by the azimuth angle (also in degrees). (Don’t confuse these parameters with the similarly named parameters specifying the observer’s position!)

The function requires a single argument: a two-dimensional array — here, Z. LightSource interprets each data point as the height of a surface above a point in the xy-plane. It also assumes these points have the same spacing in the x and y directions. If you are not using Cartesian coordinates and uniform spacing, you may be surprised by the result.

The object returned by light.shade is a NumPy array of RGBA values for each data point. (For each point in the input array, light.shade returns a 4-element array of the Red, Green, Blue, and Alpha value for that point. Alpha controls the transparency of the point.) Other plotting tools can use this data to draw a shaded surface. To use this array instead of the color or cmap options of the surface plotting command, pass the array with a keyword argument:

ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                facecolors=illuminated_surface)

If you prefer to shade a surface of uniform color instead of using a color map, or if you have a colored surface that you wish to shade, a LightSource object offers a second method called shade_rgb. You have to pass the function two arguments: an array of RGB values and a data set giving the height of each point.

As an example, let’s transform a white surface so we can see the shading effects independent of any coloring. The RGB values for white are [1.0, 1.0, 1.0]. (Red, green, and blue values are all at maximum.) To create a uniform white surface, we need to create an array with three elements for every element of the data set Z, with each entry set to 1.0. The following code will create the RGB array, shade it, and plot it:

rgb = np.ones((Z.shape[0], Z.shape[1], 3))
illuminated_surface = light.shade_rgb(rgb, Z)
ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                facecolors=illuminated_surface)

To change the color of the shaded surface, we can use NumPy array math. Just make a three-element array with the RGB values of your target color and multiply rgb by this array before shading.

# Create a shaded green surface.
green = np.array([0,1.0,0])
green_surface = light.shade_rgb(rgb * green, Z)
ax = Axes3D(plt.figure())
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0, antialiased=False,
                facecolors=green_surface)

The figure below illustrates the techniques described here. The same surface is shown with four different color and lighting configurations. The code that produced the figures is also included below.

Shading of 3D surfaces.
Shading of 3D surfaces.






# =========================================================================
# shading.py
# Author:   Jesse M. Kinder
# Created:  2015 Jul 27
# Modified: 2015 Jul 31
# -------------------------------------------------------------------------
# Demonstrate shading of surface plots using Matplotlib's LightSource.
# ------------------------------------------------------------------------- 
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Import Bessel function.
from scipy.special import jn

# Import colormaps.
from matplotlib import cm

# Import lighting object for shading surface plots.
from matplotlib.colors import LightSource

# Define grid of points.
points = np.linspace(-10, 10, 101)
X, Y = np.meshgrid(points, points)
R = np.sqrt(X**2 + Y**2)
Z = jn(0,R)

# Create an rgb array for single-color surfaces.
white = np.ones((Z.shape[0], Z.shape[1], 3))
red = white * np.array([1,0,0])
green = white * np.array([0,1,0])
blue = white * np.array([0,0,1])

# Set view parameters for all subplots.
azimuth = 45
altitude = 60

# Create empty figure.
fig = plt.figure(figsize=(18,12))

# -------------------------------------------------------------------------
# Generate first subplot.
# ------------------------------------------------------------------------- 
# Create a light source object for light from
# 0 degrees azimuth, 0 degrees elevation.
light = LightSource(0, 0)

# Generate face colors for a shaded surface using either
# a color map or the uniform rgb color specified above.

illuminated_surface = light.shade_rgb(red, Z)

# Create a subplot with 3d plotting capabilities.
# This command will fail if Axes3D was not imported.
ax = fig.add_subplot(2,2,1, projection='3d')
ax.view_init(altitude, azimuth)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0,
                antialiased=False, facecolors=illuminated_surface)

# -------------------------------------------------------------------------
# Repeat the commands above for the other three subplots, but use different
# illumination angles and colors.
# ------------------------------------------------------------------------- 
light = LightSource(90, 0)
illuminated_surface = light.shade_rgb(green, Z)

ax = fig.add_subplot(2,2,2, projection='3d')
ax.view_init(altitude, azimuth)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0,
                antialiased=False, facecolors=illuminated_surface)

# ------------------------------------------------------------------------- 
light = LightSource(90, 45)
illuminated_surface = light.shade_rgb(blue, Z)

ax = fig.add_subplot(2,2,3, projection='3d')
ax.view_init(altitude, azimuth)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0,
                antialiased=False, facecolors=illuminated_surface)

# ------------------------------------------------------------------------- 
light = LightSource(180, 45)
illuminated_surface = light.shade(Z, cmap=cm.coolwarm)

ax = fig.add_subplot(2,2,4, projection='3d')
ax.view_init(altitude, azimuth)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, linewidth=0,
                antialiased=False, facecolors=illuminated_surface)

# ------------------------------------------------------------------------- 
plt.tight_layout()
plt.savefig('shading.png')

Monday, July 20, 2015

Displaying Plots Inside Loops

In the early stages of some programming projects, I often like to explore the problem visually: to run a series of simulations with different values of the input parameters and create a series of graphs in order to gain some intuition about the problem. A convenient approach is to embed some plotting commands in a loop.

For example, suppose I want to see what the first ten Bessel functions look like. It seems that the following script should do what I want:

import numpy as np
import matplotlib.pyplot as plt
from scipy.special import jn                # Import Bessel function.

r = np.linspace(0,20,101)

for n in range(10):
    plt.plot(r, jn(n,r))                    # Draw nth Bessel function.
    plt.title("Bessel function J[%d](r)." % n)
    input("Press <Enter> to continue.")     # Wait for user input to continue.
    plt.cla()                               # Clear axes for next plot.

However, when I run the script, all I see is an empty plot window, even though I have to hit <Enter> ten times at the command prompt!

This behavior is due to the way Python handles events in the graphical user interface (GUI) used to display plots. Basically, Python creates the figure, but moves on to the next command before giving me a chance to look at it. What I need is some way to force Python to display the plot until I am ready to move on. (Asking for input at the command line does not accomplish this.)

plt.waitforbuttonpress()

PyPlot provides the perfect function: plt.waitforbuttonpress(). Referring to its documentation, we see that this function is a “blocking call to interact with the figure.” In other words, Python is not allowed to execute any other commands until the user interacts with the figure. The documentation also explains that there is an optional timeout argument, and the function returns different values depending on what happens:

  • True — The user pressed a key on the keyboard.
  • False — The user clicked the mouse.
  • None — The (optional) timer elapsed.

Clicking the mouse on buttons like “Zoom” or “Save Figure” do not return a value. Only a mouse click within the actual plot causes the function to return False.

The following loop will force Python to display each plot until I press a button on the keyboard or click with the mouse:

for n in range(10):
    plt.plot(r, jn(n,r))                # Draw nth Bessel function.
    plt.title("Bessel function J[%d](r)." % n)
    plt.waitforbuttonpress()            # Display plot and wait for user input.
    plt.cla()                           # Clear axes for next plot.

When I run the script, I can see each Bessel function, but if I try to zoom in on the figure, Python advances to the next plot. I can get around this by placing the command inside an “infinite” loop and taking advantage of the different return values of the function:

for n in range(10):
    plt.plot(r, jn(n,r))                    # Draw nth Bessel function.
    plt.title("Bessel function J[%d](r)." % n)
    while True:
        if plt.waitforbuttonpress(): break  # Exit loop if user presses a key.
    plt.cla()                               # Clear axes for next plot.

At each iteration of the while loop, Python waits for the result of plt.waitforbuttonpress(). If I click the mouse to zoom or move the axes, the return value is False, so the loop executes again. However, if I press a key on the keyboard, the return value is True and Python breaks out of the loop. I can use the mouse to interact with each plot as much as I like, then press a key to move on to the next one when I am ready.

You can accomplish the same task in fewer lines with a different while loop:

while not plt.waitforbuttonpress(): pass

This while loop does nothing at each iteration (pass) and continues to cycle so long as the return value of plt.waitforbuttonpress() evaluates to False.

Exercises

Rewrite the script so that Python will display each plot for 5 seconds unless the user presses a key or clicks the mouse (inside the active portion of the plot window), at which point the countdown timer resets.

Rewrite the script so that Python will display each plot until the user presses a key on the keyboard or there are 10 seconds of inactivity — whichever comes first.

Thursday, July 16, 2015

Raising a Figure Window to the Foreground

This post describes a utility function that will raise a plot window to the foreground of your screen. The function will only work with the Qt graphics backend, so I will start with a brief overview of graphics backends. If you just want to use the function as a black box, you can do the following:

  1. Set the Graphics Backend to “Qt” in the Spyder preferences menu.
  2. Copy this function into your working directory.

Graphics Backends

You may have have written a script to produce precisely the data you need, but a lot of processing is required to transform these numbers into a figure. You need to create a plot window, draw a figure inside of it, and manage all of the attributes that control a figure’s appearance: title, axis labels, line widths, colors, tick marks, etc. All of this happens in the background when you type a command like plt.plot(x,y). A graphics backend is the software that Python uses to physically draw a figure on your computer screen.

If you have already imported PyPlot, you can see which backend you are currently using:

plt.get_backend()

There are many backends available. You can get a list of those available for your Python environment as follows:

import matplotlib
matplotlib.rcsetup.all_backends

My installation lists 23 available backends. Many of these have cryptic names like “TkAgg” or “Qt5Agg”. These names describe the software packages used to create Python libraries for displaying windows. (See Qt, Tk, or Agg, for instance.) If you are working from the command line (outside of Spyder), you can select an available backend by typing

matplotlib.use(backend name)

before you import PyPlot. If you have already imported PyPlot, this command will have no effect.

In Spyder, you must select a backend from the Preferences menu: Preferences > IPython > Graphics. There is a button with a drop-down menu that allows you to select a graphics backend. You must reset the kernel or restart Spyder for the change to take effect.

In my installation of Spyder, the Preferences menu allows me to select among 5 options: Inline, Automatic, Mac OSX, Tkinter, and Qt.

Each of these has its own strengths and weaknesses. “Inline” is fast, but it does not allow you to interact with the plots you create. “Mac OSX” allows me to use OS X shortcut keys like <Cmd-~> to cycle through open plot windows and <Cmd-W> to close the current plot window. “Tkinter” is an extremely clean and simple graphical user interface (GUI). (You can use Tkinter to develop your own widgets — GUI’s for scripts you write — but that is a topic for another post …) However, the “Qt” backend offers two useful features not present in the other backends:

  1. Interactive plot menu: When you create a plot, you can access an interactive menu to adjust line properties, axis properties, range, and labels.
  2. A window manager that allows you to bring the plot window to the foreground.

The interactive plot menu is nice when the figure you have created is almost perfect. You can do a few minor modifications without having to run a plotting script again. (It can be frustrating to wait for Python to redraw a complex figure when you only want to change the size of the title font!)

The window manager is useful if your installation of Spyder — like mine — tends to create new figure windows in the background where you cannot see them. Instead of minimizing Spyder and whatever other applications you have open, you can type a command at the IPython prompt to bring the figure to the foreground.

Sending a Window to the Foreground

To see how this works, start a new IPython session using the Qt backend. (Remember: If you are using Spyder, you must set this in the Preferences menu. If you are working from the command line, you must set the backend before you import PyPlot.)

import matplotlib
matplotlib.use('Qt5Agg')

import numpy as np
import matplotlib.pyplot as plt

Now, generate some data, create a figure with a name, and plot the data.

x = np.linspace(-1,1,101)
y = x**3 - x

plt.figure('cubic')
plt.plot(x, y, 'r', lw=3)

When you create a figure, the Qt backend creates a figure manager, an object that can draw and print the figure to the screen and manipulate the figure window. To gain control of the figure manager in our example, type

cfm = plt.get_current_fig_manager('cubic')

One of the objects associated with a figure manager created by the Qt backend is called window. It is essentially the object that controls the window used by your operating system to display the figure you have drawn. Using two of the window object’s methods, we can raise the plot window to the foreground from the command line:

cfm.window.activateWindow()
cfm.window.raise_()

Your figure window should have come to the foreground of your screen when you executed these commands.

A Utility Function

If you find this useful, you may wish to save typing in the future and combine all of these commands into a single function:

def raise_window(figname=None):
    """
    Raise the plot window for Figure figname to the foreground.  If no argument
    is given, raise the current figure.

    This function will only work with a Qt graphics backend.  It assumes you
    have already executed the command 'import matplotlib.pyplot as plt'.
    """

    if figname: plt.figure(figname)
    cfm = plt.get_current_fig_manager()
    cfm.window.activateWindow()
    cfm.window.raise_()

Save this function in a script or module in your working directory and import it whenever you wish to use it. After creating a plot, you can raise the figure to the foreground with a single command:

plt.figure('quartic')
plt.plot(x, x**4 - x**2, 'b', lw=3)
raise_window('quartic')

If you don’t give your figures names as you create them, you can refer to them by number as well.

Tk Backend

You can accomplish the same task of raising a window using the “Tk” backend. Use the following commands:

plt.plot(x,y)

cfm = plt.get_current_fig_manager()
cfm.window.attributes('-topmost', True)

This will fix the plot window in the foreground until you close it. To allow other application windows to rise to the top of the screen as you access them, type

cfm.window.attributes('-topmost', False)

This will leave the plot window in the foreground until you click on another application.

You can bundle these commands into a utility function as well:

def raise_window(figname=None):
    """
    Raise the plot window for Figure figname to the foreground.  If no argument
    is given, raise the current figure.

    This function will only work with a Tk graphics backend.  It assumes you
    have already executed the command 'import matplotlib.pyplot as plt'.
    """

    if figname: plt.figure(figname)
    cfm = plt.get_current_fig_manager()
    cfm.window.attributes('-topmost', True)
    cfm.window.attributes('-topmost', False)
    return cfm

After creating a plot, you can raise the figure to the foreground (if you are using a Tk backend, such as “TkAgg”) with a single command:

plt.figure('quartic')
plt.plot(x, x**4 - x**2, 'b', lw=3)
raise_window('quartic')

Again, if you don’t give your figures names as you create them, you can refer to them by number as well.

Sunday, June 21, 2015

Interpolation

“I have experimental data for absorption (dependent variable) measured at various wavelengths (independent variable). Now I want estimates of the absorption at other values in the same range.”

This is an interpolation problem. We wish to obtain the value of a dependent value at point for which we did not sample the independent value. Interpolation is a mathematical procedure for filling in the gaps between available values. SciPy provides a module for interpolation based on the FITPACK library of FORTRAN functions. (Thus, it is fast and reliable.)

A Simple Example

To gain access to the interpolation functions, import the module:

import scipy.interpolate

Or, simply import the function we need for our one-dimensional problem:

from scipy.interpolate import interp1d 

Suppose that each column of an array expData contains a (wavelength, absorption) pair. We can use interp1d to create a function-like object for interpolation from this data set:

F = interp1d(expData[:,0], expData[:,1], kind='cubic')

Now, F(w) will return an interpolated estimate of the absorption at wavelength w every time it is called.[1] We can use the function to get an estimate for each wavelength in an array as well. Define the points at which you want the estimated values:

desiredRange = np.arange(430, 700, 100)

Then, evaluate the function at the desired points:

interpolatedData = F(desiredRange)

This is the basic procedure for interpolation. Now let’s take a look inside our black box!

More on interp1d(x,y)

interp1d requires two arguments — the x and y values that will be used for interpolation. In this example, we have provided an optional argument kind that specifies the type of interpolation procedure. Here, kind='cubic' instructs Python to use a third-order polynomial to interpolate between data points.

The other options are

  • linear: interpolate along a straight line between neighboring data points
  • nearest: project to the nearest data point
  • zero: project to the preceding data point
  • slinear: use a linear spline
  • quadratic: use a quadratic spline
  • cubic: use a cubic spline

The default of interp1d is a linear interpolation. You can also provide an integer number, in which case the function will use a polynomial of that order to interpolate between points. For example,

f = interp1d(x, y, kind=10)

will use a 10th order polynomial to interpolate between points.

The following script will demonstrate the available options and their output for a sample data set:

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d

# Original "data set" --- 21 random numbers between 0 and 1.
x0 = np.linspace(-1,1,21)
y0 = np.random.random(21)

plt.plot(x0, y0, 'o', label='Data')

# Array with points in between those of the data set for interpolation.
x = np.linspace(-1,1,101)

# Available options for interp1d
options = ('linear', 'nearest', 'zero', 'slinear', 'quadratic', 'cubic', 10)

for o in options:
    f = interp1d(x0, y0, kind=o)    # interpolation function
    plt.plot(x, f(x), label=o)      # plot of interpolated data

plt.legend()
plt.show()
Interpolated data from interp1d.
Interpolated data from interp1d.

More on Interpolation

You can see a few general principles of interpolation from the figure:

  • Interpolating functions are continuous.
  • Interpolating functions always pass through the data points.
  • A quadratic function can give a much worse fit than linear interpolation.
  • Increasing the order of the polynomial does not always lead to a better fit.
  • Interpolating functions can oscillate wildly between data points.
  • The fit gets worse toward the edges of the data set.

So what should you do when interpolating your own data?

The cubic spline is the workhorse of the industry. As you can see from the figure, it provides a smooth curve that appears to fit the data well. Smoothness extends beyond what you see in the figure: a cubic spline has continuous first and second derivatives. This is a useful property in physics, where first and second derivatives are quite common in theoretical analyses (Newton’s laws, Maxwell’s equations, the Schroedinger equation, etc.).

A cubic spline interpolation is a good choice in most cases.

A final word of caution: Interpolation and extrapolation are not the same. A good interpolating function can be a terrible approximation outside the set of data points used to create it. For this reason, the functions generated by interp1d(x,y) will not even return a number when you provide a value of the independent variable outside the range of the data set. They will raise a ValueError instead.


  1. F(w) will return an array, even if it is called with a single value. If you want a number instead of a 0-dimensional array that contains a single number, just call float(F(w)).  ↩