Modeling, simulation, and engineering

Using Microsoft Word for Technical Documents

August 18th, 2010 Posted in Scientific computing | No Comments »

Microsoft Word is not the best tool for doing technical writing.  However, sometimes we are required to use Word because we need to collaborate with others who want to use Word.  Right now, I am using Word 2007  on Windows XP to write several mathematical papers.  In general, it is a big improvement from previous versions.  The new integrated equation editor is outstanding–except for the major bug I’ll discuss below.  Here is a brief “FAQ” you will want to read if you are using Word for technical writing.

Q: How do I enter multi-line equations in Word 2007?

A: Press Shift-Enter where you want a line break to appear

Q: How do I prevent a page break from splitting a table into two parts?

A: Select the table.  On the Home tab, click the little box in the lower-right corner of the Paragraph box.  On the paragraph dialog, choose the “Lines and Page Breaks” tab.  Check the box for “Keep Lines Together.”

Q: How do I automatically number headings in Word 2007?  For example: Section 1, Subsection 1.1, 1.2, 1.3, Section 2, etc.

A: It’s not obvious.  See this page from dummies.com about number headings.

Q: In Word 2007, why do equations sometimes appear as blank spaces or question marks when I print or save my document as a PDF file?

A. This occurs when Word is installed on Windows XP Pro.  See the following Microsoft tech support item to find out how to install missing scripts:

The characters in an equation are not printed

You may also have an outdated printer driver:

Microsoft Support Item 920228

Q: Why do equations created in Word 2007 disappear when I open the document in Word 2008 for Mac?

A. Word for Mac does not support equations written in the new Word 2007 equation editor. Unfortunately, neither does PowerPoint 2007 on the PC.  You can work around this by inserting equations into Word the old-fashioned way: go to the Insert tab, click on Object (found in the “Text” box towards the right side of the tab), and choose “Microsoft Equation 3.0″ from the list in the dialog box.

Reading an array from a text file with Fortran 90/95

May 21st, 2010 Posted in Uncategorized | No Comments »

If you’re used to coding in more modern languages, Fortran I/O can seem a little bizarre.  Strings in Fortran are much more difficult to work with, since they are fixed-length rather than null-terminated. The following example illustrates a simple way to read an array of numbers from a text file when the array length is unknown at compile time.

program io_test
      real, dimension(:), allocatable :: x
      integer :: n

      open (unit=99, file='array.txt', status='old', action='read')
      read(99, *), n
      allocate(x(n))
      read(99,*) x

      write(*,*) x
end

Here is the text file that the array is read from. The integer on the first line is the number of elements to read from the next line.

10
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Managing a pool of MPI processes with Python and Pypar

April 17th, 2010 Posted in Uncategorized | No Comments »

MPI is a standard for communication between multiple processes in parallel computing. These processes can be running on different cores, CPUs, or entirely different computers in a grid. MPI is a standard, and there are many implementations available (many are open source). The features of MPI can be accessed from Python with the packages pypar and mpi4py.  Here I present a Python script that implements a “process pool” design pattern using pypar.  I have observed a pattern often enough in my own work that I wrote this framework to avoid reinventing the wheel every time I come across it.

This pattern is useful for any embarrassingly parallel problem.  This describes a computing task that can be easily accelerated by running multiple parallel processes that do not not need to interact with one another.  For example, I have large scientific data sets from several runs of an experiment that need to be analyzed.  Since the data from each run can be analyzed independently from the other runs, I can analyze all the data sets at once on a parallel machine.   The code below implements a “master-worker” paradigm that requires at least three processes to accelerate the calculation.  The first process be comes the master, which does no calculation but hands out tasks to the workers.  The rest of the processes are workers, which receive a chunk of work, finish it, return the result to the master process, and then wait for more work.

#!/usr/bin/env python
from numpy import *
import pypar
import time

# Constants
MASTER_PROCESS = 0
WORK_TAG = 1
DIE_TAG = 2

MPI_myID = pypar.rank()

### Master Process ###
if MPI_myID == MASTER_PROCESS:
    num_processors = pypar.size()
    print "Master process found " + str(num_processors) + " worker processors."

    # Create a list of dummy arrays to pass to the worker processes
    work_size = 10
    work_array = range(0,work_size)
    for i in range(len(work_array)):
        work_array[i] = arange(0.0, 10.0)

    # Dispatch jobs to worker processes
    work_index = 0
    num_completed = 0

    # Start all worker processes
    for i in range(1, min(num_processors, work_size)):
        pypar.send(work_index, i, tag=WORK_TAG)
        pypar.send(work_array[work_index], i)
        print "Sent work index " + str(work_index) + " to processor " + str(i)
        work_index += 1

    # Receive results from each worker, and send it new data
    for i in range(num_processors, work_size):
        results, status = pypar.receive(source=pypar.any_source, tag=pypar.any_tag, return_status=True)
        index = status.tag
        proc = status.source
        num_completed += 1
        work_index += 1
        pypar.send(work_index, proc, tag=WORK_TAG)
        pypar.send(work_array[work_index], proc)
        print "Sent work index " + str(work_index) + " to processor " + str(proc)

    # Get results from remaining worker processes
    while num_completed < work_size-1:
        results, status = pypar.receive(source=pypar.any_source, tag=pypar.any_tag, return_status=True)
        num_completed += 1

    # Shut down worker processes
    for proc in range(1, num_processors):
        print "Stopping worker process " + str(proc)
        pypar.send(-1, proc, tag=DIE_TAG)

else:
    ### Worker Processes ###
    continue_working = True
    while continue_working:

        work_index, status =  pypar.receive(source=MASTER_PROCESS, tag=pypar.any_tag, \
                return_status=True)

        if status.tag == DIE_TAG:
            continue_working = False
        else:
            work_array, status = pypar.receive(source=MASTER_PROCESS, tag=pypar.any_tag, \
                return_status=True)
            work_index = status.tag

            # Code below simulates a task running
            time.sleep(random.random_integers(low=0, high=5))
            result_array = work_array.copy()

            pypar.send(result_array, destination=MASTER_PROCESS, tag=work_index)
    #### while
#### if worker

pypar.finalize()

Constrained least-squares fitting with Python

March 5th, 2010 Posted in Python, Scientific computing | No Comments »

Scipy contains a good least-squares fitting routine, leastsq(), which implements a modified Levenberg-Marquardt algorithm.  I just learned that it also has a constrained least-squared routine called fmin_slsqp().   I am using simple upper and lower bound constraints, but it’s also possible to specify more complex functional constraints.

What I did not realize, at first, is that fmin_slsqp requires a different type of objective function than leastsq.   leastsq requires you to write a function that returns a vector of residuals, and leastsq automatically squares and sums the residuals.  fmin_slsqp is actually more flexible, in that it can use any objective function that returns a single scalar value.  To implement least-squares curve fitting, your objective function will need to find the residual at each data point, square the values, and sum them up.  Hopefully this tip will save you some time.

Check out the scipy optimization tutorial for more examples.  Here is the original paper by Dieter Kraft which introduces the algorithm used by fmin_slsqp.

A preliminary review of the Lehigh Rendition console

January 17th, 2010 Posted in Stage Lighting | No Comments »

In my last post, I explained why I don’t think the ETC Element is a good replacement for the ETC Express.  When I was at LDI 2009, I ran across the Rendition series of consoles from Lehigh Lighting.   Based on the literature I picked up, and the extensive demonstration that I received from the Lehigh representative, it looks like the Rendition might be the true heir to the Express.  It is targeted at a similar market: small theaters, churches, and schools.  The physical layout will be familiar to anyone with theater console experience: 24 or 48 submasters on the left,  dual cue playback controls in the center, and hardkeys on the right, with channel faders (48 or 96) along the upper right portion of the desk.  The submasters are traditional theater-style subs.  Each sub records a fixed look rather than an independent cuelist.  Shows can be saved to a flash drive via a USB port.
Lehigh Rendition 24/48
While the console hardware is finalized, several additional features will be added in an upcoming version of the operation system.  The console currently supports one external monitor, but dual monitors will soon be supported.  An Ethernet port is already included on the console.  In a future software release, it will be possible to add wings with additional submasters or channel faders.  The advantage of Ethernet over USB is that a wing can be placed hundreds of feet from the console, and connected using existing Ethernet lines.  This feature would allow a wing to to be used as remote focus unit.  Using a wireless Ethernet bridge opens up even more possibilities.

The main difference between the two models, the 24/48 and 48/96, is the number of hardware faders.  The software capabilities are identical.   The price of the console depends on how many channels of conventional dimming are “unlocked.”  You can choose from 125, 250, or 512 conventional channels.  All models support 1024 channels of moving-light control.  The two hardware DMX ports each support 512 channels, but two additional universes can be accessed via Ethernet (I’m not sure of the details on how this works).  The console seems to support all the basic and advanced cue and effect functions that you’d expect: multiple cue lists, flexible cue timing options, macros, subroutines, and effects.  I can’t really comment further about that, since I would need to spend a lot of time with the console to accurately gauge its reliability and ease of use.  Moving lights are also supported, with fixture profiles for easy patching and a trackball for focusing.  Once again, it looks promising, but the only way to really evaluate these features is to try to use them and see how intuitive the process is.  Download the offline editing software and try it for yourself.

A more sophisticated console called the Rendition Pro is expected to be available in the second quarter of 2010.  The Pro is aimed at more experienced users, and appears to compete with the ETC Eos and Ion.  It will run the same software as the Rendition, so many of the features that are expected to be added to the Rendition are actually being developed for the Pro.  The layout is similar to the Rendition, with playback masters on the left side, traditional cue playback controls in the lower center, and keys to the lower right. Two LCD displays with encoder wheels and softkeys occupy the space above the keypad, with no hard channel faders.  Further, the submasters  have LCD displays and additional buttons–probably “Go” and “Pause”  buttons to support a cue stack on each sub.

From my limited experience, both of these consoles series are worthy of further investigation.  They seem to be solidly built and well-engineered.  The Lehigh personnel that I spoke to were helpful and friendly, and I appreciate their assistance in answering my questions.

Storing large Numpy arrays on disk: Python Pickle vs. HDF5

January 10th, 2010 Posted in Python, Scientific computing, Software development | No Comments »

In a previous post, I described how Python’s Pickle module is fast and convenient for storing all sorts of data on disk. More recently, I showed how to profile the memory usage of Python code.  In recent weeks, I’ve uncovered a serious limitation in the Pickle module when storing large amounts of data: Pickle requires a large amount of memory to save a data structure to disk. Fortunately, there is an open standard called HDF, which defines a binary file format that is designed to efficiently store large scientific data sets. I will demonstrate both approaches, and profile them to see how much memory is required. I am writing the HDF file using the PyTables interface. Here’s the little test program I’ve been using:

#!/usr/bin/env python

from numpy import *

array_len = 10000000

a = zeros(array_len, dtype=float)

#import pickle
#f = open('test.pickle', 'wb')
#pickle.dump(a, f, pickle.HIGHEST_PROTOCOL)
#f.close()

import tables
h5file = tables.openFile('test.h5', mode='w', title="Test Array")
root = h5file.root
h5file.createArray(root, "test", a)
h5file.close()

I first ran the program with both the pickle and the HDF code commented out, and profiled RAM usage with Valgrind and Massif (see my post about profiling memory usage of Python code). Massif shows that the program uses about 80MB of RAM. I then uncommented the Pickle code, and profiled the program again. Look at how the memory usage almost triples!

 MB
232.2^                                                                  #
  |                                                                  #
  |                                                                  #
  |                                                                  #
  |                                                                  #
  |                                                                  #
  |                                                                  #
  |                                                         @        #
  |                                                         @        #
  |                                                         @        #
  |                                                         @        #
  |                                                         @        #
  |                                                         @        #
  |                                                ,        @        #  .
  |                                                @        @        #  :
  |                                                @        @        #  :
  |                                                @        @        #  :
  |                                                @        @        #  :
  |                                                @        @        #  :
  |                                                @        @        #  :
0 +----------------------------------------------------------------------->Mi
  0                                                                   161.9

I then commented out the Pickle code and uncommented the HDF code, and ran the profile again. Notice how efficient the HDF library is:

    MB
81.62^                              ,.., .....,.. .,...,... .,......,..,:  .
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |                              @::@ :::::@:: :@:::@::: :@::::::@::#:  :
   |              .,..,. .....    @::@ :::::@:: :@:::@::: :@::::::@::#:  :::
 0 +----------------------------------------------------------------------->Mi
   0                                                                   246.6

I didn’t benchmark the speed because, for my application, it doesn’t really matter, because the data gets dumped to disk relatively infrequently. It’s interesting to note that the data files on disk are almost identical in size: 76.29MB.

Why does Pickle consume so much more memory? The reason is that HDF is a binary data pipe, while Pickle is an object serialization protocol. Pickle actually consists of a simple virtual machine (VM) that translates an object into a series of opcodes and writes them to disk. To unpickle something, the VM reads and interprets the opcodes and reconstructs an object. The downside of this approach is that the VM has to construct a complete copy of the object in memory before it writes it to disk. I’m not sure why the pickle operation in my example code seems to require three times as much memory as the array itself. 99% of the time, memory usage isn’t a problem, but it becomes a real issue when you have hundreds of megabytes of data in a single array. Fortunately, HDF exists to efficiently handle large binary data sets, and the Pytables package makes it easy to access in a very Pythonic way.

My article published in Technologies for Worship Magazine

December 14th, 2009 Posted in Stage Lighting | No Comments »

Head over to Technologies for Worship Magazine and check out my recently-published article about backlighting.  It’s my first publication in print (outside of technical journals).  I’m looking forward to writing more articles for a general audience!

First impression: ETC Element Console Review

December 8th, 2009 Posted in Stage Lighting | No Comments »

We recently got an ETC Element to replace an ETC Express 72/144 that had become unreliable due to a hardware problem.  I’d like to point out that ETC equipment is generally very durable, but this particular console only lasted about 8 years due to abuse by the high-school students who use the console during the week.  The Express was a great console.  It had a hardware fader for each of the channels that we commonly used, and a bank of 20 submasters.  Recording a submaster was as simple as creating a look onstage with the channel faders, then pressing “Record” and hitting the submaster bump button.  It had one external LCD screen for output, which had a very simple character-based interface.  Just about anyone could learn to use the Express.

The Element is, in many ways, a much more sophisticated console.  While the Express didn’t have any real capability to run moving lights, the Element has many features to facilitate the use of moving lights and programmable LED fixtures.  It supports two external LCD screens with a full graphical interface, a keyboard, and a mouse. Unfortunately, the Element has lost all of the simplicity that made the Express so successful in its target market.  Let’s start with the physical design.

The greatest disadvantage of the Element, especially for untrained users, is that all of the forty faders (60 in some models) are multi-function.  With the bank switch in position 1, the faders represent channels 1-40.  Position 2 changes the fader function to channels 41-80, and Position 3 changes the faders to channels 81-120.  Position 4 changes the fader function to submasters.  While it’s nice to have 40 subs, it’s also extremely confusing to figure out which channel fader or submaster is currently controlling the output.  Here’s an example.

Let’ s say that you use faders 2, 17, and 31 to set a look on Bank 1.  If you switch to Bank 2, faders 2, 17, and 31 are still up, but they are not controlling anything.  Bring those faders down, and continue setting up the look on channels 41-80 using Bank 2.  Now switch back to Bank 1 to tweak the look on channels 1-40…but the faders are no longer up!  The LEDs on the bump buttons will flash to show which channel levels no longer match the fader positions.  To take control of these channels, bring the fader up until it matches the current channel level.  Then you can control the channel again.  Are you confused yet?  Did I mention that the faders have no numbers?  I find this slows me down when creating a look, and makes it harder to edit a look “on the fly.”  To top it off, the bank selector is a little plastic twist switch, and it’s hard to see which bank it’s pointing at.

With many consoles, you can work around the limited number of faders by adding an expansion wing with additional controls.  Unfortunately, the Element does not support expansions–you have to upgrade to the Ion to get that feature!  I’m not thrilled with the layout of the keys on the Element keyboard, either.  It’s difficult to guess where a key is going to be–they don’t seem to be grouped in any particular way.  Compare to the Hog 1000, in which the keys are grouped as “verbs” and “nouns” for command-line programming.  One final gripe about the hardware: it’s difficult to run my finger down the row of bump buttons to  find a particular light–this was easy to do on the old Express.

So, after all my complaining, I want to leave you with a great feature.  I love the “Exclusive” submaster setting that prevents the output from a sub from being recorded in subsequent cues and submasters.  I use this all the time to keep some lights up onstage or in the house without having them recorded into cues.  I’m sure there are more little gems built into the Express.

Here is my conclusion about the Element: it is NOT a replacement for the Express.  Do not expect this console to be a “drop-in” replacement for any traditional theater console.  The learning curve will be very steep for brand-new users, and slightly less steep for Express users.  In the end, the Element will be far more capable than the Express–but that’s not the target market for the Express.  If you are buying a console for a school, small church, or community theater where the users are generally inexperienced, there are probably better choices than the Element.

LDI 2009 Report

November 22nd, 2009 Posted in Stage Lighting | No Comments »

I attended LDI (Live Design International) last Friday in Orlando, FL.  This was my second time at the show (read my report from 2007).  Overall, I’d say this year’s show was scaled back from 2007, which is not surprising considering how the economic recession has affected the entertainment industry.  There were fewer booths this year, and they weren’t as fancy.  A few manufacturers, such as Chauvet and Coemar, brought pretty spectacular booths.  Other companies brought large booths (such as ETC and Martin), but they were not as flashy.  I’m not sure if Friday is normally a slow day for this conference, but there wasn’t much of a crowd.  I didn’t have to wait in line to see anything.

In 2007, I went to LDI to try out control consoles to replace our Hog 1000.  That plan fell through when the economy tanked, so this year I was just looking around.  Apparently a lot of people were, because the sales people were rather aggressive about jumping onto any apparent interest in their product.  One girl tried to scan my badge as I walked by–I don’t even know what she was selling!  Actually, that’s a common occurrence at LDI–it’s hard to tell what the actual product is in the booth.  All the larger booths seemed to consist of truss, truss warmers, banners, drapes, moving lights, and video.  I often had to stop and look for a minute to figure out which one was actually their product!

I will post some photos when I get a chance.

Building and linking to a shared Fortran library

October 26th, 2009 Posted in Fortran, Linux, Scientific computing | No Comments »

I’m using GNU Fortran (gfortran) to build several shared libraries, and then dynamically linking to them from a Fortran program.  The process is a little different than what I’m used to for C libraries, so I thought I’d explain it.  Unlike C, there is no need to #include header files when compiling code that relies on functions defined in an external library.  Likewise, there is no need to use -l or -L linking flags to tell the linker about s hared libraries (at least when they’re in the same directory).  In fact, the whole process requires a lot less command-line options than I had expected.

Building the Fortran shared libraries

I want to test a library called my_library.  However, this library relies upon another library, which is called cfdrc_user_access.  I want to compile a unit-test program that calls my library and makes sure it is working correctly.

gfortran -shared -fPIC -o cfdrc_user_access.so cfdrc_user_access.f90
gfortran -shared -fPIC -o cfdrc_user.so my_library.f90

Linking the shared libraries with a Fortran program

First, compile the program to an object file (.o), using the -c flag as shown on the first line.  Then use gfortran to link the object file with the shared libraries created in the previous step:

gfortran -c blocking.f90
gfortran -o blocking blocking.o cfdrc_user.so cfdrc_user_access.so

Remember, unless the shared libraries are in a directory that’s already on your  dynamic linker path, you  will probably need to modify your LD_LIBRARY_PATH in order to run the program.

export LD_LIBRARY_PATH=/home/yourname/current_working_directory/
./blocking

That worked for me–hopefully it helps you.

X.org configuration for a Gentoo guest on VMWare Fusion

October 21st, 2009 Posted in Linux, Mac | No Comments »

Here is an xorg.conf for a Gentoo guest running in VMWare Fusion on MacOS 10.6 (Snow Leopard).  It’s surprisingly simple to set up X when the hardware is fake ;)  Gentoo runs great on a Macbook Pro 13″.  I was concerned at first about the lack of 3D acceleration, but I tried a few 3D applications and found the performance to be acceptable.  Blender ran really smoothly with a simple model (once I found out that command-click is equivalent to middle click).  Rotating a field of several hundred spheres in Paraview was a little slow, but still very acceptable.

I also bought a Logitech wireless mouse, and found that the middle button works just as you’d expect (paste) in Gentoo running in Fusion.  I didn’t have to install the Logitech drivers in OS X.

Kernel configuration for a Gentoo guest in VMWare Fusion

October 9th, 2009 Posted in Linux, Mac | No Comments »

I recently installed Gentoo Linux (amd64) as a guest on my Mac (OS 10.6 Snow Leopard) using VMWare Fusion.  I thought I’d post the kernel config that I am using, since I didn’t find any out there that I trusted.  If you can use this as a starting point, then it should save you some time and trouble.  It’s a pretty minimal configuration–I think I removed all the extra drivers and stuff.  You could lean it out a little more by removing the audio and a few other extras that I thought I might use.  Let me know if you have any trouble with it.

Gentoo amd64 kernel config for VMWare Fusion

So far I am very pleased with its performance.  My only disappointment is that VMWare doesn’t support Linux graphics hardware acceleration.

f2py: binding Fortran and Python

September 23rd, 2009 Posted in Fortran, Python, Software development | No Comments »

I  have recently started using f2py to call Fortran from Python.  I have found this useful for two reasons: speeding up Python scripts by calling compiled Fortran code, and using Python as a unit testing framework for Fortran modules.   Unfortunately, the documentation for f2py is rather sparse, and may not be completely up to date.   In this note, I will hopefully prevent you from wasting a lot of time figuring out how to pass array arguments, and return array results.

Array arguments

Passing array arguments is a critical when working with numerical algorithms.  F2py handles this well, but it is difficult to figure it out how to do it correctly.  Here is a  simple example with some Fortran 90 routines:

module test

contains

subroutine foo (a)
    implicit none

    integer, intent(in) :: a
    print*, "Hello from Fortran!"
    print*, "a=",a
end subroutine foo

function bar (len_a, a)
    implicit none
    integer, intent(in) :: len_a
    real, dimension(len_a), intent(in) :: a

    real, dimension(len_a) :: bar
    !f2py depend(len_a) a, bar

    integer :: i
    real, dimension(len_a) :: b

    do i=1,len_a
        b(i) = 2.0*a(i)
    end do

    bar = b
end function bar

subroutine sub (a, len_a, a_out)
    implicit none

    real, dimension(len_a), intent(in) :: a
    integer, intent(in) :: len_a
    real, dimension(len_a), intent(out) :: a_out

    integer :: i

    do i=1,len_a
        a_out(i) = 2.0*a(i)
    end do

end subroutine sub

end module test

Function “foo” is quite straightforward. Function “bar” is a little more complex, because it accepts an array argument and returns an array. If you’re new to Fortran, it will seem strange to pass the length of the array along with the array, but you need that information to declare the output array. f2py needs to know that the input array a and the output array bar both depend on the argument len_a.  The special comment line

!f2py depend(len_a) a, bar

is mandatory!  It tells f2py that a depends on len_a.  If you omit this comment or the corresponding one for, the function will not work correctly.  You will get strange errors like

ValueError: failed to create intent(cache|hide)|optional array-- must have defined
dimensions but got (0,)

Fortran subroutines

Now look at the subroutine called sub.  On the Fortran side, it has three arguments: two inputs and an output.  However, Python only has functions,  and all non-array arguments are passed by value.  How do you reconcile this?  When the Fortran subroutine is called from Python, the intent(out) variables are returned as a function result, or a tuple of results if there are more than one.  Compare the Fortran code above with the Python call below to see what I mean.

Building

This code can be compiled using the “fast and smart” method described in the docs:

f2py -m -c hello hello.f90

Here is the Python code that calls the Fortran routines:

#!/usr/bin/env python

import hello
from numpy import *
a = arange(0.0, 10.0, 1.0)
len_a = len(a)

print "foo:"
hello.test.foo(len_a)

print "bar:"
a_out = hello.test.bar(len_a, a)
print a_out

print "sub:"
a_out = hello.test.sub(a, len_a)
print a_out

You might also want to check out this rather complicated f2py example.

Profiling memory usage of Python code

April 17th, 2009 Posted in Linux, Python, Scientific computing, Software development | 2 Comments »

In a previous post, I explained how to use the Python profiler.  The profile is great for finding out which parts of the code run the slowest, or are called most often.  However, the profiler doesn’t give any information about how much RAM is being consumed, or where it’s being consumed.  If your program needs so much memory that it starts swapping to disk, its speed can be reduced by orders of magnitude.  On the positive side, your code may run much faster if it fits entirely in the processor cache.  In this post, I will introduce two tools that can help you understand the RAM usage of your Python code.

Read the rest of this entry »

How to make two mice work with xwindows (x.org)

April 10th, 2009 Posted in Linux | No Comments »

It’s a real pain to surf the Web when the batteries die in your wireless mouse or trackball.  I use some old rechargeables that are no longer fit for digital camera service, so this happens to me fairly often.   My batteries just died again, so I had to figure out how to configure X.org to use a backup mouse.  It’s actually not hard to have an old 3-button mouse plugged into the PS/2 port and tell X.org to use it as a backup.  First, define two input devices:

Section "InputDevice"
	Identifier  "Mouse0"
	Driver      "mouse"
	Option	    "Protocol" "auto"
	Option	    "Device" "/dev/input/mouse0"
	Option      "Buttons" "7"
	Option	    "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
	Identifier  "BackupMouse"
	Driver      "mouse"
	Option	    "Protocol" "auto"
	Option	    "Device" "/dev/input/mouse1"
	Option      "Buttons" "5"
	Option	    "ZAxisMapping" "4 5"
EndSection

The first mouse is my wireless Logitech trackball (which is currently a fancy paperweight while the batteries charge).  The second mouse is an old Microsoft two-button mouse with a scroll wheel that functions as buttons 3,4, and 5.  The trick is to define the ServerLayout as follows:

Section "ServerLayout"
	Identifier     "X.org Configured"
	Screen      0  "Screen0" 0 0
	InputDevice    "Mouse0" "CorePointer"
	InputDevice    "BackupMouse" "SendCoreEvents"
	InputDevice    "Keyboard0" "CoreKeyboard"
EndSection

Note that there can only be ONE “CorePointer” so set that to be your primary mouse.  Set the backup mouse to “SendCoreEvents” so that it receives mouse events, but X.org won’t freak out if it’s not present.

Lookup tables and spline fitting in Python

April 8th, 2009 Posted in Python, Scientific computing, Software development | No Comments »

Lookup tables and spline fitting are widely used by scientific programmers.  A particular function may not have an analytic solution–in other words, it can’t be expressed as an equation of elementary functions.  This might happen if the function were empirically determined from experimental data, or if the equation can’t be algebraically solved for one variable.  If the analytical function is available, but takes a long time to evaluate, a lookup table or spline approximation can be considerably faster.  In a previous post, I showed how to use the function interp1d from scipy.interpolate as a lookup table.  In a later post, I showed that interp1d is actually rather slow, and scipy.interpolate.UnivariateSpline is much faster.  Now, I will show some benchmark results, and explain a potential pitfall when using UnivariateSpline.

The two functions I will use for this demonstration are not very complicated.  Each function is a piecewise approximation of a more complex function.  Here is one of the functions.  For small z, the function uses one approximation, and for large z, it uses a far-field approximation.  For intermediate values, a polynomial is used to smoothly match the two solutions.

def M_wall_par(z):
    h = z - 1.0
    if h < 0.15:
        return 1./(0.9588+8./15.*log(1./h) + 64./375.*h*log(1./h))
    elif h < 0.6:
        return 0.33672 + 1.29936*h - 1.75005*h**2 + 0.89186*h**3
    else:
        return 1.-9./(16.*z)+0.125/z**3

I won’t go into the other function, which is similar.  Here is a code fragment showing how to create a lookup table for this function:

M_par_spline = UnivariateSpline(z, M_parallel, s=0)
M_par_fit = empty(len(z), dtype=float)
for i in range(len(z)):
    M_perp_fit[i] = M_perp_spline(z[i])
    M_par_fit[i] = M_par_spline(z[i])

The keyword parameter s=0 is critical, because it forces UnivariateSpline to act as a lookup table, with linear interpolation between points.  If s is set to anything higher than zero, UnivariateSpline fits a spline to the data, which can introduce fitting errors.  If you are going to use a spline fit, look at a plot of the fitted curve to make sure it’s valid across the entire domain of interest.  I benchmarked the speed of the 100,000 function calls and compared it to the same number of calls to the interpolating object. Here are the results:

  • Function calls: 1.8789 seconds average, standard deviation of 0.005
  • UnivariateSpline calls: 0.9662 seconds average, standard deviation of 0.005
  • interp1d calls: 24.8 seconds average, standard deviation of 0.1

The interpolation runs almost twice as fast as the function call, but interp1d is an order of magnitude slower!  This is because UnivariateSpline wraps a function from FITPACK (an optimized, compiled Fortran library), while interp1d is a pure Python function.  My benchmark excluded overhead such as creating the UnivariateSpline object, creating Numpy arrays, etc.  Here is a plot showing that the interpolated curve tracks perfectly with the function data (note there are two different functions plotted on this graph).

1D interpolation results (no smoothing)

1D interpolation results (no smoothing)

In conclusion, the UnivariateSpline class from Scipy is the way to go for creating fast lookup tables, if you set the keyword parameter s=0.  If you set s to a value greater than zero, the same class will fit a spline of the specified order, although you’d better check to make sure it is actually a valid approximation of your data.

3D Plotting Software for Python::Part 1::PyX

March 20th, 2009 Posted in Python, Scientific computing, Software development | 3 Comments »

There are lots of good open-source tools that you can use to make high-resolution, publication-quality 2D plots.  Personally, I like to use Python, numpy, and matplotlib.  Unfortunately, it is much harder to find a good tools to make 3D plots.  Older versions of matplotlib had rudimentary 3D support, but this was removed in version 0.98.  In this post, I will review a Python 3D plotting library called PyX.

Preparing the data

Figuring out how to store the data to be plotted was actually the hardest part of learning to use PyX.  The data format for 3D plots is not well documented.  PyX requires a list of (x,y,z) lists like this:

[
[x0  y0  z[0,0]]
[x0  y1  z[0,1]]
[x0  y2  z[0,2]]
...
[x1  y0  z[1,0]]
[x1  y1  z[1,1]]
[x1  y2  z[1,2]]
...
]

Unfortunately, PyX is designed to work with lists rather than Numpy arrays, which can be a real disadvantage for large datasets.  Here is a code snippet that shows how I transformed some Numpy data into a list of lists that can be plotted in PyX.  The 1D Numpy arrays z_values and theta_average contain x and y values, and the z values are stored in the 2D Numpy array called B_h_theta.


# Reshape data for plotting with pyx
values = []
for i in range(B_h_theta.shape[0]):
    for j in range(B_h_theta.shape[1]):
        values.append([z_values[i], theta_average[j], B_h_theta[i][j]])

Plotting

First create a graph.data.points object.  The one required argument is a list of values to plot, in the form described above.  There are other types of graph.data objects for reading data from text files, plotting functions, etc.  Then, create a graph.graphxyz object which defines the visual appearance of the graph.  Call the plot() method of graph.graphxyz and pass in the data object as an argument.  Call the dodata() to finish the plot, and then call writeEPSfile() or writePDFfile() to write the plot to a file on disk.  There is no way (that I found) to just show the plot in a graphical window.


from pyx import *
v = graph.data.points(values, title="B(h,theta)", addlinenumbers=0, x=0, y=1, z=2)
g = graph.graphxyz(size=4, projector=graph.graphxyz.central(10, 150, 30), \
        x=graph.axis.linear(title="z"), y=graph.axis.linear(title=r'$\theta$'),\
        z=graph.axis.linear(title="B(h,theta)"), x2=None, y2=None, z2=None)
g.plot(v, [graph.style.grid()])
g.dodata()
g.writeEPSfile("phi_plot")

Results

3D Plot from PyX

3D Plot from PyX

Advantages

  • Clean, object-oriented interface
  • Produces nice surface and wireframe plots in .pdf and .eps formats
  • Appears to be flexible and full-featured

Disadvantages

  • Does not seem to be actively developed or maintained (although it is very useful in its present state)
  • Does not integrate easily with Numpy–all data needs to be stored in text files or Python lists
  • No GUI output–writes images straight to .pdf or .eps files
  • TeX labels have poor resolution (at least on my system)

The Python configparser: a way to read simple data files

February 3rd, 2009 Posted in Linux, Software development | 1 Comment »

My simulation library, which is written mostly in Python, needs a lot of data and parameters in order to run. In some cases, I just hard-code the values in the script that calls the library, and in other cases I load a pickle file containing a Python objext. What if I want to read in data or configuration parameters from a human-editable text file? If the information is extensive or complex, XML might be a good choice, but XML is overkill for simple configuration or data files. Fortunately, a standard Python library called ConfigParser has already defined a configuration file format, and provides methods to interact with such files. Here is a sample of the format used in a config file:

[0.01]         ; this is a comment
1.0: 27.221
2.0: 4.214
5.0: 0.6973
10.0: 0.1523
30.0: -0.00120
[0.05]
1.0: 325.586
2.0: 30.858
5.0: 2.376
10.0: 0.458
30.0: -0.0159

Reading such a file is easy:


parser = ConfigParser.SafeConfigParser()
parser.read("filename")
section_names = parser.sections() # returns a list of strings of section names
items_in_section = parser.items(section_names[i]) # get a list of tuples from a section
pair = parser.items(section_names[i])[j]  # get one key/value pair
key = pair[0]
value = pair[1]     # values are always strings--remember to convert if needed

There are a couple of interesting “features” you should know about. First, if you open a file using the parser.read(”filename”) method, no exception will be raised if the file does not exist! If reading the file is not optional, first open the file with the fp=open(”filename”) and then call parser.readfp(fp). The second caveat is that the sections or key/value pairs returned from the parser are not necessarily returned in the order that they exist in the file. If you are trying to read data into arrays, be aware that the arrays may not be in order, so you may need to sort them.

Why reinvent the wheel when trying to do simple things? Unless you are doing it as an exercise, just look through the Python library and you’ll probably find that someone has already done the work for you.

Updated Python class for writing Paraview (VTK) (.vtu) files

January 23rd, 2009 Posted in Python, Scientific computing | No Comments »

I have released a new version of my Python class that generates VTK data files in the .vtu format, which is compatible with Paraview and other VTK applications.  If you have downloaded the old one, please get the latest version, which incorporates some bug fixes and has been more thoroughly tested.

How to put formatted, highlighted code in a Wordpress post

January 17th, 2009 Posted in Software development, Web design | No Comments »

I found two complementary plugins that enable me to put highlighted formatted code in a Wordpress page or post. Here’s an example of what they do:

# Plot flux at continuum boundary
pylab.figure()
pylab.hold(True)

pylab.plot(nd_times, Jl_BD, 'b-', label="Flux from BD simulation")
pylab.plot(nd_times, Jl_BD+Jl_BD_std, 'b.', label="Flux from BD simulation")
pylab.plot(nd_times, Jl_BD-Jl_BD_std, 'b.', label="Flux from BD simulation")

Test successful! Note that Javascript must be enabled to see the highlighting. Get the plugins here: