Category Archives: Software development

“Exporting” a project from a Git repository

What do you do when you want to distribute or release source code that is stored in a Git repository? Obviously, if your target audience is using Git, you can just compress the directory that contains the repository and distribute the copies, or give the users a way to clone your repository (such as GitHub). However, your audience may not be Git users, or the hidden .git directory may be very large and you don’t want to distribute it. The solution is the git archive command, which packs the files from a tree-ish into an achive (ZIP or TAR). By “tree-ish”, they mean that you can specify a branch, commit, HEAD, etc. git archive is somewhat analagous to the svn export command. I find the most useful form of this command to be:
cd example
git archive --output ~/example.zip --format=zip --prefix=example/ HEAD

Do not forget the trailing slash after the directory that you specify with the --prefix flag!

REFERENCE: How to do a “git export” (like svn export)

Collaborative Git workflow: Shared Repository on a File Server

GitHub is a great tool for collaborating on projects. However, sometimes it is necessary to mimic the “GitHub workflow” using a shared repository on a local Linux server. The following example shows how I shared an example repository with multiple users.  We are also using the Git flow model for branching, aided by the handy git flow plugin.

On my workstation

I started by creating a repo on my local workstation and setting it up to use the git flow plugin.

git init .
git flow init
git flow feature start first_feature
Continue reading

pickle, hickle and HDF5

Danny Price recently left a comment to let me know about a new Python package he’s developing called hickle. The goal of “hickle” is to create a module that works like Python’s pickle module but stores its data in the HDF5 binary file format. This is a promising approach, because I advocate storing binary data in HDF5 files whenever possible instead of creating yet another one-off binary file format that nobody will be able to read in ten years. The immediate advantage of using HDF5 to store picked Python objects is that HDF5 files are portable across many platforms, while “pickled” objects may not be readable on a different platform. Continue reading

Tricks for Writing XML with Python 3

I’ve added a Python 3 XML example to my Shocksolution_Examples repo on GitHub.  The new example shows how to generate an XML file which functions as a template for building a GUI with wxGlade.  However, this example should be helpful for anyone who needs to create XML files with Python.  The full example is on GitHub, so I’m just going to highlight a few interesting snippets.
Continue reading

Python string format examples

The format method for Python strings (introduced in 2.6) is very flexible and powerful.  It’s also easy to use, but the documentation is not very clear.  It all makes sense with a few examples.  I’ll start with one and add more as I have time:

Formatting a floating-point number

"{0:.4f}".format(0.1234567890)
"{0:.4f}".format(10.1234567890)

The result is the following string:

'0.1235'
'10.1235'

Braces { } are used to enclose the “replacement field”
0 indicates the first argument to method format
: indicates the start of the format specifier
.4 indicates four decimal places
f indicates a floating-point number

Align floating-point numbers at the decimal point

There is no format specifier that aligns numbers at the decimal point. Decimal-point alignment is accomplished by fixing the field width, which is set by placing a number before the decimal point in the format string. Compare the previous example to the output shown below:

"{0:8.4f}".format(0.1234567890)
"{0:8.4f}".format(10.987654321)

Result:

'  0.9877'
' 10.9877'

Scientific Notation

"{0:.4e}".format(0.1234567890)

Output:

'1.2346e-01'

Multiple Arguments

In Python 2.6 you can include multiple arguments like this:

"sin({0:.4f}) = {1:.4e}".format(0.1234567890, sin(0.123456789))
'sin(0.1235) = 1.2314e-01'

In Python 2.7 and later, you may omit the first integer from each replacement field, and the arguments to format will be taken in order:

"sin({:.4f}) = {:.4e}".format(0.1234567890, sin(0.123456789))
'sin(0.1235) = 1.2314e-01'

Leave space for a minus sign

"{: .4e}".format(0.098765)
"{: .4e}".format(-0.1234567)

Output:

 9.8765e-02
-1.2346e-01

Note that there is a space between the colon and the dot in the format specifier. The decimal points for the two numbers are aligned in the output, which is handy for printing tabular data.

Parsing INI configuration files with FORTRAN

Fortran would not be my first choice for working with text, in any form! However, sometimes even numerical codes need to read data from configuration files. The easiest way to read a configuration file from a Fortran 90 routine is by using namelist I/O (I really need to add an example of that). If you’re stuck with INI files, I found an INI file parser written in Fortran buried in an index of free Fortran routine.  I don’t even remember how I even found it, since it’s not well labeled and doesn’t come up in the first few pages of Google results, so I thought I’d better write a post about it in case I ever need such a thing in the future.

A git branching strategy suitable for large projects

Git is an amazing tool…but what is the best way to use it?  Like any tool that gives you great power and flexibility, it’s up to you to use the tool in the best way to suit your purpose.  A friend of mine who manages an enterprise-class software development team recommended the git branching strategy explained in this blog post.  I highly recommend reading it.

Replacing text in place with GNU sed

sed is a stream editor, which means that it accepts a stream of text, processes it, and spits out another stream of text.  sed can process files that are too large to load into memory, and it is a completely command-line tool that can easily be integrated into shell scripts.  This excellent sed tutorial was written for an old version of sed provided by Sun Microsystems, and it doesn’t cover one of the most useful features of GNU sed.  GNU sed accepts a -i command line argument that tells sed to replace the text file in place, rather than writing the output stream to another file.  Another nice feature is that sed -i will create a backup file before processing if you provide a backup suffix:

sed -i.bak 's/CB_0/CAB_max/g' source_code.py

Briefly, the sed command in this example works like this: s tells sed to substitute CB_0 with CAB_max, and the g (which stands for global) tells sed to replace every occurrence in the file (you can also process multiply files with wildcards like *.py).  I used this example to change a variable name in a bunch of similar Python scripts.

I later learned that you can do the same thing with perl, using almost the same syntax:

perl -i.bak -pe 's/CB_0/CAB_max/g' *.py

The -e option tells perl to execute the one-line script that is provided, and the -p option tells perl to iterate over the file name arguments.

A self-contained Fortran linear equation solver

I’ve just released a self-contained Fortran module that solves a system of linear equations using the LU decomposition.

Download the Fortran linear solver from github

This module is based on code that was implemented and released on the Web by Jean-Pierre Moreau.  His implementation was based on one of the Numerical Recipes books.  I updated his code to a more strict Fortran 90 standard and added the necessary comments so that it can be built as a Python module using f2py.  I replaced Jean-Pierre’s Fortran test program with a simpler, self-contained program.  I also included a Python script that implements the same test case.

I created this module because sometimes a self-contained routine is more appropriate than a full library.  I am compiling a library that implements a custom boundary condition for a proprietary computational fluid dynamics solver (CFD-ACE+).  The library has to be written in Fortran, and it has to be built using a proprietary set of build scripts.  I could either try to reverse-engineer the build process and to modify it to link to a shared library like LAPACK, or I could implement a self-contained solver.  Since Pierre had already implemented the solver, I was able to slightly modify his code and get it working relatively quickly.

Finding a value in an unordered Fortran array

I have been optimizing some Fortran code that involved searching for an integer value in an unordered array (we know the value occurs only once).  Since there is no intrinsic procedure to accomplish this, I thought I’d try a couple of approaches to see which was fastest.  The simple answer is that, in this case, brute force beats elegance, even when the target value is near the end of the array.

Download the full example from GitHub

Method 1: Brute force

Algorithm: use a do loop to iterate through the array until you find the target value.  Exit the loop when you find the value.

do i = 1, num_elements
    if (array1(i) .eq. target_value) then
        loc = i
        exit
    endif
end do

Results: CPU time required my PC: 0.048 sec
Continue reading