<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>as through a mirror dimly</title>
	<atom:link href="http://www.shocksolution.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.shocksolution.com</link>
	<description>Modeling, simulation, and engineering</description>
	<pubDate>Tue, 30 Dec 2008 20:23:41 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
	<language>en</language>
			<item>
		<title>A lookup table for fast Python math</title>
		<link>http://www.shocksolution.com/2008/12/11/a-lookup-table-for-fast-python-math/</link>
		<comments>http://www.shocksolution.com/2008/12/11/a-lookup-table-for-fast-python-math/#comments</comments>
		<pubDate>Thu, 11 Dec 2008 23:17:06 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/?p=167</guid>
		<description><![CDATA[Numerical programming frequently requires the use of look-up tables.  A look-up table is a collection of pre-computed values.  When given an &#8220;x&#8221; value, the table returns a pre-computed &#8220;y&#8221; value.  Look-up tables can be used to speed up numerical codes, when it is faster to look up a value in the table [...]]]></description>
			<content:encoded><![CDATA[<p>Numerical programming frequently requires the use of look-up tables.  A look-up table is a collection of pre-computed values.  When given an &#8220;x&#8221; value, the table returns a pre-computed &#8220;y&#8221; value.  Look-up tables can be used to speed up numerical codes, when it is faster to look up a value in the table than it is to compute the value.  They are also used when the data in the table cannot be computed&#8211;for example, experimental data or averaged results from an ensemble of Monte Carlo simulations.  Another application is to compute a value when a function cannot be solved algebraically.  Assume that you have a formula for a function q(h).  You need the value of h for a given value of q, but the formula cannot be algebraically solved to get h(q).  Instead, choose a range of h values, compute the function q(h), and store each value in a look-up table.  Now you can get h(q) for any value stored in the table.  The major limitation of a look-up table is that it cannot return valid results for any value of q which is outside the range of those stored in the table.  Depending on its implementation, the table may be able to interpolate to return values between known points.</p>
<p><a title="SciPy home" href="http://www.scipy.org/SciPy">Scipy</a> includes a handy lookup table class, but it&#8217;s sort of hidden.  Here is an example of how to create a lookup table using interp1d:</p>
<pre>from scipy.interpolate import interp1
from numpy import *

ddeltah = L/1000.
h_sampled = arange(0., L+deltah/10, deltah)
q_sampled = q(h_sampled,L,D,1e-4)
h_interp = interp1d(q_sampled, h_sampled, kind='linear')
print h_interp(0.514234)</pre>
<p style="text-align: left;">First, create a Numpy array h_sampled to store the h-values for the lookup table.  The function q(h,L,D,dt) is not shown, but it takes an array of h values and returns a floating-point array of q values in the range from 0 to 1.  These two arrays are then used to construct the lookup table.   The table uses linear interpolation to compute values between the known points.  Higher-order interpolations can be used, but I don&#8217;t need them in this case.  Higher-order interpolations can introduce distortions for certain types of data, so they are best avoided unless you really understand the data being fitted.  Below is a plot illustrating the  results of a look-up table:</p>
<p style="text-align: left;"><a href="http://www.shocksolution.com/wordpress/wp-content/uploads/2008/12/interpolated_h_q.png"><img class="size-medium wp-image-168 aligncenter" title="interpolated_h_q" src="http://www.shocksolution.com/wordpress/wp-content/uploads/2008/12/interpolated_h_q.png" alt="" width="288" height="216" /></a></p>
<p style="text-align: left;">The solid line is the function h(q), the black dots are the values stored in the lookup table, and the red triangles are values interpolated from the lookup table.  The function h(q) was approximated with a lookup table that was created by computing q(h) with an evenly spaced array of h values.  Notice how the points stored in the table are spaced more closely together for smaller values of q.  This is a consequence of using a linear &#8220;h&#8221; spacing with a nonlinear function.  You could work around this by using a non-uniform array of &#8220;h&#8221; values to create the lookup table.  This is another reason to understand the function you are approximating, instead of blindly placing values into the table.</p>
<p style="text-align: left;">If you want to understand more about how lookup tables work, or don&#8217;t want to install SciPy, check out this <a title="Interpolated Python Lookup Table" href="http://zovirl.com/2008/11/04/interpolated-lookup-tables-in-python/">interpolated lookup table class</a>.  I wouldn&#8217;t use it for serious math, since it doesn&#8217;t use Numpy and is probably not very fast.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/12/11/a-lookup-table-for-fast-python-math/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Update 2: building 64-bit Numpy with Intel compilers and MKL</title>
		<link>http://www.shocksolution.com/2008/12/09/update-2-building-64-bit-numpy-with-intel-compilers-and-mkl/</link>
		<comments>http://www.shocksolution.com/2008/12/09/update-2-building-64-bit-numpy-with-intel-compilers-and-mkl/#comments</comments>
		<pubDate>Tue, 09 Dec 2008 19:33:12 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Linux]]></category>

		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/?p=162</guid>
		<description><![CDATA[In a previous post I described how I built Numpy with Intel compilers and the Math Kernel Library on a 64-bit cluster.  Today I upgraded to Numpy-1.2.1 and I made a few improvements to my install process.  Please read the previous post, since I will not duplicate some important information, and then read [...]]]></description>
			<content:encoded><![CDATA[<p>In a previous post I described <a title="Previous update" href=" http://www.shocksolution.com/2008/10/17/updated-building-64-bit-numpy-with-intel-compilers-icc/">how I built Numpy with Intel compilers and the Math Kernel Library on a 64-bit cluster</a>.  Today I upgraded to Numpy-1.2.1 and I made a few improvements to my install process.  Please read the previous post, since I will not duplicate some important information, and then read on.</p>
<p>This time, I made use of a site.cfg file.  Copy the file <strong>site.cfg.example</strong> to <strong>site.cfg</strong> and edit.  At the end of the file, uncomment the [mkl] section and set the path to your library.  Mine looks like:</p>
<pre>[mkl]
library_dirs = /opt/intel/mkl/10.0.1.014/lib/em64t/
lapack_libs = mkl_lapack
mkl_libs = mkl, guide</pre>
<p>Configure the build with the following command. I&#8217;m not 100% sure that you need to specify the compilers at this point, but it never hurts to be consistent:</p>
<pre>python setup.py config --compiler=intel --fcompiler=intel</pre>
<p>After configuration, build the library:</p>
<pre>python setup.py build --compiler=intel --fcompiler=intel --verbose &gt; install_output.txt</pre>
<p>Finally, install in my home directory since I don&#8217;t have admin privileges on this cluster:</p>
<pre>python setup.py install --home=~</pre>
<p>This seems like a better method than the one I used before.  It doesn&#8217;t require typing such a long command line.  It helps that the site.cfg.example file from numpy-1.2.1 is a little easier to follow.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/12/09/update-2-building-64-bit-numpy-with-intel-compilers-and-mkl/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using Python to generate XML files for visualization in Paraview</title>
		<link>http://www.shocksolution.com/2008/11/13/using-python-to-generate-xml-files-for-visualization-in-paraview/</link>
		<comments>http://www.shocksolution.com/2008/11/13/using-python-to-generate-xml-files-for-visualization-in-paraview/#comments</comments>
		<pubDate>Thu, 13 Nov 2008 19:00:23 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Linux]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=81</guid>
		<description><![CDATA[VTK is an open-source software system for &#8220;3D computer graphics, image processing, and visualization&#8221; developed by by Kitware.  VTK is the foundation of Paraview, an industrial-strength CFD visualization tool that I have found to be very useful.  I generate &#8220;second generation&#8221; XML-based files from my Python code and import them into Paraview for [...]]]></description>
			<content:encoded><![CDATA[<p><a title="VTK homepage" href="http://www.vtk.org/">VTK</a> is an open-source software system for<span style="font-size: x-small; font-family: Arial,Helvetica,sans-serif;"> &#8220;3D computer graphics, image processing, and visualization&#8221; developed by by Kitware.  VTK is the foundation of <a title="Paraview homepage" href="http://www.paraview.org/">Paraview,</a> an industrial-strength CFD visualization tool that I have found to be very useful.  I generate &#8220;second generation&#8221; XML-based files from my Python code and import them into Paraview for visualization.  I am in the process of creating some Python classes to do, and I hope to publish them soon.  Until then, I want to share some useful resources. The <a title="VTK File Formats" href="http://www.vtk.org/pdf/file-formats.pdf">VTK file formats are specified in this document</a>.  It&#8217;s a pretty good specification, but it lacks some examples.  Soon I will post an example of a valid unstructured, serial .vtu file.  Each VTK file includes data from only one time step, so you have to keep track of time yourself (the filename is an easy solution).  Paraview can read in data from multiple time steps, but you have to specify them in a .pvd file.  This is also an XML file, with the following format: (<a title="Cmake discussion t hread" href="http://www.cmake.org/pipermail/paraview/2008-August/009062.html">reference</a>)<br />
</span></p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;VTKFile type="Collection" version="0.1" byte_order="LittleEndian"&gt;
&lt;Collection&gt;
&lt;DataSet timestep="131" group="" part="0" file="particles_step131.vtu"/&gt;
&lt;DataSet timestep="417" group="" part="0" file="particles_step417.vtu"/&gt;
&lt;DataSet timestep="923" group="" part="0" file="particles_step923.vtu"/&gt;
&lt;DataSet timestep="1744" group="" part="0" file="particles_step1744.vtu"/&gt;
&lt;DataSet timestep="5113" group="" part="0" file="particles_step5113.vtu"/&gt;
&lt;DataSet timestep="17613" group="" part="0" file="particles_step17613.vtu"/&gt;
&lt;DataSet timestep="29999" group="" part="0" file="particles_step29999.vtu"/&gt;
&lt;/Collection&gt;
&lt;/VTKFile&gt;</pre>
<p>You can read in a single .pvd file from Paraview, and all the timestep files will automatically be read in.</p>
<p>In the near future, I plan to publish a Python class that handles the details of generating valid XML, which you can use as an example or as a foundation to create more advanced VTK output.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/11/13/using-python-to-generate-xml-files-for-visualization-in-paraview/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Unexpected integer/float math behavior in Python</title>
		<link>http://www.shocksolution.com/2008/11/06/unexpected-integerfloat-math-behavior-in-python/</link>
		<comments>http://www.shocksolution.com/2008/11/06/unexpected-integerfloat-math-behavior-in-python/#comments</comments>
		<pubDate>Fri, 07 Nov 2008 00:21:16 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=78</guid>
		<description><![CDATA[I wasted some time today tracking down a bug in one of my programs.  It turned out to be &#8220;unexpected behavior&#8221; rather than a bug.  I was aware of this aspect of the language, but I made an assumption and got bit.  Read on for a valuable lesson.
Python handles integer math differently than floating point [...]]]></description>
			<content:encoded><![CDATA[<p>I wasted some time today tracking down a bug in one of my programs.  It turned out to be &#8220;unexpected behavior&#8221; rather than a bug.  I was aware of this aspect of the language, but I made an assumption and got bit.  Read on for a valuable lesson.</p>
<p>Python handles integer math differently than floating point math.  If you type a number without a decimal point, Python treats it as an integer.  <strong>All math performed only with integers results in integers.</strong> For example, 1/2 evaluates to 0 while 1./2. evaluates to 0.5.  If you mix integers and floats, Python will  produce a floating point result (1/2.=0.5), but you must be very careful.  For example, you might expect the expression 4/3*3.14159 to yield a floating point result.  It does yield a floating point number, but <strong>not</strong> the one you were expecting!  4/3*3.14159 yields 3.14159.  What happened?  Python works from left to right.  4/3 evaluates to the integer &#8220;1&#8243;.  1*3.14159 evaluates to  3.14159.  For comparison, 4./3.*3.14159 evaluates to 4.1887866.  Here&#8217;s the problem with this particular aspect of Python: according to the rules of math, 4/3*3.14159 is exactly the same expression as 4*3.14159/3, but in Python they yield different results if you forget the decimal points!  4*3.14159 evaluates to a floating point, so (4*3.14159)/3 yields the &#8220;correct&#8221; floating point value.</p>
<p>Lesson Learned: be explicit about specifying <strong>all</strong> floats if you are doing floating-point math!  Sometimes I get lazy and leave a trailing decimal point off of a number when doing a floating point calculation, knowing that the results are &#8220;upcast&#8221; into floats. Not any more!</p>
<p>Note: this unexpected behavior <a title="Python 3.0 Changes" href="http://docs.python.org/dev/3.0/whatsnew/3.0.html#common-stumbling-blocks">goes away in Python 3.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/11/06/unexpected-integerfloat-math-behavior-in-python/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Profiling Python code</title>
		<link>http://www.shocksolution.com/2008/10/24/profiling-python-code/</link>
		<comments>http://www.shocksolution.com/2008/10/24/profiling-python-code/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 21:37:24 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=71</guid>
		<description><![CDATA[&#8220;Speed&#8221; is a complicated term when used in the context of software.  Does it mean raw speed of execution, or reducing the amount of time until a correct result is obtained?  Python is not the first language that comes to mind when people think of &#8220;fast software.&#8221;  It is true that pure Python will usually [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Speed&#8221; is a complicated term when used in the context of software.  Does it mean raw speed of execution, or reducing the amount of time until a correct result is obtained?  Python is not the first language that comes to mind when people think of &#8220;fast software.&#8221;  It is true that pure Python will usually not execute as quickly as the same algorithm directly coded in C or Fortran.  However, when you define speed as &#8220;least amount of time until you get the right answer,&#8221; then Python is pretty fast.  It is so easy to develop correct code in Python, when compared to low-level compiled languages, that Python is often the fastet route to a correct answer, even if the execution time is longer.  Having said that, there <strong>are</strong> times when code has to execute quickly, and that&#8217;s why I will introduce you to profiling Python code.</p>
<p>Like all things Python, profiling is easier than you think.  Python 2.4 has the <a title="Python 2.4 profiler" href="http://www.python.org/doc/2.4/lib/profile.html">profile module</a>, and Python 2.5 has both <a title="Python 2.5 profilers" href="http://www.python.org/doc/2.5.2/lib/profile.html">profile and cProfile</a>.  cProfile is written in C for lower overhead, and it&#8217;s the recommended version.  I am stuck with profile, because the cluster that I am working with still uses Python 2.4.  You can read the docs for more details, but I will quickly outline what I find to be the most helpful usage.  For example, say I want to profile the file <strong>run_sim.py</strong>.  I use the following command line:</p>
<pre>python -m profile -o profile_file run_sim.py</pre>
<p>The argument -m tells Python to run the library module <strong>profile</strong> as a script.  The argument -o is an option that tells <strong>profile</strong> to write the profile data to the specified file name.  The last argument, of course, is the name of the script to analyze.  Once the run is finished,  you can analyze the results.  I prefer to do this interactively, and I highly recommend the <a title="iPython" href="http://ipython.scipy.org/moin/">iPython</a> shell.  The following is an example of an interactive session in iPython.</p>
<pre>In [1]: import pstats
In [2]: p = pstats.Stats('profile_file')
In [3]: p.sort_stats('cumulative').print_stats(10)
Fri Oct 24 17:18:37 2008    profile_file

         278914972 function calls (277313577 primitive calls) in 4618.950 CPU seconds

   Ordered by: cumulative time
   List reduced from 550 to 10 due to restriction &lt;10&gt;

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000 4618.950 4618.950 profile:0(execfile('run_sim.py'))
        1   12.740   12.740 4618.950 4618.950 run_sim.py:3(?)
        1    0.000    0.000 4618.950 4618.950 :0(execfile)
    129/1    0.000    0.000 4618.950 4618.950 &lt;string&gt;:1(?)
   100000 1416.240    0.014 4513.100    0.045 ~/BrownianDynamics.py:212(timeStep)
 12130610 1144.580    0.000 1620.700    0.000 ~/Brownian_Dynamics/trunk/BrownianDynamics.py:105(hasXYZCollision)
 36484167  497.980    0.000  779.550    0.000 /usr/lib64/python2.4/random.py:506(gauss)
   200000   22.760    0.000  673.340    0.003 ~/lib64/python/scipy/integrate/quadrature.py:182(simps)
   300000  545.730    0.002  637.640    0.002 ~/lib64/python/scipy/integrate/quadrature.py:152(_basic_simps)</pre>
<p>This is the profile of a Brownian Dynamics simulation.  I have sorted the results by cumulative time, to see what parts of the code take longest to run.  The first five &#8220;calls&#8221; are not that useful&#8211;they show the profiler calling the &#8220;master&#8221; script, which in turn calls a &#8220;slave&#8221; script that actually does the work.  The fifth call is where things get interesting&#8211;it shows the runtime of the script that actually does the computations.  The sixth call is to the method &#8220;hasXYZCollision&#8221; within that script.  Clearly, this is the most time-consuming part of the code, which is why I have spend the past few posts optimizing it!  The next call is to a random number generator&#8211;not much I can do about that right now.  The final two calls demonstrate that you have to read the profile carefully.  The function &#8220;simps&#8221; calls the function &#8220;_basic_simps&#8221;, so the total time spend doing numerical integration is actually 673 seconds, not 673+637 seconds.</p>
<p>I have found the Python profiler to be extremely useful.  Since <a title="Quote from Donald Knuth" href="http://en.wikipedia.org/wiki/Optimization_(computer_science)#When_to_optimize">&#8220;premature optimization is the root of all evil,&#8221;</a> the profiler is an necessary tool to find and improve those bits of code that are truly inefficient.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/24/profiling-python-code/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Updated: building 64-bit Numpy with Intel compilers (icc)</title>
		<link>http://www.shocksolution.com/2008/10/17/updated-building-64-bit-numpy-with-intel-compilers-icc/</link>
		<comments>http://www.shocksolution.com/2008/10/17/updated-building-64-bit-numpy-with-intel-compilers-icc/#comments</comments>
		<pubDate>Fri, 17 Oct 2008 19:36:38 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Linux]]></category>

		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=69</guid>
		<description><![CDATA[I had to re-build Numpy because our cluster was upgraded and the Intel compilers and libraries were moved to a different directory.  This turned out to be a half-day affair of trial-and-error.  I learned a few important things, which I will try to list here:

Delete the numpy-1.0.4/build directory after every build attempt.  Doing &#8220;python setup.py [...]]]></description>
			<content:encoded><![CDATA[<p>I had to re-build Numpy because our cluster was upgraded and the Intel compilers and libraries were moved to a different directory.  This turned out to be a half-day affair of trial-and-error.  I learned a few important things, which I will try to list here:</p>
<ul>
<li>Delete the numpy-1.0.4/build directory after every build attempt.  Doing &#8220;python setup.py clean&#8221; is <strong>not</strong> effective.  I kept getting errors about undefined symbols when I tried to &#8220;import numpy&#8221; on the Python command line.  It was looking for symbols in the old locations, even though I had just rebuilt the code using the new library locations.  It turned out that I needed to delete the build directory in order to force a complete bottom-up rebuild.</li>
<li>The use of &#8220;setup.py&#8221; from distutils is not well documented online.  The best thing to do is run &#8220;python setup.py &#8211;help-commands&#8221; to get a list of available commands.  Then run &#8220;python setup.py &lt;cmd&gt; &#8211;help&#8221; to get help for that specific command.  You can string commands together on the command line, as I will show in the example below.</li>
<li>When you test the new numpy, make sure you are <strong>not</strong> in the numpy-1.0.4 directory!  If you are in the numpy source directory, when you import numpy, you will get the message &#8220;Running from numpy source directory.&#8221; and you will not be able to load any symbols from numpy.</li>
<li>On 64-bit architectures, you need to compile position-independent library code.  For some reason, distutils does not do this automatically, and the compilation will fail with an error similar to the following:
<pre>relocation R_X86_64_PC32 against `_various_library_symbols'
can not be used when making a shared object; recompile with -fPIC</pre>
<p>Edit the file numpy-1.0.4/numpy/distutils/intelccompiler.py.  Change the line <strong>cc_exe=&#8221;icc&#8221;</strong> to<strong> cc_exe=&#8221;icc -fPIC&#8221;</strong> (line 11 in my version of numpy).</li>
<li>Make sure your LD_LIBRARY_PATH is pointing at the right location, or the compiled code won&#8217;t be able to find shared libraries.</li>
</ul>
<p>Here is the command line I used to build numpy:</p>
<p>python setup.py config &#8211;library-dirs=/apps/intel/mkl/10.0.1.014/lib/em64t/ &#8211;library-dirs=/apps/intel/cce/10.1.008/lib/ &#8211;compiler=intel &#8211;fcompiler=intel build &#8211;verbose install &#8211;home=~ &gt; install_output.txt</p>
<p>Explanation: &#8220;config&#8221; is a command that accepts arguments &#8211;library-dirs to set the path where icc searches for shared libraries, &#8211;compiler to set the C compiler, and &#8211;fcompiler to set the Fortran compiler.  Technically, I think you can omit the &#8220;build&#8221; command.  The &#8220;install&#8221; command installs numpy, and the &#8211;home=~ installs it in the user directory (I don&#8217;t have admin privileges on this machine).  Finally, &gt; install_output.txt redirects the lengthy output to a text file for debugging purposes.</p>
<p>I hope this saves somebody some time!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/17/updated-building-64-bit-numpy-with-intel-compilers-icc/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Even faster collision detection in Python using Numpy</title>
		<link>http://www.shocksolution.com/2008/10/16/even-faster-collision-detection-in-python-using-numpy/</link>
		<comments>http://www.shocksolution.com/2008/10/16/even-faster-collision-detection-in-python-using-numpy/#comments</comments>
		<pubDate>Thu, 16 Oct 2008 19:44:15 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=67</guid>
		<description><![CDATA[Last night, in the shower, I realized that my collision detection routine could be even faster. Here is a representative snippet of code from my previous post:
    d2 = (x-self.x[0:i])*(x-self.x[0:i]) + (y-self.y[0:i])*(y-self.y[0:i]) + (z-self.z[0:i])*(z-self.z[0:i])

For some reason, I used the code (x-self.x)*(x-self.x) instead of (x-self.x)**2.  Upon further reflection, I realized that (x-self.x)*(x-self.x) computes [...]]]></description>
			<content:encoded><![CDATA[<p>Last night, in the shower, I realized that my collision detection routine could be even faster. Here is a representative snippet of code from my previous post:</p>
<pre><code>    d2 = (x-self.x[0:i])*(x-self.x[0:i]) + (y-self.y[0:i])*(y-self.y[0:i]) + (z-self.z[0:i])*(z-self.z[0:i])
</code></pre>
<p>For some reason, I used the code (x-self.x)*(x-self.x) instead of (x-self.x)**2.  Upon further reflection, I realized that (x-self.x)*(x-self.x) computes the difference between array elements twice, and then multiplies the results.  Using a &#8220;power function&#8221; should enable the interpreter to compute the difference only once, and then multiply each element times itself.  Here is the updated code, using Python&#8217;s power operator:</p>
<pre>d2 = (x-self.x[0:i])**2 + (y-self.y[0:i])**2 + (z-self.z[0:i])**2</pre>
<p>The previous version of the code ran in 46.3 seconds on a representative problem&#8211;the updated version ran in only 35.8 seconds on the same problem.  That&#8217;s a pretty good improvement.  Out of curiosity, I decided to try Numpy&#8217;s power() function in place of Python&#8217;s power operator:</p>
<pre>two = i
d2 = power(x-self.x[0:i],two) + power(y-self.y[0:i],two) + power(z-self.z[0:i],two)nt_(2)</pre>
<p>This version took 55.5 seconds to execute.  Using a Numpy integer scalar instead of a Python integer made no difference.  Numpy&#8217;s power() function is quite sophisticated&#8211;it can take two array arguments and compute element-by-element powers.  I suspect that this flexibility results in some overhead that slows it down compared to Python&#8217;s built-in power operator.</p>
<pre></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/16/even-faster-collision-detection-in-python-using-numpy/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Haze damages LCD projectors</title>
		<link>http://www.shocksolution.com/2008/10/14/haze-damages-lcd-projectors/</link>
		<comments>http://www.shocksolution.com/2008/10/14/haze-damages-lcd-projectors/#comments</comments>
		<pubDate>Tue, 14 Oct 2008 17:24:28 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Stage Lighting]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=65</guid>
		<description><![CDATA[We recently learned the hard way that the use of haze can lead to reliability problems for some LCD projectors. We have two rather large LCD projectors (don&#8217;t know the specs offhand) permanently mounted to the ceiling, projecting onto the front of screens located on both sides of the stage.  A third LCD projector hits [...]]]></description>
			<content:encoded><![CDATA[<p>We recently learned the hard way that the use of haze can lead to reliability problems for some LCD projectors. We have two rather large LCD projectors (don&#8217;t know the specs offhand) permanently mounted to the ceiling, projecting onto the front of screens located on both sides of the stage.  A third LCD projector hits a large rear-projection screen that basically forms the back wall of the stage.  We also have an oil-based hazer mounted above the stage.  We had some old, tired projectors, so we assumed that they were just old and unreliable.  However, after replacing them with new, more powerful models, the reliability issues continued.  After sending the new projectors out for service for the second time in less than a year, the service center informed us that a film of oil had been deposited on the LCD.  You can buy expensive sealed-optics projectors, which should work reliably in dirty environments, but we saved money and bought standard models.  We don&#8217;t have the budget to replace the &#8220;new&#8221; projectors, so we&#8217;ve had to stop using haze altogether.  The stage doesn&#8217;t look nearly as good without it, and our moving lights are much less useful.</p>
<p>On a side note: you should be aware that haze can also damage moving light fixtures that use a fan to cool the power supply or motors.  The fan sucks in haze and coats the internal components with a film of oil.  Eventually, something overheats and the fixture can actually catch fire.  Fortunately, we have older High End Studiospots and Trackspots that apparently don&#8217;t use forced-air cooling, so apparently the haze doesn&#8217;t really get inside.  Be warned!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/14/haze-damages-lcd-projectors/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Speeding up Python math with Numpy: collision detection example</title>
		<link>http://www.shocksolution.com/2008/10/12/speeding-up-python-math-with-numpy-collision-detection-example/</link>
		<comments>http://www.shocksolution.com/2008/10/12/speeding-up-python-math-with-numpy-collision-detection-example/#comments</comments>
		<pubDate>Sun, 12 Oct 2008 18:10:20 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[Software development]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=63</guid>
		<description><![CDATA[Python is a very-high-level language.  That makes it easy to write code quickly, but the program may not be as fast as a program compiled from a lower-level language.  For this reason, many scientific programs are written in Fortran or C++.  However, it has always been my experience that the majority of time on a [...]]]></description>
			<content:encoded><![CDATA[<p>Python is a very-high-level language.  That makes it easy to write code quickly, but the program may not be as fast as a program compiled from a lower-level language.  For this reason, many scientific programs are written in Fortran or C++.  However, it has always been my experience that the majority of time on a project is spent in writing, modifiying, and debugging code, rather than executing.  Fortunately, if written correctly, the time-critical parts of Python code can execute almost as fast as compiled software.  Here is an example of a collision-detection algorithm which achieved almost a ten-fold increase in speed when written to use <a title="Numpy homepage" href="http://numpy.scipy.org/" target="_blank">Numpy</a>.</p>
<pre><code># Oldest, slowest method
for j in range(0,i):
    if ((x-self.x[j])*(x-self.x[j]) + (y-self.y[j])*(y-self.y[j]) + (z-self.z[j])*(z-self.z[j])) &lt; R_squared:
        return True
for j in range(i+1,len(self.x)):
    if ((x-self.x[j])*(x-self.x[j]) + (y-self.y[j])*(y-self.y[j]) + (z-self.z[j])*(z-self.z[j])) &lt; R_squared:
        return True
return False</code></pre>
<p>This code starts with lists of the coordinates (self.x, self.y, self.z) of identical spheres in three-dimensional space.  A sphere &#8220;i&#8221; is moved to a new location (x,y,z), and we need to know if it collides with any of the other spheres.  This algorithm uses the Pythagorean Theorem to compute the center-to-center distance between spheres.  If this distance is less than twice the sphere radius, then the new sphere intersects with one of the old ones, so we return &#8220;True,&#8221; otherwise return &#8220;False.&#8221;  We loop through every sphere in the list (except for the current one) and test for a collision.  Now, we will perform the same operation using Numpy:</p>
<pre><code>if i&gt;0:
    d2 = (x-self.x[0:i])*(x-self.x[0:i]) + (y-self.y[0:i])*(y-self.y[0:i]) + (z-self.z[0:i])*(z-self.z[0:i])
    if min(d2) &lt; R_squared:
        return True
if i+1 &lt; len(self.x):
    d2 = (x-self.x[i+1:len(self.x)])*(x-self.x[i+1:len(self.x)]) + (y-self.y[i+1:len(self.x)])*(y-self.y[i+1:len(self.x)]) + (z-self.z[i+1:len(self.x)])*(z-self.z[i+1:len(self.x)])
    if min(d2) &lt; R_squared:
        return True</code></pre>
<p>The basic collision detection test remains the same, but now the math is implemented using Numpy operaters on Numpy arrays.  Notice that the loops are gone&#8211;Numpy operators can operate element-by-element on an array so that an entire &#8220;for&#8221; loop can be written in one line.  Not only is the syntax cleaner, but it in this case, it sped up the code by almost a factor of ten!  Numpy is written in C and carefully optimized to perform <strong>fast</strong> operations on arrays.  When you need to operate on a whole array, the operations take place at the speed of compiled C.  Note that I had to add some &#8220;if&#8221; statements to avoid creating zero-element arrays for the cases i=0 and i=len(array)-1.</p>
<p>I&#8217;m going to speculate on another reason why Numpy might be so much faster: the compiled Numpy code can take advantage of parallel execution in the CPU.  Modern CPUs have some level of instruction-level parallel execution, even within a single core.  A good compiler will find sections of code that don&#8217;t have to run sequentially, and compile them in a way that takes advantage of this.  The collision detection is great example of code that can be executed in parallel&#8211;whether or not sphere &#8220;i&#8221; collides with sphere &#8220;i-1&#8243; has nothing to do with whether or not it collides with sphere &#8220;i+1.&#8221;</p>
<p><a title="Optimizing numerical operations in Python" href="http://www.scipy.org/PerformancePython" target="_blank">Here is an extensive example of how to optimize numerical operations in Python</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/12/speeding-up-python-math-with-numpy-collision-detection-example/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The price of cheap Chauvet LED fixtures</title>
		<link>http://www.shocksolution.com/2008/10/08/the-price-of-cheap-chauvet-led-fixtures/</link>
		<comments>http://www.shocksolution.com/2008/10/08/the-price-of-cheap-chauvet-led-fixtures/#comments</comments>
		<pubDate>Thu, 09 Oct 2008 02:34:49 +0000</pubDate>
		<dc:creator>craig</dc:creator>
		
		<category><![CDATA[Stage Lighting]]></category>

		<category><![CDATA[chauvet]]></category>

		<category><![CDATA[failure]]></category>

		<category><![CDATA[intelligent light]]></category>

		<category><![CDATA[LED]]></category>

		<category><![CDATA[q-wash]]></category>

		<category><![CDATA[reliability]]></category>

		<guid isPermaLink="false">http://www.shocksolution.com/blog/?p=58</guid>
		<description><![CDATA[In my previous post about cheap LED fixtures, I didn&#8217;t mention our Chauvet Q-Wash intelligent moving light LED fixtures.  We bought four at a very discounted price to supplement our conventional movers.  They weren&#8217;t bright enough to stand out in the stage wash, but they made nice little beams in the haze, and [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous post about cheap LED fixtures, I didn&#8217;t mention our <a title="Chauvet Q-Wash" href="http://www.chauvetlighting.com/fixtures/qwashled_fix.shtml" target="_blank">Chauvet Q-Wash</a> intelligent moving light LED fixtures.  We bought four at a very discounted price to supplement our conventional movers.  They weren&#8217;t bright enough to stand out in the stage wash, but they made nice little beams in the haze, and they could be used to &#8220;fill in&#8221; small dark spots on the stage or set pieces.  As wash lights, they lack gobos or beam control, but they do have smooth color RGB mixing (most &#8220;real&#8221; wash lights do have some kind of beam width control).  So far, you might be thinking that you should pick up a couple, so I should tell you now that they all stopped working within a year or so of installation.  One died within a month or two, and after about six months of operation, we lost one every couple of months until they were all gone.  The problem wasn&#8217;t the LEDs&#8211;it was the motors or control circuitry.  We were using haze, which is known to shorten the lifespan of moving lights, but that&#8217;s the standard operating environment for concert lights!  The fixtures were also permanently mounted above our stage, so they weren&#8217;t damaged during handling.  They just quietly died on their own&#8211;well, except for the one that went insane and just started moving around on its own until we got up in the lift and unplugged it!  Fortunately this didn&#8217;t happen during a performance, as the motors got very hot and might have started smoking or something.  The moral of the story is that a low initial price doesn&#8217;t always save you money in the long run.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shocksolution.com/2008/10/08/the-price-of-cheap-chauvet-led-fixtures/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
