Monday, June 27, 2011

SPH in movies

Superman returns – but who’s looking after his water?

Aapone-20090513000179000144-topshots-australia-film-auction-superman-original
Is it a plane? No, it’s Smoothed Particle Hydrodynamics.
William West/AFP

Watching films such as Superman Returns or The Day after Tomorrow, you would have seen dramatic sequences of surging water and crumbling buildings.


While doing so, mathematics was probably the last thing you thought about; but without it, scenes of this nature would be virtually impossible.


Take the 2006 film Superman Returns. In one scene, a giant spherical object smashes into a water tank releasing a huge amount of water (see below).


Still image from Superman Returns. Courtesy of Sony Pictures Imageworks


Traditionally, the only possible way to create this kind of sequence would be to use small models – which produce unrealistic results. Or we could create a computer simulation.


Swapping droplets for particles


These days, one of the most popular methods for simulating water is to replace fluid with millions of individual particles within a computer simulation.


And the way these particles move is determined by an algorithm that my colleagues and I invented to simulate the formation of stars in our galaxy’s giant molecular clouds.


The method is known as Smoothed Particle Hydrodynamics (SPH) and the use of SPH in Superman Returns is the work of an American visual effects company called Tweak.


Superman Returns certainly isn’t the only film to feature SPH fluid simulations: think of Gollum falling into the lava of Mount Doom in Lord of the Rings: Return of the King; or the huge alligator splashing through a swamp in Primeval.


These particular scenes are the work of people at a Spanish visual effects company called NextLimit, who received an Oscar for their troubles.


How does SPH work?


Rather than trying to model a body of water as a whole, SPH replaces the fluid with a set of particles. A mathematical technique then uses the position and masses of these particles to determine the density of the fluid being modelled.


Using the density and pressure of the fluid, SPH makes it possible to map the force acting on each particle within the fluid. This technique provides results quite similar to the actual fluid being modelled. And the more particles used in the simulation, the more accurate the model becomes.


This SPH simulation uses 128,000 particles to model a fluid.


Beyond the basics


In Superman Returns, gravity also affects how the body of water behaves (the water spills out of the water tank) and SPH can easily be adapted to accomodate this.


In addition, fluids often need to flow around solid bodies such as rocks and buildings that might be carried, bobbing along, by the flow. The SPH method can be easily extended to handle this combination of solid bodies and fluids by adding sets of particles to the equation, to represent the solid bodies.


These adjustments and extensions to SPH can be made to produce very realistic-looking results.


In industry, SPH is used to describe the motion of offshore rigs in a storm, fluid flow in pumps, and injection moulding of liquid metals. In zoology, it’s being used to investigate the dynamics of fish.


SPH and the stars


As hinted at above, it’s not just water and its inhabitants that can be modelled using this technique.


SPH simulations of star formation by Matthew Bate, from the University of Exeter, and Daniel Price, of Monash, have been able to predict the masses of the stars, and the number of stable two- and three-star systems that form from a typical molecular cloud.


In the case of stable two-star systems (known as binaries) SPH can predict the shape of the orbits in good agreement with astronomical observations.


To get this level of accuracy, millions of particles are used in the SPH calculation, and the motion of these particles is calculated on a number of computer systems that work together in parallel.



SPH is also the method of choice for following the evolution of the universe after the Big Bang. This evolution involves dark matter and gas, and the simulations have one set of SPH particles for the dark matter and one set for the gas.


An advanced SPH code – known as Gadget – used for this purpose was developed by Volker Springel. The code enables astrophysicists to predict the way galaxies form and their distribution in the universe, including the effects of General Relativity.


But for non-astrophysicists, admittedly, the movies may be more of a draw.


So next time you’re watching a film and you see large swathes of water in unusual places or doing incredibly destructive things, think about maths for a moment: without it, such breathtaking scenes would be virtually impossible.

Friday, March 18, 2011

Readibility


Note to myself:
Quoted from http://bethesignal.org/blog/2011/03/18/moving-needle-gnome-leadership/


Contributors are your participants. If there is nothing to contribute, you have no participants. If you have no participants, you barely even have a project, let alone co-development! For FLOSS projects, change has an incentive beyond improving the software.
Readability is the key to creating code that others will use. Because in the end? We can scale silicon, but carbon? People are much harder to scale.

Saturday, March 5, 2011

My new mobile

At long last, i've got a new mobile, after years of using the venerable good old fellow Nokia 1100
My old phone: Nokia N8 (circa 2005)
There was not much wrong with the old phone apart from battery problem. However there was one thing which made me buy new phone earlier.
Earlier last month Nokia announced at MWC its deal with MSFT, and the trojan horse that Elop proved himself. I loved Nokia and was waiting for a long time for the N9 with Meego to come to market. However this announcement quashed all my hopes. Hence i bought a Nokia phone while it was still available w/o Windows OS. And the best Nokia phone in market at the time in India was the Nokia N8.
So this is my new phone, a silver Nokia N8.
My new phone: Nokia N8
Many people have asked my why Nokia instead of Android. Well it has really awesome hardware, even if the software Symbian^3 is not as snappy and does not have a million stupid apps in the Ovi store.
Here are some things which impressed me:
  • 12 MPx camera with carl-zeiss lenses and xenon flash: I can assure you it can take really awesome pics and i know mega-pixels are not everything in photography. It has a really large sensor for a camera phone. Of course it cannot replace a dedicated DSLR camera (i don't have one), but is handy and good as my primary camera.
  • FM Transmitter: Yes you can start your own radio station upto a few meters. Not that i have any idea to use it, but still its a cool technology.
  • USB on the go: I can connect usb storage drives (pen drives and external hard disk drives directly to my mobile). USB keyboard and mouse can also be connected to the mobile.A mouse cursor moving in the mobile looks really cool.
  • Proper wireless and proxy support: My friend's android cannot connect to ad-hoc wireless networks and also does not support proxy on wlan, so the he needs to be connected to gprs.
  • I do not want google to know "ALL" about me.
One con of this mobile is that development on symbian has stopped, and linux support for symbian development is abysmal, and there's no FOSS compiler on linux to compile to symbain, and nokia remote compiler does not support proxy.
If only pyside would work on N8, i'd be really happy. Anyone interested in a GSOC project for this please check here http://developer.qt.nokia.com/wiki/PySide_GSoc_Ideas. Treat assured :)


Saturday, January 22, 2011

Stay Healthy By Taking Breaks

Stay Healthy By Taking Breaks: "
Most of us lead sedentary lifestyles these days -- most of our time is spent in front of computers. This slowly is causing a lot of problems people from previous generations haven't experienced: back aches, knee problems, wrist pains, myopia, among others. And just going to a gym or putting in one hour of physical activity a day isn't enough. It doesn't help balance the inactivity over the entire day.

I recently wrote an article in the BenefIT magazine that talks about two tools: Workrave and RSIBreak. Thanks to the publishers, the article is available in pdf format under a CC license.

I've tried both the software but have been using Workrave for quite a while now and am quite happy with it. To briefly introduce them: both software prompt the user to take a break at regular intervals. They have timers that trigger at configured intervals asking the user to take a break. Workrave also has some stretching exercises suggested that can be performed in the longer breaks. The shorter (and more frequent) breaks can be used to take the eyes off the monitor and to relax them. Read the article for more details.

I've reviewed Workrave version 0.9.1 in the article, though the current version as of now is 0.9.3, which has a few differences from those mentioned in the article. The prime difference is the addition of a 'Natural Rest Break' that gets triggered when the screen-saver gets activated, which is nice since if the user walks away from the computer for a prolonged period of time, the rest break in effect has been taken, and the next one is scheduled after the configured duration once the screen-saver is unlocked.

Both software are available in the Fedora repository: Workrave is based on the GTK toolkit (and integrates nicely with the GNOME desktop), whereas RSIBreak is based on the Qt toolkit (and integrates nicely with the KDE desktop). Give these software a try for a cheap but effective way of staying healthy!

"


I've installed Workrave now from the fedora repos. It seems much better than the gnome-typing-break in the keyboard preferences. RSIBreak tries to install a whole lot of KDE dependencies. If anyone of you sits for long hours on a computer, please be nice to yourself and prevent RSI (repetitive strain injury) and other such problems. If you think that working out a few hours in gym will counter it, you are mistaken.

Tuesday, January 18, 2011

firefox 4b9

Firefox 4 beta 9 is looking good, apart from no hw acceleration on linux.
The minizable menu-bar is a very cool feature. Now all the UI is reduced to a single line on my firefox :)

Here's a screenshot running firefox 4b9 on gnome-shell. See how the menu, back-forward buttons, awesomebar, tabs, tabs-list, panorama all fit into a single line. Now thats some use of the wide laptop screens.


Mathematics, History and worms eating manuscripts…

Mathematics, History and worms eating manuscripts…: "

This is a sad story of forgetten history, indifference towards ancient knowledge and wisdom & callous neglect…Read on.. From A search for India’s mathematical roots, some depressing excerpts (emphasis added):


K. Ramasubramanian is the head of the Indian Institute of Technology, Bombay (IIT-B) research Cell for Indian Science and Technology in Sanskrit (CISTS), the only one of its kind in the country, where doctoral students translate the work of ancient Indian scientists into English, study language technology in Sanskrit that will help computers to analyse a wide range of speech and text, and make the translation and interpretation of Sanskrit texts easy.


…“No country should allow the distortion of its own history,” said Murli Manohar Joshi, former Union minister for human resource development, who had directed all the IIT campuses to set up a CISTS in 2002. Following the directive, IIT-B appointed Kulkarni to spearhead research in Sanskrit language technology in 2003. A year later, the institute brought Ramasubramanian on board.


His students are now at different stages of translating primary Sanskrit texts (dating between the seventh and 15th centuries) of the Kerala School mathematics…All these texts work on the same principles, but work on different timescales. For instance, “Siddhanta texts help predict astronomical positions for a mahayuga (great age), which is about 4,320,000 years. The intermediate Tantra texts work with a yuga, one-tenth of that time—432,000 years. Finally, the Karna texts help quick calculations for as little as one month. My students are working with all three of these texts,” said Ramasubramanian.


…But not every member of the team has scientific training. One of them is a trained astrologer and delighted to read the future. Dinesh Mohan Joshi, (grandson of an astrologer) said: “I saw my grandfather look at kundalis (a graphical representation of planetary positions at birth that charts the life course of the baby) and makes predictions. I saw them come true. I was fascinated. I wanted to be able to do that too. So, I went to (Shri) Lal Bahadur (Shastri) Sanskrit Vidyapeeth, Rashtriya Sanskrit Vidyapeeth and became an acharya (teacher) there. Then a friend told me about this cell and I decided to come.” Unlike Bhatt, his was an uphill struggle to master the mathematics, “because I had no formal training in the subject”.


…And in Joshi’s struggle to learn mathematics, lies the biggest challenge that this venture faces, because “there just aren’t enough people who are skilled in both. If they know Sanskrit, they know little science. And if they are good scientists, they are not interested in Sanskrit or translation of Indian texts”, said Subramanian, explaining why, despite making an enormous effort, IIT has not been able to expand the cell.



Photograph of K. Ramasubramanian, courtesy: IIT Mumbai


Another challenge is of a different nature: Original manuscripts are either rotting or missing. “I had gone to find out some text related to my research at the Kerala University library of manuscripts when I found worms eating four of seven manuscripts. I bought lemongrass oil and gave it to the librarian who said they were too short staffed to look after the documents,” said Ramasubramanian, lamenting that it was the same story across the country. “We simply do not take our historical heritage, intellectual heritage seriously.”


The professors and students say they have to battle for respect in a country where history, especially the history of science has little value. “Only recently, the cell has started getting more visibility, people have begun asking us to come and talk about our work. Slowly, people are becoming interested…” Kulkarni said.



Reminded me of: Does no one remember the Hindu contribution to Mathematics? and this on the Kerala School




"

Sunday, November 14, 2010

whY kernel development is not for the light-hearted

 The linux kernel is a huge behemoth


[pankaj@pankajlaptop pysph-perf]$ su -
Password:
[root@pankajlaptop ~]# yum --enablerepo fedora-debuginfo --enablerepo updates-debuginfo install kernel-debuginfo
Loaded plugins: auto-update-debuginfo, langpacks, presto, refresh-packagekit
Adding en_US to language list
Found 1 installed debuginfo package(s)
Enabling rpmfusion-nonfree-debuginfo: RPM Fusion for Fedora 14 - Nonfree - Debug
Enabling rpmfusion-free-updates-debuginfo: RPM Fusion for Fedora 14 - Free - Updates Debug
Enabling rpmfusion-free-debuginfo: RPM Fusion for Fedora 14 - Free - Debug
Enabling rpmfusion-nonfree-updates-debuginfo: RPM Fusion for Fedora 14 - Nonfree - Updates Debug
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package kernel-debuginfo.x86_64 0:2.6.35.6-48.fc14 set to be installed
--> Processing Dependency: kernel-debuginfo-common-x86_64 = 2.6.35.6-48.fc14 for package: kernel-debuginfo-2.6.35.6-48.fc14.x86_64
--> Running transaction check
---> Package kernel-debuginfo-common-x86_64.x86_64 0:2.6.35.6-48.fc14 set to be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================
 Package                                           Arch                      Version                              Repository                            Size
=============================================================================================================================================================
Installing:
 kernel-debuginfo                                  x86_64                    2.6.35.6-48.fc14                     updates-debuginfo                    239 M
Installing for dependencies:
 kernel-debuginfo-common-x86_64                    x86_64                    2.6.35.6-48.fc14                     updates-debuginfo                     37 M

Transaction Summary
=============================================================================================================================================================
Install       2 Package(s)

Total download size: 276 M
Installed size: 1.6 G

Is this ok [y/N]: y
Downloading Packages:
Setting up and reading Presto delta metadata
Processing delta metadata
Package(s) data still to download: 276 M
(1/2): kernel-debuginfo-2.6.35.6-48.fc14.x86_64.rpm                                                                                   | 239 MB     01:11    
(2/2): kernel-debuginfo-common-x86_64-2.6.35.6-48.fc14.x86_64.rpm                                                                     |  37 MB     00:11    
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                        3.3 MB/s | 276 MB     01:23    
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : kernel-debuginfo-common-x86_64-2.6.35.6-48.fc14.x86_64                                                                                1/2
  Installing     : kernel-debuginfo-2.6.35.6-48.fc14.x86_64                                                                                              2/2

Installed:
  kernel-debuginfo.x86_64 0:2.6.35.6-48.fc14                                                                                                                

Dependency Installed:
  kernel-debuginfo-common-x86_64.x86_64 0:2.6.35.6-48.fc14                                                                                                  

Complete!
[root@pankajlaptop ~]#

Thursday, November 11, 2010

python editors in python: spyder and iep

This post is gonna be about python editors (IDEs if you may call, not but quite so)

If you are looking for IDEs, check out pydev and SPE, these are some of the best ones out there with integrated debugging features. There's also a wing editor which many people say is quite good, but i've never used it

Here i'm gonna list opinions about IEP and Spyder. I'm really interested in both of them and run their repository version.
My main issue is i want a good editor for cython, for which i am willing to get my hands dirty (a bit) and add some features myself (see http://powerpan.blogspot.com/2010/10/cython-functions-coverage-revsited.html), so if any of you can help me or have an opinion please do so.

Common features:
Both are python editors written in python (pyqt)
Both provide code completion support and outline
Both have integrated python shells

Differences:
Pros:

Spyder seems a bit more mature project
Spyder has ipython shell which is more useful than a python shell
IEP has outline support for cython

Cons:
Spyder seems to excessively misuse screen real estate - see this issue (editor areas are quite small)
IEP lacks some features such as tree file browser, shell variables, variable occurances marker, pylint support annotation
IEP is officially only for Python3, though you can surely work with python2 files (Also check http://code.google.com/r/pankaj86-iep-python2/source/browse if you really want to run it in python2 only)

Features both lack:
Graphical Debugger: get the code from pydev/winpdb to provide a graphical debugger
Cython support: I could certainly do well with more cython support. Outlining (iep does that), completion support in cython and goto definition (including into cython file from a python file)
Profiling support: Simply run profiling and put the result data into a table (spreadsheet widget)
Documentation: More documentation is needed for both the projects, especially developer documentation to implement new interesting plugins, such as profiling, debugging etc
Also it'd be cool if tooltips could be added to the editor to show documentation and other information on hovering on words in the editor, as in pydev

Few ideas:
Merge features from IEP to Spyder and vice versa. Add new features to both. Having two different projects is good in a way as it keep diversity and help in bringing new ideas. However that shouldn't mean they are independent without any co-operation. Both should surely work together to bring new features.

NOTE: The content may get outdated as you are reading it

Sunday, October 31, 2010

linux proxy problem revisited

Some time back i posted how to setup a local forwarding proxy in linux so that u do not need to set your proxy password in each and every program that asks, and that anyone and everyone cannot see your passwords by simply typing "echo $http_proxy"
But again it was not so automatic, you still needed to set the proxy to localhost "http://127.0.0.1:3128/"

Now this post is to make that also redundant. No program connecting to http (port 80) (not https) needs to have the proxy set. This is done by a transparent proxy (intercepting proxy), which 3proxy is by default without requiring any specific configuration (squid needs some option to be set in its config file)

Firstly set up the local forwarding proxy as mentioned in my previous post http://powerpan.blogspot.com/2010/06/linux-proxy-problem.html
Now you need to set an iptables (default linux firewall) policy which will redirect all outgoing traffic to port 80 to the local proxy at port 3128
You also need some way to exclude  the proxy itself from being redirected. So do the following:

Create a specific user for 3proxy:
Create a user on your computer (say 3proxy) with a specific uid (say 480)
# useradd -u 480 3proxy
Now in the 3proxy config file set it to change its user to 3proxy. Just before the "proxy" line in /etc/3proxy.cfg add the following line:
setuid 480
and restart the 3proxy service:
# service 3proxy restart

Redirect outgoing http traffic to local proxy
Here's an iptables rule that forwards outgoing traffic on port 80 (excluding those from user 3proxy) to the local proxy.
# iptables -t nat -A OUTPUT -p tcp -m owner ! --uid-owner 3proxy --dport 80 -j REDIRECT --to-port 3128

Thats all. Now you are done. To make it persistent across reboots add it to some startup file (/etc/profile.d/).

If you are on Fedora there's a better way:
Create a file (/etc/iptables-transparent-proxy) with the following line:
-A OUTPUT -p tcp -m owner ! --uid-owner 3proxy -m tcp --dport 80 -j REDIRECT --to-ports 3128
Now open system -> config -> firewall and in the Custom Rules (bottommost filter on the left) add a new rule with protocol:ipv4, table:nat and file:/etc/iptables-transparent-proxy and you are done.

To test it, open a terminal, unset http_proxy and wget google.com. The index.html file should be downloaded


NOTE:

  • This automatic forwarding does not work for https sites which are specifically designed to prevent such things (man-in-the-middle attacks)
  • Outgoing http traffic to any port other than 80 is not redirected

Saturday, October 30, 2010

cython functions coverage revsited

A few days back i showed how to get coverage of cython functions using Ned Batchelder's coverage.py
In that post i had posted a patch to coveragepy to enable cython function coverage.

However i realize that many of my friends have never applied a patch before, dont have admin rights on few machines few other problems which may hinder their using this new feature, so here's a good new for you.

I've rewritten the patch into a single file pyx_coverage.py . You can directly use this file instead of 'coverage' command to get cython function coverage, no need to patch anything. You still need to have Ned's coveragepy installed though.

All commands/options/configuration files for coveragepy are applicable here too.

To find coverage of cython files (pyx extension) you need to do following:
1. compile cython code to 'c' with directive profile=True
2. keep source pyx files in same locations as the compiled .so files
    i.e. use 'python setup.py build_ext --inplace' or 'python setup.py develop'
3. run coverage (this file) with the option timid enabled (can also set in in .coveragerc)
    i.e. 'python pyx_coverage.py run --timid my_module.py'

You can use nose test collector as follows:
$ python pyx_coverage.py run /path/to/nosetests /path/to/source

replacing the /paths as appropriate

Download the file from here: https://sites.google.com/site/pankaj86/files/pyx_coverage.py

Reader's Bonus: If you can help me write a python c extension for this treat assured.
Hint: See Ned's coverage.tracer python c extension.

Monday, October 25, 2010

cython functions coverage using coverage.py

For all the coders out there, if you have not been writing unit tests for your code then god bless you, but if you do write tests here's another tool you must use: code coverage.
In python, on of the most popular code coverage tools is Ned Batchelder's coverage.py. It reports statement coverage of all your tests and also can report coverage in beautiful html pages.Its a very nice tool and also integrates well with testing frameworks such as nose to automate your testing and coverage reporting tasks.

But, for all those who use cython, you must surely be aware of the difficulties it brings along while testing, you can never be sure if "all is well". coverage.py doesn't report coverage of cython modules, as those are compiled into native functions.
To mitigate this problem to some extent, i wrote a simple patch to enable coverage.py to report function coverage (not statement coverage) of cython pyx files too, hurray. So now after applying the patch to coverage, all you need to do is:

  1. compile cython code with profiling enabled (cython --directive profile=True)
  2. run your tests under coverage as you would normally do taking care to add the timid option (coverage --timid test.py)
  3. ???
  4. profit
So now that you can profit, i'd be greatly thankful if someone comes up and writes this patch into the tracer.c file in the coverage source (it's very simple, get the source using $ hg clone http://bitbucket.org/ned/coveragepy)

Tuesday, July 13, 2010

Nautilus media tag editor extension by yours truly

Over this weekend, i was searching for something to do (not that i dont have tasks piled up), and i remembered there was no easy way to edit audio files tags. I know there are some excellent programs out there such as easytag and exfalso . However they need you to run a separate program to edit the tags.
Also there is a nautilus extension provided by totem that displays the metadata in the file properties dialog, but cannot edit it. I thought i may be a good idea to make a metadata editor extension for nautilus, so thats what i came up with.


Installation:
The extension requires that exfalso be installed on your computer (since it directly uses the exfalso metadata editor). On fedora, its a simple command:

$ yum install quodlibet

The extension itself is a single python file which you need to put into a directory ~/.nautilus/python-extensions/ (create the directory if it does not exist). The file is available at: https://sites.google.com/site/pankaj86/files/media_tag_editor.py.

Now the mandatory screenshot:


Monday, July 12, 2010

benchmarking pitfalls

In this post i'm going to list some of the pitfalls which can happen when you are trying to optimize your code and test the timings of specific code snippets.

As an example, consider the case of testing the performance of custom array class implemented in pysph at http://code.google.com/p/pysph/source/browse/source/pysph/base/carray.pyx?repo=kunalp-alternate , which is a simple substitute for 1D numpy arrays.

Now in orfer to test the performace i wrote a simple benchmark code:

cpdef dict loopget(ns=Ns):
    cdef double t, t1, t2, num
    cdef int N, i
    cdef dict ret = {}
    empty = numpy.empty
    zeros = numpy.zeros
   
    cdef DoubleArray carr
    cdef numpy.ndarray[ndim=1, dtype=numpy.float64_t] narr
   
    for N in ns:
        carr = DoubleArray(N)
        t = time()
        for i in range(N):
            num = carr[i]
        t = time()-t
        narr = zeros(N)
        t1 = time()
        for i in range(N):
            num = narr[i]
        t1 = time()-t1
        t2 = time()
        for i in range(N):
            num = carr.data[i]
        t2 = time()-t2
        ret['carr loopget %d'%N] = t/N
        ret['carrd loopget %d'%N] = t2/N
        ret['narr loopget %d'%N] = t1/N
    return ret

This snippet simply times the retrieval of a value from a custom DoubleArray class (using its data attribute which is a c array) versus a numpy buffer, both should ideally be at c speed, but the numpy buffer is not unless you disable cython array bounds check.
Now if you run it you would be surprized to see the timings:

carr loopget 100                             9.05990600586e-08
carr loopget 1000                            3.38554382324e-08
carr loopget 10000                           3.42130661011e-08
carr loopget 100000                          3.51309776306e-08
carrd loopget 100                            9.53674316406e-09
carrd loopget 1000                           9.53674316406e-10
carrd loopget 10000                          9.53674316406e-11
carrd loopget 100000                         9.53674316406e-12
narr loopget 100                             9.53674316406e-09
narr loopget 1000                            1.90734863281e-09
narr loopget 10000                           1.09672546387e-09
narr loopget 100000                          1.01089477539e-09

Strangely, getting the value in the c data array is extremely fast, and in fact takes the same time independent of the size of the array.
My first thought was that the access time in C was extremely small as compared to the time it took to call the python's time function. However my readings about gcc and compiler optimizations came to my mind.
Trick: Note that the assignment which is tested in the code snippet does not affect any other part of the code, and the variable num is never even read again. Hence the compiler optimizes it away (this technique is called dead code removal). Thus in the case of C array access the assignments do not occur at all. This does not happen for the other two parts because in calling python functions the compiler can never be sure what is done of the variables, and hence cannot reliably determine whether the assignment has any side-effect or not, so that the assignment is not removed while compilation.

Keeping this fact in mind, let us try to modify the test code so that this specific optimization does not take place. Consider our new test code:

cpdef dict loopget(ns=Ns):
    cdef double t, t1, t2, num
    cdef int N, i
    cdef dict ret = {}
    empty = numpy.empty
    zeros = numpy.zeros
    cdef dict d = {}
   
    cdef DoubleArray carr
    cdef numpy.ndarray[ndim=1, dtype=numpy.float64_t] narr
   
    for N in ns:
        carr = DoubleArray(N)
        t = time()
        for i in range(N):
            num = carr[i]
        t = time()-t
        d[num] = num
        narr = zeros(N)
        t1 = time()
        for i in range(N):
            num = narr[i]
        t1 = time()-t1
        d[num] = num
        t2 = time()
        for i in range(N):
            num = carr.data[i]
        t2 = time()-t2
        d[num] = num
        ret['carr loopget %d'%N] = t/N
        ret['carrd loopget %d'%N] = t2/N
        ret['narr loopget %d'%N] = t1/N
    return ret

The purpose of these added statements is to make sure that the assignment to num is not useless and that the compiler does not optimize it away. Since the new statements occur outside the time() calls it shouldn't affect our tests.
Let us now check the new timings:

carr loopget 100                             1.19209289551e-07
carr loopget 1000                            4.19616699219e-08
carr loopget 10000                           4.52041625977e-08
carr loopget 100000                          4.62603569031e-08
carrd loopget 100                            9.53674316406e-09
carrd loopget 1000                           3.09944152832e-09
carrd loopget 10000                          1.31130218506e-09
carrd loopget 100000                         1.07049942017e-09
narr loopget 100                             2.14576721191e-08
narr loopget 1000                            2.86102294922e-09
narr loopget 10000                           1.69277191162e-09
narr loopget 100000                          2.29835510254e-09

As you can see now the times are more reasonable.
Conclusion: Timing testing code is not a trivial thing to do :)

Friday, July 9, 2010

nearest particle search

Nearest neighbour particle search (NNPS) is  a common requirement of (meshfreee) particle methods, such as SPH. The requirement is to locate all particles within a fixed distance (the kernel support) of a specified particle, and the trick is to avoid doing brute-force distance comparison of every particle with every other particle (O(N^2)). There are many techniques available to implement this. One of the simplest for a fixed kernel support of all particles is to bin the particles and then search for the particles only in the neighbouring bins. Such a technique is implemented here: http://code.google.com/p/pysph/source/browse/source/pysph/base/nnps.pyx?repo=kunalp-alternate.
Here i'm gonna present some timings for the nnps. Note that the timings are old and also include some constant extra times for other operations (calling of rand() numpy function which i've now converted to the c rand() function).

Here are the timings result (Click on image to view the raw data sheet) :



As you can see, it shows that the bin size should be atleast thrice the kernel support size to get good performance.


Wednesday, July 7, 2010

aero nebula cluster

For those who do not know, i'm currently in my last year of the DD program in Aerospace engineering, and my project is implementation of solid mechanics code using SPH (smoothed particle hydrodynamics) integrated into the pysph project.
It pysph is basically a SPH implementation framework written in python/cython. (Now you know my reason for all those optimization posts :) ).
Now for most CFD codes, you need to run them in parallel on clusters so as to reduce the time required. So i just saw the specs of the nebula cluster (on which i have login) in aero department. Its really wonderful. The specs are:
20 nodes (15 working) each node with 12 six-core AMD opteron 2427 processors with 2.2 GHz xloxk speed and 12 GB RAM, in all 180 6-core processors.
This is sure gonna make parallelizing much more fun and interesting.
PS: I just rad and saw quite a few videos from google about their patented map-reduce technique. It would be interesting to implement SPH in map-reduce and let it run in the "cloud", the buzzword of today.

Thursday, June 17, 2010

the linux proxy problem

For those of you who use linux for anything more than web browsing (in university/office) must be aware of the problems a proxy can pose. In many places as in my institute, you need to necessarily use a specified proxy server to access outside world, needing authentication for your credentials.
In my college, a common login registered in a central ldap server provides for all authentication services (used for course registration/fees payments/emails/proxy/...). Hence it is very important to protect it. Here i will show one way to avoid anyone easily getting your password.


Network proxy loophole in GNOME:
If you are using GNOME (default Fedora/Ubuntu) and you set your proxy details in "system->preferences->network proxy" then you open a simple loophole in the settings.
After setting your username/password, open a new terminal and type
    echo $http_proxy
Now you can clearly see your password as
        http://<user>:<pass>@proxy.com:3128/
Now since many people come to your rooms in colleges you can see how simple it is to get your credentials.

Is there a way out:
There may be other ways, but here's the one which i follow. I create a local forwarding proxy server on my own computer and direct all applications to use that proxy. The settings for my proxy server are written in a file only readable by the root.
What follows is a step-by-step guide to set it up. Tested on Fedora

What do i use:
I use a small proxy server 3proxy, you could also use any other proxy server such as squid. In fact i used to use squid before i came to know of 3proxy (when it was packaged in fedora). Squid is a much more feature rich and heavy proxy. When i was using it had a bug whereby it would do at least 100 cpu wakeups per second, using precious power on my laptop. This may have been fixed by now.

Installation:
 On Fedora systems you can do
    yum install 3proxy
A similar command for apt-get may work on Ubuntu (i've never tried)

Configuration:
The configuration you need to do is

  1. Open the file /etc/3proxy.cfg in editor of your choice as root
  2. Locate the line containing 'proxy -n'
  3. Above this line, upto the line 'dnspr', comment out all uncommented lines and instead add the following lines:

    auth iponly

    allow * * 127.0.0.0/24,<local_IPs> * * * *
    allow * * * * * * *
    parent 1000 http <proxy.server.com> <port> <proxy_user> <proxy_pass>
    proxy -n

    The values in angle brackets need to be replaced by you configuration The values for my college are given in parenthesis
    <local IPs> = ips not connected through proxy [10.0.0.0/8]
    <proxy.server.name> = proxy server [netmon.iitb.ac.in]
    <port> = proxy port [80]
    <proxy_user> = proxy authentication username
    <proxy_pass> = proxy authentication password
  4. Comment out all lines with the content:

    socks
    pop3p
    ftppr
    admin
    dnspr
    tcppm
    udppm
  5. Save the file
  6. as root run (this will make the file only readable by root user)
        chmod o-rwx /etc/3proxy.cfg
        chkconfig 3proxy on
  7. ??
  8. profit
The details of the 3proxy.cfg file are documented at http://www.3proxy.ru/doc/html/man3/3proxy.cfg.3.html

Now in whichever application you need to set the proxy server, set it as

http://127.0.0.1:3128/

without any authentication.
Thats it, now only root knows your ldap password, and no one else can snoop

EDIT:
If you automatically want to set the proxy environment variable of the whole system, then you can create a file /etc/profile.d/proxy.sh with the following content

export http_proxy=http://127.0.0.1:3128/
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy

Many (not all) programs on linux use these environment variables to get proxy settings.

EDIT2 :
To set multiple proxies (different hosts go through different proxies) you can do something like below (see 3proxy.cfg manual for much more detail and many other options):
# direct connection allow * 127.0.0.1 127.0.0.0/24,<local_IPs> * * # through proxy1 allow * * <hosts_thru_proxy1> * * parent 1000 http <proxy1.server.com> <port> <proxy_user>  # through proxy2 allow * * <hosts_thru_proxy2> * * parent 1000 http <proxy2.server.com> <port> <proxy_user>  # through proxy3 allow * * <hosts_thru_proxy3> * * parent 1000 http <proxy3.server.com> <port> <proxy_user> allow * * * * * proxy -n

Sunday, May 30, 2010

research made simple with zotero

If you are into study/research of any kind (academic/non-academic) which involves reading up things and keeping track of them then you are in for a great productivity boost.  This will help if you are reading books/news/articles/wikipedia/journals or any such sort of thing. The too I'm talking about is zotero
With zotero you can save proper bibliographic references of lots of material you see on the internet and manage/search/cite them in various forms. Its really difficult to describe all the wonderful things zotero can do for your research, so it'll be really good for you if you watch the screencast http://www.zotero.org/support/screencast_tutorials/zotero_tour

Some features you'll find helpful:
Collect:

  • Single click saving of references. For example single click on any sciencedirect article, if you have subscribed (as in my college), a single click will save all information about the article, including the pdf (with well thought name instead of fulltext.pdf) if its available.
  • To enable saving pdf select in Zotero Preferences->General tab -> automatically attach associated pdfs and other files when saving.
  • In search tab in preferences, you may also want to enable indexing of pdfs if you need.
  • Clicking on sites with references to lots of articles (wikipedia references, cited by in Scopus etc), you can easily select all the references you need to save
Manage:
  • You can search all your saved articles, add notes, tags etc
  • You can group all articles in collections based on topic
  • You can create saved searches based on various criteria
Cite:
  • To cite an article(s) simply select them and right click to 'create bibliography from selectd articles' and choose a format style from the many available (including all popular journals) and you are done
  • If you are using bibtex to manage bibliographies for your article then select the articles and right click to do 'export selected items' and select bibtex format
  • Zotero plugins are available for Openoffice and MS Office too, so you can easily insert the references in your articles, without the pain of collecting anf formatting
Share:
  • If you work in team then this is a really wonderful feature. Create a simple login on the zotero server (you can also use openid)
  • In zotero preferences->sync tab enter your zotero login details and enable sync my library and group library.
  • All synced items (including attached pdfs) are available on the internet anywhere without even installing zotero addon. You just need to login to zotero and see your collection. This is very useful if your college has access to some journals but when you are somewhere else in a conference and you need to check and article. 100 MB space is freely available and you can buy even more.
  • 'My library' is your personal collection. Group libraries are shared collections, which can be shared with other people you are working with.
So what are you waiting for, install it now. If you did not install it yet, then you need to watch the screencast http://www.zotero.org/support/screencast_tutorials/zotero_tour now

    Saturday, May 29, 2010

    numpy array performance / divide and conquer considered harmful

    This is again a post about python code speed, the data and inference are more than a few months old but still valid.
    Here's a spreadsheet showing speed of array math operations (+, -, *, /) between numpy arrays and python lists.
    Check this spreadsheet to see the timings of various operations
    https://spreadsheets.google.com/ccc?key=0AomYDYyBBNkkdHAtMkdHMF9TZ29lMmZQV3UwYkxWNFE&hl=en

    The operations I considered for comparison were:

    • x+0.1
    • x-0.1
    • x*0.1
    • x/0.1
    • x*(1/0.1)
    • x+y
    • x-y
    • x*y
    • x/y
    • [p+yp[j] for j,p in enumerate(xp)]
    • [xp[j]+yp[j] for j in xrange(i)]
    where x and y are numpy arrays, xp and yp are python lists, all of size N which is varied for the comparison.
    The raw timings data is available here:
      https://spreadsheets.google.com/pub?key=0AomYDYyBBNkkdHAtMkdHMF9TZ29lMmZQV3UwYkxWNFE&hl=en&output=html

      See the timings plot yourself
      Conclusion:
      • Use numpy arrays for size > 10
      • Avoid division as much as you can to improve the speed of your numerical codes
      • Instead of x/0.1 do x*(1/0.1) . This itself causes large speedup as N is increased.
      • x/0.1 and x/y take almost the same time
      • +, -, * take almost same time, / takes much more time, and its expense increases as N is increased.
      • Once again, do not divide.
      • The same thing is valid in cython code also. Avoid division even in cython code, and even if you are using double instead of numpy arrays (buffer). Rewrite expressions to minimize the usage of division operator.

      Wednesday, May 26, 2010

      cython timings test

      The TASK : To optimize cython functions

      Detailed: functions which depend on a once initialized attribute value

      This often comes handy in many cases, for example to write a Laplacian function of a scalar field in spherical/axisymmetric coordinate system, you would need three independent cases for 1,2,3 dimensions for performance purposes and if u do not write all functions as general 3D functions.


      The test CODE : test_kernel.pyx


      cdef class Kernel:
          cdef int dim
          cdef double (*func)(Kernel,double)
          def __init__(self, dim=1):
              self.dim = dim
              if dim == 1:
                  self.func = self.func1
              elif dim == 2:
                  self.func = self.func2
        
          cdef double func1(self, double x):
              return 1+x
        
          cdef double func2(self, double x):
              return 2+x
        
          cdef double c_func(self, double x):
              '''this is only to make function signature compatible with func1 and func2'''
              return self.func(self, x)
        
          def p_func(self, double x):
              return self.func(self, x)
        
          cpdef double py_func(self, double x):
              return self.func(self, x)
        
          cpdef double py_c_func(self, double x):
              return self.c_func(x)
        
          def py_func1(self, x):
              return self.func1(x)
        
          def py_func2(self, x):
              return self.func2(x)
        
          cdef double func_common(self, double x):
              cdef int dim = self.dim
              if dim == 1:
                  return 10+x
              elif dim == 2:
                  return 20+x
        
          def py_func_c_common(self, x):
              return self.func_common(x)
        
          cpdef double py_func_common(self, double x):
              cdef int dim = self.dim
              if dim == 1:
                  return 10+x
              elif dim == 2:
                  return 20+x

      Compilation command:
          cython -a test_kernel.pyx;
          gcc <optimization-flag> -shared -fPIC test_kernel.c -lpython2.6 -I /usr/include/python2.6/ -o test_kernel.so
      where optimization flag is either empty or "-O2" or "-O3"

      Cython optimization
      Tip 1:
      Type (cdef) as many variables as you can. You also need to type the locals in each function. Try to try to use C data types wherever possible.
      Tip 2:
      use:
          cython -a file.pyx
      command to generate a html file which shows lines which cause expensive python functions to be called. Clicking on a line shows the corresponding C code generated, highlighting expensive calls in shades of red. Try to eliminate as many such calls as you can.

      The TEST :

      time_kernel.py

      import timeit

      def time(s):
          '''returns time in microseconds'''
          t = 1e6*timeit.timeit(s,'import test_kernel;k1=test_kernel.Kernel(1);k2=test_kernel.Kernel(2);',number=1000000)/1000000.
          print s, t
          return t

      time('k1.p_func(0)')
      time('k1.py_func(0)')
      time('k1.py_func1(0)')
      time('k1.py_c_func(0)')
      time('k1.py_func_c_common(0)')
      time('k1.py_func_common(0)')

      time('k2.p_func(0)')
      time('k2.py_func(0)')
      time('k2.py_func2(0)')
      time('k2.py_c_func(0)')
      time('k2.py_func_c_common(0)')
      time('k2.py_func_common(0)')

      Timings :



      functiontime (μs)(ns)

      Optimization flag ->None-O2-O3sum(k1+k2)/2penalty
      1k1.p_func(0)0.201780.183210.180350.188450.193680.0000
      2k1.py_func(0)0.232240.185990.183930.200720.195411.7345
      3k1.py_func1(0)0.214770.189910.192520.199070.198024.3456
      4k1.py_c_func(0)0.233950.191960.192430.206110.197613.9311
      5k1.py_func_c_common(0)0.195660.184580.190620.190290.197673.9960
      6k1.py_func_common(0)0.219810.187070.189840.198910.195101.4237
      7k2.p_func(0)0.204480.183880.181940.19010

      8k2.py_func(0)0.217980.188590.184370.19698

      9k2.py_func2(0)0.204130.181240.181940.18910

      10k2.py_c_func(0)0.231140.191660.192380.20506

      11k2.py_func_c_common(0)0.198600.187830.187450.19129

      12k2.py_func_common(0)0.216090.187470.186400.19666


      Average0.215600.186810.187030.19648


      Result :

      The best is to write separate C function and a python accessor function.

      task
      functionpenalty cost (ns)
      C function + python accessor : base casep_func
      cpdef instead of defpy_func1.7345
      calling a cdef class method instead of a function pointer attributepy_func1,py_func24.3456
      one extra c function callpy_c_func3.9311
      (def + cdef) instead of (cpdef)py_func_c_common-py_func_common2.5723
      One C comparison vs one C function callpy_func_common1.4237

      Conclusion :

      As can be clearly seen that the results are clearly inconclusive :)
      This was a small test carried on my laptop with no controlled environment. Also thought the results seemed close to repeatable, nevertheless many trials should be conduction and each value should have a standard deviation also to check the repeatability. However one clear conclusion is do not forget to add optimization flags. Setuptools already does that for you.
      Also using a function pointer is not so bad after all. It would become more advantageous in case of more number of comparisons.
      Cython provides great speedups (who didn't know that :) ). The pure python version of py_func_common took 0.408μs for dim=1 and 0.518μs for dim=2
      These results are purely from python point of view. The effect of cdef/cpdef should also be considered in c/cython code which calls these functions.

      CAVEAT:

      I am no optimization expert. I have done this out of out of sheer boredom :)
      If anyone wants to verify, you are welcome
      Any information content is purely coincindental

      Tuesday, April 27, 2010

      Tracing python programs

      Coming with the easier python debugging enabled by the new gdb with python hooks is another awesome python feature coming in the new Fedora "Goddard" 13 release. That is tracing of python processes and their function calls. This feature is developed on top of systemtap, the linux analogue of Sun's awesome Dtrace system tracer.
      So what does it mean? For the uninitiated, it implements hooks (tracepoints) in the python main shared libraries (libpython.so and libpython3.so) so that systemtap can trace whenever a python function is entered/exited in any python process on the system. This means you can anytime check a python process to see which functions are being called and how many times etc. This has really cool uses. More information about this feature is available at https://fedoraproject.org/wiki/Features/SystemtapStaticProbes#Python_2

      Just to illustrate the use try the following examples (from the above link)
      First install python-debuginfo. Now add yourself to stapdev and stapusr groups or run the following command as root:
          $ stap /usr/share/doc/python3-libs-3.1.2/pyfuntop.stp
      This will display a top like output on the terminal showing the python functions which are called by all running processes and the number of times its being called. Its fun to watch, just run a python program and check all the functions being called :)
      Here's a sample output from my laptop

      PID                                                                         FILENAME   LINE                       FUNCTION  CALLS
       15479                                 /usr/lib/python2.6/site-packages/yum/packages.py    261                         verCMP  15768
       15479                                 /usr/lib/python2.6/site-packages/yum/packages.py    270                        __cmp__  15767
       15479                             /usr/lib/python2.6/site-packages/rpmUtils/updates.py    129                   returnNewest   9045
       15479                           /usr/lib/python2.6/site-packages/rpmUtils/miscutils.py     36                     compareEVR   1191
       15479                                 /usr/lib/python2.6/site-packages/yum/packages.py     48                   comparePoEVR    578
       15479                                 /usr/lib/python2.6/site-packages/yum/packages.py    296                          verEQ    556
       15479                                 /usr/lib/python2.6/site-packages/yum/packages.py     55                 comparePoEVREQ    556
       15479                                 /usr/lib/python2.6/site-packages/yum/__init__.py    778                             2
       15479                                 /usr/lib/python2.6/site-packages/yum/__init__.py    206                     _getConfig      2
       15479                                   /usr/lib/python2.6/site-packages/yum/config.py     69                        __get__      2
       15479                                         /usr/lib64/python2.6/logging/__init__.py   1026                          debug      1
       15479                                         /usr/lib64/python2.6/logging/__init__.py   1236                   isEnabledFor      1
       15479                                         /usr/lib64/python2.6/logging/__init__.py   1222              getEffectiveLevel      1
       15479                             /usr/lib/python2.6/site-packages/rpmUtils/updates.py    272                      doUpdates      1
      
      This shows the functions called during a 1 second interval (the script updates the display every second) by checking of available updates by packagekit.
      Another system script displays the python function call hierarchy of any program you run. Try this by running
          $ stap -v /usr/share/doc/python-libs-2.6.4/systemtap-example.stp -c python
      Now you will get a python terminal after a long hierarchy of function calls. Here you can see all python functions called for each line you enter on the python terminal. Its not as much fun, but useful if you want to check where all those extra unneeded function calls are being made.
      Read a short writeup from the developer of these features at http://fedoraproject.org/wiki/Python_in_Fedora_13 and also check http://press.redhat.com/2010/04/27/fedora-13-spotlight-feature-exploring-new-frontiers-of-python-development/