Dr. Rosemary Mardling
of Monash University has been researching the mathematics behind the orbit of Pluto and its moon Charon. Observations from the hubble telescope has shown that the spins of the planets are in sync with the orbit, the orbit is not perfectly circular and the surface of Charon is covered with young (crystalline) water ice. In this figure, the observed orbit of Pluto and Charon is that their orbit around each other is at a 120o angle to the Sun.
The research led to developing simulation software that plotted the planets evolution over the last four billions years. As there were parameter sets that needed to be explored to discover the evolution of the planets, distributing this across a cluster reduced the time to get results. The program written to simulate the orbits and is 776 lines of code.
From this simulation in science terms, they discovered a quasi-equilibrium type of solution that allows us to constrain the structure much more accurately. They can now explain the observation of water in crystalline form.
Executing this on VPAC’s cluster to produce the results took weeks. It is the aim of the multi-site EnFuzion client to reduce the time of the run by using multiple clusters. In a reduced trial version of the Pluto-Charon code, the time it took to execute on grendel was 29 minutes and 34 seconds. Executing the same code on Griffith’s cluster it took 35 minutes and 45 seconds. Due to a configuration problem on Monash’s hathor, the run was unable to be processed by hathor. Using the multi-site EnFuzion client with grendel and r1n01e, the run only took 16 minutes and 41 seconds, as it was able to use both clusters. This result matched the expected time to run the job on both clusters working together. Using the power of two clusters, the power of the combined cluster is equal to the inverse of the sum of the inverse times of each cluster working at the whole run. I.e.,
, whereas T is the expected time and ti is time of each cluster. In this case, T = 16 minutes and 10 seconds.
If the original project were to run on two roughly equal powered clusters, the results would have been produced in half the time. Unfortunately, the original simulation had only a few long executing jobs that could not be balanced over multiple clusters. This was because there were not enough jobs to distribute to all the nodes. To use the clusters effectively, there would have to be at least enough jobs as there are total nodes across all clusters. Using the multi-site EnFuzion client would have only used one cluster as that could have satisfied all the jobs. If the program were broken into further parameter sets (this would depend on the program), thus creating more jobs, then the multi-site EnFuzion client would be effective.
The case study was successful in improving the time of the trial application using multiple clusters.