In a run consuming 214 hours on 120 processors of NCSA's new 128-processor Origin 2000 system, researchers from the University of Minnesota's Laboratory for Computational Science and Engineering (LCSE) simulated the convection process in an entire rotating model star. This run involved over 18,000 time steps, advancing 57 million active computational cells out of a grid with 169 million cells overall. This astrophysical gas dynamics simulation was extremely computationally intensive, consisting of over 3.5 million billion floating point operations (3.5*10^15 flops). It was data intensive as well, generating over 2 terabytes of archived information. This data was captured on 8 Ampex D-2 tape cartridges for transport to the LCSE for later analysis and visualization.
"This is the beginning of a whole new class of simulation that we'll be able to do," says Dr. David Porter of the Department of Astronomy and the LCSE at the University of Minnesota. "Until now we've only been able to model small patches near the surface of a star. Here we're modeling the whole star, and a rotating star no less."
The Origin 2000 system at NCSA consists of two 64-processor systems interconnected by 100 megabyte per second HiPPI links. NCSA plans to double this system to 256 processors within the year and to upgrade the interconnection network to the 800 megabyte per second CrayLink.
For this simulation, David Porter, Steven Anderson, and Joe Habermann developed a new version of the LCSE's PPM gas dynamics code that employs a novel strategy for overcoming latency in parallel applications. Latency refers to the time it takes to retrieve data from memory. In large multiprocessor computing systems, such as the Origin 2000, each group of processors sharing a common memory is simultaneously advancing a different portion of the problem. This means of attacking the problem in parallel is how the computer attains speed and efficiency. Latency becomes an issue when the processor groups (there were 2 groups of 64 processors each used at NCSA) must retrieve information from other processor groups in order for the calculations to continue. This communication between the processor groups introduces latency.
In the LCSE team's new PPM code, latency is prevented from slowing down the computation by performing the work in a special order which gives each processor useful calculations to do while necessary data is fetched from other processors. This strategy works better and better as the problem size is increased. Only information near the surface of each computational grid domain must be sent to other processor groups, while the amount of useful work which can be performed while these messages reach their destinations scales along with the much larger interior volume of the grid domain. The LCSE team plans to apply this same strategy for latency tolerance to enable very widely separated computing resources connected by the new NSF vBNS network to cooperate on the solution of single, tightly coupled, grand challenge applications like the stellar convection problem.
The energy produced in thermonuclear reactions at the core of a star is transported to the surface in one of two ways, thermal conduction and convection. In the outer layers of stars like the sun, conduction is inefficient and hence convection transports the heat. Regions of gas which are heated by conduction deep within the star become buoyant and rise toward the surface, where heat can be radiated directly into space. Gas cooled near the surface becomes denser and sinks. This flow becomes highly turbulent, and the turbulence itself can become involved in transporting heat. Along with its heat, the rising and sinking gas carries angular momentum. The turbulent convection flow can redistribute angular momentum as well as heat. The sun is known to rotate more slowly near its poles than near its equator. The simulation by the LCSE team at NCSA can help us to understand how such differential rotation becomes established in a star.
NCSA, a unit of the University of Illinois at Urbana-Champaign, receives major funding from the National Science Foundation, the Defense Advanced Research Projects Agency, NASA, corporate partners, the State of Illinois, and the University of Illinois. The LCSE, in the University of Minnesota's Institute of Technology, is supported by the NSF, the Department of Energy, NASA, ONR, the LCSE industrial partners, and the University of Minnesota.