Grid computing is a powerful form of distributed computing wherein a network of loosely coupled and geographically separated computers, typically of different computational powers, work together to perform data-intensive calculations. The technology uses numerical simulations to help investigate a variety of challenging scientific problems, including the subatomic world revealed by particle accelerators like the Large Hadron Collider.
The current trend in the industry is to build grid-computing systems that can run not just one but multiple large-scale, numerical simulations. While most systems can guarantee good performance, this is usually accompanied by significant cost and large storage requirements. To optimize the cost–performance relationship of large-scale grid-computing systems, scientists must overcome several issues — one of which is the efficient allocation of computational resources, known as scheduling.
Rubing Duan and Xiaorong Li at the A*STAR Institute of High Performance Computing in Singapore and co-workers have now developed a scheme to address the scheduling problem in two large-scale applications: the ASTRO program from the field of cosmology, which simulates the movements and interactions of galaxy clusters, and the WIEK2k program from the field of theoretical chemistry, which calculates the electronic structure of solids1. The researchers’ new scheme relies on three game-theory-based scheduling algorithms: one to minimize the execution time; one to reduce the economic cost; and one to limit the storage requirements.
The researchers performed calculations wherein they stopped the competition for resources when the iteration reached the upper limit of optimization. They compared their simulation results with those from related algorithms — namely, Minimum Execution Time, Minimum Completion Time, Opportunistic Load Balancing, Max-min, Min-min and Sufferage. The new approach showed improvements in terms of speed, cost, scheduling results and fairness. Furthermore, the researchers found that the execution time improved as the scale of the experiment increased. In one case, their approach delivered results within 0.3 seconds while other algorithms needed several hours.
Nonetheless, the researchers note that their algorithms may not be suited for applications that are highly heterogeneous.
Prior to this study, only a handful of work had considered the optimization of multiple metrics. However, Duan, Li and colleagues were able to present a model approach to reducing the execution time, economic cost and storage requirements in grid-computing systems, especially when used in large-scale applications.
“Our game-theory-based scheduling algorithms possess great potential for large-scale applications,” says Duan. “We are looking into how the algorithms adapt to other metrics, such as memory, security, resource availability, network bandwidth and multiple virtual organizations.”
The A*STAR-affiliated researchers contributing to this research are from the Institute of High Performance Computing