Modeling the Power Consumption of Computer Systems with Graphics Processing Units
Project Description

Links: [Project Description] [Personnel] [Funding] [Source Code] [Blog]

Project Description

The power consumption of computer systems has become an important concern, especially in large data centers that house thousands of computers. In order to enable energy-efficient scheduling and operation of these machines, a variety of models have been devised that allow reasonably accurate prediction of these systems’ total power consumption at any given time based on metrics that are available in software [1, 2, 3, 4, 5, 6, 7, 10].

However, all of these power models assume that the major consumers of dynamic power in the system will be the central processing unit (CPU), memory, and disk. One major hardware trend threatens to violate this assumption: the increased use of graphics processing units (GPUs) for general-purpose computing. GPUs perform much better than CPUs at certain specialized types of computation, but their power consumption is also high. In addition, while CPUs’ power consumption has improved signiﬁcantly in recent years, GPUs have not experienced the same aggressive energy-efficiency optimization. Therefore, models that do not include the GPU may be missing the major consumer of dynamic power in the system.

We intend to make the Mantis power modeling software [9] GPU-aware. That is, we intend to incorporate model parameters that reﬂect the activity of the GPU, and we will calibrate and evaluate the model using workloads that stress the GPU. We expect this approach to power modeling to yield more accurate results for GPU-intensive workloads.

Hypotheses to Investigate and Methods of Investigation

Performance of Current Modeling Techniques

Our ﬁrst hypothesis is that full-system power models that are not GPU-aware will be unable to predict the system’s power consumption when running GPU-intensive workloads. That is, their accuracy will signiﬁcantly degrade when GPU-intensive workloads are run.

To address this hypothesis, we will use the Mantis power modeling software to develop traditional CPU-, memory-, and disk-based full-system power models for two computer systems with different graphics processors. We will run a set of workloads on each system, including a GPU-intensive workload, and measure full-system power while these workloads run. We will compare the model’s predictions with the power measurements and see if the accuracy degrades for the GPU-intensive workload.

Creating a GPU-Aware Power Model

Our second hypothesis is that we will be able to improve the accuracy of these power models by incorporating the GPU’s hardware performance counters as model parameters.

To address this hypothesis, we will modify the software model calibration suite to stress the GPU, and we will collect GPU performance counter data while running this calibration suite. We will develop new models based on this data, and then we will repeat the evaluation from the previous section using the new models.

Further Questions to Address

We will also attempt to address questions that arise during the research, such as:

Which GPU performance counters best correlate to each other and to overall power consumption? Do these results hold for all GPU-intensive workloads, or does one set of counters work well for traditional graphics programs and another for general-purpose computation?
What is the best way to stress the GPU in order to calibrate our models?
How different are the results across the two computer systems under test? Are they different only in the values of model coefficients, or do they incorporate totally different parameters?
How do our results contrast with the first published study on high-level models of GPU power [8]? This study modeled the GPU’s DC power rather than the full-system power, and it modeled only one GPU. We will see whether the metrics that best correlate to the GPU power are also best for modeling its contribution to the full-system power.

References

W. L. Bircher and L. K. John. Complete system power estimation: A trickle-down approach based on performance events. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Apr. 2007.
G. Contreras and M. Martonosi. Power prediction for Intel XScale processors using performance monitoring unit events. In Proceedings of the International Symposium on Low-Power Electronics and Design (ISLPED), Aug. 2005.
X. Fan, W. Weber, and L. A. Barroso. Power provisioning for a warehouse-sized computer. In Proceedings of the International Symposium on Computer Architecture (ISCA), June 2007.
T. Heath, B. Diniz, et al. Energy conservation in heterogeneous server clusters. In Proceedings of the 10th Symposium on Principles and Practice of Parallel Programming (PPoPP), June 2005.
C. Isci and M. Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. In Proceedings of the 36th International Symposium on Microarchitecture (MICRO-36), Dec. 2003.
A. Lewis, S. Ghosh, and N.-F. Tzeng. Run-time energy consumption estimation based on workload in server systems. In Proceedings of the 1st Workshop on Power-Aware Computing and Systems (HotPower), Dec. 2008.
T. Li and L. K. John. Run-time modeling and estimation of operating system power consumption. In Proceedings of SIGMETRICS, June 2003.
X. Ma, M. Dong, et al. Statistical power consumption and modeling for GPU-based computing. In Proceedings of the Workshop on Power-Aware Computing and Systems (HotPower), Oct. 2009.
S. Rivoire. Models and metrics for energy-efﬁcient computer systems. PhD thesis, Stanford University, 2008.
S. Rivoire, P. Ranganathan, and C. Kozyrakis. A comparison of high-level full-system power models. In Proceedings of the Workshop on Power-Aware Computing and Systems (HotPower), Dec. 2008.