Watch for dust build-up on these intakes! Improving Fan Control I could add two Corsair ML120 Pro Blue 120mm fan as case fans easily enough.įans at the bottom of a vertically-oriented case pull in cool air near the ground into the GPU compartment. These are meant to be static pressure fans, pulling cool air in from outside, with the resulting hot air vented out by the CPU fan. It was at this point that I learned something lucky: I had made a dumb mistake in my build, forgotten that the Fractal Node 202 has space for two case fans beneath the GPU. Not hot enough to trigger the emergency brake that is a hardware slowdown, but hot enough to affect performance. This is the best list of active throttles I've seen, and when I was investigating, it clearly and consistently showed SW Thermal Slowdown - my GPU was too hot. On Linux, the job is somewhat more difficult: you can run nvidia-smi -q -d CLOCK to ask for the GPU frequency, but must run this repeatedly to see if the clock frequency is changing.įor those of us on Linux and without datacenter-style monitoring, though, there's an easier way! PERFORMANCE ![]() On Windows, third-party programs like GPU-Z can help you detect this by showing a graph of GPU frequency over time. The reference RTX 2080 Super has a base clock of 1650MHz, with a boost clock of 1815MHz, so these would seem to be good speeds, but the instability in the frequency meant something was wrong. Throttling confirmed! The SM Clock plot showed clear signs of throttling - spiking constantly between 1770MHz and 1690Gz, and even dropping to 1650MHz for a sustained window. The sign of a throttled GPU is a SM frequency that is uneven - full-power GPUs maintain a stable clock frequency. It turns out, what an NVIDIA GPU will do in order to stay cool is reduce the clock frequency of the streaming multiprocessor (SM) units, which contain CUDA cores, resulting in a decrease in performance that is proportional to the decrease in frequency for tasks running on these cores. I started to learn about what happens to a GPU as it reaches thermal maximums. Still, 86 ☌ is warm, and since my case is a Fractal Node 202, an extremely compact mini-ITX that clocks in at 10.2L, cooling was at the top of my mind. And it's hard to find good information about what qualifies as "high" temperatures for a GPU, and what the effects of running at high temperatures are. Now, a bit of frame drop in a demanding game could maybe be expected, new GPU or not. I started investigating my GPU's performance after two observations: the first was that latency-sensitive VR games would sometimes stutter or jerk before becoming smooth again, with brief large spikes in frame latency (going from sub-6ms times up to 15-18ms for brief fractions of a second) the second was that when running at maximum utilization, my GPU temperature was pinned at 86 ☌ with the GPU fans running at full speed. Identifying the potential for more performance A good rule-of-thumb is to check that GPU utilization is being reported as nearly 100%, while other components are not at their maximums. ![]() This post focuses on finding and addressing bottlenecks affecting GPU compute, but graphics processing can be slowed by many components: a slow CPU can prevent a GPU from running at maximum speed by failing to provide it with work quickly enough a machine learning task that requires large amounts of data transfer may be limited elsewhere, such as GPU memory bandwidth, disk, or network activity. ![]() With a bit of debugging and a few small changes to the system, I've managed to reclaim that performance. But out-of-the-box it was 15 % less performant than it currently is when reporting maximum utilization. It's a great piece of hardware and up for anything I can throw at it, which so far includes Metro Exodus, Half-Life: Alyx, and more. I got an NVIDIA RTX 2080 Super a few months ago.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |