A good while back I put up a post about new hardware I had acquired. I have been happy with said hardware with one small glitch. From time to rare time the system would hang. Nothing indicated a failure, no errors in the systems logs, just a dead locked system that required a physical reboot. After a long time of searching I have finally found the cause, video card heat from my three Asus ENGTX465 video cards.
I monitored the card temps so was under the impression they were OK. I even had a bash script that I tied in to conky to output the GPU and card temps. I saw some decently high temps from time to time but nothing close to what the listed thermal specifications were. I made the assumption that the cards cooling hardware was doing what it should be doing… shame on me.
While working with some tweaks and cable cleanup inside the case I noticed the cards frame was extremely hot. I mean so hot I was unable to touch it. This is bad for most hardware to the point of maybe having to replace parts. I got to digging around and turned up one can check fan speeds using nvidia-settings or nvidia-smi. I tied this in to the information I was already pulling on temps and found the cards were running the fans at 40% regardless of the GPU or card temperatures. Of course I set out to correct this problem.
I have been working to modify my gpu-temp.sh script to also manage the fan speeds based on temperature. There are already some variants ont he web but every one of those I have come across uses one or two fan speeds all the time.I figure this can be done much more elegantly than low, medium and high so set out to have a dynamically set fan speed based on temps that can climb up or down gradually as needed. I have this working and will post it here once I have completed some final cleanup and tweaks.
The whole reason for this post, to make note that the cards require help to stay cool. Even under Windows they happily tick away at 40% fan speeds and get hot enough to cook lunch on so this in not a Linux thing, but an Asus thing I believe. Under Windows I have installed their Smart Doctor which controls the fan speeds. It has an auto setting that does pretty much nothing and leaves them at 40%. It does however have a settings allowing you to configure temps for a respective fan speed starting with High and working your way down. Sadly this is less dynamic than my script and yields higher temps and more dramatic fan speed changes.