In this article we'll explore what a watchdog is, how it functions, how it's different from hardware watchdogs, and how to set it up in various mining clients.
A watchdog is a utility that monitors your hardware and software operation. Let's see how watchdogs are utilized in mining.
In case of software watchdogs, the monitoring is done by a process running in background on the system, which periodically checks if your mining client is performing well, if the GPUs are responsive, etc. Upon detecting a failure an action is taken, most often a restart or force exit of the client. The drawbacks of such approach become apparent when the mining client process crashes so hard that the watchdog process crashes with it, or, if the system experiences a hardware crash, or gets stuck, on such occasions the software watchdog will not be able to perform any action.
In case of hardware watchdogs, the monitoring is done by an external devices that periodically sends requests to the machine, most often it's a USB device which sends signal to the kernel of the system. By getting a reply from the system, the watchdog "knows" that the system is working and resets it's internal timer. If the request sent to the system times out, the watchdog uses simple electrical connection to the Reset Switch pins on the motherboard to force the reset of the system. The drawback of such approach becomes apparent when the mining client process crashes "lightly" or gets stuck, reporting that it's mining when it actually is not and is stuck. The system will respond to the hardware watchdog successfully and no action is taken, but effectively no mining is done.
Configuring software watchdogs will take us to the advanced configuration editor instead of the simple. However, using it isn't as hard it may seem, just make sure to enter the options as listed.
By default, PhoenixMiner has watchdog enabled. To disable it, add option
You can control the timer for the watchdog when it's enabled, the option is
-wdtimeout 30, where 30 is the number of seconds it takes for the watchdog to timeout, the acceptable range is 30 to 300, with default being 45.
You can also control the action that is perform when watchdog timeout is triggered. The option for this is
-rmode 0No restart, the miner shuts down. Notice that minerstat will detect the miner crash and restart it.
-rmode 1The default option, the miner gets restarted with the same command line options.
-rmode 2The miner shuts down and reboots the system.
Below is a configuration example of PhoenixMiner with auto-reboot after 90 seconds timeout with explicitly enabled watchdog:
By default, T-Rex has watchdog disabled. To enable it, change option
"no-watchdog":true to be
eto exit the miner,
rto reboot the system,
sto shutdown the system completely.
Here are a few examples:
Here's the config of the miner with the last example used.
Make sure to not remove any comma or bracket by accident when using Advanced config.
By default, TeamRedMiner has watchdog enabled. To disable it, add option
There are several watchdog options available in TeamRedMiner:
--no_gpu_monitorDisables miner internal monitoring the GPU for it's temperature and fan speed.
--temp_limit=TEMPSets the temperature at which the GPUs are considered too hot and stop mining. Default is 85C (Celsius). Make sure to always set the resume temp (listed below) to configure this correctly.
--temp_resume=TEMPSets the temperature at which the GPUs are considered cold enough to resume mining, default is 99C (Celsius), effectively disabling the start-stop behavior.
--watchdog_script=XConfigures the GPU watchdog to shut down the miner and run the specified platform and exits immediately. The default script is watchdog.bat/watchdog.sh in the current directory, but a different script can be provided as an optional argument, potentially with a absolute or relative path as well.
--watchdog_testTests the configured watchdog script by triggering the same action as a dead GPU after ~20 secs of mining.
Specific to mining Ethash-based coins, i.e. Ethereum (ETH) is
--eth_hashwatch=N,M where N and M are hashrate values in MH/s set as a range. When the hashrate of one of the GPUs is outside the specified range, the watchdog will be triggered. You can enter -1 for one of the values to make it unlimited. I.e.
--eth_hashwatch=-1,1000 so the watchdogs triggers only when hashrate is above 1000MH/s but not when it's even 0.01MH/s.
Here's a config example with hashrate watcher set for 20-35MH/s and no temperature or fan speed monitoring enabled:
By default, lolMiner comes with watchdog in "script" mode which exits the miner and runs the file inside the miner's default directory
emergency.bat depending on the platform. However, as the file is empty, the watchdog is effectively setting the miner to mode "exit". You can explicitly set it to be on or off:
"WATCHDOG": "off"This will do nothing except for printing a message. If only a single card did crash and not the whole driver this means the other cards will continue mining.
"WATCHDOG": "exit"This will close the miner with a exit code of 42. The miner will automatically restart after some seconds of pause as minerstat detects it's crash. This is recommended and default option.
Example of config with watchdog set to exit miner when a GPU is detected as "lost":
By default, GMiner comes with watchdog on. Note that in GMiner the watchdog also works in cases where the network connection to the pool is lost.
--watchdog 1or shortly
-w 1- enables or disable watchdog, default value is 1, enabled. You can set this option to
--watchdog 0to disable the watchdog.
--watchdog_restart_delay 10- miner restart delay for watchdog in seconds, default value is 10 seconds.
Example of config with watchdog set to restart the miner after 5 seconds of watchdog timeout:
By default, NBMiner comes with watchdog process enabled. You can disable it by adding option
Example of NBMiner configured with watchdog disabled:
Use minerstat software and improve your mining operationSign up for free