# Intro

I've been asked recently to provide an automated solution for the ethminer, which is the Ethereum GPU mining worker software, that seems to have been undergoing some sporadic crashes. It looks like the worker process ethminer.exe for this cryptocurrency miner has some kind of a latent bug that makes it unstable over time. And even though that software is open source, analyzing its source code seems to be much more difficult task than to create a watch process. This blog post will describe the creation and the workings of such process.

# The Issues

There are actually several issues with the current version of the ethminer:

1. Unlike other paid cryptocurrency mining software, ethminer is free and open source, but doesn't seem to have a lot of features of the paid software. Stuff like a watch utility that can monitor and react to an abnormal hash-rate of the miner.
2. ethminer doesn't seem to have an ability to automatically reboot the mining rig if it fails to recover the hash-rate.
3. Finally, the ethminer doesn't seem to have a large developer team maintaining it and thus some bugs remain unpatched. So if the ethminer.exe process crashes the mining rig currently has no means to recover, or to restart the ethminer.exe process.

I will address all of these issues with a small and totally free command line utility, that I will call EtherMineSafeRunner.

# The Solution

For this small project (and for simplicity) I decided to use C# to write the watch utility with. I've used the latest version of the Visual Studio 2019, Community edition, and made it as a classic .NET console application.

The EtherMineSafeRunner accepts the following command line parameters:

Command Line Parameters
EtherMineSafeRunner MinHash MaxHash RebootAfter "path-to\ethminer.exe" miner_commad_line

Where:

• MinHash = is a minimum allowed hash rate (in Mh). EtherMineSafeRunner will watch the ethminer's output and if it doesn't produce the hash rate higher than this value within 2 minutes, it will restart the ethminer.exe process.
• MaxHash = is a maximum allowed hash rate (in Mh). Or, you can specify 0 not to use the maximum hash rate cap. This parameter is used the same way as MinHash that I described above, but for the maximum cap on the hash-rate.
• RebootAfter = specifies the number of restarts of the ethminer.exe process before EtherMineSafeRunner reboots the mining rig. Or, specify 0 not to reboot it.
Note that preventing reboots may get ethminer stuck in an infinite loop until you reboot manually!
Note that if you are relying on automatic rebooting of the mining rig, it would be advisable to put your EtherMineSafeRunner into a DOS batch file (with the .bat extension) and then place that batch file into the autorun folder for the Windows user.

To open Windows autorun folder for the user, go to the following location using Windows Explorer: %AppData%\Microsoft\Windows\Start Menu\Programs\Startup.

I would recommend to add a delay before starting the crypto miner so that you can cancel it if anything after a reboot. You can do it as such:

Batch File (.bat)
echo off

echo "Will auto-start miner in 20 seconds..."
timeout /t 20

start /d "C:\path\to" start-ETH-miner.bat


You would then place the command line for the EtherMineSafeRunner into the start-ETH-miner.bat file.

• path-to\ethminer.exe = is the path where the ethminer.exe image file is located. Make sure to use double-quotes if it contains spaces. Or, you can use just the file name if ethminer.exe is located in the same folder as the EtherMineSafeRunner.exe image.
• miner_commad_line = this is a command line specific for the ethminer, as described in its documentation.

So here's an example of how you would run it:

EtherMineSafeRunner 80 100 32 "path-to\ethminer.exe" -P stratum://0xETH_PUB_KEY:x@us2.ethermine.org:14444

This will start ethminer with the following parameters:

-P stratum://0xETH_PUB_KEY:x@us2.ethermine.org:14444

Where you would obviously put your own Ethereum key instead of 0xETH_PUB_KEY.

The first command line will also instruct EtherMineSafeRunner to monitor ethminer to output the hash-rate in the range between 80Mh to 100Mh, and will restart the ethminer.exe process if the hash-rate is outside of that range within the last 2 minutes. It will also reboot the rig after 32 attempts to restart the ethminer.exe process. A reboot may clear any issues with your overclocked GPUs or their driver.

I decided not to express these via a command line interface for simplicity. But EtherMineSafeRunner will also do the following:

• Check ethminer Crashes - every time ethminer.exe process crashes, EtherMineSafeRunner will restart it.
• Check Accepted Hashes - it will monitor how long ago the ethminer generated an accepted hash. If it doesn't notice one for 30 minutes it will restart the ethminer.exe process.
• Check ethminer Running - it will monitor if ethminer.exe process is running and will restart it if it is not. (This is different than a crash in case ethminer.exe process hangs up or fails to start.)
• Diagnostic Log File - it will maintain a text event log file with any diagnostic messages during the EtherMineSafeRunner operation. This will be especially helpful if you need to diagnose wayward reboots and other issues with the ethminer. The log is maintained in the EthMineSafeRunner_Log.txt file in the same folder where the EtherMineSafeRunner is running in.
• Ability To Email Notifications - it will have an ability to interface with a custom web server to email critical notifications about the ethminer. (Note that some additional setup of a PHP script is required on the server. Read below for details.)

All-in-all I coded EtherMineSafeRunner to provide as much automation for the cryptocurrency mining process with the ethminer as possible.

# Implementation

This section is for C# developers that want to know how I implemented certain aspects of the EtherMineSafeRunner watch tool. Let's dissect its most interesting functions:

## Starting ethminer

The code that starts an ethminer.exe process & redirects its output is implemented in the RunMiner function:

C#
static MinerRunResult RunMiner(CmdLineParams info)
{
//Run the miner process and begin watching it

try
{
Process proc = new Process();
proc.StartInfo.FileName = info.strMinerExePath;

if(info.arrMinerCmdParams.Count > 0)
{
//Make command line
string strCmdLn = "";

foreach(string strCmd in info.arrMinerCmdParams)
{
if(!string.IsNullOrEmpty(strCmdLn))
strCmdLn += " ";

if(strCmd.IndexOf(' ') == -1)
{
strCmdLn += strCmd;
}
else
{
strCmdLn += "\"" + strCmd + "\"";
}
}

proc.StartInfo.Arguments = strCmdLn;
proc.StartInfo.UseShellExecute = false;
proc.StartInfo.CreateNoWindow = true;

proc.StartInfo.RedirectStandardOutput = true;
proc.StartInfo.RedirectStandardError = true;

{
try
{
}
catch(Exception ex)
{
//Failed
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Error, "EXCEPTION_2: " + ex.ToString());
OutputConsoleError("EXCEPTION_2: " + ex.ToString());
}
});

{
try
{
}
catch(Exception ex)
{
//Failed
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Error, "EXCEPTION_3: " + ex.ToString());
OutputConsoleError("EXCEPTION_3: " + ex.ToString());
}
});

//Start the process
gWS.setMinerProcessClass(proc, true);
proc.Start();

//Make the miner process exit with ours
AttachChildProcessToThisProcess(proc);

int nPID = proc.Id;
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Information, "Miner started (PID=" + nPID + ") ... with CMD: " + strCmdLn);

proc.WaitForExit();

//Get exit code & remember it
uint nExitCd = (uint)proc.ExitCode;
gWS.setLastMinerExitTimeUTC(nExitCd);

gEventLog.logMessage(EventLogMsgType.ELM_TYP_Error, "Miner process (PID=" + nPID + ") has exited with error code 0x" + nExitCd.ToString("X"));

OutputConsoleError("WARNING: Miner has exited with error code 0x" + nExitCd.ToString("X") + " ....");

res = MinerRunResult.RES_MR_MINER_EXITED;
}
else
{
//Error
OutputConsoleError("ERROR: Not enough parameters to start a miner");
}

}
catch(Exception ex)
{
//Failed
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Error, "EXCEPTION_1: " + ex.ToString());
OutputConsoleError("EXCEPTION_1: " + ex.ToString());
res = MinerRunResult.RES_MR_EXCEPTION;
}

return res;
}

There are several moments I need to point out here:

• Note that I'm re-building a command line for the ethminer in the strCmdLn local variable in the beginning of the function. I also make sure to account for any spaces in the command line parameters and if so, to enclose such parameters in double-quotes. (I'm also going with a quick-and-dirty method of just reusing a string object instead of a more robust and efficient StringBuilder. But in this case it won't make much of a difference.)
• Then also note that we instruct the .NET Framework not to use ShellExecute by setting proc.StartInfo.UseShellExecute = false; and also instruct it not to create a window by setting proc.StartInfo.CreateNoWindow = true;. The former is needed to ensure that we can redirect the STDOUT and STDERR streams to our own process.
Note that we need to redirect both STDOUT and STDERR since ethminer (that is written in C++) uses both std::cout and std::clog to output the results.

Then both STDOUT and STDERR streams are intercepted in our process by specifying our handlers in proc.OutputDataReceived += new DataReceivedEventHandler and in proc.ErrorDataReceived += new DataReceivedEventHandler. Each one redirects received strings into the DataReceivedFromMiner function for interpretation. The actual redirection begins with the call to proc.BeginErrorReadLine(); and proc.BeginOutputReadLine(); later after the ethminer process is started.

• We remember the Process class reference in the global variable via a call to gWS.setMinerProcessClass(proc, true). This will allow us to monitor it later from our threadWatchMiner worker thread.
• One other important thing to do is to ensure that the ethminer process that we start in our program is also terminated if our program exits or is terminated itself. This is important because we are starting the ethminer process without its own console window. Such option is implemented in the AttachChildProcessToThisProcess function.
• Then we need to put our thread into a waiting state with a call to proc.WaitForExit();. This will ideally stall our thread indefinitely, or until the ethminer process crashes. (Note that by definition, the ethminer should not exit if everything is working right.) But, if ethminer crashes, the call to proc.WaitForExit(); will return and we can collect some information about the crash, such as the exit code (which will most certainly be the SEH exception code) and the time of the crash. In this case we can also return a special result code res = MinerRunResult.RES_MR_MINER_EXITED; signifying that the miner crashed and we need to restart it.
• Otherwise all further monitoring will be done in our threadWatchMiner worker thread.

## Terminating ethminer With Our Process

Terminating the ethminer process along with our process is important because it is not running with its own console window.

The most secure way to do it is by using the Job Objects, that were introduced in Windows XP, and the JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE flag. That way the system will terminate the ethminer process for us when our process is terminated. Which is convenient for us.

Unfortunately I didn't find any support for such feature in .NET, so I rolled out my own using pinvoke in the AttachChildProcessToThisProcess function:

C#
static private IntPtr ghJob;

static void AttachChildProcessToThisProcess(Process proc)
{
//Attach 'proc' process to this process, so that it's closed along with this process
try
{
if (ghJob == IntPtr.Zero)
{
ghJob = CreateJobObject(IntPtr.Zero, "");		//It will be closed automatically when this process exits or is terminated

if (ghJob == IntPtr.Zero)
{
throw new Win32Exception();
}
}

JOBOBJECT_BASIC_LIMIT_INFORMATION info = new JOBOBJECT_BASIC_LIMIT_INFORMATION();
info.LimitFlags = JOBOBJECTLIMIT.JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;

JOBOBJECT_EXTENDED_LIMIT_INFORMATION exInfo = new JOBOBJECT_EXTENDED_LIMIT_INFORMATION();
exInfo.BasicLimitInformation = info;

int nLength = Marshal.SizeOf(typeof(JOBOBJECT_EXTENDED_LIMIT_INFORMATION));
IntPtr exInfoPtr = Marshal.AllocHGlobal(nLength);

try
{
Marshal.StructureToPtr(exInfo, exInfoPtr, false);

if (!SetInformationJobObject(ghJob, JobObjectInfoType.ExtendedLimitInformation, exInfoPtr, (uint)nLength))
{
throw new Win32Exception();
}

//And attach the process
if (!AssignProcessToJobObject(ghJob, proc.Handle))
{
throw new Win32Exception();
}
}
finally
{
Marshal.FreeHGlobal(exInfoPtr);
}

}
catch(Exception ex)
{
//Error
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Error, "Failed to assign miner job: " + ex.ToString());
OutputConsoleError("ERROR: Failed to assign miner job: " + ex.ToString());
}
}

Note that I also declared the following native functions and data structures:

C#
public enum JobObjectInfoType
{
AssociateCompletionPortInformation = 7,
BasicLimitInformation = 2,
BasicUIRestrictions = 4,
EndOfJobTimeInformation = 6,
ExtendedLimitInformation = 9,
SecurityLimitInformation = 5,
GroupInformation = 11
}

[StructLayout(LayoutKind.Sequential)]
public struct JOBOBJECT_BASIC_LIMIT_INFORMATION
{
public Int64 PerProcessUserTimeLimit;
public Int64 PerJobUserTimeLimit;
public JOBOBJECTLIMIT LimitFlags;
public UIntPtr MinimumWorkingSetSize;
public UIntPtr MaximumWorkingSetSize;
public UInt32 ActiveProcessLimit;
public Int64 Affinity;
public UInt32 PriorityClass;
public UInt32 SchedulingClass;
}

[Flags]
public enum JOBOBJECTLIMIT : uint
{
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE = 0x2000
}

[StructLayout(LayoutKind.Sequential)]
public struct IO_COUNTERS
{
public UInt64 WriteOperationCount;
public UInt64 OtherOperationCount;
public UInt64 WriteTransferCount;
public UInt64 OtherTransferCount;
}

[StructLayout(LayoutKind.Sequential)]
public struct JOBOBJECT_EXTENDED_LIMIT_INFORMATION
{
public JOBOBJECT_BASIC_LIMIT_INFORMATION BasicLimitInformation;
public IO_COUNTERS IoInfo;
public UIntPtr ProcessMemoryLimit;
public UIntPtr JobMemoryLimit;
public UIntPtr PeakProcessMemoryUsed;
public UIntPtr PeakJobMemoryUsed;
}

[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
static extern IntPtr CreateJobObject(IntPtr lpJobAttributes, string name);

[DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
static extern bool SetInformationJobObject(IntPtr job, JobObjectInfoType infoType,
IntPtr lpJobObjectInfo, uint cbJobObjectInfoLength);

[DllImport("kernel32.dll", SetLastError = true)]
static extern bool AssignProcessToJobObject(IntPtr job, IntPtr process);

A few things to note about the code above:

• Note that I'm using the job object as a singleton in the global static ghJob variable. That is because we don't need to create multiple job objects in the call to ghJob = CreateJobObject(IntPtr.Zero, "");. This has to be done only once, in case our AttachChildProcessToThisProcess function is called repeatedly.
• Then we just do some marshalling of the Win32 structures to be used in .NET and call SetInformationJobObject to set up our job with the JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE flag, and then call AssignProcessToJobObject to associated our child process with that job. And that is it.
• Lastly, note that we never close the job object created with a call to CreateJobObject and stored in the ghJob global variable. That job object will be closed automatically by the operating system when our process exits or is terminated. This is what will enforce the JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE flag and also terminate the ethminer process.

## Output Tracker

All the tracking of the text output that we receive from the ethminer is done via the DataReceivedFromMiner function. That function first colorizes some important text that it receives from the ethminer (since its original coloring will be lost) and then passes the text to the AnalyzeDataReceivedFromMiner function that analyzes it.

The AnalyzeDataReceivedFromMiner function works on a line-by-line basis. It splits each line into words, by spaces and looks for specific keywords. I didn't go too fancy on it, and simply assumed that an important (for us) output line from the ethminer would start from either m or i. Then I assumed that an important hash-rate would start with Mh, preceded by a fractional number.

In this case I did not take into account a possibility of lower Kh, or higher Gh hash-rates. If that is the case, I will leave it up to the reader to adjust the source code.

Additionally, if the hash is accepted, the presence of the **Accepted keyword would indicate that.

Lastly the function remembers the time when a good hash-rate was recorded (or if the value of it falls within the range specified via the command line) and also when the last accepted hash was received. All of these data will be used later in the threadWatchMiner worker thread.

The worker thread for watching the functioning of the ethminer in implemented via the threadWatchMiner function. Its actual code is not that interesting. I'll just note its most essential functions:

• It monitors if the ethminer child process is running. And if not, it waits for at least 5 minutes before rebooting the mining rig. This part is important in case the ethminer process fails to start, or if there's some kind of a different bug in our own EtherMineSafeRunner process.
• It checks when the ethminer child process had started and doesn't perform any other checks earlier than 40 seconds. This is needed to let the ethminer do its own initialization.
• Otherwise, it checks how long ago did we receive a good hash-rate from the ethminer child process. If that didn't happen within first 4 minutes, or later within 2 minutes, the EtherMineSafeRunner will kill the ethminer process. This, in turn, will allow the main thread to restart it.
• Additionally, it checks how long ago did we receive an accepted hash from the mining pool. If that didn't happen within first 35 minutes, or later within 30 minutes, the EtherMineSafeRunner will kill the ethminer process. This, in turn, will allow the main thread to restart it.
• Lastly, the worker thread will output the current tracking stats into the console window. It does so every 15 seconds. This will help user see the current state of the monitoring process.

The output is presented in the following form:

WATCH: Miner=On Runtime=01:18:49:19 Restarts=0 LastAccepted=00:00:48 LastGoodHash=00:00:03

Where:

• Miner = On indicates that the ethminer process is running. OFF if not.
• Runtime = gives how long the ethminer process had been running. The format is: days:hours:minutes:seconds
• Restarts = gives how many times the ethminer process had been restarted because of a crash, or because of some other condition, described above.
• LastAccepted = how long ago the ethminer received an accepted hash from the mining pool. The format is: hours:minutes:seconds
• LastGoodHash = how long ago the ethminer reported a good hash-rate. The format is: hours:minutes:seconds

## Rebooting Mining Rig

The rebooting of the mining rig is done in the rebootRig function. For that I'm simply using the shutdown.exe /r /f /t 0 command that will force the reboot. It is important to force it to ensure that the system is rebooted unconditionally.

The rebooting function is called in the critical moments and indicates a severe condition. Thus, in that case I also call my sendEmailNotification function that is supposed to dispatch an email to the user to notify them of a critical condition of the rig.

The email notifications are an absolute last resort for the watch tool. It should be used only when the automated script gets stuck trying to recover the miner. This happens during a forced reboot of the mining rig, and also if the rebooting fails.

Unfortunately sending an email these days is not a sure way, so I chose a more reliable method of custom-writing a PHP function that would run on the customer's server. (I understand that this is not the easiest approach for everyone. It just worked in the case that I used this script for.)

The process of dispatching an email is actually two-fold. One, the sendEmailNotification function composes and sends a simple POST request to the PHP script on the web server. I did it as such:

C#
public static void sendEmailNotification(string strSubject, string strMsg)
{
//Send an email to self with a critical rig state notification
//'strSubject' = short subject line for the email
//'strMsg' = short message to include (will have current time added to it)

try
{
strMsg = "[" + DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss.fff") + "] " + strMsg;

}
catch(Exception ex)
{
//Exception
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Critical, "EXCEPTION in sendEmailNotification: " + ex.ToString());
}
}

{
//Only if we have the notification key
{
try
{
using (HttpClient httpClient = new HttpClient())
{
//10-second timeout
httpClient.Timeout = TimeSpan.FromSeconds(10);

//Parameters for the PHP script
var content = new FormUrlEncodedContent(new[] {
new KeyValuePair<string, string>("subj", strSubj),
new KeyValuePair<string, string>("msg", strMsg),
});

var result = await httpClient.PostAsync(gkstrNotificationPage, content);

//Get result
int nStatusCode = (int)result.StatusCode;

if (nStatusCode != 200)
{
//Something went wrong
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Critical, "Failed to send web server notification. Code: " + nStatusCode + ", SUBJ: " + strSubj + ", MSG: " + strMsg);
}
}
}
catch (Exception ex)
{
gEventLog.logMessage(EventLogMsgType.ELM_TYP_Critical, "EXCEPTION in task_NotifyWebServer: " + ex.ToString());
}
}
}

Note that the internal task_NotifyWebServer function has some of its parts that are executed asynchronously, and thus it has to be declared as a async Task so that we can wait for them with the await keyword. I'm not doing any fancy synchronization here, since all that matters in this case is to see that the web server, or our PHP script, replied with the 200 status code. That will indicate a success.

### Server-Side PHP Script

The second-stage of the critical notification code is the PHP script, that runs on the web server, that actually dispatches the email:

PHP
$key =$_REQUEST['key'];
if($key == '-insert-secret-key-') //Special private key from EtherMineSafeRunner C# solution. Match it with the gkstrNotificationKey variable there! {$email_address = "notify@example.com";    //Email address to send notification to (use address on the local server)

$subject =$_REQUEST['subj'];
$msg =$_REQUEST['msg'];

if($subject != '' ||$msg != '')
{
if(!mail($email_address,$subject != '' ? $subject : "Eheminer notification",$msg != '' ? $msg : "-empty-")) { //Mail function failed http_response_code(406); } exit; } } //Failed to authenticate http_response_code(404); The PHP code above does the following: • Authentication is important since you don't want to expose your email server to spammers. Thus make sure to provide a unique key in the '-insert-secret-key-' string above and match it to the gkstrNotificationKey variable in the EtherMineSafeRunner code. • The email address to send the notification to is specified in the $email_address variable.
If you are using some shared hosting provider to run your PHP script, I would strongly suggest specifying an email address on that same server! This is because many web hosting providers may block your outbound email, or even if they don't, such email will have very little luck passing through SPF and DKIM antispam filters on other mail servers. As a result your notification email may never arrive.

On the other hand, if you send it within the same shared server to your own account, such email will be most certainly delivered.

• Then the actual email is dispatched via the mail function. And if that function succeeds, our script will exit, which will output the 200 status code back for our C# script.
• In any other situation, our PHP script will output either 404 status code, if authentication didn't pass, or 406 if the mail function fails. This will indicate to our C# code a certain failure that will be recorded in the event log file.

To enable email notifications in your own build of the EtherMineSafeRunner, make sure to provide the following and then recompile the source code:

• Upload the PHP script, that dispatches emails, to your web server. (You can also use a shared web hosting server for this.)
• Set the gkstrNotificationHost variable in the EtherMineSafeRunner project to the host name of your server. Example: "https://example.com"
• Set the gkstrNotificationPage variable in the EtherMineSafeRunner project to the relative location of the PHP script on the web server. Example: "/php/send_rig_notification_email.php"
• Set the gkstrNotificationKey variable in the EtherMineSafeRunner project to some unique password-like secret string. And then also adjust the PHP script with the same secret value. Example: "B43F8657-FDD0-43C0-8AF7-54EB99199F45" (but don't use this one!)
• Recompile the EtherMineSafeRunner project and use the resulting EtherMineSafeRunner.exe file.

## Persistent Event Log

Maintaining a persistent event log is a very important thing to do, especially for such an automated script as EtherMineSafeRunner. I chose a simple approach. I write the most critical notifications from the EtherMineSafeRunner and from ethminer into a simple text file. This is implemented in the EventLog class.

There's nothing super-interesting there, so check the source code for a complete example. I'll just note a few things here:

• The file name and location where the event log file is placed is defined by the _strLogFilePath variable. You can specify just a file name (like I did, with the EthMineSafeRunner_Log.txt file) in which case the event log file will be placed into the same folder as the EtherMineSafeRunner process; or you can specify an actual file path. Just make sure that the EtherMineSafeRunner process can write into it!
• It is important to maintain a certain maximum size of the event log file (to prevent it from growing uncontrollably) and to truncate it from time to time. In my case, I chose to provide the maximum allowed size of the event log file in bytes in the _ncbMaxAllowedFileSz variable. Then when the EtherMineSafeRunner starts, the constructor of the EventLog class will check the event log file size and remove all (old) entries on top of it to satisfy the size restriction.

By default the event log file is set to grow no longer than 2 MB.

• Lastly, one important thing to note here is that we need to synchronize writing into our event log from within our process. This is done with the use of the lock(_lock){ } block of code. This is needed because we have multiple threads that can be writing into the event log.

If you are interested in the EtherMineSafeRunner watch tool: