Oracle GoldenGate on Microsoft Windows Server Process Start/Restart – Part 1

I have the honor of sharing today’s blog with Mack Bell.  Mack is the Product Manager for GoldenGate for SQL Server at Oracle and my go-to person for any issues and problems that I have with that product.  Mack has been on my review list for all of my other blogs and papers that I have written on Oracle GoldenGate 12.2 as well as earlier releases. When discussing GoldenGate on Microsoft SQL Server, Mack and I have seen it all!

If you have GoldenGate processes set to start automatically on Windows Server startup, you may have experienced a problem where all of the GoldenGate processes start but immediately stop if the source or target database is not yet available.  You may also be using the AUTORESTART feature and expect it to restart the processes and attempt connection to the database again, yet you find that this is not happening.

In this two part blog entry, Mack is going to explain exactly what is going on, so let’s dive into the details and options that you have for auto starting and restarting GoldenGate processes, and see if we can’t explain why things don’t sometimes work as you might expect, and what to do to get the results that you do expect.

The Manager

First, let’s look into the Manager process and see what is required to set it to start automatically on Windows.  The Manager process is the parent to all of the other GoldenGate processes, and there are two modes that the Manager can run in on a Windows Server environment.  The first mode, which is fine for testing, is to run the Manager as a user process, started from GGSCI.

GGSCI>start mgr

You’ll notice in this mode, that the Manager is running as a user process, with that user’s credentials, and will shut down when the user logs out of the server.  Again, this is fine for testing purposes but not what you would use in a production environment.

For production environments, you want to install the Manager as a Windows service.  This is done via the following command in GGSCI:

GGSCI>shell install addservice

1.jpg

The ‘shell’ command directs GGSCI to run an OS level command, in this case the install.exe executable located in the GoldenGate home directory, but you could also run install.exe directly from a command prompt and pass it the ‘addservice’ option.  There are other available options for the install executable, and you can view those options by checking its help file:

GGSCI>shell install help

For now though, we’ll accept the default option when adding the Manager as a service, and that is to automatically start the Manager on Windows startup.

AUTOSTART

Now that Manager will be running as a Windows service which is set to start automatically on server startup, what can be done to automatically start other GoldenGate processes, such as Extracts and Replicats?  For this, we use the AUTOSTART parameter in the Manager’s parameter file:

GGSCI> edit params mgr

2.jpg

For this example, I’ll simply wildcard the Extract and Replicat processes to start, but I could have also listed out individual processes, like AUTOSTART EXTRACT extname.  After saving the parameter file and starting (or restarting) the Manager, you’ll notice that any GoldenGate processes that you had registered will now start when the Manager starts.

AUTORESTART

Next, what about conditions where a GoldenGate process ‘Abends’ and you want it to attempt a restart automatically?  We can use the AUTORESTART feature of the Manager to accomplish this.  With this setting, the Manager will attempt to automatically restart a previously RUNNING processes for a default number of two attempts with a two minute gap and retry attempts.  You can adjust these settings and I encourage you to check the GoldenGate Reference Guide for the available options and settings.

3.jpg

With the Manager’s parameter file saved with this new parameter, restart the Manager for the new settings to take effect.

Process Status and Checkpoint File

Ok, everything is squared away, or is it?  Since we’re focusing this article on why processes don’t seem to restart automatically when you might expect them to, I must first explain how the Manager would even know to restart a process, and this is due to a status recorded in that process’s checkpoint file.  Every GoldenGate Extract or Replicat has a checkpoint file, and among other bits of information stored there, the last known status is recorded.   The Manager will only attempt a restart of a process if the status was RUNNING.   Processes that are Stopped (STOP command) or Killed (KILL command) do not apply to AUTORESTART.

Let’s look then at a couple of different scenarios; one for when the Manager would restart a process and another when it would not.

Scenario #1 – Process is RUNNING, then Abends from SQL Server instance shutdown

1.jpg

I started my Manager and it automatically started my other processes.  I’m now going to shut down the SQL Server instance that houses my source and target databases and check the status of the processes again.  The Extract (EXT2008) and the Replicat (REP2008) have active connections to the database and have already connected and updated their checkpoints to the status of ‘RUNNING’.  When I stop the database service, those two processes are going to change their statuses to ‘ABENDED’, and the Manager will attempt a restart of each.

5.jpg

I’ll wait a few minutes and check the Manager’s report file to see that it has tried the restart, but with the SQL Server instance still offline, the processes will not be able to enter a Running state at this time.

This is the information from the Manager’s report file, but the processes are still in an ABENDED state since the database instance is still offline.

6.jpg

I’ll now start the SQL Server instance again before the next restart attempt, then wait for the Manager to restart the processes and check their statuses.

7.jpg

This time, all the processes are running again and the restart logic by the Manager has succeeded.  But before moving on to the next scenario, let me point out one key observation here, and that is that the Manager’s auto restart logic functioned because the status of the Extract and Replicat went from Running, to Abended, and the checkpoint information for those two processes was updated to Abended, which informed the Manager that the AUTORESTART rules would take effect.

Scenario #2 – Process is manually stopped and re-Started against offline SQL Server Instance or Database

For this scenario, I’m going to highlight a situation where the Manager does not attempt a restart of the GoldenGate processes and then explain why that happens.

Let’s start with a running group of GoldenGate processes, and then manually we will stop them.

8.jpg

Next, I’ll shutdown the SQL Server instance and then manually restart the two GoldenGate processes.