OPC UA Catch-Up

Overview

This document describes the Catch-Up functionality implemented in the Apis Hive OPC UA client module. The Apis Hive OPC UA client module provides as default a standard OPC UA client with Node subscription and method call functionality.

In the cases of communication loss between an OPC UA server and OPC UA client, most OPC UA servers provides a short term buffering of subscription values. When clients re-connects after a relative short termed communication loss, the server start sending the buffered values. This standard OPC UA functionality will only work on short-term communication losses and are dependent on client/server buffer settings and server capabilities.

To accommodate problems with lost data after a long-term communication failure Prediktor has implemented the Catch-Up functionality in the OPC UA client. Unlike classic OPC, OPC UA provides both Data Access and History Access through the same interface. Apis Hive OPC UA client uses the History Access capability in OPC UA to read the logged values on the subscription nodes provided from the server for the duration of the communication failure. When all available history values on the subscription nodes are read up to the time of the re-connect, the client continues to read the subscribed node value through OPC UA Data Access as normal.

To get Catch-Up in Apis Hive UA client bee to work as intended, the server need to provide history data on all the items that the client subscribes to through Data Access.

The resolution on the logged data is not necessary the same as the negotiated sample-interval on the subscription nodes on source system. Differences between the logging interval and sample interval will result in different resolution on data collected through catch-up periods than through normal data access. Careful configuration on both the source server and client is necessary to accommodate this, if correct data resolution through both the catch-up (History Access) and Data Access data is important for the data consuming system.

Dependencies

The Apis Hive OPC UA bee has no Apis Hive dependencies.

Status Items

The Apis Hive OPC UA bee has a set of status items:

ItemVisibilityDescription
#Connected#AlwaysItem telling if module is connected to the OPC server; true: is connected, false: is disconnected.
#Session-State#AlwaysItem telling the current status of the OpcUA session.
#Subscription-State#AlwaysItem telling the current status of the OpcUA subscription.
#Endpoint#AlwaysThe URL to the OpcUA server.
#Catchup-State#CatchupItem telling the current state of data catch-up process. Realtime means no cathup is running, otherwise some sort of cathup related state.
#Catchup-PointInTime#SerializedCatchupItem telling the current point in time for the catchup process, when catch-up process is active.
#Catchup-RealTimeBufferCount#SerializedCatchupItem telling the current number of buffered real-time data callbacks during the catchup process, when catch-up process is active.
#Catchup-ReadChunksCount#SerializedCatchupItem telling the total number of data chunks read from the UA server during the catchup process, when catch-up process is active.
#Catchup-ReadChunksErrors#CatchupItem telling the total number of failed data chunks reads from the ua server during the catchup process, when catch-up process is active.
#Catchup-ReadSamplesCount#SerializedCatchupItem telling the total number of data samples read from the UA server during the catchup process, when catch-up process is active.
#Catchup-Speed#SerializedCatchupItem telling how many times faster than real-time the catchup process is running (note that a low figure may indicate both bad performance as well as high data density).
#Catchup-Progress#DirectCatchupItem telling the progress of direct catchup, in how many items have finished [#finished-read/#finished-writes/#total-items].
#Catchup-ReadSpeedAvg#Item telling the average read speed of the UA server, in samples per millisecond.
#Catchup-StepNextTimeAvg#SerializedCatchupItem telling the average duration of a StepNext iteration, in milliseconds.
#Catchup-WrittenSamplesCount#DirectCatchupItem telling the total number of data samples written during the direct catchup process, when catch-up is active.
#Catchup-WrittenChunksCount#DirectCatchupItem telling the total number of data chunks written during the direct catchup process, when catch-up is active.
#Catchup-WriteSpeedAvg#DirectCatchupItem telling the average history write speed, in samples per second.

Command Item

The Apis Hive OPC UA bee has a catch-up relevant command item:

ItemDescription
$Cmd_Catchup-Continue$Trigger continuation of the catch-up prosess, when catch-up type is SerializedFull_PauseAfterInitial or SerializedPArtial_PauseAfterInitial

The command item is added through the add item method. The command type id: 10140.

Catch-Up relevant module properties

  • CatchUpMode: The data catch-up functionality of the communication. Set this to enable catch-up. Property ID: 1250

    • NoCatchup: No data catch-up, just pure real-time communication.

    • SerializedFull: Historical data is streamed through Hive sample-by-sample, until we have caught up real-time. Note that this requires the UA server to implement the OPC UA HA profile!

    • SerializedFull_PauseAfterInitial: Same as 'SerializedFull', but will stop in playback sequence after first StepNext, use CommandItem $Cmd_Catchup-Continue$ to continue playback sequence.

    • SerializedPartial: Only data missing locally, per item, is read from remote server and streamed through Hive, sample-by-sample, until we have caught up real-time.

    • SerializedPartial_PauseAfterInitial: Same as 'SerializedPartial', but with one stop in playback sequence after first StepNext, use CommandItem #ContinueCatchup# to continue playback sequence.

    • Direct: Historical data is written diectly into an APIS HoneyStore timeseries database (when item is logged), and the realtime data starts to updated immediately on the item as when no catch-up is in use.
      Note! When using Direct catch-up, it is important that the related Honeystore database(s) have their Capabilities property set to accepting Out-of-sequence data.

  • CatchUpPeriod: The maximum period to look back for historical data when initiating a catch-up operation. Select a pre-set value, or enter a value in seconds. Property ID: 1262

  • CatchUpChunkSize: How many values to read per node during catch-up read operations. Maps to 'numValuesPerNode' parameter of 'ReadRawModifiedDetails'. Default is 5000. Property ID: 1261

  • CatchupChunkCount: How many chunks to cache during catch-up read operations. Must be between 0-255. Default is 5.Property ID: 1260

  • CatchUpPeriod: The maximum period to look back for historical data when initiating a catch-up operation. Select a pre-set value, or enter a value in seconds. Property ID: 1262

CONFIGURATION OF SERIALIZED CATCH-UP

Catch-Up operation, step by step

When the OPC UA Client module re-connects after a period of communication loss, and if configured to do automatic catch-up, the OPC UA Client module will start to read the history data on the subscription items, starting from the time when the connection was lost, possibly limited by the set CatchUpPeriod module property. During this period of history data read, the module will buffer all Data Access subscription values in the module until all history data are read from the source system. After all the available item data is read up to the time of the re-connect, the module continuous with normal OPC UA Data Access subscription operation. During the Catch-Up the OPC UA Client Bee updates the configured data items in steps, assuring that all History data values for the items are written to the tags. Only single values are updated in each step.

Implications on memory usage

The OPC UA Client module reads the data in chunks. The number and size of the chunks for each item is configurable through the module properties: CatchUpChunkSize and CatchupChunkCount. The values set on these properties together with the number of items and item types will have implications on memory usage. The buffering of the OPC UA Data Access subscription items will also add to the memory consumption, so careful choice of Catch-up period must be selected to avoid out-of-memory situations. It is recommended to use the 64bit version of Apis Hive to ensure enough memory in high item volume solutions.

Timestamps

Item values read through OPC UA History Access (and OPC UA Data Access) will keep the server data timestamp through all the External Item transfers in Apis Hive, if not the TimeReferenceItems module property is set explicit. We recommend not using the TimeReferenceItems property in a catch-up solution.

EventBroker

Each Apis module normally have set of events and a set of commands. An event in one module can cause a command to be fired in another module. The mapping of the events to commands are done in the Event Broker. The OPC UA client module has a specific event CatchUp-StepNextDone and a specific command CatchUp-StepNext, that fires when a catch-up step is done and triggers each step in the catch-up process respectively. The module also fires a ServerDataChanged event after a new set of item values are received, both during catch-up and normal operation. Only a single value for each item are updated for each trigging of the ServerDataChanged event.

Trig the CatchUp-StepNext command by the CatchUp-StepNextDone event on the OPC UA Client Module, to ensure proper operation of events during normal operation.

This will be coverd in more detail below.

Figure 1, Event broker, CatchUp-StepNext on CatchUp-StepNextDone.

Apis Hive Inter-module data exchange

When using the Catch-Up functionality in the Apis Hive OPC UA Client Bee, it is important to configure the Apis Hive Data Exchange accordingly.

It is important to use the synchronized event/command functionality in the Apis Hive Event Broker instead of asynchronous timer-based Data Exchange

ExternalItem transfer in Apis

Apis Hive is a module based system environment that gives you the opportunity to collect, process and store real-time data. Different modules in Apis Hive provides different functionality, but common to all is the inter-module data exchange provided by the ExternalItem functionality.

Example of data exchange between modules

In this example system, we use a simple setup consisting of tree modules in an Apis Hive default instance:

  • A connection module, the OpcUa Client bee, named OpcUaBee.

  • A process module, the Calculate Bee, named Calculate.

  • A store module, the Logger Bee, named ClientLogger.

A simple view of the modules:

Figure 2, Apis Hive with Calculate, Logger and OpcUa client module.

The OpcUaBee module has one subscription item that reads a sine signal from the a source systemed named CustomerServer. The Item is named CustomerServer.Worker.Sine. Further the Calculate module has one item, named SineTimes2, which has CustomerServer.Worker.Sine as input through the external item connection.

Figure 3, Item connection.

Normally, data exchange (ExternalItem) is configured at destination side by adding one or more external items to an item by the item attributes. Second the destination module need to be set to use a specific Exchange rate, a definition of the period between each read of the connected items, normally in the millisecond area. In this way, the items will be inter-connected, and read at a specific time period.

Data exchange, by event broker:

In the figure below, we see that the OpcUa trigger the ClientLogger module to log, the Calculate module to handle its external items and the ClientLogger module to log again when ServerDataChanged event fires. The reason for logging twice is to ensure that history values that could be needed by calculate operations are properly logged before calculations are trigged, and results from the calculations are logged. Values on items that has not changed will not be logged.

We also see that CatchUp-StepNext is triggered as before by CatchUp-StepNextDone.

Figure 4, Event broker.

Catch-Up steps summary

The steps of the Catch-Up summed up:

  • The module re-connects after a period of lost communication

  • The buffering of the OPC UA Data Access subscription data on the items starts immediately

  • The module decides from what timestamp to start to read item history data from, based on time of communication loss or reconnect-time and CatchUpPeriod setting.

  • The module starts to read history data in chunks and trigger the event in event broker for each change in item value.

  • If configured, the external item values are synchronously updated through the Event Broker event ServerDataChanged for each value update.

  • After all history data is read, the OPC UA Client Module continuous with normal OPC UA Data Access subscription operation.

  • If configured, the external item values are synchronously updated through the Event Broker for each value update.

Recommendations

  • Use event-broker to handle Apis Hive Data Exhange, trigging of CatchUp-StepNext command by the CatchUp-StepNextDone event is a must.

  • Ensure proper configuration of both client and source system server to achieve same data sampling rate of both History Access item data and item subscription (OPC UA Data Access) data.

CONFIGURATION OF DIRECT CATCH-UP

Direct catch-up is more straight forward to use, and is the correct choice if:

  • you want the realtime values for items to be applied on the UA client items rights away.
  • you just need to fetch historical data for periods where the UA client was not running.
  • you don't need to perform calculations and/or alarming on the items when catching up.

When using direct catch-up, the missing item data will be retrieved item-by-item. This means that during the catchup process, some items will have all their missing database fetched, as other items will not.

ADVANCED CATCH-UP CONFIGURATION

When running catchup towards misbehaving UA servers, i.e. servers not returning an initial VQT for all subscribed items after creating the subscription, the catchup process will be stuck waiting forever to get started. This is because the catchup process needs to have an initial VQT for all its items, to be able to determine the End times to use for the history read operations.

To allow the catchup process top run such UA servers, two strategies has been implemted as a workaround.
If forced, the highest time of the current local computer time and server time (as reported by the UA server), will be used as endtimes for the stale items.

These strategies are:

  1. MaxEqualMissingEndtimesCatchupContinue: After 3 (default) publish responses during startup, with the UA server failing to update any new VQTs, the catchup process will be forced to start.
    The default value of 3 can be changed through registry key value MaxEqualMissingEndtimesCatchupContinue of the module.
  2. MaxWaitSecondsMissingEndtimesCatchupContinue: After 600 seconds (default) since startup, with the UA server failing to update all VQTs, the catchup will be forced to start.

To modify the values any of these two settings, create DWORD values named MaxEqualMissingEndtimesCatchupContinue and/or MaxWaitSecondsMissingEndtimesCatchupContinue, for the for the actual Opc UA client module, and ehter other values.
E.g., for a Hive instance named AI_Catchup and an OPCUA client module named OpcUa and new default values 60 (=0x3c) and 120 (=0x78):

[HKEY_LOCAL_MACHINE\SOFTWARE\Prediktor\Apis\AI_Catchup\Modules\OpcUa]  
"MaxEqualMissingEndtimesCatchupContinue"=dword:0000003c  
"MaxWaitSecondsMissingEndtimesCatchupContinue"=dword:00000078