Integrating RDBMS Systems

Using s-Server, you can read from and write to RDBMS systems.

This topic contains the following subtopics:

Reading from RDBMS Sources

You can read data from RDBMS sources using the SQL/MED plugin for s-Server. This plugin is pre-installed with s-Server.

In order to read data from an RDBMS system, you need a JDBC driver. Currently, these are preinstalled for Microsoft SQL Server, Oracle, and Teradata, and can be installed for MySQL.

Note: In order to use other JDBC drivers with SQLstream s-Server, you need to 1) copy all required jar files to the $SQLSTREAM_HOME/lib directory, 2) adjust their permissions so that they are readable by the user who runs s-Server (typically the sqlstream user), and 3) restart s-Server.

To read data, you first define a server object with connection information, including the URL, user name, and password. Once you define this server object, you can read from the database’s tables by either:

  • Defining a foreign table to map onto a table in the foreign database (the preferred method)

  • Using the SQL-standard three-level qualifier scheme for the names of tables: catalog.schema.table. In this case, the catalog name will be the name given to the foreign server in your CREATE SERVER command. See the topic CREATE SERVER in the s-Server Streaming SQL Reference Guide for more details.

The CREATE SERVER topic in the Streaming SQL Reference Guide has a complete list of options for creating SQL/MED server objects.

Foreign Server Definition

You define a foreign server in SQL using the CREATE OR REPLACE SERVER command. (You can also define servers through s-Studio.) This foreign server always uses a foreign data wrapper called SYS_JDBC. This wrapper is preinstalled with s-Server.

For example, the following code block defines a connection for a MySQL database.

CREATE OR REPLACE SERVER mysql_reader
 FOREIGN DATA WRAPPER SYS_JDBC
 OPTIONS (
   DRIVER_CLASS 'com.mysql.jdbc.Driver',
   URL 'jdbc:mysql://localhost:3306/sample',
   USER_NAME 'sqlstream',
   PASSWORD 'sqlstream'
DIALECT 'MYSQL',
   );

Once defined, you can reference this table as mysql_reader.schema.table, where “schema” is a schema defined in the MySQL database and “table” is a table defined in this schema.

The following code block defines a connection for a Teradata database:

CREATE OR REPLACE SERVER "Teradata_DB"
 FOREIGN DATA WRAPPER "SYS_JDBC"
 OPTIONS (
   URL 'jdbc:teradata://localhost/',
   USER_NAME 'sqlstream',
   PASSWORD '',
   DIALECT 'Teradata',
   TERADATA_QUERY_BAND 'org=Finance;report=EndOfYear;universe=west;',
   DRIVER_CLASS 'com.teradata.jdbc.TeraDriver'
 );

Note the presence of a TERADATA_QUERY_BAND option. This option is specific to Teradata and defined in the CREATE SERVER topic of the Streaming SQL Reference Guide.

You reference tables in the Teradata database defined above using the following qualifier: Teradata_DB.schema.table, where “schema” is a schema defined in the Teradata database and “table” is a table defined in this schema.

Defining a Foreign Table

When you define a foreign table, you

  1. Indicate a table and schema in the foreign database.
  2. Declare columns that match those in the target database table.

You can set these options when you declare a server, and override these options in a table definition.

Name Description
TABLE_NAME Name of the remote table.
SCHEMA_NAME Name of the remote schema.

Sample SQL

Like all tables, foreign tables need to be created within schemas. The following code first creates and sets a schema called “MOCHI,” then creates a foreign table called “regional_activity” which uses the server defined above and indicates a schema in the foreign database called “public” and a table in the foreign database called “regional_activity.”

CREATE OR REPLACE SCHEMA "MOCHI"
SET SCHEMA 'MOCHI';
CREATE OR REPLACE FOREIGN TABLE "regional_activity"
 SERVER "Postgres_DB"
 OPTIONS (
   SCHEMA_NAME 'public'
--schema name in the foreign database
   TABLE_NAME 'regional_activity'
--table name in the foreign database
 ) DESCRIPTION 'per-city summary of suspicious activity';

Querying RDBMS Tables

You can query JDBC foreign tables in one of two ways:

  • By defining foreign tables within s-Server that map onto tables in the foreign database, such as “my_s-ServerSchema”.“foreign_customers”. This has the advantage of hiding connection details. You can also restrict privileges to this table. This is the recommended technique for accessing foreign tables.

  • By using the three name qualifier scheme, such as “Oracle_DB”.””.“customers”. Merge currently does not function for PostgreSQL and MySQL.

Using a Foreign Table

See the topic CREATE FOREIGN TABLE in the Streaming SQL Reference Guide for more details. Once you create a foreign table, you can select from it like any other table in s-Server, with the contents of the foreign table automatically fetched.

select_rdbms_source_for_table
Using the Three Name Qualifier Schema

To read from a foreign table use the schema/table as you would for any other table and preface the server name as the catalog.

! select_rdbms_source_direct

JDBC Foreign Data Wrapper Definition

Because this plugin is included as part of the standard s-Server distribution, a corresponding foreign data wrapper instance named SYS_JDBC is predefined by SQLstream s-Server’s initialization scripts. Normally, there is no need to define additional instances.

However, it may be useful to define additional wrapper instances corresponding to specific DBMS types. For the JDBC plugi,n all SQL/MED options declared on the wrapper are propagated to the server, allowing common option settings to be factored out of individual server definitions. (Note that this is not necessarily true for other plugins.) Options specifically set on a server definition always take precedence over settings inherited from a wrapper.

Failure scenarios

SQLstream s-Server fails or is unreachable

If the SQLstream s-Server fails, the driver reports an error. The JDBC driver reports an error. If the client application is waiting in a driver method at that time, it will be notified of the error immediately. Otherwise the client application may not learn of the server crash until it next calls into the driver.

When first asked to create a JDBC connection, the driver attempts to establish an SDP connection. If the SQLstream s-Server is unreachable and either of these connections cannot be made, the driver reports an error.

(The system administrator might prevent this situation from happening by implementing a hot- or cold-swap for the SQLstream s-Server, and a load balancer so that clients can address it using the same name.)

Client application crashes

If the SQLstream s-Server detects that the client application crashed, the SQLstream s-Server then cleans up any server-side orphaned statements. Server-side distributed garbage collection (DGC) eventually harvests the remote objects abandoned by the crashed application and finalizers in these objects perform any remaining cleanup.

Tailing Foreign Tables with SQL/MED

In using the SQL/MED adapter to query remote database tables, you will often want to “tail” an external database table. “Tailing” refers to setting up an open-ended query that returns any newly-added rows from the external database table. When new rows are added, s-Server automatically adds these to a streaming query. In order to tail a database, you will need to make use of the STREAM keyword. Before discussing this process, this document will briefly cover a more limited use case of querying a remote database table–a static query on the current contents of the table.

Foreign tables are created automatically once you create a server object using the SQL/MED plugin. This process is described in the topic Reading Data from RDBMS Sources, and involves code along the following lines:

CREATE OR REPLACE SCHEMA "ForeignTableData";
SET SCHEMA '"ForeignTableData"';

CREATE OR REPLACE SERVER mysql_reader
 FOREIGN DATA WRAPPER SYS_JDBC
 OPTIONS
    (
   DRIVER_CLASS 'com.mysql.jdbc.Driver',
   URL 'jdbc:mysql://localhost:3306/sample',
   USER_NAME 'sqlstream',
   PASSWORD 'sqlstream',
   txInterval '50'
    );

Note: You can also create foreign tables manually. See the topic Reading Data from RDBMS Sources for more details.*

Option for Performing a Static Query on a Remote Database Table

*SELECT * FROM * is a relational table read. Returns all the current rows and then an end of statement. You do not need to create a special foreign table object in order to execute this query. You can do so using the the SQL-standard three-level qualifier scheme for the names of tables.

Options for Performing a Tailing Query on a Remote Database Table

SELECT STREAM * FROM <FOREIGN TABLE> (note the use of the STREAM keyword) is a private polling table read. This query returns all current rows and rows in the future. There is no need for a highwater mark when using this option.

SELECT STREAM * FROM <FOREIGN STREAM> (note the use of the STREAM keyword and the FOREIGN STREAM object) is a shared/public polling table read. This option requires you to create a foreign stream object for the remote database table, as described below. This query returns all current rows and rows in the future based upon the current state of the highwater mark. A highwater mark is a data point that defines the current row being read by an s-Server query (this is important only when multiple readers are querying the same SQL/MED foreign stream).

Initially this query will work the same way as SELECT STREAM * FROM <FOREIGN TABLE> (it returns all rows). However, if another query starts a read on the same foreign stream, then this query will pick up at the same point (the highwater mark) as the other query. If no query is currently being run against the foreign stream, the foreign stream does not discard rows. As a result, querying the foreign stream a second time will initially return only new tuples as the stream “catches up” to the current state of the table. Any new query will update the highwater mark to the current highest rowtime or key.

Writing a Tailing Query Using the STREAM Keyword/Table Object Option

In order to tail a remote database table using this option, you prepare a SELECT query with a lower and upper bound on a monotonically increasing column of the foreign stream. For example, given a remote table called TEST_TABLE and a monotonically increasing column called ID, you might use a query along the following lines:

SELECT STREAM * FROM TEST_TABLE WHERE ID > -1 AND ID <= (SELECT MAX(ID) FROM TEST_TABLE) - 2;
  • The id > ? is the lower bound and id <= (select max(id) from TEST_TABLE) - 2 is the upper bound. The lower bound excludes rows that we have already seen. The upper bound excludes rows that may not have been committed yet.
  • The 2 in the upper bound is called the txInterval and gives a fudge factor for the inserters to this table. If there will only be 1 inserter to TEST_TABLE, txInterval can safely be set to 0 (it must always be positive). txInterval defaults to 2, but you should set it equal to the maximum total number of concurrent inserters into the foreign table (that is, the maximum number of insert queries that could simultaneously be executing at once).
  • Another table property, pollingInterval, determines how long to wait between executing the prepared statement against the remote DB table. The MED/JDBC reader only waits if no rows were found in the last execution pass.
  • Multiple monotonically increasing columns where clause expressions in the select query are OR’ed together (ie (ID > ? AND ID <= (select max(id) from TEST_TABLE)) OR (ID2 > ? AND ID2 <= (select max(id) from TEST_TABLE))).
  • You can use this process with both foreign streams and foreign tables in a select stream query. In foreign stream use cases, a highwater mark will be stored.

Again, this query only works for private polling. If you want to create a tailing query that lets multiple users read from a remote database table, you will need to create a foreign stream object.

Tailing a Remote Database Table with a Foreign Stream Object

When you tail a remote database table with a foreign stream object, you allow multiple readers to query the remote database table. In order to do so, you need to create a foreign stream and make use of a highwater mark.

You pass highwater marks using an option called “SQLSTREAM_POSITION_KEY” in the OPTIONS clause of FOREIGN STREAM definition.

Highwater marks can be passed dynamically through an OPTIONS_QUERY option on the foreign stream. This lets you change the highwater mark if the foreign stream is being queried by different users. See Using the OPTIONS_QUERY Option below.

SQL for Creating Foreign Stream to Tail Foreign Table

The following SQL first creates a schema in which the foreign stream will reside, called ForeignTableData. The next block of SQL creates a server object.

CREATE OR REPLACE SCHEMA "ForeignStreamData";
SET SCHEMA '"ForeignStreamData"';

CREATE OR REPLACE SERVER mysql_reader
FOREIGN DATA WRAPPER SYS_JDBC
OPTIONS
   (
  DRIVER_CLASS 'com.mysql.jdbc.Driver',
  URL 'jdbc:mysql://localhost:3306/sample',
  USER_NAME 'sqlstream',
  PASSWORD 'sqlstream',
  txInterval '1000'
   );

CREATE OR REPLACE FOREIGN STREAM "ForeignTableDataStream"
OPTIONS
(
 "id" BIGINT NOT NULL,
 "col1" VARCHAR(10),
 "col2" DOUBLE,
 "col3" INTEGER
)
SERVER "mysql_reader"
   (
    TABLE_NAME 'sample_data',
    queryCol 'id',
    --highwater mark
    "SQLSTREAM_POSITION_KEY" '1000',
    SCHEMA_NAME 'public',
    DESCRIPTION 'Foreign stream to deliver rows appended to
       public.sample_data. Foreign streams are a shared or public polling table
       read.'
   )
;

Using the OPTIONS_QUERY Option for Highwater Marks

When you use the OPTIONS_QUERY, you can pass options to the foreign stream dynamically without the need to change the FOREIGN STREAM statement itself.

In the following example, each user can pass her own watermark to the foreign stream while running the query. When the SELECT query is run on ForeignTableDataStream, s-Server also runs OPTIONS_QUERY to fetch new values of for options.

CREATE OR REPLACE SCHEMA "ForeignStreamData";
SET SCHEMA '"ForeignStreamData"';

CREATE OR REPLACE SERVER mysql_reader
FOREIGN DATA WRAPPER SYS_JDBC
OPTIONS
   (
  DRIVER_CLASS 'com.mysql.jdbc.Driver',
  URL 'jdbc:mysql://localhost:3306/sample',
  USER_NAME 'sqlstream',
  PASSWORD 'sqlstream',
  txInterval '1000'
   );

CREATE OR REPLACE FOREIGN STREAM "ForeignTableDataStream"
OPTIONS (
 "id" BIGINT NOT NULL,
 "col1" VARCHAR(10),
 "col2" DOUBLE,
 "col3" INTEGER
)
SERVER "mysql_reader"
   (
    TABLE_NAME 'sample_data',
    queryCol 'id',
    "OPTIONS_QUERY" 'SELECT SQLSTREAM_POSITION_KEY, USER_NAME, PASSWORD FROM my_config_table WHERE STREAM_NAME = ''ForeignTableDataStream''',
    SCHEMA_NAME 'public',
    DESCRIPTION 'Foreign stream to deliver rows appended to
       public.sample_data.'
   )
;

Monotonically increasing columns are indicated either using the queryCol table property or by MONOTONIC_=true table property. For example the following are equivalent:

`queryCol ‘ID’

MONOTONIC_ID 'true'

queryCol and the Foreign Stream’s High Water Mark

Since table entries from multiple insertion sources can arrive without being in strict order, a newly arriving row can have a queryCol value higher than a prior row. If earlier rows with lower queryCol values were already streamed, a new arriving row with a higher queryCol value is seen as “out of order” and discarded.

You can minimize this data loss by specifying a querytSort parameter. This parameter establishes an interval in which rows with out-of-order queryCol can arrive. The querytSort value you choose enables the streaming of rows whose queryCol values are lower than “high-water mark minus querytSort.” As incoming rows establish a new high-water mark, earlier rows outside that new interval can then be streamed.

Querying the Foreign Stream as a Tail

In order to tail a remote database table using this option, you prepare a SELECT query with a lower and upper bound on a monotonically increasing column of the foreign stream. For example, given a foreign stream called TEST_STREAM and a monotonically increasing column called ID, you might use a query along the following lines:

SELECT STREAM * FROM TEST_STREAM WHERE ID > <HIGHWATER MARK> AND ID <= (SELECT MAX(ID) FROM TEST_TABLE) - 2;
  • The id > is the lower bound and id <= (select max(id) from TEST_STREAM) - 2 is the upper bound. The lower bound excludes rows that we have already seen. The upper bound excludes rows that may not have been committed yet.

  • The 2 in the upper bound is called the txInterval and allows flexibility (a “fudge factor”) for the inserters to this table. txInterval defaults to 2, but should be set to the maximum total number of concurrent inserters into the foreign table (ie the maximum number of insert queries that could simultaneously be executing at once). If you know that there will only be one inserter for TEST_TABLE, you can safely set txInterval to 0 (it cannot be negative).

  • Another table property, pollingInterval, determines how long to wait between executing the prepared statement against the remote DB table. The MED/JDBC reader only waits if no rows were found in the last execution pass.

  • Multiple monotonically increasing columns where clause expressions in the select query are OR’ed together. For example: (ID > ? AND ID <= (select max(id) from TEST_TABLE)) OR (ID2 > ? AND ID2 <= (select max(id) from TEST_TABLE)).

  • You can use this process with both foreign streams and foreign tables in a select stream query. In foreign stream use cases, a highwater mark will be stored.

Writing to RBDMS Sinks

To write data to an RDBMS destination, you first define a server object with connection information, including the URL for the RDBMS, user name, and password.

This topic contains the following subtopics: - Overview - JDBC Foreign Server Definition - Inserting-into-Foreign-Tables - Using-a-Foreign-Table - Managing-Commits - JDBC-Foreign-Data-Wrapper-Definition - Errors-for-Writing-to-RDBMS-Tables - Using-PRESERVE_PARTIAL_ROWS-and-BULK_LOAD - Examples

In order to write data to an RDBMS system, you need a JDBC driver. Currently, these are preinstalled for Microsoft SQL Server, Oracle, and Teradata, and can be installed for MySQL and PostgreSQL.

Once you define this server object, you can insert into the database’s tables by either:

  • Defining a foreign table to map onto a table in the foreign database (the preferred method)
  • Using the SQL-standard three-level qualifier scheme for the names of tables: catalog.schema.table. In this case, the catalog name will be the name given to the foreign server in your CREATE SERVER command. See the topic CREATE SERVER in the s-Server Streaming SQL Reference Guide for more details.

The CREATE SERVER topic in the SQLstream Streaming SQL Reference Guide has a complete list of options for creating SQL/MED server objects.

You can use two options, TRANSACTION_ROWTIME_LIMIT and TRANSACTION_ROW_LIMIT, to manage when rows commit. See Managing Commits below. You can also determine how s-Server handles write errors using options on the INSERT and MERGE statements, including managing retries. See Errors for Writing to RDBMS Tables below.

{{./includes/int_note_on_writing_pumps.mmark}}

JDBC Foreign Server Definition

Like all plugins, in order to use the SQL/MED plugin for JDBC, you need to set up a foreign server SQL object to handle connection parameters for the foreign DBMS. This foreign server always uses a foreign data wrapper called SYS_JDBC. This wrapper is preinstalled with s-Server.

For example, the following code block defines a connection for a MySQL database.

CREATE OR REPLACE SERVER mysql_reader
FOREIGN DATA WRAPPER SYS_JDBC
OPTIONS (
​      DRIVER_CLASS 'com.mysql.jdbc.Driver',
​      URL 'jdbc:mysql://localhost:3306/sample',
​      USER_NAME 'sqlstream',
​      PASSWORD 'sqlstream'
      DIALECT 'MYSQL',
​      );

The following code block defines a connection for a Teradata database:

CREATE OR REPLACE SERVER "Teradata_DB"
    FOREIGN DATA WRAPPER "SYS_JDBC"
    OPTIONS (
        URL 'jdbc:teradata://localhost/',
        USER_NAME 'sqlstream',
        PASSWORD '',
        DIALECT 'Teradata',
        TERADATA_QUERY_BAND 'org=Finance;report=EndOfYear;universe=west;',
        DRIVER_CLASS 'com.teradata.jdbc.TeraDriver'
    );

Note the presence of a TERADATA_QUERY_BAND option. This option is specific to Teradata and is explained below.

You reference tables in the Teradata database defined above using the following qualifier: Teradata_DB.schema.table, where “schema” is a schema defined in the Teradata database and “table” is a table defined in this schema.

Inserting into Foreign Tables

You can insert into JDBC foreign tables in one of two ways:

  • By defining foreign tables within s-Server that map onto tables in the foreign database, such as “my_s-ServerSchema”.“foreign_customers”. This has the advantage of hiding connection details. You can also restrict privileges to this table. This is the recommended technique for accessing foreign tables.
  • By using the three name qualifier scheme, such as “Oracle_DB”.””.“customers”. Merge currently does not function for PostgreSQL and MySQL.

Using a Foreign Table

SSee the topic CREATE FOREIGN TABLE in the SQLstream Streaming SQL Reference Guide for more details.

Using the Three Name Qualifier Schema

To write to a foreign table use the schema/table as you would for any other table and preface the server name as the catalog.

CREATE OR REPLACE SCHEMA "RDBMS_schema";
SET SCHEMA "RDBMS_schema";
CREATE OR REPLACE PUMP "writerPump" STOPPED AS
INSERT INTO "PostgresDB".my_schema.my_table
SELECT STREAM * FROM "MyStream";
--where "MyStream" is a currently existing stream

To start writing data, use the following code:

ALTER PUMP 'writerPump' START;

Managing Commits

In defining the foreign server or table, you can set parameters that manage how frequently the foreign database table is updated.

Option Defnition
TRANSACTION_ROWTIME_LIMIT The number of milliseconds which may elapse between commits. Defaults to 1000 ms (1 second).
TRANSACTION_ROW_LIMIT The number of rows to batch up before committing. Defaults to 0.

Say that

TRANSACTION_ROWTIME_LIMIT = M

and

TRANSACTION_ROW_LIMIT = N

Whenever the rowtime advances due to a new row or rowtime bound, we commit if either of the following conditions holds:

a) M > 0 and M milliseconds have elapsed since our last commit.

or

b) N > 0 and N rows have flowed by since our last commit..

JDBC Foreign Data Wrapper Definition

Because this plugin is included as part of the standard s-Server distribution, a corresponding foreign data wrapper instance named SYS_JDBC is predefined by SQLstream s-Server’s initialization scripts. Normally, there is no need to define additional instances.

However, it may be useful to define additional wrapper instances corresponding to specific DBMS types. For the JDBC plugin, all SQL/MED options declared on the wrapper are propagated to the server, allowing common option settings to be factored out of individual server definitions. (Note that this is not necessarily true for other plugins.) Options specifically set on a server definition always take precedence over settings inherited from a wrapper.

Errors for Writing to RDBMS Tables

Writes into an RDBMS sink can raise two kinds of errors:

Error type Description
Transient errors These are errors which might go away if we tried to write the row again. These problems include connection resets and java.sql.SQLTransientExceptions raised by the foreign database.
Fatal errors All other errors are considered fatal. The result won’t change if we try to write the row again.

You can manage how s-Server behaves when it encounters these errors by setting options in the INSERT and MERGE statements.

Version 6.0 Feature Version 6.0 lets you use options in an INSERT or MERGE statement. These override table/stream and server options if these are set.

Using PRESERVE_PARTIAL_ROWS and BULK_LOAD

s-Server writes data into RDBMS tables in batches. The size of each batch is determined by TRANSACTION_ROWTIME_LIMIT and TRANSACTION_ROW_LIMIT. If one of the rows in the batch fails to write, s-Server conforms to the following error policy.

  • ABORT - The statement terminates and all writes are rolled back. This is the default behavior for finite DML (an INSERT or MERGE that writes from a table).
  • STUMBLE - The current batch of rows is rolled back but the statement continues to run. This is the default behavior for infinite DML (an INSERT or MERGE that writes from a stream) and for finite bulk-load.
  • SKIP - The failed row is discarded, but the statement continues to run. Other rows in the batch may be committed, depending on how the database’s JDBC driver handles java.sql.Statement.executeBatch(). You can enable this behavior by setting PRESERVE_PARTIAL_WRITES to true.

The error policy is determined by two options in the INSERT and MERGE statements: PRESERVE_PARTIAL_ROWS and BULK_LOAD.

When BULK_LOAD is set to true, s-Server will commit rows whenever a batch of buffered rows is flushed to the external database.

When PRESERVE_PARTIAL_WRITES, s-Server will commit rows whenever a batch of buffered rows is flushed to the external database.

See DML options for the INSERT statement for more details (these are the same for both INSERT and MERGE).

INSERT and MERGE into med jdbc tables with streams as a source (finite inserts and merges) behave as follows for the indicated values of (PRESERVE_PARTIAL_ROWS, BULK_LOAD):

PRESERVE_PARTIAL_ROWS BULK_LOAD Result
false false STUMBLE
false true Illegal
true false SKIP
true true Illegal

INSERT and MERGE with tables as a source (finite inserts and merges) into med jdbc tables behave as follows for the indicated values of (PRESERVE_PARTIAL_ROWS, BULK_LOAD):

PRESERVE_PARTIAL_ROWS BULK_LOAD Result
false false ABORT
false true STUMBLE
true false illegal
true true skip

Examples

\--
-- ABORT (default for finite DML)
\--
insert into CAMPAIGNS
 values ('aaa', 1), ('bbb', 2), ('ccc', 3), ('aaa', -1), ('ddd', 4), ('eee', 5);
\--
-- STUMBLE (default for infinite DML and for finite bulk-load)
\--
insert into CAMPAIGNS with options (bulk_load 'true')
 values ('aaa', 1), ('bbb', 2), ('ccc', 3), ('aaa', -1), ('ddd', 4), ('eee', 5);
merge into CAMPAIGNS t
 using (select stream CAMPAIGN_ID, TOTAL_CLICKS from CONSOLIDATED_STREAM) as s
 on t.TOTAL_CLICKS = s.TOTAL_CLICKS
 when matched then update set CAMPAIGN_ID = s.CAMPAIGN_ID
 when not matched then insert (CAMPAIGN_ID, TOTAL_CLICKS) values
 (s.CAMPAIGN_ID, s.TOTAL_CLICKS)
;
\--
-- SKIP
\--
insert into CAMPAIGNS
 with options (preserve_partial_writes 'true', bulk_load 'true')
 values ('aaa', 1), ('bbb', 2), ('ccc', 3), ('aaa', -1), ('ddd', 4), ('eee', 5);
merge into CAMPAIGNS with options (preserve_partial_writes 'true') t
 using (select stream CAMPAIGN_ID, TOTAL_CLICKS from CONSOLIDATED_STREAM) as s
 on t.TOTAL_CLICKS = s.TOTAL_CLICKS
 when matched then update set CAMPAIGN_ID = s.CAMPAIGN_ID
 when not matched then insert (CAMPAIGN_ID, TOTAL_CLICKS) values
 (s.CAMPAIGN_ID, s.TOTAL_CLICKS)
;

Managing Retries

When s-Server encounters a transient error–an error that might be resolved if s-Server tries rewriting the row–it may try rewriting the row, depending on the following:

  • Whether s-Server is writing from a stream or table. Writing from a stream is known as infinite DML, while writing from a table is known as finite DML. By default, when s-Server is writing from a stream, it retries writing rows that raise transient errors. When s-Server is writing from a table, it does not retry writing rows that raise transient errors.
  • How you have configured rewrites using the WITH OPTIONS clause of an INSERT or MERGE statement.

Using the WITH OPTIONS Clause to Manage Rewrites

You can adjust how and when s-Server rewrites rows in the case of transient errors. Note: Fatal errors ABORT, STUMBLE, or SKIP immediately, depending on how you have configured options above.

You do so by setting options in the WITH OPTIONS clause of an INSERT or MERGE statement. These options are as follows:

Option Definition
RETRY_COUNT The number of times that s-Server tries rewriting a batch of rows that fails because of a transient error. -1 means “no maximum number of retries”. Other negative values (and non-integer values) are illegal. For finite DML, RETRY_COUNT defaults to 0. For infinite DML, it defaults to -1.Each retry attempt will pause longer than the previous attempt.
RETRY_MAX_TIME The maximum number of milliseconds that s-Server will spend retrying a failed batch. It defaults to -1, which means “no maximum time”. RETRY_MAX_TIME = 0 means the same thing as RETRY_COUNT = 0: don’t retry at all.
RETRY_PAUSE_MIN The minimum number of milliseconds to pause between retries. It defaults to 10,000 (10 seconds). The value must be a positive integer–anything else raises an error.
RETRY_PAUSE_MAX The maximum number of milliseconds to pause between retries. It defaults to 300,000 (5 minutes). The value must be a positive integer–anything else raises an error.

s-Server continues trying to rewrite batches of rows until it reaches either _RETRYCOUNT or _RETRY_MAXTIME. Once it encounters either, then the statement will go on to enforce its core error policy: ABORT, STUMBLE, or SKIP.

We also raise an error if _RETRYCOUNT is set to a positive value on a finite DML statement but _BULKLOAD is not set to true. This is because the first step of retrying a batch is to roll back the previous attempt of the batch. For finite statements without _BULKLOAD, that would mean we would have to cache a potentially huge number of rows–all the way back to the beginning of the statement.

Here’s an example. The following INSERT statement retries batches of rows up to 5 times, pausing for 100 milliseconds before the first retry and pausing up to a maximum of 500 milliseconds before subsequent retries:

INSERT INTO foreignTable
with options
(
retry_count '5',
retry_pause_min '100',
retry_pause_max '500'
)
SELECT STREAM * from sourceStream;