The CREATE PUMP statement creates a pump, either in the specified schema (if the qualified-pump-name includes a schema name) or in the current schema.
A pump is a SQLstream schema object (an extension of the SQL standard) that provides a continuously running INSERT INTO stream SELECT… FROM query functionality, thereby enabling the results of a query to be continuously added into a named stream or foreign stream. You can also use a MERGE statement for the INSERT INTO clause, but only if you are inserting into a (foreign) table.
The topic MERGE in this guide provides an example of using a MERGE statement with a pump. See the topics SELECT statement and INSERT statement in this guide for more details.
All pumps are created as STOPPED by default. You can also specify that they be created as STARTED. We recommend creating pumps as STOPPED and then using ALTER PUMP to start them.
You cannot create a network of pumps with cyclic references - that is, pumps that feed streams from which their own data has originated. s-Server will return an error if you try to create such a pump. See Pumps and Cyclic References below.
Subtopics:
CREATE [ OR REPLACE ] PUMP [<schema-name>].<pump-name>
[ STARTED | STOPPED ]
[ OPTIONS ( <variable> 'value' [, <variable> 'value'] ) ]
[ DESCRIPTION '<string-literal>' ]
AS <streaming-insert>
INSERT INTO <sink-stream-name> SELECT STREAM <expression-list> FROM <source-stream>
CREATE PUMP simple_pump STOPPED
AS
INSERT INTO sink_stream
SELECT STREAM * from source-stream
The following code first creates and sets a schema, then creates two streams in this schema:
CREATE SCHEMA "Test";
SET SCHEMA '"Test"';
CREATE OR REPLACE STREAM "OrderDataWithCreateTime" (
"key_order" VARCHAR(20),
"key_user" VARCHAR(20),
"key_billing_country" VARCHAR(20),
"key_product" VARCHAR(20),
"quantity" INTEGER,
"eur" DOUBLE,
"usd" DOUBLE)
DESCRIPTION 'Creates origin stream for pump';
CREATE OR REPLACE STREAM "OrderData" (
"key_order" VARCHAR(20),
"key_user" VARCHAR(20),
"country" VARCHAR(20),
"key_product" VARCHAR(20),
"quantity" INTEGER,
"eur" DOUBLE,
"usd" DOUBLE)
DESCRIPTION 'Creates destination stream for pump';
The following code uses these two streams to create a pump. Data is selected from “OrderDataWithCreateTime” and inserted into “OrderData”.
CREATE SCHEMA "Pumps";
SET SCHEMA '"Pumps"';
CREATE OR REPLACE PUMP "200-ConditionedOrdersPump" STOPPED AS
--We recommend creating pumps as stopped
--then using ALTER PUMP [..] START to start it
INSERT INTO "Test"."OrderData" (
"key_order", "key_user", "country",
"key_product", "quantity", "eur", "usd")
--note that this list matches that of the query
SELECT STREAM
"key_order", "key_user", "key_billing_country",
"key_product", "quantity", "eur", "usd"
--note that this list matches that of the insert statement
FROM "Test"."OrderDataWithCreateTime";
To start writing data, use the following code. The code below ALTERS the pump using a qualified pump name (schema and pump).
ALTER PUMP "Pumps"."200-ConditionedOrdersPump" START;
For more detail, see the topic Pumps, Streams, and Processing Pipelines in the s-Server Concepts Guide.
The nature of streaming data means that streaming applications work in pipelines, with multiple streams connected by pumps. For example, you might have stream 1 that ingests data from a log file; stream 2 that enriches this data by joining it with data from an RDBMS system; stream 3 that performs analysis on this data; and stream 4 that provides data for StreamLab or another visualization application.
s-Server disallows cyclic references, that is, pumps that feed earlier streams from later streams. In the example above, you could not add an additional pump that feeds stream 1 from stream 4. When you execute a CREATE PUMP statement, s-Server checks for cyclic references and returns an error if it finds any.
To determine cyclic reference, we do a topological sort of all existing pumps to determine if they are cyclic. The nature of streaming data means that streaming applications work in pipelines, with multiple streams connected by pumps. For example, you might have stream 1 that ingests data from a log file; stream 2 that enriches this data by joining it with data from an RDBMS system; stream 3 that performs analysis on this data; and stream 4 that provides data for StreamLab or another visualization application. Here, the following pumps operate:
These streams represent a topological order, and in order for the application to function correctly, pumps need to start in order. s-Server automatically starts pumps in the correct topological order when you use a single ALTER PUMP statement referencing multiple pumps.