Query Patterns

This section provides some examples of common use cases streaming query patterns, including service level agreements and periodic reports.

Service Level Agreement

A common requirement is to report on the success or failure of business processes to meet service level agreements (SLAs). This section presents a pair of examples.

Finding cases that meet a time-based SLA

Assuming two streams orders (ROWTIME, id, city, …) and shipments (ROWTIME, order_id, …) the following query will show all those orders intended for New York which were successfully shipped within an hour:

CREATE VIEW compliant_orders AS
  SELECT STREAM *
FROM orders OVER sla
JOIN shipments
ON orders.id = shipments.orderid
WHERE city = 'New York'
WINDOW sla AS (RANGE INTERVAL '1' HOUR PRECEDING)

The “triggering” event which forces the emission of the result is the shipment (which is not windowed).

Identifying SLA failures

A simple approach is to report on orders that have been shipped but which have missed the deadline. In this example we find all orders shipped between 1 and 24 hours after order:

CREATE VIEW late_shipments AS
  SELECT STREAM *
FROM orders OVER sla
JOIN shipments
ON orders.id = shipments.orderid
WHERE city = 'New York'
ANDshipments.ROWTIME - orders.ROWTIME > INTERVAL '1' HOUR
WINDOW sla AS (RANGE INTERVAL '1' DAY PRECEDING)

This formulation does not capture orders which were shipped in more than one day, or have not been shipped yet. To do that we can create a stream view of “SLA expiry” events:

CREATE VIEW orders_sla AS
  SELECT STREAM o.ROWTIME + INTERVAL '1' HOUR AS ROWTIME
, o.*
FROM orders o

Now we can find those orders for which no shipment has been received in the hour between the order being logged and the expiry of the SLA period. Note that the “triggering event” is now the orders_sla, and the filter on s.orderid IS NULL restricts the output to only those orders which fail the SLA:

CREATE VIEW non_compliant_orders AS
  SELECT STREAM *
FROM orders_sla AS os
OUTER JOIN shipments s over sla
ON os.id = s.orderid
WHERE city = 'New York'
ANDs.orderid IS NULL
WINDOW sla AS (RANGE INTERVAL '1' HOUR PRECEDING)
 CREATE VIEW orders_and_shipments AS
  SELECT STREAM os.id, os.cust_id, 1 as net
FROM orders_sla AS os
  UNION
  SELECT STREAM s.order_id, s.cust_id, -1 as net
FROM shipments s
CREATE VIEW non_compliant_orders AS
  SELECT STREAM id, cust_id, SUM(net)
  FROM orders_and_shipments as B
 GROUP BY FLOOR(B.ROWTIME TO HOUR),id, cust_id
 HAVING SUM(net) > 0

The condition SUM(net)>0 means the order hasn’t shipped.

Periodic Reports

A periodic report is a summary of activity over a time period, such as a daily sales report or an hourly traffic report. It’s easy to create such reports in SQL with GROUP BY. SQL standard intervals are in units of years, months, days, hours, minutes, and seconds. But what if you want a summary report for each 8-hour shift in the day, or an interim report at 3pm and a final report at 6pm each day? These are requirements for partitioned streaming aggregations over unusual intervals. Here’s an example of creating GROUP BY reports at 8 A.M. and 4 P.M. with partitioning by ticker.

CREATE OR REPLACE VIEW "ShiftReports"
DESCRIPTION
  '24x7 round-the-world trading, summarized in two 8-hour shift reports each day'
AS
SELECT STREAM
  "ticker",
  COUNT(*) AS "bids",
  SUM("shares") AS "shares",
  CAST(SUM("shares" * "price") AS DECIMAL(10,2)) AS "amounts"
FROM SALES.BIDS AS B
GROUP BY
  FLOOR(B.ROWTIME - INTERVAL '8' HOUR TO DAY),
  FLOOR(B.ROWTIME - INTERVAL '16' HOUR TO DAY),
  "ticker";

Rows are output anytime some column in the GROUP BY list changes, which for the above example will be your chosen intervals, every day at 8am and 4pm.

This technique of converting intervals to DAY is very useful for handling or reporting any general uneven intervals groupings.

For even intervals, the following code is of interest:

CREATE OR REPLACE VIEW "ShiftReports-10sec"
 DESCRIPTION
'24x7 round-the-world trading, summarized in 10-second reports each day'
 AS
 SELECT STREAM
"ticker",
COUNT(*) AS "bids",
SUM("shares") AS "shares",
CAST(SUM("shares" * "price") AS DECIMAL(10,2)) AS "amounts"
 FROM SALES.BIDS AS B
 GROUP BY
FLOOR((B.ROWTIME - TIMESTAMP '1970-01-01 00:00:00') SECOND / 10 to second),
"ticker";

TIMESTAMP ‘1970-01-01 00:00:00’ is an arbitrary timestamp (in this case the epoch). The subtraction gives an interval in seconds. This example provides six summary reports each minute, for each 10-second interval, and is equivalent to the next example.

In \$SQLSTREAM_HOME/s-Server/examples/querypatterns/periodic you can find a catalog.sql script with the code for the ShiftReports view above, plus a DataGen configuration file to generate a stream of SALES.BIDS sample data. The \$SQLSTREAM_HOME/examples/querypatterns/agg/sales.bids.random.set.xml file generates 5000 rows at a rate of 10 per second. This is a bit more than 8 minutes of data, not enough to produce any aggregated output from ShiftReports. But you probably weren’t willing to wait 8 hours for your first aggregated rows, anyway. So consider this shorter-attention-span simulation:

CREATE OR REPLACE VIEW "ShiftReports-10sec"
DESCRIPTION
  '24x7 round-the-world trading, summarized in 10-second reports each day'
AS
SELECT STREAM
  "ticker",
  COUNT(*) AS "bids",
  SUM("shares") AS "shares",
  CAST(SUM("shares" * "price") AS DECIMAL(10,2)) AS "amounts"
FROM SALES.BIDS AS B
GROUP BY
  FLOOR(B.ROWTIME TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '10' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '20' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '30' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '40' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '50' SECOND TO MINUTE),
  "ticker";

This view is also provided in \$SQLSTREAM_HOME/s-Server/examples/querypatterns/periodic/catalog.sql. ShiftReports-10sec produces aggregated output for each ticker every 10 seconds.

Periodic Reports In Depth

The Periodic Reports examples include ShiftReports-10sec which produces aggregated output for each ticker every 10 seconds.

CREATE OR REPLACE VIEW "ShiftReports-10sec"
DESCRIPTION '24x7 round-the-world trading, summarized in 10-second reports each day' AS
SELECT STREAM
  "ticker",
  COUNT(*) AS "bids",
  SUM("shares") AS "shares",
  CAST(SUM("shares" * "price") AS DECIMAL(10,2)) AS "amounts"
FROM SALES.BIDS AS B
GROUP BY
  FLOOR(B.ROWTIME TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '10' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '20' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '30' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '40' SECOND TO MINUTE),
  FLOOR(B.ROWTIME - INTERVAL '50' SECOND TO MINUTE),
  "ticker";

This analysis involved adding three new columns, amount, total shares, and total amt. The Calc file is included at \$SQLSTREAM_HOME/examples/querypatterns/periodic/ShiftReports-10sec.ods. The total shares and total amt values are calculated to aggregate the per-ticker values within each 10-second grouping.

SQLstream’s streaming GROUP BY uses the time-based grouping clauses to identify time-period cut-off points for incoming ROWTIMEs. Aggregations are performed for each ticker during a given time period. At each cut-off point, the current per-ticker aggregations are output and reset for the next time period.

The incoming data starts with ROWTIME 12:12:52.00. (SQLstream ROWTIMEs are in milliseconds, but in this example Calc shows them rounded to hundredths of a second.) The time-based grouping clauses specify 10-second time periods starting at the top of each minute, e.g., hh:mm:00, hh:mm:10, hh:mm:20, etc. So, this input data’s first time period will end just before 12:13:00.

This means that the arrival of a row with a ROWTIME >= 12:13:00 triggers the output of the first time period’s aggregations.