Telemetry provides information on the structure, status, and progress of data flowing in s-Server, such as the net input and output bytes for a particular stream, when a stream started processing data, and when a stream stopped processing data. Telemetry allows you to see the overall structure of your data pipelines, how your data is flowing and where it is actually transformed, and the progress and throughput of data moved through pipelines, so that, for example, you can identify performance-slowing bottlenecks. The management views SESSIONS_VIEW and STATEMENTS_VIEW may both be useful in identifying graphs and nodes related to your streaming application.
This topic contains the following information:
To understand telemetry and related concepts, it helps to have a clear picture of how s-Server computes the results of a query.
s-Server consist of two layers:
The specification of a data-flow is called a physical query plan. The actual data flow is called a stream graph, which is a chain of stream nodes, each of which is a small software machine that applies some relational operation to its input data stream(s) to produce is output data stream(s). Stream graphs are also referred to as stream operators. Nexus nodes are defined by a CREATE STREAM statement. They are freestanding nodes to which other stream graphs can connect and disconnect, in order to receive or deliver data.
In the most common case, a node has 1 input and 1 output, and so the graph is a linear chain of operations. But data flows can merge and split, and the graph of a new query can connect to the graphs of existing queries as they run.
Telemetry data is available through system functions that you can query using SQL. Functions are available as both tables and streams. Tables provide a single snapshot of the server, while streams return continually-updating information.
Category | Table Function | Stream Function |
---|---|---|
Stream Graph Virtual Table/Stream. Contains a row to describe each stream-graph. Stream-graphs represent one SQL statement running in s-Server, usually either a SELECT, INSERT, or a running PUMP. | getStreamGraphInfo(session integer, recent float) | getStreamGraphInfoForever(session integer, period float) |
Stream Operator (Node) Virtual Table/Stream. Contains a row to describe each stream-node. Stream-nodes represent one step in the running implementation of a SQL statement. | getStreamOperatorInfo(session integer, recent float) | getStreamOperatorInfoForever(session integer, period float) |
Server Info Virtual Table/Stream. Contains a single row, which describes s-Server as a whole. Each column is some attribute (status or statistic) of the overall server. | getServerInfo(recent float) | getServerInfoForever(period float) |
These functions take the following arguments:
Argument | Definition |
---|---|
SESSION | getStreamGraphInfo and getStreamOperatorInfo only. Sessions refer to a single connection to s-Server. You can designate a session ID, or use 0 to mean all sessions. To determine session, you can query sys_boot.mgmt.sessions_view: select session_name, id from sys_boot.mgmt.sessions_view (or use the SQL script at $SQLSTREAM_HOME/support/sql/showSessions.sql). This query will return results along the following lines: Session_Name: LOCALDB.StreamLab_Output_Buses.pipeline_2_step_3-to-pipeline_2_out_sink_1-Pump ID: 7338* You should be able to identify your desired session by session_name. Use the corresponding ID to get information on this session. See the topic s-Server Features for Managing and Monitoring for more details on sys_boot.mgmt.sessions_view. |
RECENT | In seconds, how recent a snapshot is required. 0 will force a new snapshot, 10 means use a extant snapshot no more than 10 secs old. You can use RECENT to get a pair of results that match the same snapshot. For more detail, see How RECENT Works below.Note: It's a good idea to add 0.5 sec to RECENT for the second query. This allows room for the 2nd query to be processed slightly later than the 1st. |
PERIOD | In seconds, how often to fetch and emit new data. |
You query these using a statement along the following lines:
SELECT NAME_IN_QUERY_PLAN, NODE_ID, NET_INPUT_ROWS, NET_INPUT_BYTES, NET_OUTPUT_ROWS, NET_OUTPUT_BYTES FROM TABLE (sys_boot.mgmt.getStreamOperatorInfo(0, 10));
Note: Columns do not require quotation marks.
Note that the above query uses '0' for SESSION. This tells s-Server "give me information on all sessions". This query returns a result along the following lines:
NAME_IN_QUERY_PLAN | NODE_ID | NET_INPUT_ROWS | NET_INPUT_BYTES | NET_OUTPUT_ROWS | NET_OUTPUT_BYTES |
---|---|---|---|---|---|
[LOCALDB, StreamLab_Output_Meetup, dashboard_map_step_2] | 325006 | 36684 | 733688 | 25216 | 504312 |
[LOCALDB, StreamLab_Output_Meetup, dashboard_map_step_2] | 325006 | 40779 | 815584 | 29310 | 586208 |
[LOCALDB, StreamLab_Output_Meetup, us_rsvps] | 325008 | 41870 | 837408 | 113269 | 2265384 |
[LOCALDB, StreamLab_Output_Meetup, dashboard_bar_CHARt_step_2] | 325010 | 42614 | 852288 | 0 | 0 |
[LOCALDB, StreamLab_Output_Meetup, us_rsvps] | 325008 | 46490 | 929792 | 127127 | 2542536 |
[LOCALDB, StreamLab_Output_Meetup, dashboard_bar_CHARt_step_2] | 325010 | 47316 | 946328 | 0 | 0 |
To see all the info about all the statements that belong to session 7338, run the query:
SELECT * FROM TABLE(SYS_BOOT.MGMT.getStreamGraphInfo(7338, 0));
As of version 6.0, you no longer need to use the preface SYS_BOOT.MGMT.
You can restrict your results using a SELECT list or WHERE clauses, and you can compute expressions based on the telemetry data. That is, you can use SQL to limit or filter telemetry data. The management views SESSIONS_VIEW and STATEMENTS_VIEW may both be useful in identifying graphs and nodes related to your streaming application.
All streams created by users have a NAME_IN_QUERY_PLAN that corresponds to the fully qualified sql name of the stream, which has the format [LOCALDB,
Therefore, you could filter for only named streams by running the following query:
SELECT STREAM * FROM STREAM(sys_boot.mgmt.getStreamOperatorInfoForever(0, 10)) WHERE NAME_IN_QUERY_PLAN LIKE '%LOCALDB%';
To filter only streams from a particular schema, you could run a query along the following lines:
SELECT SELECT STREAM * FROM STREAM(sys_boot.mgmt.getStreamOperatorInfoForever(0, 10)) WHERE NAME_IN_QUERY_PLAN LIKE '%LOCALDB, my-schema%';
You can also query individual columns, as in
SELECT STREAM NAME_IN_QUERY_PLAN, NODE_ID, NET_INPUT_ROWS, NET_INPUT_BYTES, NET_OUTPUT_ROWS, NET_OUTPUT_BYTES FROM STREAM(sys_boot.mgmt.getStreamOperatorInfoForever(0, 10));
A full list of columns for each virtual table/stream appears at the end of this page.
As of version 6.0, you no longer need to use the preface SYS_BOOT.MGMT.
You can use a pump to move telemetry data into a stream that you've created. You first need to set up a stream that will accept columns, in order, from the telemetry table/stream that you want to use. The example below uses columns from getStreamOperatorInfoForever().
CREATE OR REPLACE SCHEMA "telemetry-results";
SET schema '"telemetry-results"';
CREATE OR REPLACE STREAM "telemetry-results-stream"
(NODE_ID VARCHAR(32),
LAST_EXEC_RESULT VARCHAR(32),
NET_INPUT_ROWS BIGINT,
INPUT_ROWTIME_CLOCK TIMESTAMP,
NET_INPUT_BYTES BIGINT,
NAME_IN_QUERY_PLAN VARCHAR(32),
NET_OUTPUT_ROWS BIGINT,
OUTPUT_ROWTIME_CLOCK TIMESTAMP,
NET_OUTPUT_BYTES BIGINT
)
;
CREATE OR REPLACE PUMP "telemetry-results"."my-Pump" STOPPED AS
INSERT INTO "telemetry-results"."telemetry-results-stream"
(NODE_ID, LAST_EXEC_RESULT, NET_INPUT_ROWS, INPUT_ROWTIME_CLOCK, NET_INPUT_BYTES, NAME_IN_QUERY_PLAN, NET_OUTPUT_ROWS, OUTPUT_ROWTIME_CLOCK, NET_OUTPUT_BYTES)
SELECT STREAM NODE_ID, LAST_EXEC_RESULT, NET_INPUT_ROWS, INPUT_ROWTIME_CLOCK, NET_INPUT_BYTES, NAME_IN_QUERY_PLAN, NET_OUTPUT_ROWS, OUTPUT_ROWTIME_CLOCK, NET_OUTPUT_BYTES FROM STREAM (sys_boot.mgmt.getStreamOperatorInfoForever(0, 1));
--selects columns from telemetry Stream Operator stream
By default, pumps are created as stopped. That means when you initially create them, they don't move any data.
To start the pump, enter the following line:
ALTER PUMP "telemetry-results"."my-Pump" START;
To query the new stream, enter the following line:
SELECT STREAM * from "telemetry-results"."telemetry-results-stream";
Here the 2 requests session 2, and the 0 require a snapshot no older than 0 secs. That is, it forces a new snapshot.
If you asked instead:
SELECT * FROM TABLE(SYS_BOOT.MGMT.getStreamGraphInfo(2, 5));
the system would use a snapshot it already had if this snapshot was no older than 5 seconds. (If that data is good enough, it saves a little work to use the existing snapshot.)
Or consider:
SELECT * FROM TABLE(SYS_BOOT.MGMT.getStreamGraphInfo(0, 0));
Session = 0 means report on all sessions (since the lowest session ID is 1).
You can use RECENT to match queries:
SELECT * FROM TABLE(SYS_BOOT.MGMT.getServerInfo(0, 0));
SELECT * FROM TABLE(SYS_BOOT.MGMT.getStreamGraphInfo(0, 2));
SELECT * FROM TABLE(SYS_BOOT.MGMT.getStreamOperatorInfo(0, 2));
We want all 3 parts to come from the same snapshot, so that all the data is consistent. Note that the 1st query forces a new snapshot, and the following 2 queries accept stale data (up to 2 seconds state). They will thus re-use the snapshot from the 1st query.
As of version 6.0, you no longer need to use the preface SYS_BOOT.MGMT.
(getStreamGraphInfo/getStreamGraphInfoForever)
This table/stream returns rows for each stream graph. A stream graph generally corresponds to one SQL statement (DML or query). Stream graph data offers you an overview of how a particular node in a stream graph is processing data, including information on how much data it has processed since inception (in bytes and rows), how quickly it is processing data, and how much data it has output since inception.
We recommend looking at StreamGraphInfo first. Once you have identified a stream graph with issues, you can then use StreamOperatorInfo to see how the operation is doing in more detail, likely by filtering on graph_id, which serves as a foreign key between the two tables/streams.
The set of columns fall into four groups: identification, description, status, and statistics. Identification and description are stable identifiers and CHARacteristics for the node, including numerical identifiers for the node and stream_graph, and so on. Status and statistics offer information on what the stream node is doing now, how long it has been doing it, how much data it has processed, how quickly it is doing so, and so on. The former are useful for determining information about a node, and the latter are useful for understanding how the stream node is performing.
Note: The s-Server query makes a unique identifier for each stream node.
Note: All errors that are logged to the trace log are also available in a system management stream called sys_boot.mgmt.all_trace. The latest (up to 1000) error messages are also available in a table called sys_boot.mgmt.saved_trace. You can use these to create visualizations of errors; they can be accessed from SQLline or anywhere that has a JDBC connection to s-Server. See the Global Error Stream for more details.
These fields are helpful in answering questions like "why is my pipeline stuck?" or "why won't the scheduler do what I want it to do next?" Important columns include the following:
Input and output are measured in bytes and bytes/sec. Input and output are also measured in rows and rows/sec when possible. Most stream operators process row by row, and so can count rows. But a few process a whole buffer at a time, and since row-length can vary, we don't pay the cost of converting, and sometimes report null for the stats in terms of rows.
These are the columns returned by the functions getStreamGraphInfo and getStreamGraphInfoForever:
Column Name | Type | Definition |
---|---|---|
measured_at | TIMESTAMP NOT NULL | Time of query snapshot. |
graph_id | INT NOT NULL | Numerical identifier for stream graph. Numbered from 1 as graphs are added to scheduler. The same column appears in the Stream Node Virtual Table, serving as a foreign key. You can use graph_id in conjunction with the Stream Node table to learn more about individual nodes in a graph. |
statement_id | INT | 0 if not from a SQL statement. These are listed in the monitoring view called STATEMENTS_VIEW. See s-Server Features for Managing and Monitoring for more details. |
session_id | INT | Numerical identifier for the session, which is equates to JDBC Connection to s-Server. These are listed in the monitoring view called SESSIONS_VIEW. See s-Server Features for Managing and Monitoring for more details. |
source_sql | VARCHAR(2048) | SQL for statement. |
sched_state | CHAR(2) | State of the stream graph in the scheduler. See sched_state above for values. |
close_mode | CHAR(4) | When the stream graph was closed, and how it was closed. |
is_global_nexus | BOOLEAN | Indicates a named stream. These are defined by a CREATE STREAM statement. They are freestanding nodes to which other stream graphs can connect and disconnect, in order to receive or deliver data. |
is_auto_close | BOOLEAN | Whether or not the stream will close automatically at state E |
num_nodes | INT NOT NULL | Number of nodes in the stream graph. These may be shared with other stream graphs, as in the case of named streams. |
num_live_nodes | INT NOT NULL | Number of open nodes not at state E. |
num_data_buffers | INT NOT NULL | Number of data buffers between nodes.Note: Two adjacent nodes share a data buffer; the upstream node writes rows to the buffer, and the downstream node reads rows from the buffer. |
total_execution_time | DOUBLE | Time spent executing this stream graph in seconds. |
total_opening_time | DOUBLE | Time spent setting up this stream graph in seconds.Note: When a user sends a SQL query to the server, the query engine sends a query plan to the data engine layer. The data engine then "sets up" the query, by performing a number of steps, including allocating and initializing software objects that do the data operations, allocating data buffers to stand between the operators, adding to internal control & bookkeeping data structures, and connecting the graph to one or more global nexuses. |
total_closing_time | DOUBLE | Time spent tearing down this graph in seconds.Note: When the user ends the query, the data engine stops executing the stream graph and does the set-up in reverse. |
net_input_bytes | BIGINT | Overall input read by the stream graph since it started, in bytes. |
net_input_rows | BIGINT | Overall input read by the stream graph since it started, in rows. |
net_input_rate | DOUBLE | Overall input rate, averaged since the stream graph started, as bytes/sec. |
net_input_row_rate | DOUBLE | Overall input rate, averaged since the stream graph started, as rows/sec. |
net_output_bytes | BIGINT | Overall output written by the stream graph since it started, in bytes. |
net_output_rows | BIGINT | Overall output written by the stream graph since it started, in rows |
net_output_rate | DOUBLE | Overall output rate, averaged since the stream graph started, as bytes/sec. |
net_output_row_rate | DOUBLE | Overall output rate, averaged since the stream graph started, as rows/sec. |
net_memory_bytes | BIGINT | Bytes of working memory now for whole stream graph. |
max_memory_bytes | BIGINT | Maximum bytes of working memory since stream graph opened. |
when_opened | timestamp | System time when stream graph was opened. |
when_started | timestamp | System time of first execution of stream graph. |
when_finished | timestamp | System time of latest execution of stream graph. |
when_closed | timestamp | System time when stream graph was closed. |
(getStreamOperatorInfo/getStreamOperatorInfoForever)
This table/stream offers you an overview of how a node (also known as a stream operator) is processing data.
Columns for these rows fall into four groups: identification, description, status, and statistics. Identification and description are stable identifiers and CHARacteristics for the stream, including numerical identifiers for the stream graph, its session number, and so on. Status and statistics offer information on what the stream graph is doing now, how long it has been doing it, how much data it has processed, how quickly it is doing so, and so on. The former are useful for determining information about a stream graph, and the latter are useful for understanding how the stream graph is performing.
Identification columns consist of graph_id and statement_id. These are both unique numerical identifiers.
Description columns include session_id, source_sql, is_global_nexus, is_auto_close, num_nodes, and num_data_buffers. These tell you what the stream looks like before any processing occurs.
Status columns include sched_state, num_live_nodes, and num_data_buffers. These columns tell you how the scheduler is handling the stream, as well as how many active nodes the stream is engaging.
Statistics columns include net_execution_time, net_schedule_time, net_input_bytes, net_input_rows, net_input_rate, net_input_row_rate, net_output_bytes, net_output_rows, net_output_rate, and net_output_row_rate.
These are the columns returned by getStreamOperatorInfo / getStreamOperatorInfoForever:
Column name | Type | Definition |
---|---|---|
measured_at | TIMESTAMP NOT NULL | Time of query snapshot. |
node_id | VARCHAR(8) NOT NULL | Node identifier, such as 4.6 for the 6th node in the 4th stream graph |
graph_id | INT NOT NULL | Graph identifier. This column also appears in the Stream Graphs table/stream. |
source_sql | VARCHAR(2048) | SQL for stream graph. |
query_plan | VARCHAR | The query plan executed by this node. See EXPLAIN PLAN examples. |
name_in_query_plan | VARCHAR | A generated name that matches the logical and physical query plans as reported in the s-Server trace log. All streams created by users have a NAME_IN_QUERY_PLAN that corresponds to the fully qualified name of the stream, which has the format [LOCALDB, |
num_inputs | INT NOT NULL | Count of input neighbors. |
input_nodes | VARCHAR(64) NOT NULL | node_id numerical identifier of input neighbors. |
num_outputs | INT NOT NULL | Count of output neighbors. |
output_nodes | VARCHAR(64) NOT NULL | node_id numerical identifier of output neighbors. |
sched_state | CHAR(2) | State of the stream graph in the scheduler. See sched_state above for values. |
last_exec_result | CHAR(3) | Latest code returned by ExecStream::execute() -- e.g. UND (underflow), OVF (overflow). See last_exec_result above. |
num_busy_neighbors | INT | When positive, indicates that this node is blocked because some neighboring nodes are busy. |
input_rowtime_clock | timestamp | ROWTIME of the latest input row read by this node. See input_rowtime_clock above. |
output_rowtime_clock | timestamp | ROWTIME of the latest output row written by this node. See input_rowtime_clock above. |
execution_count | BIGINT | Number of times this node has been executed by the scheduler. |
started_at | timestamp | System time of first execution. |
latest_at | timestamp | System time of latest execution. |
net_execution_time | DOUBLE | Total time node has executed in seconds. |
net_schedule_time | DOUBLE | Total overhead scheduling for operator in seconds. |
net_input_bytes | BIGINT | Overall input read by the stream node since it started, in bytes. |
net_input_rows | BIGINT | Overall input read by the stream node since it started, in rows. |
net_input_rate | DOUBLE | Overall input rate, averaged since the stream node started, as bytes/sec. |
net_input_row_rate | DOUBLE | Overall input rate, averaged since the stream node started, as rows/sec. |
net_output_bytes | BIGINT | Overall output written by the stream node since it started, in bytes. |
net_output_rows | BIGINT | Overall output written by the stream node since it started, in rows |
net_output_rate | DOUBLE | Overall output rate, averaged since the stream node started, as bytes/sec. |
net_output_row_rate | DOUBLE | Overall output rate, averaged since the stream node started, as rows/sec. |
net_memory_bytes | BIGINT | Bytes of working memory now for this node. |
max_memory_bytes | BIGINT | Maximum bytes of working memory since node opened. |
(getServerInfo/getServerInfoForever)
The Server Info Virtual Table provides a summary of the entire server. This includes information on the number of currently active sessions, the number of currently active SQL statements, the number of currently executing threads and the number of stream graphs currently open. Because it summarizes the entire server, this table (getServerInfo) has only one row; the stream (getServerInfoForever) returns one row per snapshot.
These are the columns returned by getServerInfo / getServerInfoForever:
Column | Type | Meaning |
---|---|---|
measured_at | TIMESTAMP NOT NULL | Time of snapshot. |
is_running | BOOLEAN NOT NULL | Whether or not server is running. |
is_licensed | BOOLEAN NOT NULL | Whether or not server has some kind of license. |
license_kind | VARCHAR(32) | Type of license. |
license_version | VARCHAR(32) | Version number for license. |
is_throttled | BOOLEAN NOT NULL | Whether or not the server is currently throttled. |
num_sessions | INT | Number of currently active sessions. |
num_statements | INT | Number of currently active SQL statements. |
started_at | timestamp | Time server started. |
throttled_at | timestamp | Time throttling started. If server is not throttled, value is null. |
throttle_level | DOUBLE | Throughput limit enforced by throttling. If server is not throttled, value is null. |
num_exec_threads | INT | Number of currently executing threads. |
num_stream_graphs_open | INT | Number of stream graphs currently open. |
num_stream_graphs_closed | INT | Number of stream graphs currently closed. |
num_stream_operators | INT | Number of stream operated. |
num_stream_graphs_open_ever | INT | Number of stream graphs opened since server started. |
num_stream_graphs_closed_ever | INT | Number of stream graphs closed since server started. |
net_memory_bytes | BIGINT | Total bytes of working memory. |
max_memory_bytes | BIGINT | Max bytes of working memory since server started. |
usage_at | timestamp | Timestamp of last usage snapshot. |
usage_since | timestamp | Usage data is from usage_since to usage_at. |
usage_reported_at | timestamp | Timestamp of last successful upload of usage data. |
net_input_bytes_today | BIGINT | Overall input read by the server since midnight, in bytes. |
net_input_bytes | BIGINT | Overall input read by the server since it started, in bytes. |
net_output_bytes | BIGINT | Overall output written by the server since it started, in bytes. |
The example queries below mostly show examples of single queries to the finite table functions. If you want to monitor over time, just switch to using the equivalent streaming functions.
Telemetry tracks two kinds of memory:
Some stream operators/nodes require a significant amount of auxiliary memory. For example, windowed aggregation requires remembering data from past rows, and this is kept in a window structure that can be large. The amount of auxiliary memory can vary over time. The Stream Node Table/Stream (getStreamOperatorInfo) keeps track of the current memory used by a stream operator/node in the net_memory_bytes column. It also tracks the maximum value since the operator started in the max_memory_bytes column. Each node, represented by row in getStreamOperatorInfo has its own pair of values. You can query these columns with a statement along the following lines:
!outputformat vertical
select graph_id,net_memory_bytes,net_input_bytes,net_input_rows,
from table(stream(getStreamOperatorInfoForever(0, 2))
where sched_state <> 'Z'
order by net_memory_bytes desc
fetch first 10 rows only;
The other kind of memory that can be significant is that used by the data buffers that s-Server uses to pass rows from node to node. You can track this memory using the Stream Graph Table/Stream. In the Stream Graph Table, keeps track of the total of all the buffers in the graph plus all the auxiliary memory required by all the nodes in the graph in the net_memory_bytes column. It also tracks the maximum over the lifetime of the stream graph in the max_memory_bytes column. You can query these columns with a statement along the following lines:
!outputformat vertical
select graph_id,net_memory_bytes,net_input_bytes,net_input_rows,source_sql
from table(getStreamGraphInfo(0,2))
where sched_state <> 'Z'
order by net_memory_bytes desc
fetch first 10 rows only;
Rather than tracking net_memory_bytes - the current memory use - you may want to track max_memory_bytes - the peak amount of memory ever used by this operator or graph.
A memory leak would show as the net_memory_bytes growing over time for one or more graphs. You cannot see it in a single poll of the telemetry data. Instead, you can either poll multiple times using the query above, or you can use the streaming api with a slow polling frequency, and build a pipeline which detects rising memory allocation over time.
CREATE OR REPLACE SCHEMA "telemetry";
SET SCHEMA '"telemetry"';
CREATE OR REPLACE VIEW "graph_5mins" AS
SELECT STREAM * FROM STREAM(getStreamGraphInfoForever(0,300))
WHERE sched_state <> 'Z';
-- pick up memory from 5 and 60 minutes ago
-- If you have hourly aggregations you may want to check every 1 and 3 or 4 hours
CREATE OR REPLACE VIEW "graph_memory_history" AS
SELECT STREAM graph_id, source_sql, net_memory_bytes
, LAG(net_memory_bytes, 1, 0) OVER "60min" as memory_5mins_ago
, LAG(net_memory_bytes, 11, 0) OVER "60min" as memory_60mins_ago
FROM "graph_5mins"
WINDOW "60min" AS (PARTITION BY graph_id RANGE INTERVAL '1' HOUR PRECEDING);
CREATE OR REPLACE VIEW "graph_memory_delta" AS
SELECT STREAM *
, net_memory_bytes - memory_5mins_ago AS memory_growth_5mins
, net_memory_bytes - memory_60mins_ago AS memory_growth_60mins
FROM "graph_memory_history";
-- now focus on the graphs which have rising memory
-- we could insist on short term rising as well
-- we could add more periods
CREATE OR REPLACE VIEW "graph_memory_rising" AS
SELECT STREAM *
FROM "graph_memory_delta"
WHERE memory_growth_60mins > 0;
Now monitor the final view; of course ignoring the first cycle of data:
select stream * from "telemetry"."graph_memory_rising";
We can track execution time (which approximates to the CPU used):
You can get information on bytes or rows read by a stream graph or operator by calling getStreamGraphInfo() or getStreamOperatorInfo() for a session in your application.
To identify your application's session, we recommend looking at sys_boot.mgmt.sessions_view. You may also be able to identify your application's stream graph by looking at the column NAME_IN_QUERY_PLAN. For named streams, these correspond to the fully qualified sql name of the stream, which has the format [LOCALDB,
To get information on bytes or rows read into a stream graph or operator, query the following columns:
Column | Type | Description |
---|---|---|
net_input_bytes | BIGINT | Overall input read by the stream graph since it started, in bytes. |
net_input_rows | BIGINT | Overall input read by the stream graph since it started, in rows. |
net_input_rate | DOUBLE | Overall input rate, averaged since the stream graph started, as bytes/sec. |
net_input_row_rate | DOUBLE | Overall input rate, averaged since the stream graph started, as rows/sec. |
net_output_bytes | BIGINT | Overall output written by the stream graph since it started, in bytes. |
net_output_rows | BIGINT | Overall output written by the stream graph since it started, in rows |
net_output_rate | DOUBLE | Overall output rate, averaged since the stream graph started, as bytes/sec. |
net_output_row_rate | DOUBLE | Overall output rate, averaged since the stream graph started, as rows/sec. |
Overall input means the total input that enters the stream graph or stream operatorfrom the outside.
At server level, s-Server does not record rows or row rates; instead it records
Column | Type | Description |
---|---|---|
net_input_bytes | BIGINT | Overall input read by the server since it started, in bytes. |
net_output_bytes | BIGINT | Overall output written by the server since it started, in bytes. |
net_input_bytes_today | BIGINT | Overall input read by the server since midnight, in bytes. |
Source foreign streams are managed by an XO of type AnonymousJavaUdxRel. This XO handles any Java UDX; we can recognize the input foreign streams nodes because they have no input nodes.
The telemetry stats show zero data in (because there are no input nodes); whatever data is read by the plugin is recorded as net_output_rows, net_output_bytes etc as the data is delivered downstream.
Example: Using the StreamLab Gallery Apps - Sydney Buses and CDN, we see:
!outputformat vertical
select xo.node_id,xo.name_in_query_plan,xo.query_plan,xo.net_output_rows, xo.net_output_row_rate
, g.source_sql
from table(getStreamOperatorInfo(0,2)) xo
join table(getStreamGraphInfo(0,2)) g using (graph_id)
where xo.name_in_query_plan like 'AnonymousJavaUdxRel%'
and xo.num_inputs = 0
order by xo.net_output_rows desc
fetch first 5 rows only;
NODE_ID 83.0
NAME_IN_QUERY_PLAN AnonymousJavaUdxRel.#44650:21684
QUERY_PLAN AnonymousJavaUdxRel(param0=['KAFKA10-source;137;#
NET_OUTPUT_ROWS 8700
NET_OUTPUT_ROW_RATE 765.7860186133535
SOURCE_SQL INSERT INTO "LOCALDB"."StreamLab_Output_cdn"."source_1_ns"
(SELECT STREAM "source_1_fs"."ReportTime", "source_1_fs"."ip", "source_1_fs"."clientOS", "source_1_fs"."clientVersion", "source_1_fs"."sessionId", "source_1_fs"."contentId", "source_1_fs"."bufferingDelay", "source_1_fs"."bitRate", "source_1_fs"."cacheMisses"
FROM "LOCALDB"."StreamLab_Output_cdn"."source_1_fs" AS "source_1_fs")
NODE_ID 39.1
NAME_IN_QUERY_PLAN AnonymousJavaUdxRel.#25825:11569
QUERY_PLAN AnonymousJavaUdxRel(param0=['org.postgresql.Driver'], param1=['jdbc:postgresql://localhost/demo'], param2=['demo'], param3=['demodemo'], param4=['SELECT *
NET_OUTPUT_ROWS 2422
NET_OUTPUT_ROW_RATE 19708.201443532176
SOURCE_SQL INSERT INTO "LOCALDB"."StreamLab_Output_buses"."sink_1"
(SELECT STREAM "input"."pixel", "input"."avg", "input"."avgB1_0u", "input"."avgB1_0l", "input"."avgB2_0u", "input"."avgB2_0l", "input"."speed_cat", "input"."id", "input"."reported_at", "input"."speed", "input"."driver_no", "input"."prescribed", "input"."gps_bearing", "input"."gps_lon", "input"."gps_lat", "input"."highway", "input"."name"
FROM "LOCALDB"."StreamLab_Output_buses"."pipeline_1_step_9" AS "input")
NODE_ID 41.1
NAME_IN_QUERY_PLAN AnonymousJavaUdxRel.#27194:12159
QUERY_PLAN AnonymousJavaUdxRel(param0=['FILE-source;46;#
NET_OUTPUT_ROWS 2340
NET_OUTPUT_ROW_RATE 10.289060740533513
SOURCE_SQL INSERT INTO "LOCALDB"."StreamLab_Output_buses"."source_1_ns"
(SELECT STREAM "EXPR$0"."id", "EXPR$0"."reported_at", "EXPR$0"."speed", "EXPR$0"."driver_no", "EXPR$0"."prescribed", "EXPR$0"."gps", "EXPR$0"."highway"
FROM STREAM(SPECIFIC "LOCALDB"."StreamLab_Output_buses"."source_1_throttlefunc"(CURSOR ((SELECT STREAM "source_1_fs"."id", "source_1_fs"."reported_at", "source_1_fs"."speed", "source_1_fs"."driver_no", "source_1_fs"."prescribed", "source_1_fs"."gps", "source_1_fs"."highway"
FROM "LOCALDB"."StreamLab_Output_buses"."source_1_fs" AS "source_1_fs")), 100)) AS "EXPR$0")
3 rows selected (0.336 seconds)
Sink foreign streams are managed by an XO of type FarragoJavaUdxRel. We can recognize the output foreign streams nodes because they have no input nodes.
!outputformat vertical
select xo.node_id,xo.name_in_query_plan,xo.query_plan,xo.net_input_rows, xo.net_input_row_rate
, g.source_sql
from table(getStreamOperatorInfo(0,2)) xo
join table(getStreamGraphInfo(0,2)) g using (graph_id)
where xo.name_in_query_plan like 'FarragoJavaUdxRel%'
and xo.num_outputs = 0
order by xo.net_input_rows desc
fetch first 5 rows only;
NODE_ID 35.1
NAME_IN_QUERY_PLAN FarragoJavaUdxRel.#25324:11208
QUERY_PLAN NetworkRel
NET_INPUT_ROWS 0
NET_INPUT_ROW_RATE 0.0
SOURCE_SQL MERGE INTO "LOCALDB"."StreamLab_Output_buses"."external_table_out_1" "sink"
USING (SELECT STREAM ROWTIME AS "ROWTIME", CAST("pipeline_2_step_2"."id" AS VARCHAR(128)) AS "id", CAST("pipeline_2_step_2"."driver_no" AS INTEGER) AS "driver_no", CAST("pipeline_2_step_2"."speed" AS DOUBLE) AS "speed"
FROM "LOCALDB"."StreamLab_Output_buses"."pipeline_2_step_2" AS "pipeline_2_step_2") AS "input"
ON "sink"."position_time" = "input"."ROWTIME" AND "sink"."bus" = "input"."id"
WHEN MATCHED THEN UPDATE SET "driver_no" = "input"."driver_no"
, "speed" = "input"."speed"
WHEN NOT MATCHED THEN INSERT ("position_time", "bus", "driver_no", "speed") VALUES ("input"."ROWTIME", "input"."id", "input"."driver_no", "input"."speed")
NODE_ID 274.1
NAME_IN_QUERY_PLAN FarragoJavaUdxRel.#99060:53343
QUERY_PLAN NetworkRel
NET_INPUT_ROWS 0
NET_INPUT_ROW_RATE 0.0
SOURCE_SQL INSERT INTO "LOCALDB"."StreamLab_Output_IoT_Weather"."sink_2_fs"
(SELECT STREAM "sink_2"."Color", "sink_2"."Wind_Size", "sink_2"."Feels_Like", "sink_2"."Wind_Chill", "sink_2"."Dew_Point", "sink_2"."Barometric_Pressure", "sink_2"."Rainrate", "sink_2"."Rainfall", "sink_2"."Humidity", "sink_2"."Wind_Direction", "sink_2"."Wind_Speed_KPH", "sink_2"."Wind_Speed_MPH", "sink_2"."Temperature_F", "sink_2"."Temperature_C", "sink_2"."readings_uuid", "sink_2"."device_key", "sink_2"."model_code", "sink_2"."latitude", "sink_2"."longitude", "sink_2"."recorded_at"
FROM "LOCALDB"."StreamLab_Output_IoT_Weather"."sink_2" AS "sink_2")
2 rows selected (0.359 seconds)
User streams are identified by having a name_in_query_plan like '[LOCALDB,
select name_in_query_plan,net_input_rows,net_input_row_rate, net_output_rows, net_output_row_rate
from table(getStreamOperatorInfo(0,2))
where name_in_query_plan like '[LOCALDB%'
order by net_input_rows desc
fetch first 10 rows only
The telemetry functions tell you the throughput of each stream graph, and of each stream operator within each graph.
If you want to know how each of the SQL views in a pipeline contribute to the overall pipeline's performance, you can use the SYS_BOOT.MGT.DATA_RATE_RPS stream function to read rows and rowtime bounds from each view in a pipeline, one by one in turn.
For more information see Looking for Performance Bottlenecks using DATA_RATE_RPS in the General Management Functions section of the SQL Reference Guide.
To determine total throughput, you can use getServerInfo()..
!outputformat vertical
select *
from table(getServerInfo(2))
;
MEASURED_AT 2021-05-07 11:30:16.691
IS_RUNNING true
IS_LICENSED false
LICENSE_KIND
LICENSE_VERSION
IS_THROTTLED false
NUM_SESSIONS 12
NUM_STATEMENTS 14
STARTED_AT 2021-05-07 10:13:38.919
THROTTLED_AT
THROTTLE_LEVEL 0.0
NUM_EXEC_THREADS 2
NUM_STREAM_GRAPHS_OPEN 24
NUM_STREAM_GRAPHS_CLOSED 50
NUM_STREAM_OPERATORS 84
NUM_STREAM_GRAPHS_OPEN_EVER 389
NUM_STREAM_GRAPHS_CLOSED_EVER 365
NET_MEMORY_BYTES 72332088
MAX_MEMORY_BYTES 373293056
USAGE_AT 2021-05-07 10:13:38.917
USAGE_SINCE
USAGE_REPORTED_AT
NET_INPUT_BYTES 0
NET_OUTPUT_BYTES 0
NET_INPUT_BYTES_TODAY 0
1 row selected (0.278 seconds)
To identify a blockage, we are looking for an operator in OVF (overflow) state immediately followed by an operatpr in UND (underflow).
SELECT ovf.node_id as blocked_node_id, ovf.sched_state as blocked_state
, und.node_id as blocking_node_id, und.sched_state as blocking_state
, gund.source_sql blocking_sql
FROM (
SELECT * from TABLE(getStreamOperatorInfo(0,1))
WHERE last_exec_result in ('UND', 'YLD', 'RUN')
) und
JOIN (
SELECT * FROM TABLE(getStreamOperatorInfo(0,1))
WHERE last_exec_result in ('OVF', 'OVR')
) ovf
ON ','||ovf.output_nodes||',' LIKE '%,'||cast(und.node_id AS VARCHAR(5))||',%'
JOIN (
SELECT * FROM TABLE(getStreamGraphInfo(0,1))
) gund
ON gund.graph_id = und.graph_id
;
You can use a WebAgent telemetry API to present the telemetry information as a directed graph. This allows developers to view the execution plan for their applications and to drill down to see what stream operators are running, which are processing the most data or using the most resources and in particular to identify blockages. By default the API shows the graph as an SVG image – it can also be presented in other formats (including png, jpg or pdf). The telemetry data may also be provided in json format so that a set of historical snapshots can be collected and analysed off-line.
The developer can see all graphs and operators, and may also drill down on specific sessions (either listing one or more SESSION.IDs or using a pattern match on SESSION.NAME). Each running pump corresponds to an open session whose name is based on the pump name.