Basic commands let you change names and types for columns, split, extract, merge, and drop columns, and add a running count column.
Basic Commands cover the following topics:
The Add Running Count command simply adds a column that updates every time a new row comes in.
CAST lets you convert one data type into another. To do so, enter the name of (or select) one or more columns. Then select a new data type for the column(s) and add size or precision/scale specifications.
Precision specifies the maximum number of decimal digits that can be stored in the column, both to the right and to the left of the decimal point. You can specify precisions ranging from 1 digit to 18 digits or use the default precision of 18 digits. Scale specifies the maximum number of digits that can be stored to the right of the decimal point. Scale must be less than or equal to the precision. You can specify a scale ranging from 0 digits to 18 digits, or use the default scale of 0 digits.
You can also use CAST LIST to convert multiple columns data types at once. To do so, enter the name of columns separated by commas and in the same order, give the data types for these columns.
Example: Add columns under column list and corresponding data types in type list
The top suggestion lets you click the + to add this wrangler function into your pipeline:
The SQL for Bulk Cast Operation looks like this in the sql script:
Using CAST or CAST LIST with source operands of the types listed in the first column below can create cast target types as listed in the second column, without restriction. Other target types are not supported.
Source Operand Types | Target Operand Types |
---|---|
Any numeric type | VARCHAR, CHAR, or any numeric type (See Note A.) |
DATE | DATE, VARCHAR, CHAR, TIMESTAMP |
TIMESTAMP | TIME, VARCHAR, CHAR, TIMESTAMP, DATE |
BOOLEAN | VARCHAR, CHAR, BOOLEAN |
The DROP command deletes a column. To use DROP, enter or select one or more columns.
You can duplicate a column and its contents using the Duplicate command. For example, the Buses demonstration app uses the Duplicate command to make a copy of a column with buses' speeds, toward creating a column that categorizes columns by fast, medium, and slow.
You need not view or understand the SQL generated by a command in order to use them. However, some users may find it useful or interesting to view the SQL generated.
Like other commands, the Duplicate command generates a new view on the pipeline guide’s stream:
----------
-- pipeline_1 1: Duplicate the columns speed, adding the suffix _cat to the new column names
----------
-- Duplicate Columns Operation
CREATE OR REPLACE VIEW "StreamLab_Output_Buses"."pipeline_1_step_1" AS
SELECT STREAM "speed" AS "speed_cat", *
FROM "StreamLab_Output_Buses"."buses" AS "input";
The Extract command lets you pull content from a column, such as the first four characters, to create a new column. As new data streams in, this new column will be updated. (Data from the first column is not deleted.) To do so:
You can also merge two columns together. When you do so, you can connect them with a character. For example, you might merge columns with city and state into a city state pair.
The Basic: Rename command lets you rename a column. To do so, select a column from the Column popup menu and enter the column’s new name in the To field:
Once you do so, a suggestion appears in the Suggestion list. Click the + button to add the suggestion to the Script list.
The renamed columns appears in the Output view:
The Basic: Rename List command lets you rename a list of columns. To do so, enter the columns' old names in the Column List field and enter new names in the To List field:
Once you do so, a suggestion appears in the Suggestion list. Click the + button to add the suggestion to the Script list. The renamed columns appear in the Output view:
At times, you may want to replace null values with either a zero or blank, so that analytics work properly. You can do so with the Replace Nulls command. To replace nulls, select the column for which you want to replace nulls and enter the new character for the null.
To enter replace nulls with an empty value, select the replacement field and hit the space bar once. Once you do so, a suggestion appears reading “Replace NULLS in [column] with Empty”
The Split command lets you divide any column up at designated points. Usually, you will split columns on characters such as commas, pipes, spaces, equals signs, or other delimiters. Split columns take the same column type as those they split from. For example, if you split a column with type VARCHAR, all split columns will be of type VARCHAR. Split columns are named after the original column, with _1, _2, and so on appended. After you split columns, you can use the Rename Column command to change these names.
To use the Split command:
Select a column from the Column dropdown menu. The only columns available will be those of type VARCHAR.
Enter the number of new columns to be generated.
Select a character (delimiter) from the Split On dropdown menu. Here, you can either:
A suggestion appears that reads something like “Split column X using…”. Click the + icon to add the command to the Guide script.
The results of the script appear in the Output View window.
Sometimes, you may want to slow a data feed for testing purposes. In most cases, you can simply throttle your source directly. The default throttled rate is one row per second, but you can adjust this default rate in project settings.
However in rare cases you may particularly wish to insert an explicit throttle step into a pipeline to slow it to a specified number of rows per second; for example you might want to put this step in the middle of the pipeline, after filtering out unwanted records.
For this you can use the Throttle command. This command lets you do the following:
You can promote a timestamp to Rowtime. Doing so makes the selected column the new rowtime for the stream.
Note: In some cases, you may need to enter a timestamp format, such as yyy-MM-dd hh:mm:ss to parse a column manually.
You can turn empty cells to NULL cells with the Use Null command. To do so, select the column, select the replacement field and hit the space bar once. Once you do so, a suggestion appears reading “Replace Empty with NULL in column [column]”