Basic Commands

Basic commands let you change names and types for columns, split, extract, merge, and drop columns, and add a running count column.

Basic Commands cover the following topics:

Add Running Count

The Add Running Count command simply adds a column that updates every time a new row comes in.

Cast

CAST lets you convert one data type into another. To do so, enter the name of (or select) one or more columns. Then select a new data type for the column(s) and add size or precision/scale specifications.

Precision specifies the maximum number of decimal digits that can be stored in the column, both to the right and to the left of the decimal point. You can specify precisions ranging from 1 digit to 18 digits or use the default precision of 18 digits. Scale specifies the maximum number of digits that can be stored to the right of the decimal point. Scale must be less than or equal to the precision. You can specify a scale ranging from 0 digits to 18 digits, or use the default scale of 0 digits.

Cast List

You can also use CAST LIST to convert multiple columns data types at once. To do so, enter the name of columns separated by commas and in the same order, give the data types for these columns.

Example: Add columns under column list and corresponding data types in type list

The top suggestion lets you click the + to add this wrangler function into your pipeline:

The SQL for Bulk Cast Operation looks like this in the sql script:

Valid conversions

Using CAST or CAST LIST with source operands of the types listed in the first column below can create cast target types as listed in the second column, without restriction. Other target types are not supported.

Source Operand Types Target Operand Types
Any numeric type VARCHAR, CHAR, or any numeric type (See Note A.)
DATE DATE, VARCHAR, CHAR, TIMESTAMP
TIMESTAMP TIME, VARCHAR, CHAR, TIMESTAMP, DATE
BOOLEAN VARCHAR, CHAR, BOOLEAN

Drop

The DROP command deletes a column. To use DROP, enter or select one or more columns.

Duplicate

You can duplicate a column and its contents using the Duplicate command. For example, the Buses demonstration app uses the Duplicate command to make a copy of a column with buses' speeds, toward creating a column that categorizes columns by fast, medium, and slow.

sl_duplicate

Using the Duplicate command

  1. Select one or more columns in the Output view and click the List of Columns button at the bottom of the Column List field. You can also enter the column names in the Column List field.
  2. Enter a suffix for the new column(s), such as “_cat”. A Suggestion appears that reads something along the lines of “Duplicate the column speed, adding the suffix _cat to the new column name”.
  3. Click the + icon to add the command to the Guide script.
  4. The duplicated column(s) appear(s) in the Output view.

Resulting SQL

You need not view or understand the SQL generated by a command in order to use them. However, some users may find it useful or interesting to view the SQL generated.

Like other commands, the Duplicate command generates a new view on the pipeline guide’s stream:

----------

-- pipeline_1 1: Duplicate the columns speed, adding the suffix _cat to the new column names

----------

--  Duplicate Columns Operation

CREATE OR REPLACE VIEW "StreamLab_Output_Buses"."pipeline_1_step_1" AS
SELECT STREAM "speed" AS "speed_cat", *
FROM "StreamLab_Output_Buses"."buses" AS "input";

Extract

The Extract command lets you pull content from a column, such as the first four characters, to create a new column. As new data streams in, this new column will be updated. (Data from the first column is not deleted.) To do so:

  1. Enter a name for the new column.
  2. Select a column from the drop-down menu.
  3. Either
    • Select the part of the cell’s string that you’d like to extract.
    • Enter a starting point in the Start field and a string length in the Length field.
  4. From the drop-down menu, select an option for handling strings that are too small. You can choose to truncate, use an empty string, or use NULL.
  5. Click the + icon to add the command to the Guide script.
  6. The results of the script appear in the Output View window.

Merge Columns

You can also merge two columns together. When you do so, you can connect them with a character. For example, you might merge columns with city and state into a city state pair.

Rename

The Basic: Rename command lets you rename a column. To do so, select a column from the Column popup menu and enter the column’s new name in the To field:

Once you do so, a suggestion appears in the Suggestion list. Click the + button to add the suggestion to the Script list.

The renamed columns appears in the Output view:

Rename List

The Basic: Rename List command lets you rename a list of columns. To do so, enter the columns' old names in the Column List field and enter new names in the To List field:

Once you do so, a suggestion appears in the Suggestion list. Click the + button to add the suggestion to the Script list. The renamed columns appear in the Output view:

Replace Nulls

At times, you may want to replace null values with either a zero or blank, so that analytics work properly. You can do so with the Replace Nulls command. To replace nulls, select the column for which you want to replace nulls and enter the new character for the null.

Replacing Nulls with Empty

To enter replace nulls with an empty value, select the replacement field and hit the space bar once. Once you do so, a suggestion appears reading “Replace NULLS in [column] with Empty”

Split

The Split command lets you divide any column up at designated points. Usually, you will split columns on characters such as commas, pipes, spaces, equals signs, or other delimiters. Split columns take the same column type as those they split from. For example, if you split a column with type VARCHAR, all split columns will be of type VARCHAR. Split columns are named after the original column, with _1, _2, and so on appended. After you split columns, you can use the Rename Column command to change these names.

To use the Split command:

  1. Select a column from the Column dropdown menu. The only columns available will be those of type VARCHAR.

  2. Enter the number of new columns to be generated.

  3. Select a character (delimiter) from the Split On dropdown menu. Here, you can either:

    • Select from one of the pre-specified delimiters, such as comma, tab, space, pipe, period, or colon.
    • Select Auto and select a pre-specified automatic splitter, such as Column-Separated Values or Tab-Separated Values.
    • Select Custom and enter a delimiter of your choosing.
  4. A suggestion appears that reads something like “Split column X using…”. Click the + icon to add the command to the Guide script.

  5. The results of the script appear in the Output View window.

Throttle

Sometimes, you may want to slow a data feed for testing purposes. In most cases, you can simply throttle your source directly. The default throttled rate is one row per second, but you can adjust this default rate in project settings.

However in rare cases you may particularly wish to insert an explicit throttle step into a pipeline to slow it to a specified number of rows per second; for example you might want to put this step in the middle of the pipeline, after filtering out unwanted records.

For this you can use the Throttle command. This command lets you do the following:

  • Include a throttle step in the pipeline guide.
  • Turn that throttle on or off by clicking on the parameter in the script set
  • Specify a rate for the Throttler, either the project standard (as set in project settings) or a custom rate.
  • (If you choose Custom), specify a custom number of rows per second. To do so, select Custom from Rows Per Second and enter a number (whole or decimal) in the Custom RPS field.
  • Note that throttle steps can be (optionally) removed along with source throttling when you Export SQL

Timestamp

You can promote a timestamp to Rowtime. Doing so makes the selected column the new rowtime for the stream.

Note: In some cases, you may need to enter a timestamp format, such as yyy-MM-dd hh:mm:ss to parse a column manually.

Use Null

You can turn empty cells to NULL cells with the Use Null command. To do so, select the column, select the replacement field and hit the space bar once. Once you do so, a suggestion appears reading “Replace Empty with NULL in column [column]”