Writing a C++ UDA

This topic contains information on the following:

Introduction

Support for C++ UDAs has been added to SQLstream 7.2.1 onwards. The following categories of C++ UDAs are supported by SQLstream:

  • “Flat” UDAs - UDAs whose accumulators are of fixed size and can be mapped directly to byte arrays.
  • “Complex” UDAs - that involve arbitrary non-static accumulator data structures.

Pre-requisites

Ensure that cmake and gcc-8 are installed on your system. For more details on installing gcc and cmake, refer to Installing the C++ SDK

Since version 7.3.2 you will find the C++ SDK installed as part of a full install of s-Server at $SQLSTREAM_HOME/examples/c++sdk/.

Building C++ UDA Examples

Perform the following steps to build the C++ UDA examples:

  1. Navigate to the directory $SQLSTREAM_HOME/examples/c++sdk/examples.
  2. Build a shared library using ./build.sh. This will create a C++ library c++sdk/examples/build/plugin/libsampleUdfs.so.

Using C++ Library

Perform the following steps to use the C++ library examples created in Building C++ UDA Examples section:

  1. Install the shared library by copying the newly created libsampleUdfs.so to plugin directory of s-Server directory using the following command (this example shows how to deploy to a docker container):
    docker cp install.sql <container-name>:/home/sqlstream
    
  2. Finally, invoke sqlline --run=install.sql (in this example, that would be run on the docker container)

Creating a C++ UDA

The following examples illustrate the method of creating the C++ component for UDAs.

include / using

At the top of each C++ UDF module, include at least these lines:

#include "sqlstream/Udf.h"
#include "fennel/calculator/SqlState.h"

using namespace sqlstream;

Examples

LongAdder - simple SUM-like UDA

This simple UDA implements add (SUM) function as aggregate function with no overflow checking.

class LongAdder : public Count0IsNullUda
{
    int64_t acc;
public:
    void initAdd(CalculatorContext ctx, int64_t value) {
        acc = value;
    }
    void inline add(CalculatorContext ctx, int64_t value) {
        acc += value;
    }
};
INSTALL_UDA(longSum, LongAdder, int64_t)

LongAdderChecked - simple SUM-like UDA with overflow checking

Simple UDA implementing add (SUM) function as aggregate function with overflow checking.

class LongAdderChecked : public Count0IsNullUda
{
    int64_t acc;
public:
    // you can either provide an init method or an initAdd method
    void init() {
        acc = 0;
    }
    void inline add(CalculatorContext ctx, int64_t value) {
        if (__builtin_add_overflow(acc, value, &acc)) {
            ctx.throwException(fennel::SqlState::instance().code22003());
        }
    }
};
INSTALL_UDA(longSumChecked, LongAdderChecked, int64_t)

StringConcatter - Complex LISTAGG-like UDA

A complex UDA implementing an aggregate function with a variable aggregate data structure.

class StringConcatter : public ComplexUda {
public:
    std::string acc;

    inline void init() {
        acc.clear();
    }

    inline void add(CalculatorContext ctx, varchar_t value) {
        acc.append(value.data, value.size);
    }

    inline void getResult(CalculatorContext ctx, const ResultRegister<varchar_t> &result) const {
        // this would work, but allocate extra memory and do a copy.
        // result = acc;
        // instead return reference
        result.reference(acc.data(), acc.size());
    }
};
INSTALL_BASE_UDA(concatterAgg, StringConcatter)
INSTALL_UDA_RESULT_FUNCTION(strlistAgg, concatterAgg, StringConcatter, getResult)

Compiling the C++ UDA shared object library

To build the example sampleUdfs.cpp use the script supplied in the tarball:

./build.sh

This will create build/plugin/libsampleUdfs.so.

Installing a C++ UDA

After creating a library plugin, the corresponding SQL function would be created as follows.

These examples are from the install.sql script in the SDK tarball, and they assume that your sampleUdfs.so object library has been copied (deployed) to the plugin folder under $SQLSTREAM_HOME. You may choose any convenient location either relative to s-Server’s working directory (usually $SQLSTREAM_HOME) or use an absolute path. Then invoke sqlline –run=install.sql (or whatever your install script is called) as mentioned in the section Using C++ Library.

CREATE OR REPLACE AGGREGATE FUNCTION "mySum"(I INT) 
RETURNS INT 
LANGUAGE C 
PARAMETER STYLE GENERAL 
NO SQL 
EXTERNAL NAME 'plugin/libsampleudfs.so:longSum'; 

Calling a C++ UDA

Call the function as a sliding WINDOW aggregate:

SELECT STREAM A, B, "mySum"(B) OVER (PARTITION BY A ROWS UNBOUNDED PRECEDING) as "CumulativeTotal" 
FROM "myStream" s
;

Or as a tumbling GROUP BY:

SELECT STREAM FLOOR(s.ROWTIME TO HOUR), A, "mySum"(B) AS "HourlyTotal"
FROM "myStream" s 
GROUP BY FLOOR(s.ROWTIME TO HOUR), A
;

Parameter Type Mapping

For parameters, the following mapping will be used:

Parameter Type SQL result
TINYINT int8_t
SMALLINT int16_t
INT int32_t
BIGINT, TIMESTAMP, DECIMAL int64_t
DOUBLE, FLOAT double
BOOLEAN bool
CHAR char_t containing size_t and char *
BINARY binary_t containing size_t and uint8_t *
VARCHAR varchar_t containing size_t and char *
VARBINARY varbinary_t containing size_t and uint8_t *