SQLstream provides a number of built-in functions, and also allows users to extend s-Server’s capabilities by means of user-defined functions (UDFs) and user-defined transformations (UDXs). You define both using CREATE FUNCTION. Functions must be declared within schemas, but can be added to a SQL path.
Values passed to (or returned from) a user-defined function or transformation must be exactly the same data types as the corresponding parameter definitions. In other words, implicit casting is not allowed in passing parameters to (or returning values from) a user-defined function or a user-defined transform.
User-defined functions and transformations may be invoked using either the fully-qualified name <schema>.<function_name> or by the function name alone if it exists in the current SQL path (see SET PATH and CURRENT_PATH).
This topic contains information on the following subtopics:
SQLstream functions can be implemented as pure SQL, or can reference external functions defined in Java, Python or C+ (native) modules. The table below shows which types of function are supported:
|Type of Function||SQL||Java||Python||C++|
|User Defined Function (UDF)||Yes||Yes||No||Yes|
|User Defined Transform (UDX)||No||Yes||No||No|
A user-defined function can implement complex calculations or interact with an external system, taking zero or more scalar parameters and returning a scalar result. UDFs operate like built-in functions such as FLOOR() or LOWER(). For each occurrence of a user-defined function within a SQL statement, that UDF is called once per row with scalar parameters (constants or column values in that row).
Note: any Java class referenced by a UDF must be explicitly defined as part of a package (using a package clause in the source file for the Java class).
The syntax for a SQL user-defined function is:
CREATE FUNCTION <function_name> ( <parameter_list> ) RETURNS <data type> LANGUAGE SQL [ SPECIFIC <specific_function_name> ] [ [NOT] DETERMINISTIC ] CONTAINS SQL | READS SQL DATA | MODIFIES SQL DATA [ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ] RETURN <SQL-defined function body>
A SQL-defined function body consists only of a single RETURN statement. SQLstream only supports SQL functions that return a single scalar result; there is no support for UDXs written using SQL.
First, we create a new schema to handle our extensions and set the schema and path to point to it:
CREATE SCHEMA geolocate_sql; SET SCHEMA 'geolocate_sql'; SET PATH 'geolocate_sql';
Next, we create a function that allows us to extract the fractional portion of a degree measurement. Here we can see that the body of a SQL UDF is always RETURN <expression>:
CREATE FUNCTION get_fraction( degrees DOUBLE ) RETURNS DOUBLE CONTAINS SQL RETURN degrees - FLOOR(degrees) ;
Next, we use get_fraction to build a UDF that extracts minutes from a degree measurement. This shows how one UDF can call another:
CREATE FUNCTION get_minutes( degrees DOUBLE ) RETURNS INTEGER CONTAINS SQL RETURN CAST( (get_fraction(degrees) * 60.0) as INTEGER ) ;
With both of these functions in hand, we can build yet another UDF to extract seconds:
CREATE FUNCTION get_seconds( degrees DOUBLE ) RETURNS INTEGER CONTAINS SQL RETURN CAST( ((get_fraction(degrees) * 3600.0) - ((get_minutes(degrees) * 60)) ) as INTEGER ) ;
Now that we have our extraction functions in place, let’s create one more function that brings them all together to convert from degrees in decimal format to a fully specified string indicating degrees, minutes, and seconds:
CREATE FUNCTION to_dms( degrees DOUBLE ) RETURNS VARCHAR(20) CONTAINS SQL RETURN ( CAST( CAST(degrees AS INTEGER) AS VARCHAR(20) ) || 'd ' || CAST( get_minutes(degrees) AS VARCHAR(20) ) || 'm ' || CAST( get_seconds(degrees) AS VARCHAR(20) ) || 's' ) ;
s-Server supportd the SQL/OLB standard for User-Defined Functions (UDFs), also known as “external routines”. Farrago supports this standard; please refer to the External Routines section at http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.
The syntax for Java UDF is:
CREATE FUNCTION <function_name> ( <parameter_list> ) RETURNS <built-in scalar type> LANGUAGE JAVA PARAMETER STYLE [SYSTEM DEFINED] JAVA [ SPECIFIC <specific_function_name> ] [ [NOT] DETERMINISTIC ] NO SQL [ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ] EXTERNAL NAME ['class <fully qualified static method name>' | '<qualified jar name>:<fully qualified static method name>']
Any parameters to a Java-defined external function must be specified using built-in data types. While the Java function can return a value, changes to any values passed to the function as parameters only exist internal to the Java function. They are not passed back to the calling SQL.
The Java class containing the method must live inside a package.
The following EXTERNAL NAME clauses will work:
EXTERNAL NAME 'class org.apache.commons.lang.StringUtils.reverse' -- as long as the class is in the classpath EXTERNAL NAME '"MyJarName":foo.bar.VarbinaryFunctions.toUTF8String' -- as long as the JAR is in the current schema EXTERNAL NAME '"FOO_SCHEMA"."MyJarName:foo.bar.VarbinaryFunctions.toUTF8String'
but the following declaration will raise an error when you try to invoke the function:
EXTERNAL NAME '"FOO_SCHEMA":VarbinaryFunctions.toUTF8String' -- invalid because there is no package name included
CREATE JAR "getFraction" LIBRARY 'file:/home/aspen/getFraction.jar' OPTIONS(0); -- This code loads the installed Java JAR file CREATE FUNCTION get_fraction(degrees DOUBLE ) RETURNS DOUBLE LANGUAGE JAVA NO SQL EXTERNAL NAME 'getFraction:com.sqlstream.examples.geolocate.Degree.getFraction' ;
Note that the above examples only cover the SQL statement CREATE FUNCTION for these functions. To fully define a Java UDF, you need to create and define a Java class (such as com.sqlstream.examples.geolocate.Degree) using CREATE JAR. See the topic GeoLocation: An Example of User-Defined Functions in the s-Server Integration Guide for full examples covering the entire process.
The syntax for a creating a C UDF is:
CREATE FUNCTION <function_name> ( <parameter_list> ) RETURNS <data type> LANGUAGE C PARAMETER STYLE GENERAL NOSQL EXTERNAL NAME '<sharedObjectLibrary>:<external name of method>';
create or replace function xor(i boolean, j boolean) returns boolean language c parameter style general no sql external name 'plugin/libsampleUdfs.so:xorOp';
The <sharedObjectLibrary> can be relative to the s-Server working directory (usually $SQLSTREAM_HOME) or absolute. The <external name of method> is defined in C++ code. To read more about how to create and reference shared library methods in C++ please see Writing a C++ UDF.
User Defined Transforms (UDXs) are simply User Defined Functions (UDFs) that either return a table / finite relation or they return a stream / streaming relation instead of simply returning a scalar value.
A user-defined transform (UDX) is an advanced type of user-defined function that reads a set or stream of rows and returns a set or a stream of rows. Its input arguments can be scalars or cursors. A cursor is an object representing a query, from which the UDX can read the result of upstream processing. The query can be relational (finite) or streaming (infinite).
Although UDXs are registered in SQLstream using the CREATE FUNCTION SQL statement, SQLstream does not support the body (the procedural logic) of a UDX being specified in SQL. It has to be defined in an external language. SQLstream supports UDXs written in Java, Python or C++; development of custom UDXs requires knowledge of the UDX framework as it applies to each of these languages. So this topic contains information on the following:
A UDX is declared by a sql CREATE FUNCTION statement, much like a user-defined function.
The salient differences are:
A UDX may be implemented by a Java class (which must be located in a java package), by a Python module, or by a C shared library object.
A stream always has a ROWTIME column. Therefore, a cursor based on a streaming query will have a ROWTIME, but a cursor based on a relational query will not. As a result, the SQL that calls the UDX differs:
CURSOR(SELECT STREAM....) has a rowtime CURSOR(SELECT....) has no rowtime.
The declaration of a UDX states its output rowtype, although with the “c.*” construct the output can depend dynamically on the input type. The rowtype of an input cursor is always dynamic.
A UDX is required to set the value of every output column, matching its declared data type. ROWTIME is special:
The syntax for creating a Java UDX is:
CREATE FUNCTION <function_name> ( <parameter_list> ) RETURNS ( <result columns> ) LANGUAGE JAVA PARAMETER STYLE [SYSTEM DEFINED] JAVA [ SPECIFIC <specific_function_name> ] NO SQL EXTERNAL NAME ['class <fully qualified static method name>' | '<qualified jar name>:<fully qualified static method name>']
This example shows the declaration and use of a UDX that has three inputs: one scalar integer value and two query expressions. Also note that in this example the output is fixed at 3 columns.
CREATE OR REPLACE FUNCTION crossRef(bias INTEGER, c1 CURSOR, c2 CURSOR) RETURNS TABLE(leftVal INTEGER, rightVal INTEGER, position DECIMAL(9,2)) LANGUAGE JAVA PARAMETER STYLE SYSTEM DEFINED JAVA NO SQL EXTERNAL NAME 'class com.sqlstream.udxsample.UdxSample.crossRef';
And the UDX is referenced like this:
SELECT STREAM res.position FROM TABLE(crossRef( 33, CURSOR(SELECT STREAM * FROM transmitStream), CURSOR(SELECT STREAM * FROM recieveStream))) AS res;
For more examples and information about how to build the Java methods that implement UDXs please see Writing a Java UDX.