This user-defined transfrom (UDX) applies a RANK() function to logical groups of rows and optionally delivers the group in sorted order Applications of Group_Rank include the following:
Group Rank UDX can do the following actions:
The functional attributes and DDL are described in the topics that follow.
This Group_Rank transform acts as follows:
CREATE FUNCTION group_rank(c cursor, rankByColumnName VARCHAR(128), rankOutColumnName VARCHAR(128), sortOrder VARCHAR(10), outputOrder VARCHAR(10), maxIdle INTEGER, outputMax INTEGER) returns table(c.*, "groupRank" INTEGER) language java parameter style system defined java no sql external name 'class com.sqlstream.plugin.grouprank.GroupRank.group_rank';
The parameters to the UDX are:
|c||CURSOR to streaming result set|
rankByColumnName String naming the column to use for ranking the group | | rankOutColumnName String naming the column to use for returning the rank. This string must match the name of the groupRank column in RETURNS clause of the CREATE FUNCTION statement. | | sortOrder Controls ordering of rows for rank assignment. Valid values are: - ‘asc’ - Ascending based on rank. - ‘desc’ - Descending based on the rank.
outputOrder Controls ordering of output. Valid values are: - ‘asc’ - Ascending based on rank. - ‘desc’ - Descending based on the rank.
maxIdle Time limit in milliseconds for holding a group for ranking. When maxIdle expires the current group is released to the stream. A value of zero indicates no idle time out. | | outputMax Maximum number of rows the UDX will output in a given group. A value of 0 indicates no limit.
```sql CREATE VIEW pageCounts1Min AS SELECT STREAM pageId, COUNT(*) AS hitCount FROM AccessStream AS S GROUP BY FLOOR(S.ROWTIME TO MINUTE), pageId; SELECT STREAM “pageId”, “hitCount”, “groupRank” FROM (TABLE(group_rank(CURSOR(SELECT STREAM “pageId”, “hitCount” FROM pageCounts1Min), ‘hitCount’, ‘groupRank’, ‘desc’, ‘asc’, 10, 5)));
### Sample Input: ```sql pageId hitCount condo 51 auto 25 books 200 CDs 202 DVDs 1000 Games 500
pageId hitCount groupRank DVDs 1000 1 Games 500 2 CDs 202 3 books 200 4 Condo 51 5
Rows are buffered from the input cursor for each group, i.e., rows with the same rowtimes. Ranking of the rows is done either after the arrival of a row with a different rowtime (or when the idle timeout occurs). Rows continue to be read while ranking is performed on the group of rows with the same rowtime.
The outputMax parameter specifies the maximum number of rows to be returned for each group after ranks are assigned. By default, _grouprank supports column pass through, as the example illustrates by using c.* as the standard shortcut directing pass through of all input columns in the order presented. You can, instead, name a subset using the notation “c.columName”, allowing you to reorder the columns. However, using specific column names ties the UDX to a very specific input set while using the c.* notation allows the UDX to handle any input set
The rankOutColumnName parameter specifies the output column used to return ranks. This column name must match the column name specified in the RETURNS clause of the CREATE FUNCTION statement.