Parsing Key Pair Values

The KeyValue parser reads files with data in the format of Key=Value, where "Key" is the exact name of a column in the foreign stream with columns whose names match the keys in the file. (The parser will ignore any keypairs not defined as columns.) To parse Key Values with the Extensible Common Data Adapter (ECDA), you need to specify a location for the file to be read and indicate KV for parser. Column types also need to match. If, for example, the file a key value CUSTOMER_ID="Alan" and CUSTOMER_ID is defined as INTEGER, parsing will fail and the error will be logged to the tracer. If the column is quoted then the key must match exactly. If the column is unquoted then the key must be in upper case (e.g. CUSTOMER_ID).

Note: You can also input data in larger chunks and parse it later using the Parser UDX ). This UDX calls the parsers listed above in a function. For more information on using functions, see the topic Transforming Data in s-Server.

The s-Server trace log includes information on readers' and parsers' progress. See Periodic Parser Statistics Logging in the Administering Guavus SQLstream guide.

Example of a Workload File

CUSTOMER_ID=66,CUSTOMER_NAME="Jane Liu",CONTACT_NAME="David",ADDRESS="Linda-a-velha",CITY="Lisbon",POSTAL_CODE="2795",COUNTRY="Portugal"
CUSTOMER_ID=72,CUSTOMER_NAME="Ricardo Gomes Pereira",CONTACT_NAME="Ricardo Pereira",ADDRESS="Oeiras",CITY="Lisbon",POSTAL_CODE="2500",COUNTRY="Portugal"
CUSTOMER_ID=99,CUSTOMER_NAME="Juan Villa",CONTACT_NAME="Juan",ADDRESS="Trujillo",CITY="Cáceres",POSTAL_CODE="11125",COUNTRY="Spain"
CUSTOMER_ID=25,CUSTOMER_NAME="John Moore",CONTACT_NAME="John",ADDRESS="158 - 7th Ave.",CITY="Boston",POSTAL_CODE="90110",COUNTRY="USA"
CUSTOMER_ID=51,CUSTOMER_NAME="Rachel Morgan",CONTACT_NAME="Rachel",ADDRESS="352 9th Ave.",CITY="New York",POSTAL_CODE="65123",COUNTRY="USA"

Sample Foreign Stream to Parse Key Value Pairs

You indicate column names when you set up the stream, as in the following example, which creates a stream with the column names "customer_id," "customer_name," "contact_name," "address," "city," "postal_code," and "country." These columns will be assigned data from the first five columns found in the Key Value pairs file.

The following example sets up a server called "KVReaderServer", then creates and sets a schema called "SCHEMA_KEY_VALUE". It then creates a foreign stream to parse columns called ts, accountNumber, sourceIP, destIP, and customerID from a file in /opt/sqlstream/5.1.0.14245/s-Server/ParserFiles/. To parse JSON over other input/output systems, such as Kafka, Kinesis, a network socket, a WebSocket, HTTP or AMQP, you would need to specify options for these formats.

CREATE OR REPLACE SCHEMA "SCHEMA_KEY_VALUE";
SET SCHEMA 'SCHEMA_KEY_VALUE';
CREATE OR REPLACE SERVER "KVReaderServer" TYPE 'FILE'
FOREIGN DATA WRAPPER ECDA;
CREATE OR REPLACE FOREIGN STREAM KeyValueStreamExample
(CUSTOMER_ID INTEGER,
CUSTOMER_NAME VARCHAR(32),
CONTACT_NAME VARCHAR(32),
ADDRESS VARCHAR(32),
CITY VARCHAR(32),
POSTAL_CODE VARCHAR(32),
COUNTRY VARCHAR(32))
--Columns for the new stream
--These map to the key values in the file
SERVER "KVReaderServer"
OPTIONS
(directory '/tmp/',
filename_pattern 'workload_kv',
parser 'KV',
character_encoding 'UTF-8',
STATIC_FILES 'true');

To actually begin reading from the file, use a statement along the lines of

SELECT STREAM * from KeyValueStreamExample as t;

Foreign Stream Options for Parsing Key Values

Option Definition
PARSER This needs to be KV.
QUOTE_CHARACTER Lets you specify a different quote character, such as a single quote ('). Default is double quote (").
KEY_VALUE_SEPARATOR_CHARACTER Lets you specify the character that connects keys and values. Default is equals symbol (=)
SEPARATOR Lets you specify a character that separates key-value pairs. Default is comma (,).
ROW_SEPARATOR Lets you specify a character that splits lines in the key-value source. Default is \n.