Parsing Key Value Pairs

The Key Value parser reads files with data in the format of Key=Value, where "Key" is the exact name of a column in the foreign stream with columns whose names match the keys in the file. (The parser will ignore any key/value pairs not defined as columns.) To parse Key Value pairs with the Extensible Common Data Adapter (ECDA), you need to specify a location for the file to be read and set the parser option as KV. Column types also need to match. If, for example, the file a key value CUSTOMER_ID="Alan" and CUSTOMER_ID is defined as INTEGER, parsing will fail and the error will be logged to the tracer. If the column is quoted then the key must match exactly. If the column is unquoted then the key must be in upper case (e.g. CUSTOMER_ID).

Note: You can also input data in larger chunks and parse it later using the Parser UDX ). This UDX calls the parsers listed above in a function. For more information on using functions, see the topic Transforming Data in s-Server.

The s-Server trace log includes information on readers' and parsers' progress. See Periodic Parser Statistics Logging in the Administering Guavus SQLstream guide. These errors are also logged in the Global Error Stream.

Example of a Workload File

CUSTOMER_ID=66,CUSTOMER_NAME="Jane Liu",CONTACT_NAME="David",ADDRESS="Linda-a-velha",CITY="Lisbon",POSTAL_CODE="2795",COUNTRY="Portugal"
CUSTOMER_ID=72,CUSTOMER_NAME="Ricardo Gomes Pereira",CONTACT_NAME="Ricardo Pereira",ADDRESS="Oeiras",CITY="Lisbon",POSTAL_CODE="2500",COUNTRY="Portugal"
CUSTOMER_ID=99,CUSTOMER_NAME="Juan Villa",CONTACT_NAME="Juan",ADDRESS="Trujillo",CITY="Cáceres",POSTAL_CODE="11125",COUNTRY="Spain"
CUSTOMER_ID=25,CUSTOMER_NAME="John Moore",CONTACT_NAME="John",ADDRESS="158 - 7th Ave.",CITY="Boston",POSTAL_CODE="90110",COUNTRY="USA"
CUSTOMER_ID=51,CUSTOMER_NAME="Rachel Morgan",CONTACT_NAME="Rachel",ADDRESS="352 9th Ave.",CITY="New York",POSTAL_CODE="65123",COUNTRY="USA"

Sample Foreign Stream to Parse Key Value Pairs

You indicate column names when you set up the stream, as in the following example, which creates a stream with the column names "customer_id", "customer_name", "contact_name", "address", "city", "postal_code", and "country". These columns will be assigned data from the first five columns found in the Key Value pairs file.

The following example creates and sets a schema called "SCHEMA_KEY_VALUE". It then creates a foreign stream to parse columns called customer_id, customer_name and so on from a file in the /tmp directory.

CREATE OR REPLACE SCHEMA SCHEMA_KEY_VALUE;

SET SCHEMA 'SCHEMA_KEY_VALUE';

CREATE OR REPLACE FOREIGN STREAM KeyValueStreamExample
( CUSTOMER_ID INTEGER,
  CUSTOMER_NAME VARCHAR(32),
  CONTACT_NAME VARCHAR(32),
  ADDRESS VARCHAR(32),
  CITY VARCHAR(32),
  POSTAL_CODE VARCHAR(32),
  COUNTRY VARCHAR(32)
)
SERVER "FILE_SERVER"
OPTIONS
( directory '/tmp/',
  filename_pattern 'workload_kv',
  parser 'KV',
  character_encoding 'UTF-8',
  STATIC_FILES 'true'
);

To actually begin reading from the file, use a statement along the lines of

SELECT STREAM * from KeyValueStreamExample as t;

To parse key-value pairs from a different source type, simply replace FILE_SERVER in the example with the appropriate pre-built server, or create your own SERVER for the source.

Foreign Stream Options for Parsing Key Values

Option Definition
PARSER This needs to be KV.
QUOTE_CHARACTER Lets you specify a different quote character, such as a single quote ('). Default is double quote (").
KEY_VALUE_SEPARATOR_CHARACTER Lets you specify the character that connects keys and values. Default is equals symbol (=)
SEPARATOR Lets you specify a character that separates key-value pairs. Default is comma (,).
ROW_SEPARATOR Lets you specify a character that splits lines in the key-value source. Default is \n.

Provenance Columns for Parsers

When parsing data, you can declare provenance columns. These return metadata for the parsed data.

For Key Values, these are as follows:

Data Type Name Description
BIGINT SQLSTREAM_PROV_PARSE_POSITION Parser position within message of last parse error.
VARCHAR(65535) SQLSTREAM_PROV_PARSE_ERROR Description of parser error.
BIGINT SQLSTREAM_PROV_TEXT_PARSE_LINE_NUMBER How many lines have been parsed so far. This value is not reset per message or file.