Projects Overview

A StreamLab project is a set of StreamLab sources, external connections, sinks, and pipeline guides. You can create projects for different use cases, such as monitoring log on activity to a web site, tracking vehicle speeds across a bus system, or measuring HTTP requests on a server. StreamLab contains several built-in StreamApps that you can use as the basis for projects. See StreamLab Galleries below.

StreamLab projects have settings that let you change the project’s name and schema, and manage how the project handles streaming data. See Using Project Settings below.

Sources are log files, Kafka topics, JSON files, web feeds, external database tables, s-Server streams, tables, and so on. They capture data from web feeds, sensors, message buses, network feeds, applications, databases, and other sources. You can parse local log files–files reachable from s-Server–through StreamLab.

External Connections are databases or other data sources external to SQLstream s-Server. Once you set up an external connection, you can read and write to such data sources from StreamLab using a sink.

Sinks are destinations for rows of data, usually an external file system, message bus, or database. In s-Server, a sink consists of a stream and a pump to fill it (a pump moves data from one location to another. Internally, StreamLab uses sinks to connect pipeline guides with each other.

Pipeline Guides are collections of commands, suggestions, and scripts that you use to generate SQL views on your data sources. You can view and export the SQL generated by pipeline guides. See the topic StreamLab Pipeline Guides Overview in this guide for more details.

Projects can named, saved, and reopened. They have unique URLs, which you can share with others. Project names and user names will be appended to the StreamLab URL, as in the following:

http://myserver.com:5590/?proj=MyProject&user=user

Generally, you will want to have multiple StreamLab projects to manage different aspects of your data. You may want to start with a single project and save a copy of it when you are satisfied with it, building up a set of StreamLab applications to examine different configurations of data.

StreamLab projects are listed on the StreamLab projects home page.

Using Project Settings

You can change the project’s name, schema name, and adjust other settings through the Project Settings dialog. You can access this dialog box by clicking the Settings icon in the top left corner of the StreamLab page:

StreamLab Menu

The Project Settings dialog lets you change the project’s name, project schema name, manage how StreamLab handles queries on streams that are in use, handles throttling, and so on.

A schema is where project elements are “stored” in s-Server. By default, all the objects you create–pipeline guides, sinks, external connections, sources–are stored in the Project schema. The schema name is particularly important for developers who are accessing content that you create in StreamLab through s-Server.

Managing How StreamLab Handles Currently-Queried Streams

When a stream is being queried, it’s not possible to change the stream with a SQL script (that is, StreamLab cannot submit a CREATE OR REPLACE STREAM script). By default, StreamLab asks you if it’s okay to terminate these queries, but you can also choose to terminate these queries without asking or never terminate these queries.

  • Stop Queries Without Asking. StreamLab automatically terminates queries when you submit SQL for a currently-queried stream.
  • Ignore, Allowing SQL Scripts to Fail. StreamLab can automatically continue running queries and allow the submitted SQL to fail.
  • Ask for Permission to Stop Queries. This is the default behavior. With this setting enabled, when you submit SQL for a currently-queried stream, StreamLab will ask list currently-queried streams and ask your permission to terminate the query or queries: sl_streams_in_use_warning

When StreamLab terminates these queries, users viewing the dashboard using the query will see incoming data stop flowing. These users should should just save their changes to the dashboard and refresh the page.

Manage the SQL Run by StreamLab

By default, StreamLab runs the entire SQL script when you open it. You can deselect the Run the Complete SQL Script When Opened option to avoid running the entire script.

In order to use sources, StreamLab renders them in SQL. If you are not using sources, you can choose to leave them unrendered, which may improve performance in some cases. The Unattached Sources option lets you choose to leave these sources unrendered.

Managing Throttling

Sometimes, you may want to slow a data feed for testing purposes. In these cases, you can throttle your source–slow it to a specified number of rows per second. The default throttled rate is one row per second, but you can adjust this default rate by entering a different number in Project Standard Throttle Rate. You can also disable throttling for the project. You would most likely want to do so once you are ready to deploy a stream app. See throttling sources for more details.

Managing How the Scrutinizer Identifies Partition Keys

The Project Settings dialog box also lets you adjust settings related to the Scrutinizer. See the topic Managing the Scrutinizer for more details.

You can adjust what columns are identified as partition keys by changing Partition Key Unique Limit and Partition Key Length Limit.

  • Partition Key Unique Limit determines how limited a column needs to be–how many different values–in order to be identified as a partition key. If the Scrutinizer sees more than Partition Key Unique Limit it assumes the column probably isn’t a partition key.
  • Partition Key Length Limit determines how many values can be in a Partition Key. If the Scrutinizer sees a string value longer than Partition Key Unique Limit, it assumes the column probably isn’t a partition key. You can adjust these two values to shape which columns end up being marked.

Using the StreamLab Projects Home Page

Each instance of StreamLab has a StreamLab projects home page. This page features projects that you have created, as well as prebuilt projects in the StreamApp gallery. See StreamLab Gallery below.

Note: If you are running StreamLab on a port other than 5590, substitute that port for 5590 in the sentence above.

The home page lists all StreamLab projects created in this instance of StreamLab, lets you create new StreamLab projects, and delete existing StreamLab projects.

The home page also lists existing StreamLab StreamApps. You can import and export projects, create new projects, create copies of existing projects, and pause or resume streaming data. (This stops or starts pumps for the schema.) You can start and stop StreamApps from the command line.

Pausing and Starting Data Flowing in Schemas

All sources, pipeline guides, sinks, and external connections exist in schemas. You can start or pause data flowing in these schemas from the Projects home page.

Saving Unsaved Projects from the Projects Home Page

Projects that have not been saved will feature a “current project, not saved!” alert message, and an icon that lets you save the project.

If you try to open another project without saving the current project, StreamLab will alert you and ask if you want to save the current project before proceeding.

StreamLab Galleries

The StreamApp Gallery is a collection of StreamApps based on real-world data. These both provide demonstrations of StreamLab’s functionality, and can also be used as starting templates for your projects.

Before working with any of them, you will need to start the associated data stream. If you are running Guavus SQLstream on Amazon Marketplace, Microsoft Azure, as a Virtual Machine Appliance, as a Docker container, or Virtual Hard Disk, you can start these from the Guavus SQLstream cover page. If you have installed on Linux, you will need to start these manually.

This version of StreamLab features three StreamApps, all of which use different sources.

IoT Weather

Process real-time environmental sensor data from around the world to create a visualization of the weather. This application uses Kafka messages as a source. (Note: In order to use this demo on Linux, you need to have both Kafka and Kafkacat installed.)

Meetup RSVP

This app uses real-time Meetup RSVP messages from Meetup’s public API, including locations and event ratings to compute analytics about event popularity and visualize the results. This application uses a WebSocket as a source.

CDN QoS

This app uses Kafka as a source, and processes telemetry from video players around the world. It visualizes this data on a map. (Note: In order to use this demo on Linux, you need to have both Kafka and Kafkacat installed.)

Sydney Buses

This app processes real-time file system based telemetry data, such as latitude and longitude, driver id, bearing, speed, and so on from buses in the Sydney metropolitan area to create dashboards that show traffic patterns in terms of bus locations, speeds, and so on. This application uses the file system as a source.

Starting StreamApp Sources from the Command Line

The CDN QoS, IoT Weather, and Sydney Buses apps all use demonstration data generated by s-Server. You can start these sources manually using the following commands:

CDN

Starts streaming data into a Kafka topic called “cdn”.

Note: In order to use this demo on Linux, you need to have both Kafka and Kafkacat installed.) You can test if this is running by running something the following (depending on your installation of Kafka):/opt/kafka/bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic cdn –from-beginning

Start: $SQLSTREAM_HOME/demo/cdn/start.sh

Stop: $SQLSTREAM_HOME/demo/cdn/stop.sh

Check status:$SQLSTREAM_HOME/demo/cdn/status.sh

IoT

Starts streaming data into a Kafka topic called “IoT”. (Note: In order to use this demo on Linux, you need to have both Kafka and Kafkacat installed.)You can test if this is running by running something the following (depending on your installation of Kafka):/opt/kafka/bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic IoT –from-beginning

Start:$SQLSTREAM_HOME/demo/IoT/start.sh

Stop:$SQLSTREAM_HOME/demo/IoT/stop.sh

Check status: $SQLSTREAM_HOME/demo/IoT/status.sh |

Sydney Buses

Starts streaming data in JSON format into a file called /tmp/buses.log

Start:$SQLSTREAM_HOME/demo/data/buses/StreamJsonBusData.sh

Stop:
$SQLSTREAM_HOME/demo/data/buses/StopJsonBusData.sh

Creating a Project

To create a new project, click Projects in the top left corner of the StreamLab page. In the page that opens, click the Create New Project Button.

Each installation of StreamLab features also features StreamApps that you can use as the basis for new projects, such as the StreamLab_Bus_Demo pictured above. You can use the Copy Project button for a StreamApp or for a project that you previously created. The Copy Project button opens the Copy Project dialog box. Here, you enter a name for the new project. The name needs to have been previously unused in StreamLab.

In either case, a dialog box opens that lets you name the new project:

Saving Projects

To save a project, you click its title at in the top middle of the StreamLab application. Once you make any changes to the project, such as adding an item to the Script, this name turns blue. Click the title to save the project. The title returns to white after you click it.

Project Schemas

Schemas are groups of objects in s-Server, including streams, tables, and so on. All StreamLab objects–pipeline guides, external connections, sinks, sources–represent one or more of these s-Server object. A sink, for example, consists of a pump and a stream.

When you create a new source, external connection, sink, or pipeline guide, StreamLab uses the project schema by default. Using the project schema makes it easier for other s-Server developers to access the objects you create in StreamLab.

You can pause all streams in a schema by using the Start Streaming button on the Projects page. See Using the StreamLab Projects Home Page for more details.