Data sources may be divided into subsets called “partitions” to enable scaling up of SQLstream processing. For example:
- Apache Kafka topics are implemented as one or more partitions
- Amazon Kinesis streams are implemented as one or more partitions
- The File-VFS plugin allows you to divide sets of incoming files into logical partitions (by hashing the filename).
SQLstream pipelines can be scaled out by implementing a number of shards; the partitions are assigned as evenly as possible across the shards.
For more information about how SQLstream s-Server assists in sharding SQLstream pipelines see the individual plugins:
For more about the mapping of partitions to shards, see shards.