Today we are excited to release the TileDB connector for PrestoDB
is a popular open source distributed SQL engine for analytic queries against large amounts of data. TileDB connector support means that users can run SQL queries directly on TileDB arrays and join data stored in TileDB with external data sources. TileDB
is an optimized open source storage engine for multi-dimensional structured data which can be queried directly using a growing set of interfaces and APIs from C/C++ to Python to R and now SQL.
TileDB models traditional relational database data as multi-dimensional arrays, taking advantage of data that exhibits a natural sorted order along one or more dimensions. As an example, the figure below shows sample New York Stock Exchange (NYSE) quote data as both a logical relational table in PrestoDB and as a two dimensional sparse
array in TileDB. Stocks quotes have a natural sorted order on datetime. A common operation is to perform aggregation queries of quotes by stock symbol across a limited time window. By modeling quote data as a 2D array, this type of query reduces to an aggregation over a small two dimensional slice of the original data.