Novel Format
TileDB introduces a novel on-disk format for storing multi-dimensional arrays. Contrary to other popular systems (e.g., HDF5) that are optimized mostly for dense arrays, TileDB is optimized for both dense and sparse arrays, exposing a unified array API. In addition, TileDB's concept of immutable, append-only fragments allows for efficient updates.
dense and sparse arrays, coupled with rapid updates
Compression
Experience fast slicing and dicing of your arrays while achieving high compression ratios with TileDB's tile-based approach. TileDB can compress array data with a growing number of compressors, such as GZIP, BZIP2, LZ4, ZStandard, Blosc, double-delta and run-length encoding.
tile-based compression with multiple compressors
Parallelism
Build powerful parallel analytics on top of the TileDB array storage manager, leveraging TileDB's extreme internal parallelism or its thread-/process-safety and asynchronous writes and reads.
internal parallelism or parallel programming
Portability
TileDB works on Linux, macOS and Windows, offering easy installation packages, binaries and Docker containerization. Integrate TileDB with the tools of your favorite platform to manage massive multi-dimensional array data.
cross-platform use
Language Bindings
Enable your data science applications to work with immense amounts of data, beyond what can be stored in main memory. TileDB is built in C and C++ for performance, providing Python, R, Go and Java APIs for interoperability and ease of use.
built in C and C++ for performance, integrated with high-level languages for ease
Multiple Backends
Transparently store your arrays across multiple backends such as HDFS or S3-compliant object stores (like AWS S3, minio, or Ceph). TileDB's API is the same regardless of where the array is stored.
HDFS and AWS S3 support
Key-Value Store
Store any persistent metadata with TileDB's key-value storage functionality. A TileDB key-value store is implemented as a TileDB sparse array and inherits all its benefits (such as compression, parallelism, and multiple backend support).
persistent maps/dictionaries via sparse arrays
Virtual Filesystem
Add general file management and IO to your applications for any supported storage backend using TileDB's unified "virtual filesystem" (VFS) API.
generic file IO on multiple backends
© 2018 TileDB, Inc. All rights reserved.
[email protected]