For many IoT projects, relying on the cloud for data storage and analysis is inefficient has many limitations, including:
- Dependence on an Internet Connection: Cloud-based solutions only work when an active Internet connection is available. However, many IoT applications need to function offline, e.g. autonomous driving.
- Lack of Speed: The time delay between an action and its response is significant in a cloud application due to the round-trip the data needs to take. A near real-time response is, however, critical for many IoT use cases.
- Data Security / Privacy / Data ownership: There are added risks (data breaches and/or tampering) when transferring data through the network as opposed to keeping/using data directly at the source of its creation.
- Broadband Limitations: The growth of data volumes and IoT devices exceeds the speed by which broadband infrastructure can be extended. This puts a hard limit on the growth of applications depending on the cloud.
- High Cloud Costs: The more data is sent to the cloud and stored in the cloud, the more the cloud costs. Many companies find costs for cloud applications are higher than expected.
For IoT projects that cannot work soley cloud-based due to e.g. hardware or network/bandwidth limitations or a need for realtime response rates, Edge Computing is an scalable and sustaiable solution. In order to bring computing closer to the source of the data, you need an IoT database optimized for the edge.
What to look for in an IoT Database for the Edge
There are several factors to consider when choosing an IoT database for the edge. The five most important criteria to take into account are the edge-capability, performance, ACID-support, language support and data type support.
Of course, in order to qualify as edge-capable, the database needs to run directly on a broad spectrum of edge devices – either embedded or in-memory. Many IoT devices are physically small and have limited resources, so a database for the edge needs to have a small footprint. For that reason, the list below does not include databases with a core library larger than 10MB.
Many edge cases have a need for speed; for example: In additive manufacturing making necessary adjustments to the next layer added to an asset needs to happen in near real-time. Because this decision is based on a multitude of environmental factors from the factory floor, tons of data from sensors need to be processed extremely quickly.
Depending on whether you can afford to lose some of your data sometimes, you need to check if the database is fully ACID compliant – and under which conditions any benchmarks have been run. What does ACID mean? In the database world, this popular acronym refers to how data in transactions is handled by a database and stands for: Atomicity, Consistency, Isolation, Durability. In short, a fully ACID-compliant database is transactionally safe and ensures that, despite errors, power failures etc., no data is lost and the transactions are always executed in a valid way.
Another important criteria obviously is the language used to implement the database: Does it match your project’s language and developer skills? Generally speaking it is more efficient to keep to one language; this is also why many developers love to avoid dealing with SQL.
Data Type support
Finally, you need to decide on the overall structure data shall be stored in. In this article, we will only focus on full databases that enable complex computing on small devices – thus it only includes traditional databases, i.e. those that are relational, object-oriented or graph-based. Databases that are limited to time-series data only (e.g. InfluxDB, TimescaleDB) or any ORMs will not be discussed here.
IoT Databases for the Edge
In order to help you in choosing the best IoT database for your next project swiftly, we had a look around and compared available databases. Here is a list of IoT databases for use on the edge:
Badger calls itself a distributed, fast graph database. It is an ACID-compliant, NoSQL, LSM tree-based key-value store written fully in and available only for Go. As Badger does not focus on being run on IoT devices, it supports easy horizontal scaling, synchronous replication to prevent data loss, load balancing and using the full capacity of SSDs instead of the RAM. Central or P2P synchronization are not available.
Berkeley DB is an ACID compliant embedded key-value store. Due to a static library size of less than 1 MB, and runtime dynamic memory requirements of only a few KB, it is suitable for a variety of edge devices. The database is usable in many different languages, such as C++, C#, Java, Perl, PHP, Python, Ruby, Smalltalk and Tcl. Berkeley DB does not offer any synchronization support.
LevelDB is a key-value storage library that provides an ordered mapping from string keys to string values. It is written in C++ and has bindings for languages such as C, Go, NodeJS and Java. LevelDB runs on-disk and is queried without SQL. Applications need to use it as a library, as the database does not provide any server or command line interface. Indexes and synchronization are not supported.
ObjectBox is a fast, object-oriented, ACID-compliant database with strong relation support. It was designed specifically for Edge IoT and embedded and mobile applications. ObjectBox has a memory footprint of less than 1 MB. Language support includes C, Go, Java, Kotlin, Swift, Python (Beta), and Dart. Centralized synchronization support is now available for early access and distributed / P2P synchronization is a work in progress; ObjectBox also offers a time series feature, optimized for time series data.
Realm, which was acquired by MongoDB in summer 2019, is an ACID-compliant NoSQL database. It has been strongly focused on mobile platforms from its start and is only beginning to move into IoT. It offers central as well as P2P synchronization. Supported languages include Java, Kotlin, Swift, C# and NodeJS.
Redis is a key-value database, which is per default not ACID-compliant. However, it offers an optional durability transaction concept, which when turned on reduces the database performance significantly. In contrast to other database systems, it works in memory with user commands not being data queries, but specific operations to be performed on abstract data types. Redis has many different client bindings, e.g. C, C++, Dart, Go, Java, NodeJS, Python and Rust.
RocksDB is a persistent key-value store for SSD and RAM storage. It is not ACID-compliant, but works using concurrent transactions with conflict resolution. This embedded NoSQL database supports Java, Python, NodeJS, Go, PHP and Rust, to name only a few languages. Synchronization in any form is not natively possible with RocksDB.
SQLite is the only fully SQL-based relational database library in this list. SQLite comes with a small footprint and is fully ACID-compliant. It offers encryption as a paid service. There is no support for synchronization.
Let us know your thoughts!
Different use cases call for different databases, and we hope that this list gives you a good starting point for your edge computing project. Let us know your thoughts in the comments below – what is your favorite database work with with and why?