Embedded databases – what is an embedded database? and how to choose one

Embedded databases – what is an embedded database? and how to choose one

What is an Embedded Database?

What is a database?

While – strictly speaking – “database” refers to a (systematic) collection of data, “Database Management System” (or DBMS) refers to the piece of software for storing and managing that data. However, often the term “database” is also used loosely to refer to a DBMS, and you will find most DBMS only use the term database in their name and communication.

What does embedded mean in the realm of databases?

The term “embedded” can be used with two different meanings in the database context. A lot of confusion arises from these terms being used interchangeably. So, let’s first bring clarity into the terminology.

 💡 The term “embedded” in databases

 Embedded database”, meaning a database that is deeply integrated, built into the software instead of coming as a standalone app. The embedded database sits on the application layer and needs no extra server. Also referred to as an “embeddable database”,  “embedded database management system” or “embedded DBMS (Database Management System)”. 

“Database for embedded systems” is a database specifically designed to be used in embedded systems. Embedded systems consist of a hardware / software stack that is deeply integrated, e.g. microcontrollers or mobile devices. A database for such systems must be small and optimized to run on highly restricted hardware (small footprint, efficiency). This can be also called an “embedded system database”. For clarity, we will only use the first term in this article.

Embedded Database vs Embedded System

What is an embedded system / embedded device?

Embedded systems / embedded devices are everywhere nowadays. They are used in most industries, ranging from manufacturing and automotive, to healthcare and consumer electronics. Essentially, an embedded system is a small piece of hardware that has software integrated in it. These are typically highly restricted (CPU, power, memory, …) and connected (Wi-Fi, Bluetooth, ZigBee, …) devices. Embedded Systems very often form a part of a larger system. Each individual embedded system serves a small number of specific functions within the larger system. As a result, embedded systems often form a complex decentralized system.

 

Examples of embedded systems: smartphones, controlling units, micro-controllers, cameras, smart watches, home appliances, ATMs, robots, sensors, medical devices, and many more.

Embedded Database vs Database for Embedded Systems

When and why is there a need for a database for embedded devices?

A large number of embedded systems has limited computational power, so the efficiency and footprint of the DBMS is vital. This fact gave rise to the new market of databases specifically made for embedded systems. Because of being lightweight and highly-performant, embedded databases might work well in embedded systems. However, not all embedded databases are suitable for embedded devices. Features like fast and efficient local data storage and efficient synchronisation with the backend play a huge role in determining which databases work best in embedded systems. 

A database that is both embedded in the application and works well in embedded systems is called an Edge database. To clarify, Edge Database is an embedded database optimised for resource-efficiency on restricted decentralised devices (this typically means embedded devices) with limited resources. Mobile databases, for example, are a type of Edge databases that support mobile operating systems, like Android and iOS.

New Edge databases solve the challenge of an insanely growing number of embedded devices. This applies to both in the professional / industrial as well as the consumer world. Edge databases hence create value for decentralised devices and data by making the former more useful. 

    A database for embedded systems / embedded devices can be simultaneously an embedded database. However, more important is its performance with regards to on-device resource use to serve the restricted devices. A database that is embedded and optimized for restricted devices is called “Edge database”.

    Why use an embedded database in an embedded system?

    First of all, local data storage enabled by embedded databases is a big advantage for embedded systems. Due to the limited connectivity or realtime requirements that these systems often experience, one often cannot rely on it for retrieving data from the cloud. Instead, a smart solution would be to store data locally on the device and sync it with other parts of the system only when needed.

    Aside: a word about data sync. Embedded systems often deal with large amounts of data, while also having an unreliable or non-permanent connection. This can be imposed by the limitations of the system or done deliberately to save battery life. Thus, a suitable synchronisation solution should not only sync data every time there is a connection, but also do it efficiently. For example, differential sync works well: by only sending the changes to the server, it will help to avoid unnecessary energy use and also save network costs.

    The two most important features of databases in embedded systems are performance and reliability. A database used in embedded systems should perform well on devices with limited CPU and memory. This is why embedded databases might work well in embedded systems – they are largely designed to work in exactly such environments. Some of them are truly tiny, which means they thrive in small applications. While better performance helps to eliminate some of the risks, it does not help with sudden power failures. Therefore, a good data recovery procedure is also important. This is most consisely demonstrated by ACID compliance.

    Let’s have a look at the features of embedded databases that make them a great choice for embedded systems. 

    Advantages of embedded databases

    1. High performance. Truly embedded databases benefit from simpler architecture, as they do not require a separate server module. While the client/server architecture might benefit from the ability to install the server on a more powerful computer, this also means more risk. Getting rid of the client/server communication level reduces complexity and therefore boosts performance. 
    2. Reliability. Many embedded devices use battery power, so sudden power failures might happen. Therefore, the data management solution should be built to ensure that data is fully recovered in case of a power failure. This is a popular feature of embedded databases that are built with embedded systems in mind.  
    3. Ease of use and low maintenance. Other important benefits of using an embedded database include easy implementation and low maintenance. Designing embedded devices often requires working in tight schedules, so choosing an out-of-the-box data persistence solution is the best choice for many projects. Since embedded databases are embedded directly in the application, they do not need administration and effectively manage themselves.
    4. Small footprint. Embedded databases do not always have a small footprint, but some of them are smaller than 1 MB, which makes them particularly suitable for mobile and IoT devices with limited memory.
    5. Scalability. As the number of embedded devices grows every year, so does the data volume. An efficient solution should not only perform well with large sets of data, but also adapt to new device features and easily change to fit the needs of a new device. This is where rigid database schemas come as a disadvantage.

     

    How to choose an embedded database

    When choosing an embedded database, look out for such factors as ACID (atomicity, consistency, isolation, durability) compliance, CRUD performance, footprint, and (depending on the device needs) data sync.

    SQLite and SQlite alternatives – a detailed look at the market of embedded databases

    Database solution Primary model Minimum footprint Sync Languages
    SQLite relational <1MB no C/C++, Tcl, Python, Java, Go, Matlab, PHP, and more
    Mongo Realm object-oriented NoSQL database 5 MB+ sync only via Mongo Cloud Swift, Objective-C, Java, Kotlin, C#, JavaScript
    Berkeley DB NoSQL database; key-value store <2MB no C++, C#, Java, Perl, PHP, Python, Ruby, Smalltalk and Tcl
    LMDB key-value store <1MB no C++, Java, Python, Lua, Go, Ruby, Objective-C, JavaScript, C#, Perl, PHP, etc
    RocksDB key-value store   no C++, C, Java, Python, NodeJS, Go, PHP, Rust, and others
    ObjectBox object-oriented NoSQL database <1MB offline, on-premise and cloud Sync, p2p Sync is planned Java, Kotlin, C, C++, Swift, Go, Flutter / Dart, Python
    Couchbase Lite NoSQL DB; document store 1-5 MB sync needs a Couchbase Server Swift, Objective-C, C#, C, Java, Kotlin, JavaScript
    UnQLite NoSQL; document & key-value store ~1.5 MB no C, C++, Python
    extremeDB in-memory relational DB, hybrid persistence <1 MB no C, C#, C++, Java, Lua, Python, Rust

    When to use an Embedded Database and how to choose one

    Firstly, when choosing a database for an embedded system, one has to consider several factors. The most important ones are performance, reliability, maintenance and footprint. On highly restricted devices, even a small difference in one of those parameters might make an impact. While building your own solution with a particular device in mind would certainly work well, tight schedules and additional effort don’t always justify this decision. This is why we recommend choosing one of the ready-made solutions that were built with the specifics of embedded systems in mind. 

    Secondly, to avoid unnecessary network and battery use, you might want to choose an embedded database. On top, an efficient differential data sync solution will help reduce overhead and reduce the environmental footprint.

    Finally, there are several embedded databases that perform well on embedded devices. Each has its own benefits and drawbacks, so it’s up to you to choose the right one for your use case. That being said, we’d like to point out that ObjectBox outperforms all competitors across each CRUD operation. See it for yourself by checking out our open source performance benchmarks.

    Beginner C++ tutorial: ObjectBox installation

    Beginner C++ tutorial: ObjectBox installation

    This ObjectBox beginner tutorial is for people who have limited knowledge of C++ development (no prior experience with external libraries is required). It will walk you through the installation process of all the development tools needed to get started with ObjectBox on Windows. By the way, ObjectBox is a database with intuitive native APIs, so it won’t take you long to start using it.

    Firstly, we will need to set up a Linux subsystem (WSL2) and install such tools as:

    • CMake, which will generate build files from the ObjectBox source code to work on Linux;
    • Git, which will download the source code from the ObjectBox repository.

    Then, we will install ObjectBox and run a simple example in Visual Studio Code.

    Windows Subsystem for Linux (WSL2)

    In this section, you will set up a simple Linux subsystem that you can use to build Objectbox in C++.

    1. Install WSL (Note: this requires a reboot; it also configures a limited HyperV that may cause issues with e.g. VirtualBox).
      Warning: to paste e.g. a password to the Ubuntu setup console window, right-click the title bar and select Edit → Paste. CTRL + V may not work.
    2. (optional, but recommended) install Windows Terminal from Microsoft Store and use Ubuntu from there (does not have the copy/paste issue, also supports terminal apps better).
    Windows Terminal in the Microsoft Store

    3. Within Windows Terminal, open Ubuntu by choosing it from the dropdown menu.

    Drop-down menu in Windows Terminal, through which a new tab for Ubuntu can be opened

    4. Get the latest packages and upgrade:

    5. Install build tools

    Install ObjectBox using CMake

    Now that you have WSL2 and all the packages, we can switch to VS Code and install ObjectBox with the help of CMake.

    1. In Ubuntu, create a new directory and then open it in Visual Studio Code:

    2. Install the following extensions:

    Extensions tab in Visual Studio Code, showing what needs to be installed in this tutorial: C/C++, CMake Tools and Remote - WSL

    3. Create a text file called CMakeLists.txt with the following code. It will tell CMake to get the ObjectBox source code from its Git repository and link the library to your project.

    4. Create a simple main.cpp file that will help us verify the setup:

    5. Follow this official guide for VS code and CMake to select Clang as the compiler, configure and build ObjectBox. As a result, .vscode and build folders will be generated. So your directory should now look like this:

    Explorer tab in Visual Studio Code, showing the two new folders that were generated after a successful build

    Running the tasks-list app example

    Finally, we can check that everything works and run a simple example.

    1. Click the “Select target to launch” button on the status bar and select “myapp” from the dropdown menu. Then launch it. You should see it output the correct version as in the screenshot.

    "Select launch target" menu in Visual Studio Code
    Output of main.cpp, verifying the version of ObjectBox used and demonstrating that the C++ build files were generated correctly.

    2. Before proceeding with the example, you need to download the most recent ObjectBox generator for Linux from releases. Then come back to the Windows Terminal and type

    to open the current directory in Windows Explorer. Copy the objectbox-generator file in there.

    3. Back in VS Code, you should now run the generator for the example code:

    If you get a “permission denied” error, try this to make the generator file executable for your user:

    4. Now choose objectbox-c-examples-tasks-cpp-gen as the target and run it. You should see the menu of a simple to-do list app as shown on the screenshot. It stores your tasks, together with their creation time and status. Try playing around with it and exploring the code of this example app to get a feel of how ObjectBox can be used.

    Output of the Objectbox C++ tasks-list app example showing its menu with available commands

    Note: if you see a sync error (e.g. Can not modify object of sync-enabled type “Task” because sync has not been activated for this store), please delete the first line from the tasklist.fbs file and run the objectbox generator once again. Or, if you want to try sync, apply for our Early Access Data Sync. There is a separate example (called objectbox-c-examples-tasks-cpp-gen-sync) that you can run after installing the Sync Server.

    Why Edge Computing is More Relevant in 2021 Than Ever

    Why Edge Computing is More Relevant in 2021 Than Ever

    The world has been forced to digitize more quickly and to a greater extent in 2020 and 2021. COVID has created the need to remodel how work, socializing, production, entertainment, and supply chains function. Despite decades of digitization efforts, with the pandemic upon us, digitization challenges have become transparent. Many companies and countries realize now, they have fallen behind. And those that have not yet digitized were hit hardest by the pandemic. [1] With people leaning heavily on online digital solutions, internet infrastructure is at its capacity limit. [2] Accordingly, users are seeing broadband speeds drop by as much as half. [3] In Europe, governments even requested to reduce the quality of Netflix, Amazon Prime, Youtube and other streaming services to improve network speed. [4]

    These challenges demonstrate the growing need for an alternative to cloud computing. Cloud computing is an inherently centralized computing paradigm. Edge Computing is a decentralized topology that is based on keeping data local, at the ‘edge’ of the network, as close to the source as possible. Edge Computing is ideal for applications that are data-intensive, have high latency-requirements, or need to work offline, independant from a cloud connection. Using data on the edge, directly on or near the source of the data, not only increases the efficiency and speed of data use, but it reduces unecessary network burden and data traffic waste.

    Coronavirus accelerates the need to digitize

    It was clear even before the outbreak that internet infrastructure was struggling to keep up with growing data volumes. However, the pandemic has made broadband limitations more apparent to everyday users.

    Projections estimate that by 2025 there will be 20 million IoT devices [5] and 1.7MB of data created per second per person. It is slow, expensive, and wasteful to send all of this data to the cloud for storage and processing. This practice overburdens bandwidth and data center infrastructure. It makes projects expensive and unsustainable. Working with the data, locally, on the edge, where it was produced and is used, is more efficient than sending everything to the cloud and back. It brings reduced latency, reduced cloud usage and costs, independence from a network connection, more secure data and heightened data privacy – and even reduces CO2. Indeed, prior to the pandemic, edge computing was on the strategic roadmap for over 50% of mobility decision makers. [6]

    As the world begins to recover from the coronavirus pandemic, digitization efforts will no doubt increase. We will see intelligent systems implemented across industries and value chains, accelerating innovation and alongside: data volumes and subsequent strain on network bandwidth. Edge computing is a key technology to ensure that this digitalization is both scalable and sustainable.  

    Edge Computing takes the ‘edge’ off bandwidth strain

    what is edge computing?

    What is Edge Computing?

    With edge computing, data is stored and used on devices at the “edge” of the network – away from centralized cloud servers. Computing on the edge means that data is stored and used locally, on the device, e.g. a smart phone or IoT device. Edge computing delivers faster decision making, local and offline data processing, as well as reduced data transfer to the cloud (e.g. filtered, computed, extra- or interpolated data), which saves both bandwidth and cloud storage costs. 

    The Edge complements the Cloud

    Although some might set cloud and edge in competition, the reality is that edge computing and cloud computing are both useful and relevant technologies. Both have different strengths and ideal use cases. Together they can provide the best of both worlds: decentralized local storage and processing, making efficient use of hardware on the edge and central storing and processing of some data, enabling additional centralized insights, data backups (redundancy), and remote access. To combine the best of both worlds, relevant and useful data must be synchronized between the edge and cloud in a smart and efficient way.  

    Edge computing is an ideal technology to reduce the strain on data centers, so those functions that need cloud connection have adequate bandwidth; while those use cases that benefit from reduced latency and offline functionality are optimized on the edge.

    The Edge: interface between the Physical and the Digital World

    Edge devices handle the interface between the physical world and the cloud, enabling a whole set of new use cases. “Data-driven experiences are rich, immersive and immediate. But they’re also delay-intolerant data hogs”. [8] And therefore need to happen locally, on the edge. We may see edge computing enabling new forms of remote engagement [9], particularly in a post-corona environment.

    Edge devices can be anything from a thermostat or small sensor to a fridge or mobile phone or car – and they are part of our direct physical world and use data from their local environment to enable new use cases. Think self-stocking fridges, self-driving cars, drone-delivered pizzas. In the same way, Edge Computing is the key to the first real world search engine. I am waiting for it every day: “Hey Google, where are my keys?” Within a location like a house, the concepts and technologies to enable such a real-world search engine are all clear and available – it is just a matter of time and ongoing digitization. The basis will need to be a fast and sustainable edge infrastructure. 

    Sustainability on the Edge

    Centralized data centers consume a lot of energy, produce a lot of carbon emissions and cause significant electronic waste. [10] While data centers are seeing a positive trend towards using green data centers, an even more sustainable approach is to cut unnecessary cloud traffic, central computation and storage as much as possible by shifting computation to the edge. Edge Computing strategies that harness the power of already deployed available hardware (like e.g. smartphones, machines, desktops, gateways) make the solution even more sustainable.

    sustainability on the edge

    Intelligent Edge: AI and Edge advance hand in hand

    The growth of Artificial Intelligence (AI) and the Edge will go hand in hand. As more and more data is generated at the edge of the network, there will be a greater demand for intelligent data processing and structured optimization to reduce raw data loads going to the cloud. [11] Edge AI will have the power to work with data on local devices, keeping data streams more useful and usable. In the near future, Machine Learning applications will have the ability to learn and create unique, localized, decentralized insights on the edge – based on local inputs.

    “With Edge AI, personalization features that we want from the app can be achieved on device. Transferring data over networks and into cloud-based servers allows for latency. At each endpoint, there are security risks involved in the data transfer”. [12] Which is part of the reason why the Edge AI Software market is forecasted to reach 1.12 trillion dollars volume by 2023. The development of AI accelerators, which improve model inferencing on the edge, namely from NVIDIA, Intel and Google are helping to make AI on the edge more viable. [13] A fast edge database is a necessary base technology to enable more AI on the edge. 

    Edge Computing – an answer to Data Privacy concerns and a need for Resilience

    As data collection grows in both breadth and depth, there is a stronger need for data privacy and security. Edge computing is one way to tackle this challenge: keeping data where it is produced, locally, makes data ownership clear and data less likely to be attacked and compromised. If compromised, the data compromised is clearly defined, making notification and subsequent actions manageable. ObjectBox, in its core and as an edge technology, is designed to keep data private, on those devices it was created on, and only share select data as needed. 

    The more our private and working lives as well as the larger economy depend on digitalization, the more important it is that systems, underlying computing paradigms as well as networks have strong resilience and security. In computer networking, resilience is the ability to “provide and maintain an acceptable level of service in the face of faults and challenges to normal operation.” [14]

    ing initEdge Computing shifts computer workloads – the collection, processing, and storage of data – from central locations (like the cloud) to the edge of the networks to many individual devices such as cell phones. Accordingly, any strain is distributed to many devices. Therefore, the risk of a total breakdown is reduced: If one device does not work anymore, the rest is still working. Depending on the setup, the individual devices could even compensate for devices that have a problem.

    The same applies to security risks: Even if data from one device is compromised, all other data sets are still safe; the loss is thus very limited and clear.  Overall, as a complement to the cloud, edge computing provides improved strength and security in local networks around the world. These local infrastructures can relieve the pressure on the existing complex dependencies, and in turn make the wider system more resilient and flexible. With Edge Computing crisis response can therefore in all likelihood be faster, better informed, and more effective. [15]

    Why Corona-Tracking-Apps need to work on the edge

    There was initially quite some debate about taking a centralized versus decentralized approach to Corona-Tracking-Apps. [16] Many people were worried about their data. Edge Computing – storing most parts of the data locally, on the user’s device – is a great way to avoid unnecessary data sharing and keep data ownership clear. At the same time, data is by and large much more secure and less likely to be attacked and hacked, as the data to be gained is very reduced. An intelligent syncing mechanism like ObjectBox Sync ensures that the data which needs to be shared, is shared in a selective, transparent and secure way.

    The next few years will see big cultural changes in both our personal and professional lives – a portion of those changes will be driven by increased digitalization. Edge computing is an important paradigm to ensure these changes are sustainable, scalable, and secure. Ultimately, we have the chance to rise from this crisis with new insights, new innovation, and a more sustainable future.

    1. https://www.netzoekonom.de/2020/04/11/die-oekonomie-nach-corona-digitalisierung-und-automatisierung-in-hoechstgeschwindigkeit/
    2. https://www.cnet.com/news/coronavirus-has-made-peak-internet-usage-into-the-new-normal/
    3. https://www.nytimes.com/2020/03/26/business/coronavirus-internet-traffic-speed.html
    4. https://www.theverge.com/2020/3/27/21195358/streaming-netflix-disney-hbo-now-youtube-twitch-amazon-prime-video-coronavirus-broadband-network
    5. https://www.gartner.com/imagesrv/books/iot/iotEbook_digital.pdf
    6. https://www.forbes.com/sites/forrester/2019/12/02/predictions-2020-edge-computing-makes-the-leap/#1aba50104201
    7. https://www.gartner.com/smarterwithgartner/what-edge-computing-means-for-infrastructure-and-operations-leaders/
    8. https://www.iotworldtoday.com/2020/03/19/ai-at-the-edge-still-mostly-consumer-not-enterprise-market/
    9. https://www.accenture.com/us-en/insights/high-tech/edge-processing-remote-viewership
    10. https://link.springer.com/article/10.1007/s12053-019-09833-8
    11. https://www.forbes.com/sites/cognitiveworld/2020/04/16/edge-ai-is-the-future-intel-and-udacity-are-teaming-up-to-train-developers/#232c8fab68f2
    12. https://www.forbes.com/sites/cognitiveworld/2020/04/16/edge-ai-is-the-future-intel-and-udacity-are-teaming-up-to-train-developers/#232c8fab68f2
    13. https://www.forbes.com/sites/janakirammsv/2019/07/15/how-ai-accelerators-are-changing-the-face-of-edge-computing/#2c1304ce674f
    14. https://en.wikipedia.org/wiki/Resilience_(network)
    15. https://www.coindesk.com/how-edge-computing-can-make-us-more-resilient-in-a-crisis
    16. https://venturebeat.com/2020/04/13/what-privacy-preserving-coronavirus-tracing-apps-need-to-succeed/

    How Building Green IoT Solutions on the Edge Can Help Save Energy and CO2

    How Building Green IoT Solutions on the Edge Can Help Save Energy and CO2

    The internet of things (IoT) has a huge potential to reduce carbon emissions, as it enables new ways of operating, living, and working [1] that are more efficient and sustainable. However, IoT’s huge and growing electricity demands are a challenge. This demand is due primarily to the transmission and storage of data in cloud data centers. [2] While data center efficiency and the use of green energy will reduce the CO2 emissions needed for this practice, it is not addressing the problem directly. [3

    iot-data-cloud-energy-waste

    With ObjectBox, we address this unseen and fast-growing CO2 source at the root: ObjectBox empowers edge computing, reducing the volume of data transmitted to central data storage, while at the same time, heightening data transmission and storage efficiency. [4] We’ve talked before about how edge computing is necessary for a sustainable future, below we dive into the numbers a bit deeper. TLRD: ObjectBox enables companies to cut the power consumption of their IoT applications, and thus their emissions, by 50 – 90%. For 2025, the potential impact of ObjectBox is a carbon emission reduction of 594 million metric tons (see calculations below).

    How ObjectBox’ Technology Reduces Overall Data Transmission

     ObjectBox reduces data transmission in two ways: 1. ObjectBox reduces the need for data transmission, 2. ObjectBox makes data transmission more efficient. ObjectBox’ database solution allows companies to build products that store and process data on edge devices and work with that data offline (as well as online). This

    Green IoT Solution

    not only improves performance and customer experience, it also reduces the overall volume of data that is being sent to the cloud, and thus the energy needed to transfer the data as well as store it in the cloud. ObjectBox’ Synchronization solution makes it easy for companies to transmit only the data that needs to be transmitted through 1) selective two-way syncing and 2) differential delta syncing. Synchronizing select data reduces the energy required for unnecessarily transmitting all data to the cloud.

    We have demonstrated in exemplary case studies that ObjectBox can reduce total data transmissions by 70-90%, depending on the case. There will, however, typically be value in transmitting some parts of data to a central data center (cloud); ObjectBox Sync combines efficient compression based on standard and proprietary edge compression methods to keep this data small. ObjectBox also has very little overhead. Comparing the transmission of the same data sets, ObjectBox saves 40-60% on transmission data volume through the delta syncing and compression, and thus saves equivalent CO2 emissions for data transmissions. Additional studies support these results, and have shown that moving from a centralized to a distributed data structure, saves between 32 and 93% of transmission data. [5

    sync-sustainable-data-save-energy

    Calculations: How Does ObjectBox Save CO2?

    Physically using a device consumes little energy directly; it is the wireless cloud infrastructure in the backend (data center storage and data transmission) that is responsible for the high carbon footprint of mobile phones [6] and IoT devices. Estimates say that IoT devices will produce around 2,8 ZB of data in 2020 (or 2,823,000,000,000  GB), globally. [7] Only a small portion of that data actually gets stored and used; we chose to use a conservative estimate of 5% [8] (141,150,000,000 GB) and of that portion, 90% is transferred to the cloud [9] (127,035,000,000 GB). Transferring 1 GB of data to the cloud and storing it there costs between 3 and 7 kWh. [10] Assuming an average of 5 kWh this means a 127,035,000,000 GB multiplied by 5kWh, resulting in a total energy expenditure of 635,175,000,000 kWh. Depending on the energy generation used, CO2 emissions vary. We are using a global average of 0,475 kgCO2 / 1 kwH. [11] In total this means that there will be 301,708,125,000 KG of CO2, or roughly 301 million metric tons of CO2 produced to transfer data to the cloud and store it there in 2020. 

    Projections for 2025 have data volumes as high as 79.4 ZB. [12] Following the same calculations as above, IoT devices would be responsible for 8 billion metric tons of CO2 in 2025.* We estimate that using ObjectBox can cut CO2 caused by data transmission and data centers by 50-90%, by keeping the majority of data on the device, and transmitting data efficiently. It will take time for ObjectBox to enter the market, so assuming a 10% market saturation by 2025 and an average energy reduction of 70%, using ObjectBox could cut projected CO2 emissions by 594 million metric tons in 2025.

    ObjectBox is on a mission to reduce digital waste which unnecessarily burdens bandwidth infrastructure and fills cloud servers, forcing the expansion of cloud farms and in turn, contributing to the pollution of the environment. As our digital world grows, we all need to give some thought to how we should structure our digital environments to optimize and support useful, beneficial solutions, while also keeping them efficient and sustainable. 

    *Of course, in that time, the technologies will all be more efficient and thus use less electricity while at the same time CO2 emissions / kWh will have dropped too. Thus, we are aware that this projection is an oversimplification of a highly complex and constantly changing system.

    [1] https://www.theclimategroup.org/sites/default/files/archive/files/Smart2020Report.pdf
    [2] https://www.iea.org/reports/tracking-buildings/data-centres-and-data-transmission-networks
    [3]“Data centres… have eaten into any progress we made to achieving Ireland’s 40% carbon emissions reduction target.” from https://www.climatechangenews.com/2017/12/11/tsunami-data-consume-one-fifth-global-electricity-2025/
    [4] https://medium.com/stanford-magazine/carbon-and-the-cloud-d6f481b79dfe
    [5] https://www.researchgate.net/publication/323867714_The_carbon_footprint_of_distributed_cloud_storage
    [6] https://www.resilience.org/stories/2020-01-07/the-invisible-and-growing-ecological-footprint-of-digital-technology/
    [7] https://www.idc.com/getdoc.jsp?containerId=prUS45213219, https://priceonomics.com/the-iot-data-explosion-how-big-is-the-iot-data/, https://www.gartner.com/en/newsroom/press-releases/2018-11-07-gartner-identifies-top-10-strategic-iot-technologies-and-trends, https://www.iotjournaal.nl/wp-content/uploads/2017/02/white-paper-c11-738085.pdf, ObjectBox research
    [8] Forrester (https://internetofthingsagenda.techtarget.com/blog/IoT-Agenda/Preventing-IoT-data-waste-with-the-intelligent-edge), Harvard BR (https://hbr.org/2017/05/whats-your-data-strategy), IBM (http://www.redbooks.ibm.com/redbooks/pdfs/sg248435.pdf), McKinsey (https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/the-internet-of-things-the-value-of-digitizing-the-physical-world)
    [9] https://www.gartner.com/smarterwithgartner/what-edge-computing-means-for-infrastructure-and-operations-leaders/
    [10] According to the American Council for an Energy-Efficient Economy: 5,12 kWh of electricity / GB of transferred data. According to a Carnegie Mellon University study: 7 kWh / GB. The American Council for an Energy-Efficient Economy concluded: 3.1 kWh / GB.
    [11] https://www.iea.org/reports/global-energy-co2-status-report-2019/emissions
    [12] https://www.idc.com/getdoc.jsp?containerId=prUS45213219

    Time Series & Objects: Using Data on the Edge

    Time Series & Objects: Using Data on the Edge

    Many IoT projects collect, both time series data and other types of data. Typically, this means they will run two databases: A time-series database and a traditional database or key/value store. This creates fracture and overhead, which is why ObjectBox TS brings together the best of both worlds in one database (DB). ObjectBox TS is a hybrid database: an extremely fast object-oriented DB plus a time-series extension, specially optimized for time series data. In combination with its tiny footprint, ObjectBox is a perfect match for IoT applications running on the edge. The out-of-the-box synchronization takes care of synchronizing selected data sets super efficiently and it works offline and online, on-premise, in the cloud.

    time-series-data-example-temperature

    What is time series data?

    There are a lot of different types of data that are used in IoT applications. Time-series is one of the most common data types in analytics, high-frequency inspections, and maintenance applications for IIoT / Industry 4.0 and smart mobility. Time series tracks data points over time, most often taken at equally spaced intervals. Typical data sources are sensor data, events, clicks, temperature – anything that changes over time.

    Why use time series data on the edge?

    Time-series data sets are usually collected from a lot of sensors, which sample at a high rate – which means that a lot of data is being collected.

    For example, if a Raspberry Pi gateway collects 20 data points/second, typically that would mean 1200 entries a minute measuring e.g. 32 degrees. As temperatures rarely change significantly in short time frames, does all of this data need to go to the cloud? Unless you need to know the exact temperature in a central location every millisecond, the answer is no. Sending all data to the cloud is a waste of resources, causing high cloud costs without providing immediate, real-time insights.

    time-series-objects-edge

    The Best of Both Worlds: time series + object oriented data persistence

    With ObjectBox you aren’t limited to only using time series data. ObjectBox TS is optimized for time series data, but ObjectBox is a robust object oriented database solution that can store any data type. With ObjectBox, model your world in objects and combine this with the power of time-series data to identify patterns in your data, on the device, in real time. By combining time series data with more complex data types, ObjectBox empowers new use cases on the edge based on a fast and easy all-in-one data persistence solution. 

    Bring together different data streams for a fusion of data; mix and match sensor data with the ObjectBox time series dashboard and find patterns in your data. On top, ObjectBox takes care of synchronizing selected data between devices (cloud / on-premise) efficiently for you.

    time-series-data-visualization-dashboard

    Get a complete picture of your data in one place

    Use Case: Automotive (Process Optimization)

    Most manufacturers, whether they’re producing cars, the food industry, or utilities, have already been optimizing production for a long period of time. However, there are still many cases and reasons why costly manual processes prevail.  One such example is automotive varnish. In some cases, while the inspection is automatic and intelligent, a lot of cars need to be touched up by hand, because the factors leading to the errors in the paint are not yet discovered. While there is a lot of internal expert know-how available from the factory workers, their gut feel is typically not enough to adapt production processes.

    How can this be improved using time series and object data? 

    The cars (objects) are typically already persisted including all the mass customization and model information. If now, all data, including sensor data, of the manufacturing site like temperature, humidity, spray speed (all time-series data) is persisted and added to each car object, any kind of correlations between production site variables, individual car properties and varnish quality can be detected. Over time, patterns will emerge. The gut feel of the factory workers would provide a great starting point for analyzing the data to discover Quick Wins before longterm patterns can be detected. Over time, AI and automatic learning kicks in to optimize the factory setup best possible to reduce the need for paint touch ups as much as possible. 

    Use Case: Smart Grids

    Utility grid loads shift continually throughout the day, effecting grid efficiency, pricing, and energy delivery. Using Smart Grids, utilities companies can increase efficiency and reliability in real time. In order to get insights from Smart Grids, companies need to collect a large volume of data from existing systems. A huge portion of this data is time series, e.g. usage and load statistics. On top, they incorporate other forms of data, e.g. asset relationship data, weather conditions, and customer profiles. Using visualization and analytical tools, these data types can be brought together to generate business insights and actionable operative goals.

    ObjectBox TS: time series with objects

    Storing and processing both time series data and objects on the edge, developers can gather complex data sets and get real time insight, even when offline. Combining these data types gives a fuller understanding and context for data – not only what happens over time, but what other factors could be influencing results. Using a fast hybrid edge database allows developers to save resources, while maintaining speed and efficiency. By synchronizing useful data to the cloud, real time data can be used for both immediate action, and post-event analysis.

    Get in touch with our team to get a virtual demo of ObjectBox TS, or check out the sample GitHub repo to see more about the code.