Vector Databases for Edge AI

Vector Databases for Edge AI

The intersection of AI and Edge Computing is where Edge AI happens – and it needs databases that support AI and can run on the edge (for lack of a better term “Edge Vector Databases”). Vector Databases are the databases for AI and are an important piece of the AI tech stack. Edge Databases are databases that can run on edge devices.

Edge Vector Databases are the basis for Edge AI

 Edge Vector Databases – the intersection of Edge Computing and AI needs a database

In 2023, the Edge Computing Market is estimated to be at $53B,[1] while the AI market is expected to reach a whopping $87B.[2] Both markets are expected to grow dramatically in the coming years. The two technologies enhance each other. For many use cases it is advantageous, and oftentimes necessary, to use both Edge Computing and AI in conjunction. This is what is called Edge AI and Gartner prognosed its plateau within 2023-2024.[3] In fact, recently, Gartner named Edge AI as one of the breakthrough technologies of 2023 due to the growing demand for real-time AI solutions and the need for decentralized data processing.[4] The global Edge AI market size was valued at roughly $14.5B in 2022 with expected CAGRs of 20-30% from 2023 to 2030.[5] In this article we want to take a closer look at the use of vector databases in Edge AI.

The AI market: AI model trainings vs. using trained AI models

To understand the AI market, it is important to distinguish between the AI model generation, initial training, and the use of these models.

AI model training

AI model training is the most resource-intensive and costly part of AI – and is considered hard.[6] The larger the model, the higher the costs and the more time it needs. Also with models getting larger, the training fails more often and must be restarted, adding to duration and costs. Initially, AI models were specifically trained for one task only. Now, however, a new type of AI models, namely, foundation models, have evolved. They are general-purpose models. This development was a main driver of the current AI boom we are seeing.

Foundation models

Foundation models (also base models) are trained on a vast quantity of data at scale in such a way that they can be used for a wide variety of tasks. Costs for training such large foundation AI models from scratch (like Chat-GPT4) are typically in the millions of USD and the expectation is that large model training costs will go up to 500 million USD by 2030.[7] They can be “fine-tuned” (trained) with specific data sets (on top of the already trained model) to adapt them to specific environments,[8] e.g. GitHub Copilot is based on GPT-3 and was tweaked as well as additionally trained on the code from GitHub repos.[9] Fine-tuning is typically way less costly than training a specific one-task model from scratch (let alone the initial foundation model training). Examples of foundation models include the GPT-models from Open AI, LLaMa, and Google’s BERT. A foundation model is basically a specific subset of LLMs (Large Language Models).

Using trained AI models

Using trained foundation models (like Chat-GPT3 or LLaMA), additionally tuning them with specific data sets, is what is empowering most current AI tools / apps and all the innovations we are seeing around that. Basically, at this moment, a handful of popular foundation models are empowering a whole and currently thriving ecosystem. As AI models use vector embeddings, using trained models can be supported and enhanced with a vector database.[10] We’ll dive into this a bit more below.

📈 Side note on the market

From a market perspective, this means there is a relevant market entry barrier to training foundation models, and therefore it is reasonable to expect only few companies doing it (as opposed to many startups ;)). Yet, training new models is likely where the greatest innovation potential and most significant advancements lie. Given that the current AI market largely depends on large trained models (mainly foundation models) being deployed as free (often open source) models, there is a real risk of few tech giants owning the market later on, whereas the market that was built depending on those models starts stalling.[11] For example, openAI didn’t open source Chat-GPT4 anymore, because they felt it could impede their business interests; Sam Altman even went as far as saying it was wrong to ever open it – likely they should change their name.[12]

At the same time, Sam Altman went lobbying around the world to regulate AI (model trainings) and enhance entry barriers. The most reasonable explanation I read is that they are trying to ensure corporate dominance, and likewise, the reason behind the movement to pause AI model training from other actors in the space is a move to avoid a monopoly by OpenAI and get meat in the game quickly.[13]

Vector Databases and their important role in AI

Many machine learning and deep learning algorithms, as well as the AI models described above, depend on vector embeddings / embeddings. The increasing demand for creating, storing, and managing these embeddings has led to the emergence of vector databases

What are vector embeddings?

Vector embeddings represent (multimodal) data as n-dimensional vectors, meaning n-dimensional matrices, which comes down to a set of numbers. This is what makes them easy and efficient to compute with. The power of vector embeddings lies in their ability to capture the essence of the data they represent.

Vector embeddings basically are “the output of the process of learning embeddings”, which is done by feeding raw input data, like texts, images, words, into a trained AI model.[14]

In this process, the input data is translated into a lower-dimensional space. The result is a set of n-dimensional vectors in an embedding space.

The process of embedding

 The process of embedding [15]

The embedding space is specific to the data on which the embeddings were trained, but it can be transferred to other tasks and domains via transfer learning. Two embeddings that are close together in the embedding space indicate that the data they represent is similar.[16]

Once generated, the vector embeddings can be stored for efficient retrieval and use by AI / ML apps.[17] We visualized the process below.  

Vector Database initial preparation

Vector Database initial preparation [26]

How are embeddings generated?

Creating vector embeddings used to be a time-consuming process requiring domain experts and manual work. Today, however, there are many specialized models available for generating embeddings:

  • For text data, for example: Word2Vec, GloVe, and BERT. These models translate the semantic meaning of words and phrases into numerical form.
  • For image data, for example: VGG and Inception. These models capture visual characteristics of images and translate them into numerical form.

–> The Huggingface Model Hub offers many models that can create embeddings for different types of data. Anyone can do it with very little to no coding know-how.

But… how are embeddings generated?

There is no easy answer to this, at least I didn’t find one. If you really want to know, you will need to spend the time and effort to dig very deep. Today’s large language models were built over decades by many brilliant minds. They entail many fundamental concepts of converting different types of data (like words or images)  into numerical representations. The following three fundamental concepts seem to be part of most LLMs and are worth having heard of:[18]

  • Encoding – Non-numerical, multimodal data needs to be transferred into numbers, so models can be created out of them. 
  • Vectors – In order to store the encoded data and efficiently perform mathematical functions on them, encodings are stored as vectors (typically floating-point arrays).
  • Lookup matrices – Also known as lookup or hash tables; this table maps data  to quickly jump from numerical to word representations (and back) across large chunks of text.

From the database perspective, the primary interest is in working efficiently with the generated embeddings [19]. Once a piece of information (a sentence, a document, an image) is a vector embedding and stored in a database, it’s time to get creative.

Where are vector embeddings used? 

One of the most common vector embedding applications is in recommendation systems and search engines, e.g., Google Search uses embeddings to match text to text and text to images; Snapchat uses them to “serve individual ads depending on the user and time; and Meta (Facebook) uses them for social search.[20] But use cases are endless, e.g. they can also be used in chatbots, fraud detection, predictive maintenance, and autonomous driving. The ability to convert complex data into numerical form opens up endless applications. Generative AI applications typically work with vector embeddings. All of these applications benefit from using a vector database to enhance speed and efficiency. 

How are vector databases used? 

While a vector database can sometimes return responses directly, it typically returns results in conjunction with an LLM as we have depicted below. The vector database improves the accuracy of the LLM responses by using domain specific data from the database. More specifically: The vector search will give you the most relevant data for a specific query to provide to the LLM.[21] In both cases, the vector database helps reduce the number of queries to the LLM, which are costly, and also speeds up response times.

The use of vector databases: In use / production []

Vector databases: In use / production [26]

Accordingly, apart from efficiently handling vectors (as a data type), nearest neighbour search (primarily Approximate Nearest Neighbour (ANN) Search) is the most important feature of a vector database. The ANN algorithm finds the most similar vectors quickly. Additional filtering optimizes the result further, making queries even more efficient. Using the retrieved vectors (for context) and the initial query, an LLM generates the response. Typically, the response will be stored as a response embedding in the database, so over time, the database can serve more questions directly and / or improve the accuracy of the answer even more. This is depicted in the image above.

Edge vector databases – giving AI an edge

Edge Computing in a gist

Since data is produced and used everywhere (decentralized), using cloud computing for storage and processing is inefficient, wasteful, and often impossible. To unlock the value of decentralized data and drive digitization, you need to compute on the edge of the network (i.e. locally, closer to where data is generated). Gartner emphasizes Edge Computing’s importance for digital transformation.[22] To fully utilize Edge Computing, we need edge-specific infrastructure technologies, or “to make the edge as easy as the cloud for developers.” Edge databases are one such core infrastructure software. They enable rapid implementation of edge solutions by providing fast local data persistence (on the edge) and the capability to control and direct decentralized data flows (on the edge as well as in conjunction with a cloud). 

Edge computing and edge databases can unlock decentralized data’s full potential, drive digital transformation, and create a sustainable and efficient data management landscape.

What is Edge AI?

Edge AI is the implementation of AI applications on the edge of the network without using a cloud, meaning, the necessary AI computations are performed on the edge directly, where the data is produced, e.g. in the car (onboard AI) or on a mobile device or simply within a specific location like a shop floor. A local Edge AI enables making decisions reliably in milliseconds, also when offline, and way cheaper. At this moment, this is particularly interesting for mission-critical use cases, offline-scenarios, and applications with high data security / privacy requirements. To run AI models directly on the edge, they need to be optimized for edge devices. The good news is, there are several AI models optimized for small devices available, e.g. Google’s Gecko, which is open source.[23] “Gecko is so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.”[24] Edge AI applications benefit tremendously from using a local vector database; only few use cases could do without one.

Edge AI basically offers the advantages of Edge Computing, whereas the disadvantages are more specific to AI.

Advantages of Edge AI (vs. cloud) [25]
Disadvantages of Edge AI (vs. cloud) [25]
Edge AI is faster, can guarantee QoS requirements, and works offline Decentralized data access can be challenging and needs specific skills
Edge AI saves Internet bandwidth, cloud and networking costs (e.g. MNO costs) The initial setup for the ongoing training of decentralized Edge AIs is more complex and therefore costly (while the cloud setup is quick and easy)
Edge AI helps ensure data privacy, data security, data ownership Edge AI needs specific skills (entailing “oldschool programming skills”) – dev talent is hard to attract and expensive to keep 
Edge AI is more sustainable to run (less wasteful data traversal meaning less energy use, less costs for the energy and less CO2 emissions) The heterogeneity of the edge makes it difficult to develop solutions for a wide range of devices
Edge AI is a young market and therefore holds opportunities to capture market share and competitive advantages Edge AI is a young market and still lacks infrastructure software

Edge AI setup / architecture

There are generally two setups for Edge AI applications: Full edge setup, running the AI model on the edge devices directly, or a hybrid approach, using the cloud or a central server for the AI model.

Edge AI - setup / architecture options

Edge AI: general setup / architecture options

The edge / cloud approach has the advantage that the AI model (re-)training, enhancement will happen centrally, on the cloud without additional efforts. On the other hand, in the full edge setup, you have all the advantages of the edge (offline, cost-effective, private, …) and you can use the power of all edge devices, making it even more affordable and fast. However, the caveat lies in the challenge of organizing the learning. The individual local models diverge and you need to get these learnings distributed to all devices. This will be done in a “global AI model” (centrally).

Depending on the details, this can be done locally on a central server or in the cloud, or even on an edge device. Also, depending on the number of edge devices, connectivity, and need for speedy updates, the distribution can be organized in a more decentralized way, fully using the power of the edge. Once all devices are harmonized, any edge device could, if it has the capabilities from a hardware perspective, be the global model combining all updates and sending them out. This offers great advantages with regards to availability and resilience. You can find more about decentralized Edge AI setups under the term federated learning.

Edge AI - decentralized setup, using the full power of the edge

Edge AI – decentralized setup, using the full power of the edge

Summary: Vector Databases for the edge

According to our research and experience, no “Edge Vector Database” exists yet. The Edge Database market has always been limited to fewer players – and is certainly not as crowded as the central server / cloud database space. However, the cloud / server databases cannot be used on the edge (big to small just doesn’t work ;)), whereas Edge Databases can run anywhere and can sometimes be a good choice for a server / cloud setup.

Opinions differ on whether there is a need for specific, dedicated vector databases, or whether general databases will evolve to include vector support and become the go-to-solution. In any case, the vector database space is hot and has recently been added as a category on db-engines. Adding a new database category is a rare occurrence for db-engines, the established platform for databases. In any case, we can see that both types of databases are converging towards each other and we firmly believe that in the future, all databases will support vectors.

So, when it comes to Edge Databases, there are the same two options: Either someone will implement a dedicated edge vector database or Edge Databases will evolve to support vectors. Because so far we have seen neither, we have extended the ObjectBox Edge Database with vector support. And the huge advantage this brings is that ObjectBox already offers an excellent, highly efficient, and battle-tested out-of-the-box Sync that takes care of the “hard stuff” of decentralized data management for developers. As a developer tool that is self-hosted and can be used on all kinds of edge devices, and certainly on premise, it offers companies the flexibility to implement a myriad of applications, reduce costs (especially cloud and networking costs), while not jeopardizing data ownership in any way.

References and Notes

  10. While this is great for the current landscape of AI-driven tools and apps, and thus the consumers, these are incremental innovations and will not take the foundation of AI forwards. So, there is a rightful fear that big corporations will discontinue open sourcing advancements, once they feel protecting their business interests outweighs the benefits they have from open sourcing.
  14. An AI training model is the initial version that goes through the training process, while the AI model refers to the trained and optimized version that is ready for deployment and inference in real-world applications.
  15. Adapted from
  16. If you are interested in understanding distances in vector databases more, read this post; it explains the most important commonly used distance algorithms in a straightforward and highly understandable way.
  17. Note: Traditional databases can offer vector support too. Our guess is that there will be a consolidation of databases, with traditional databases moving towards vector support and vector databases moving towards supporting other (more traditional) database functionalities. In fact, we cannot imagine a future where any database will not support AI applications (as in: Support vectors, nearest neighbour search etc.).
  18. Based on:
  19. If you are interested in diving deeper, read this article, which is written in a highly understandable and comprehensive way, explaining everything you would need to know to develop a basic understanding.
  23. Though the merit of it being open source is unclear an ongoing discussion in the more legal oriented open source community.
  26. Adapted from

SQLite and SQLite alternatives – a comprehensive overview

SQLite and SQLite alternatives – a comprehensive overview

SQLite and SQLite alternatives - databases for the Mobile and IoT edge

Overview of SQLite and SQLite alternatives as part of the mobile / edge database market with a comprehensive comparison matrix (updated summer 2023)

Digitalization is still on the rise, as is the number of connected devices (from 13 billion connected IoT devices + 15 billion mobile devices operating in 2021 already). Data volumes are growing accordingly ( 3.5 quintillion bytes of data is produced daily in 2023), and centralised (typically cloud-based) computing can no longer support the current needs. This has led to new challenges in computing and subsequently to a shift from the cloud to the edge.

Therefore, there is a new need for on-device (local) databases like SQLite and SQLite alternatives to persist data on the edge. On top, due to the growing number of edge / connected devices, there is a need to manage data flows to / from and between edge devices. This can be done with Edge Databases (SQLite and SQLite alternatives) that provide a Data Sync functionality. In the following we’ll have a look at these.

Databases for the Edge

 While being quite an established market with many players, the database market is still growing consistently and significantly. The reason is that databases are at the core of almost any digital solution, and directly impact business value and therefore never going out of fashion. With the rapid evolvements in the tech industry, however, databases need to evolve too. This, in turn, yields new database types and categories. We have seen the rise of NoSQL databases in the last 20 years, and more recently some novel database technologies, like graph databases and time-series databases, and only this year: vector databases. Both, the shift back from a centralised towards a decentralised paradigm, and the growing number of restricted devices call for an “old / new” type of database. SQLite has been around for more than 20 years and for good reason, but the current market shift back to decentralized computing happens in a new environment with new requirements. Hence, the need for a “new” database type, based on a well-established database type: “Edge databases”. Accordingly, a need for SQLite alternatives (depending on the use case of course; after all SQLite is a great database).

What is an Edge Database?

Edge databases are a type of databases that are optimised for local data storage on restricted devices, like embedded devices, Mobile, and IoT. Because they run on-device, they need to be especially resource-efficient (e.g. with regards to battery use, CPU consumption, memory, and footprint). The term “edge database” is becoming more widely-used every year, especially in the IoT industry. In IoT, the difference between cloud-based databases and ones that run locally (and therefore support Edge Computing) is crucial.

What is a Mobile Database?

We look at mobile databases as a subset of edge databases that run on mobile devices. The difference between the two terms lies mainly in the supported operating systems / types of devices. Unless Android and iOS are supported, an edge database is not really suited for the mobile device / smartphone market. In this article, we will use the term “mobile database” only as “database that runs locally on a mobile (edge) device and stores data on the device”. Therefore, we also refer to it as an “on-device” database.

What are the advantages and disadvantages of working with SQLite?

SQLite is a relational database that is clearly the most established database suitable to run on edge devices. Moreover, it is probably the only “established” mobile database. It was designed in 2000 by Richard Hipp and has been embedded with iOS and Android since the beginning. Now let’s have a quick look at its main advantages and disadvantages:

Advantages  Disadvantages
  • 20+ years old (should be stable ;))
  • Toolchain, e.g. DB browser
  • No dependencies, is included with Android and iOS
  • Developers can define exactly the data schema they want
  • Full control, e.g. handwritten SQL queries
  • SQL is a powerful and established query language, and SQLite supports most of it
  • Debuggable data: developers can grab the database file and analyse it
  • 20+ years old ( less state-of-the-art tech)
  • Using SQLite means a lot of boilerplate code and thus inefficiencies ( maintaining long running apps can be quite painful)
  • No compile time checks (e.g. SQL queries)
  • SQL is another language to master, and can impact your app’s efficiency / performance significantly…
  • The performance of SQLite is unreliable
  • SQL queries can get long and complicated
  • Testability (how to mock a database?)
  • Especially when database views are involved, maintainability may suffer with SQLite


What are the SQLite alternatives?

There are a bunch of options for making your life easier, if you want to use SQLite. You can use an object abstraction on top of it, an object-Relational-Mapper (ORM), for instance greenDAO, to avoid writing lots of SQL. However, you will typically still need to learn SQL and SQLite at some point. So what you really want is a full blown database alternative, like any of these: Couchbase Lite, Interbase, LevelDB, ObjectBox, Oracle Berkeley DB, Mongo Realm, SnappyDB, SQL Anywhere, or UnQLite.

While SQLite really is designed for small devices, people do run it on the server / cloud too. However, for server / cloud databases, there are a lot of alternatives you can use as a replacement like e.g. MySQL, MongoDB, or Cloud Firestore.

Bear in mind that, if you are looking to host your database centrally with apps running on small distributed devices (e.g. mobile apps, IoT apps, any apps on embedded devices etc.), there are some difficulties. Firstly, this will result in higher latency, i.e. slow response-rates. Secondly, the offline capabilities will be highly limited or absent. As a result, you might have to deal with increased networking costs, which is not only reflected in dollars, but also CO2 emissions. On top, it means all the data from all the different app users is stored in one central place. This means that any kind of data breach will affect all your and your users’ data. Most importantly, you will likely be giving your cloud / database provider rights to that data. (Consider reading the general terms diligently). If you care about privacy and data ownership, you might therefore want to consider a local database option, as in an Edge Database. This way you can decide, possibly limit, what data you sync to a central instance (like the cloud).

SQLite alternatives Comparison Matrix

To give you an overview, we have compiled a comparison table including SQLite and SQLite alternatives. In this matrix we look at databases that we believe are apt to run on edge devices. Our rule of thumb is the databases’ ability to run on Raspberry Pi type size devices. If you’re on mobile, click here to view the full matrix.

Edge Database Short description License / business model Android / iOS* Type of data stored Central Data Sync P2P Data Sync Offline Sync (Edge) Data level encryption Flutter support Vector (AI) support  Minimum Footprint size Company
SQLite C programming library; probably still 90% market share in the small devices space (personal assumption) Public domain embedded on iOS and Android Relational No No No No, but option to use SQLCipher to encrypt SQLite Flutter plugins (ORMs) for SQLite, but nothing from Hwaci No, but various unofficial extensions are available < 1 MB Hwaci
Couchbase Mobile Embedded / portable database with P2P and central synchronization (sync) support; pricing upon request; some restrictions apply for the free version. Secure SSL. Partly proprietary, partly open-source, Couchbase Lite is BSL 1.1 Android / iOS JSON Documents / NoSQL db Yes Yes No Database encryption with SQLCipher (256-bit AES) Unofficial Flutter plugin for Couchbase Lite Community Edition No < 3,5 MB Couchbase
InterBase ToGo / IBLite Embeddable SQL database. Proprietary Android / iOS Relational No No No 256 bit AES strength encryption No No < 1 MB Embarcadero
LevelDB Portable lightweight key-value store, NoSQL, no index support; benchmarks from 2011 have been removed unfortunately New BSD Android / iOS Key-value pairs / NoSQL db No No No No Unofficial client that is very badly rated No < 1 MB LevelDB Team
LiteDB A .Net embedded NoSQL database MIT license Android / iOS (with Xamarin only) NoSQL document store, fully wirtten in .Net No No No Salted AES No No < 1 MB LiteDB team
Mongo Realm / Realm DB (acquired by Mongo in 2019) Embedded object database Mongo Realm is Apache 2.0, Mongo DB is SSPL, Mongo Atlas (?) Android / iOS Object Database Yes (Mongo Atlas), tied to using Mongo DB No No Yes Officially released a Flutter binding in spring 2023 (See Flutter databases) No 5 MB+ MongoDB Inc.
ObjectBox High-performance NoSQL Edge Database with out-of-the-box Data Sync for Mobile and IoT; fully ACID compliant; benchmarks available. Apache 2.0 and Proprietary Android / iOS / Linux / Windows / any POSIX Object-oriented NoSQL edge database for high-performance on edge devices in Mobile and IoT Yes WIP Yes transport encryption; additional encryption upon request Yes Yes, vector support already available, more AI support coming < 1 MB ObjectBox
Oracle Database Lite Portable with P2P and central sync support as well as support for sync with SQLite Proprietary Android / iOS Relational Yes Yes No 128-bit AES Standard encrytion No No < 1 MB Oracle Corporation
SQL Anywhere Embedded / portable database with central snyc support with a stationary database, pricing now available here Proprietary Android / iOS Relational Yes, tied to using other SAP tech though (we believe) No No AES-FIPS cipher encryption for full database or selected tables No No   SAP (originally Sybase)
UnQLite Portable lightweight embedded db; self-contained C library without dependency. 2-Clause BSD Android / iOS Key-value pairs / JSON store / NoSQL db No No No 128-bit or 256-bit AES standard encryption not yet; might be coming though; there was a 0.0.1 released some time ago No ~ 1.5 MB Symisc systems
extremeDB Embedded relational database Proprietary iOS In-memory relational DB, hybrid persistence No No No AES encryption No Yes, vector support available, nothing more yet < 1 MB McObject LLC
redis DB High-performance in-memory Key Value store with optional durability Three clause BSD license, RSAL and Proprietary No K/V in-memory store, typically used as cache No No No TLS/SSL-based encryption can be enabled for data in motion. Unofficial redis Dart client available Yes An empty instance uses ~ 3MB of memory redislabs (the original author of redis left in 2020)
Azure SQL Edge  Designed as a SQL database for the IoT edge; however, due to the footprint it is no Edge Database Proprietary No Relational DB for IoT No No No will provide encryption No   500 MB+ Microsoft

Star this matrix on GitHub. If you are interested in an indication of the diffusion rate of databases, check out the following database popularity ranking: If you are interested to learn more about SQLite, there is a great Podcast interview with Richard Hipp that is worthwhile listening to.

Is there anything we’ve missed? What do you agree and disagree with? Please share your thoughts with us via Twitter or email us on contact[at] 

Make sure to check out the ObjectBox Database & try out ObjectBox Sync. You can get started in minutes and it’s perfect if you are using an object-oriented programming language, as it empowers you to work with your objects within the database. More than 1,000,000 developers already use this Edge Database designed specifically for high performance on small, connected, embedded devices.

Vector types (aka arrays) added with ObjectBox Java 3.6 release

Vector types (aka arrays) added with ObjectBox Java 3.6 release

Vector embeddings (multi-dimensional vectors) are a central building block for AI applications. And accordingly, the ability to store vectors to add long-term memory to your AI applications (e.g. via vector databases) is gaining importance. Sounds fancy, but for the basic use cases, this simply boils down to “arrays of floats” for developers. And this is exactly what ObjectBox database now supports natively. If you want to use vectors on the edge, e.g. in a mobile app or on an embedded device, when offline, independent from an Internet connection, removing the unknown latency, try it…

See the release notes for all new features this release brings.

Code Examples

Let’s start with a simple example: let’s assume some shapes that use a palette of RGB colors. An entity for this might look like this:

We can now create a query to find all shapes that use a certain color:

Another typical use case is the embedding of certain types of data, like text, audio or images, as vector coordinates. To store such a vector embedding, in the following example we store the floating point coordinates that were computed by a machine learning model for an image together with a reference to the actual image:

Ready to go?

To update to this release, change the version of objectbox-gradle-plugin to 3.6.0.

To add ObjectBox database to your JVM or Android project read our Getting Started guide.
As always, we look forward to your feedback on GitHub or via our anonymous feedback form and hope you have a great time building apps with ObjectBox! ❤️

Green Coding: Developing Sustainable Software for a Greener Future

Green Coding: Developing Sustainable Software for a Greener Future

Digitization helps to save CO₂ – many experts agree on that. But things are not that simple, because the creation of software and its use contribute to greenhouse gas emissions too. All code creates a carbon footprint. Software development and use affect the environment from the energy consumed while running to the associated electronic device waste. Choosing a sustainable software architecture matters, but every developer also can make a difference by applying green coding principles. 

This article will explore the importance of green software development and its main principles.

Green Software Development: Balancing Digitization and Environmental Sustainability

In this section, we’ll first define some important terms in the topic of environmentally conscious software development. Then, we’ll discuss why it is relevant and discussing the broader benefits of adopting green coding practices.

What does sustainability in software development mean?

In our view, sustainability in software development (also “green software development”) entails developing and maintaining software in a way that is not only environmentally, but also socially and economically responsible. So, what really counts is the long-term bottom-line value from a general societal perspective, not an “individual balance sheet”.

There are many trade-offs in such an ambition, and therefore sustainable software development is rather a set of guiding principles than hands-on measures that are truly the same for everyone. Let’s dive a bit into how sustainable software development can contribute to all three aspects:

Environmental aspects

Since software is a significant source of direct greenhouse gas emissions, it is becoming more important to create software that reduces resource use as much as possible. As the world becomes more reliant on technology, energy consumption and carbon footprint of software will continue to grow. By adopting green software development practices, software developers can help to mitigate these environmental impacts.


Broader Economic contribution

If a software uses less energy and resources to accomplish the same tasks as another software, the users of that software can reduce their operating costs and improve their bottom line. Increasing the longevity of hardware (less wear, but also less hw requirements extending the usability of existing hw) also yields direct economic savings for the software users (companies as well as individuals). On a broader level, this compounds significantly over the number of users and with time and thus contributes to economic welfare. What sounds like a small contribution does add up tremendously in the end…

Social impact

Sustainable software development includes responsibility for the social impact of the software created. As a result, sustainable software aims to be transparent, inclusive, and offer data sovereignty. By giving individuals and organizations greater control over their own data, software empowers them and protects their privacy. At the same time, it promotes greater accountability and transparency in data-driven decision-making.

Overall, sustainability in software development involves taking a holistic approach. On top, sustainable software companies take steps to minimize negative impacts and promote positive ones over the long term.

This is why it has been one of our core values since we started ObjectBox:

Be Sustainable in every respect – we apply sutainability to our technology, as well as the people and small every-day decisions. ObjectBox aims to be the most resourceful data management solution for connected devices. We strive to save resources (energy, CO₂, bandwidth, time, etc.), but also always choose the sustainable path (recycled paper, saving energy, etc.), and support our employees to lead balanced and sustainable lives.

What is green coding / green software development?

Recently, the term “green coding” has emerged to describe the practice of creating and writing code (aka software) in a way that minimizes its environmental impact. This can involve using efficient code that consumes less energy, optimizing data usage, and reducing electronic waste.

What is the difference between Green IT and Green Coding?

Green IT is primarily about the hardware and the optimization of data centers. Today, it often actually is about optimizing cloud usage. The code decides whether this hardware is used efficiently. By contrast, green coding is about making the code more efficient, so that running the code (e.g. using an app on the smartphone, or using an email program) uses less resources and less electricity, thus producing less CO₂. 

Why is it time for developers to prioritize environmental sustainability?

Various studies estimate the Carbon footprint of the digital economy to be between 2.3 – 3.7% percent of global CO₂ emissions 😱 [1]. Although the impact of software on the environment may not yet be as dramatic as that of manufacturing, it keeps growing rapidly each year. By taking sustainable decisions in software development, we can make it part of the carbon solution of the future. 

Every line of code – scaled up to hundreds, thousands, or even millions of devices (desktops, smartphones, tablets…) worldwide – has the potential to significantly reduce energy consumption and CO₂ emissions.

How to put sustainable software development into practice?

We believe two key aspect to develop sustainable software, that creates bottom-line value, are:

  • minimize the resource consumption of software especially during operation, where most resources are consumed – be dilligent about that; it compounds
  • keep data as much as possible where it is produced, used and belongs (e.g. with the end users) and avoid unnecessary data transferals, superfluous cloud use, and unnecessarily storing data in the cloud

Both measures have significant environmental, social, and economic impact, short- and long-term.

It’s time we as developers start thinking about our impact on the planet and make sustainability a part of our everyday coding mindset. We can make a difference by incorporating sustainability into every action and decision we take when developing software. Careful measuring and optimizing the resource along the way is also important. The welcome side effect: fast software that is cheap to run and fun to use 🙂

For example, at ObjectBox, we’re all about maximizing the use of computing resources and minimizing resource waste of every line of code (LOC). This makes ObjectBox not only environmentally sustainable, but at the same time superfast, usable on low end devices w. little hw requirements, and cheap in operational costs 🤯

💚 Responsible development practices pay off in several respects and we really cannot see a huge tradeoff. All it costs is spending more time and brain on optimizations, benchmarking, and dilligently applying this approach to every line of code.

💚 As a developer tool, our impact is broader than a developer’s impact on end-users. So, we’re committed to using resources efficiently and reducing waste at every stage of the game.

Guidelines to start making your code more sustainable

Some more tipps how to put sustainable software development into practice:

  • Energy efficiency: Developing software that is energy-efficient can help to reduce its environmental impact by minimizing the amount of energy required to run software. 
  • Responsible sourcing: Using responsibly sourced hardware, software, and other materials can help to reduce the environmental impact of software development.
  • Longevity: Developing software that is designed to last can help to reduce waste and promote sustainability by reducing the need for frequent updates and replacements.
  • Accessibility: Making software accessible to a wide range of users can help to promote social sustainability by ensuring that everyone has access to the benefits of technology.
  • Data sovereignty, privacy and security: Protecting user data and maintaining strong cybersecurity measures can help to promote sustainability by preventing data breaches and other security incidents that can have negative social and economic impacts.

Examples of sustainable coding: More impactful than you would expect

1. How can a millisecond be worth 2 days?

Real world example: By reducing the resolution of images in a banking app with 500.000 users, whose users on average opened it daily, developers saved more than 2 days of total operational time (up time) [2].

 2. How can 2 grams of CO₂ savings / hour be worth 330.000 t CO2?

Theoretical consideration: Netflix states that streaming its content produces 55 grams of CO₂ per hour [3]. This gives us 40 kilograms of CO₂ per year for daily streaming of two hours per person [4]. With Netflix users being 230M, a reduction would have an enormous scaling factor [5]. Assuming a Netflix developer reduces the 55 grams to 53 grams, you get 330 kt of CO₂ in potential savings. Note: This is a highly theoretical example, just to demonstrate the thinking.
Anyways: Individuals can’t save that much as easily. That’s the impact you as a programmer have!

3. How much CO₂ can local storage save in 1 million cars?

Sending and storing 1 GB of data in the cloud needs about 5 kWh of electricity, while local storage only needs about 0.000005 kWh, which is a million times lower. Making the switch to local storage in 1 Million cars would lead to saving 905 kg of CO₂ every second. If you want to know what that actually means, you can translate that into equivalents: CO2 equivalencies or the CO2 calculator

👉 These examples clearly illustrate the potential impact of shifting towards an environmentally conscious mindset when developing software. Now that we know the why, it’s time to discuss the how.

Sustainable Edge Data Managment w. ObjectBox – a ready-made developer tool

ObjectBox is a free Edge Database that can help reduce the environmental impact of apps. It is optimized for computing resource efficiency and empowers developers to store and use data locally and create offline-first apps. Unless the data is really needed in the cloud, this is way more energy-efficient and sustainable compared to a cloud setup. On top, it works independant from an Internet connection being available and is superfast while saving battery, making it an ideal choice for apps that prioritize sustainability.

What is an Edge Database?

An Edge Database is a type of database that is used on the “edge” of a network, closer to the data sources and devices generating data. Traditional databases, on the other hand, are usually set up in centralized data centers or in the cloud.

Edge databases are essential when devices need to work offline, guarantee response times, speed is of the essence, you have limited Internet connectivity, mission-critical scenarios, or when handling high-frequency data. By processing data locally on the edge, Edge Databases can reduce latency and improve performance while also reducing the amount of data transferred over the network.

Edge databases have a small footprint and are designed to run on restricted devices such as routers, IoT gateways, mobile phones, and other embedded systems. They typically incorporate features needed in distributed systems, such as data synchronization, caching, and offline support to ensure that data remains available even in the event of network outages or other disruptions.

ObjectBox Sync is a highly efficient and sustainable data synchronization solution. It reduces the amount of energy used by having as little overhead as possible when sending data combined with solid compression, avoiding data transformations, and only syncing data changes instead of sending all data to the cloud all the time. Developers have control over what data is synced when.

Overall, ObjectBox DB + Sync is a powerful tool for building fast apps that prioritize consuming less energy and saving device resources. By storing data locally and only syncing when and where needed, developers can ensure that their apps are as sustainable as possible, and save on cloud costs along the way. 

How to start using ObjectBox Database in Flutter

How to start using ObjectBox Database in Flutter

This tutorial will help you get started with the ObjectBox Flutter Database. We will create a simple task-list app using all ObjectBox CRUD operations (Create, Read, Update, Delete). Additionally, we will support adding a tag to each task by setting up a to-one relation between tasks and tags. The pure Dart ObjectBox API is very easy to use, as you will see by going through the steps outlined below. 

A couple of useful links:

About the app

Users can enter a new task, choose which tag to apply and add the task. All added tasks are displayed as a checklist with the associated tag. Users can mark a task as finished by ticking its checkbox and delete it by swiping it away. 

Each task entry also shows the date when it was created or finished, depending on the state of the task.

Tasklist example app built with ObjectBox database in Flutter

How to start using the ObjectBox Database in your Flutter app

Add the library

Please refer to the Getting Started page for up-to-date information about adding the ObjectBox dependencies to your project.

Create a model file

ObjectBox is an object-oriented non-relational (NoSQL) database. First, we need to tell the database which entities to store. We can do this by defining a model and then using the build_runner to generate the binding code. The model is defined by writing Dart classes and annotating them.    

Create a model file (e.g. model.dart), where we want to define two entities: one for task tags and one for tasks. These are just Dart classes with the @Entity annotation. Each entity must have an ID property of type int, which serves as the unique identifier. When this is called “id”, the property is recognized automatically but if you want to use a different name, annotate it with “@Id()”. In the Tag class, we also create a String property for the tag name. Here is how the model will look. Don’t worry about the objectbox.g.dart import for now – this file will be generated later.

Additional properties of our Task entity include a String for the task’s text and two DateTime properties for the date when a task was created and finished. Then we also define a to-one relation between Tasks and Tags. This is needed so that one can assign a tag to each task created within the app. We’ll come back to how relations work at the end of this tutorial.

Generate binding code

Once our model is done, we generate the ObjectBox binding code by running flutter pub run build_runner build. This will create the objectbox.g.dart file from the import above. You will need to do this every time you update the model (e.g. by adding or removing an entity or a property), and ObjectBox will take care of the change. However, in cases like entity renaming, you will need to provide ObjectBox with more information. Read more about data model updates in the ObjectBox docs.

Create a Store

Store represents an ObjectBox database and works together with Boxes to allow getting and putting. A Box instance gives you access to objects of a particular type. One would typically create one Store for their app and a number of Boxes that depends on the number of entities they want to store in the database.

In a separate file, e.g. objectbox.dart, we define the ObjectBox class that will help us create the store. We only need a single database store where we get two Boxes – one for each object type. Let’s call these taskBox and tagBox.

So that our app can always display the current list of tasks without the need to actively refresh it, we create a data stream. Within ObjectBox, we can stream data using queries. We query all tasks by using taskBox.query() and order them by date created in a descending order: order(Task_.dateCreated, flags: Order.descending). Then we use the watch() method to create a stream.

Finally, we define the create() method that will create an instance of ObjectBox to use in our app.

Open the store

Now we can initialise the store in our app’s main() function. Do this by calling the create() method of the ObjectBox class.

CRUD operations

CREATE – Put a new Task or Tag into the Store

In the homepage state subclass of main.dart, we define the methods for adding new tasks and tags. Start by creating a task using the input controller. Then, set a tag we want to relate to this task by calling Now we only need to put the new Task object in its Box.

READ – Get all tag names

To get all tag names, we call getAll() on our tagBox. This will return a list with all tags. If you want to read just a single object, call the get(id) method to get only the desired single object back. For a range of objects, use getMany(), passing a list of ids to it.

Another way of reading data is by using queries. In the “Create a Store” section above, we created a task stream with the help of a query builder. There we just needed all tasks, so no criteria was specified. But generally, one can specify custom criteria to obtain a list of objects matching the needs. Learn more about how to use queries using the ObjectBox Query Docs.


DELETE – remove tasks by swiping

To remove objects from the database, we add a dismissible Flutter widget. Inside the setState method of the onDismissed property, we simply use the ObjectBox remove() operation with the corresponding task id.

UPDATE – Updating the date when Task was finished 

Our app prints all tasks as a list with checkboxes. Each task’s finished date is initially set to null. We want the finished date of a task to update when the user marks the task as done. The non-null finished date therefore will act as an indicator that the task is finished. Let’s do this inside the setState() method of the onChanged property of our Checkbox class. Set the dateFinished to if it was null when the checkbox value was changed, and set back to null otherwise. 


Remember we initialised a ToOne relation in the very first section and planned to come back to this at the end? Now that we have covered all CRUD operations, we can come back to discussing those.

Relations allow us to build references between objects. They have a direction: a source object references a target object. There are three types of relations:

  • to-one relations have one target object, as used above;
  • one-to-many relations can have multiple target objects, but each target only has one source, e.g. we could add a backlink (one-to-many) to Tag in our example to find out all tasks with a specific tag;
  • many-to-many relations involve targets that can have multiple sources, e.g. to improve our example app by supporting multiple tags per task, we could replace the to-one with a many-to-many relation.

Read about relations in more detail and learn how to use them with the help of the ObjectBox Relations Docs or our video tutorial.

How to use relations

When listing all tasks, we might want to include each task’s tag next to the task name. The relations are already initialised (see the “Create a model file” section). Now we need to read the tag of each task we want to list. So, inside the Text widget that displays the task name, we use tasks[index] to print the name of the corresponding tag.

Build your own Flutter app with the ObjectBox Database

Now you have all the tools needed to build your own version of the task-list app with tags. The setup we described is rather minimal, so that anyone can get started easily. However, this gives you lots of room for improvement. For example, you could replace the existing to-one relation with a to-many, allowing users to add more than one tag per task. Or you can add task-list filtering and/or sorting functionality using different queries. The possibilities are truly endless.

We’d like to learn more about the creative ways to use the ObjectBox Database in Flutter you came up with! Let us know via Twitter or email.

Beginner C++ Database Tutorial: How to use ObjectBox

Beginner C++ Database Tutorial: How to use ObjectBox


As a direct follow up from the ObjectBox database installation tutorial, today we’ll code a simple C++ example app to show how the database can be used. Before starting to program, let’s briefly overview what we want to achieve with this tutorial and what is the best way to work through it.

Overview of the app we want to build

In short, we will make a console calculator app with an option to save results into memory. These will be stored as objects of the Number class. Every Number will also have an ID for easy reference in future calculations. Apart from the function to make calculations, we will create a function to enter memory. It will list all the database entries and have an option to clear memory. By coding all of this, we will make use of such standard ObjectBox operations as put, get, getAll and removeAll.

Our program will consist of seven files: 

  • the FlatBuffers schema file, that defines the model of a class we want to store in the database
  • the header file, for class function definitions
  • the source file, for function implementation
  • the four files with objectbox binding code that will be created by objectbox-generator

How to use this tutorial

While looking at coding examples is useful in many cases, the best way to learn such a practical skill like programming is to solve problems independently. This is why we included an exercise for each step. You are encouraged to make the effort and do each of them, even if you don’t know the answer straight away. Only move to the next step after you test each part of your program and make sure that everything works as intended. Ideally, you should only use the code snippets presented here to check yourself or look for hints when you feel stuck. Bear in mind that sometimes there might be several different ways to achieve the same results. So if something that we ask you to do in this tutorial doesn’t work for you, try to come up with your own solution.

How to create the FlatBuffers file?

First, we’ll create the FlatBuffers schema (.fbs) for our app. This is required for the objectbox-generator to generate binding code that will allow us to use the ObjectBox library in our project. 

The FlatBuffers schema consists of a table, which defines the object we want to store in the database, and the properties of this object. Each property consists of a name and a type. We want to keep our example very simple, so just two properties is enough.

  1. To replicate a calculator’s memory, we want ObjectBox to store some numbers. We can define the Number object by giving the table a corresponding name.
  2. Inside the table, we want to have two properties: id and contents. The contents of each Number object is the number itself (double), while id is an ulong that our program will assign to each of them for easy identification.

Exercise: create a file called numbers.fbs and define the table in the format

Reveal code

Generating binding code

Now that the FlatBuffers file is ready, we can generate the binding code. To do this, run the objectbox-generator for our FlatBuffers file:

The following files will be generated:

  • objectbox-model.h
  • objectbox-model.json
  • numbers.obx.hpp
  • numbers.obx.cpp

The header file

This is where the main chunk of our code will be. It will contain the Calculator class and all the function definitions.

  1. Start by including the three ObjectBox header files: objectbox.hpp, objectbox-model.h and numbers.obx.hpp. Our whole program will be based on one class, called Calculator. It should only have two private members: Store and Box. Store is a reference to the database and will manage Boxes. Each Box stores objects of a particular class. In this example, we only need one Box. Let’s call it numberBox, as it will store Numbers that we want to save in the memory of our calculator.

Exercise: create a file called calculator.hpp and define the Calculator class with two private members: reference to the obx library member Store and a Box of Numbers.

Reveal code

2. After the constructor, we define the run function. It will be responsible for the menu of our program. There should be two main options: to perform calculations and enter memory. As discussed above, we want this app to do two things: perform calculations and show memory. We’ll define these as separate functions, called Calculate and Memory. The first one is quite standard, so we won’t go into a detailed explanation here. The only thing you should keep in mind is that we need to account for the case when the user wants to  operate on a memory item. To deal with this, we’ll process input in a function called processInput.

Exercise: define the parametrised constructor which takes a reference to Store as a parameter. Then define the run and Calculate functions.

Reveal code

3. The final part of this function is for saving results into memory. We start by asking the user if they want to do that. If the answer is positive, we create a new instance of Number and set the most recent result as a value of its contents. To save our object in the database, we can operate with put(object) on our Box. put is one of the standard ObjectBox operations, which is used for creating new objects and overwriting existing ones. 

Exercise: create an option to store the result in memory, making use of the ObjectBox put operation.

Reveal code

4. Next, we should define processInput, which will read input as a string and check whether it has the right format. Now, to make it recognise the memory items, we have to come up with a standard format for these. Remember, we defined an ID property for our Numbers. Every number in our database has an ID, so we can refer to them as, e.g. m1, m2, m3 etc. To read the numbers from memory, we can make use of the get(obx_id) operation. It returns a unique pointer to the corresponding Number, whose contents we need to access and use as our operand.

Exercise: define the processInput function, which detects when something like m1 was used as an operand and updates x, y, and op according to the input.

Reveal code

5. The last function in our header file will be Memory. It should list all the numbers contained in the database and have an option to clear data. We can read all the database entries by calling the getAll ObjectBox operator. It returns a vector of unique pointers. To clear memory, you can simply operate with removeAll on our Box.

Exercise: define the Memory function, which lists all the memory items, and can delete all of them by request.

Reveal code

The source file

To tie everything together, we create a source (.cpp) file. It should contain only the main function that initialises the objectbox model, creates an instance of the Calculator app, and runs it. To create the ObjectBox model, use

then passing options as a parameter when you initialise the Store.

Exercise: create the source file

Reveal code

Final notes

Now you can finally compile and run your application. At this point, a good exercise would be to try and add some more functionality to this project. Check out the ObjectBox C++ documentation to learn more about the available operations.

After you’ve mastered ObjectBox DB, why not try ObjectBox Sync? Here is another tutorial from us, showing how easily you can sync between different instances of your cross platform app.

Other than that, if you spot any errors in this tutorial or if anything is unclear, please come back to us. We are happy to hear your thoughts.