On-device Vector Database for Dart/Flutter

On-device Vector Database for Dart/Flutter

ObjectBox 4.0 introduces the first on-device vector database for the Dart/Flutter platform, allowing Dart developers to enhance their apps with AI in ways previously not possible. A vector database facilitates advanced data processing and analysis, such as measuring semantic similarities across different document types like images, audio files, and texts. If you want to go all-in with on-device AI, combine the vector search with a large language model (LLM) and make the two interact with individual documents. You may have heard of this as “retrieval-augmented generation” (RAG). This is your chance to explore this as one of the first Dart developers.

Vector Search for Dart/Flutter

Now, let’s look into the Dart specifics! With this release, it is possible to create a scalable vector index on floating point vector properties. It’s a very special index that uses an algorithm called HNSW. It’s highly scalable and can find relevant data within millions of entries in a matter of milliseconds.

Let’s have a deeper look into the example used in our vector search documentation. In this example, we use cities with a location vector to perform proximity search. Here is the City entity and how to define a HNSW index on the location (it would also need additional properties like an ID and a name, of course):

Vector objects are inserted as usual (the indexing is done automatically behind the scenes):

To perform a nearest neighbor search, use the new nearestNeighborsF32(queryVector, maxResultCount) query condition and the new “find with scores” query methods (the score is the distance to the query vector). For example, to find the 2 closest cities:

Vector Embeddings

In the cities example above, the vectors were straight forward: they represent latitude and longitude. Maybe you already have vector data as part of your data, but often, you don’t. So where do you get the vectors from?

For most AI applications, vectors are created by a so-called embedding model. There are plenty of embedding models to choose from, but first you have to decide if it should run in the cloud or locally. Online embeddings are the easier way to get started. Just set up an account at your favorite AI provider and create embeddings online. Alternatively, you can also run your embedding model locally on device. This might require some research. A good starting point for that may be TensorFlow lite, which also has a Flutter package. If you want to use really good embedding models (starting at around 90 MB), you can also check these on-device embedding models. These might require a more capable inference runtime though. E.g. if you are targeting desktops, you could use ollama (e.g. using this package).

CRUD benchmarks 2024

A new release is also a good occasion to refresh our open source benchmarks. Have a look:

CRUD is short for the basic operations a database does: Create, Read, Update and Delete. It’s an important metric for the general efficiency of a database.

What’s next?

We are excited to see what you will build with the new vector search. Let us know! And please give us feedback. It’s the very first release of an on-device vector database ever – and the more feedback we get on it, the better the next version will be.

Edge AI: The era of on-device AI

Edge AI: The era of on-device AI

AI anywhere and anytime - free from Internet dependencies & 100% private

Edge AI is an often overlooked aspect of AI’s natural evolution. It is basically the move of AI functionalities away from the cloud (or powerful server infrastructure) towards decentralized (typically less powerful) devices at the network’s edges, including on mobile phones, smartwatches, IoT devices, microcontrollers, ECUs, or simply your local computer. Or in more broadly speaking: “Edge AI” means AI that works directly on-device, “local AI”.

Therefore, Edge AI apps work independently from an internet connection, offline as well as online. So, they are ideal for low, intermittent, or no connectivity scenarios. They are reliably available, more sustainable, and – of course – way faster on-device than anything hosted in the cloud. On-device AI apps can empower realtime AI anytime and anyplace.

Edge AI is where Edge Computing meets AI

The importance of vector databases for AI applications

To enable powerful on-device AI applications, the on-device (edge) technology stack needs local vector databases. So, before diving deeper into Edge AI, we’ll dive into vector databases first. Jump this section, if you are already familiar with them.

What is a vector database?

Just as SQL databases handle data in rows and columns, graph databases manage graphs, object databases store objects, vector databases store and manage large data sets of vectors, or more precisely, vector embeddings. Because AI models work with vector embeddings, vector databases are basically the databases for AI applications. Vector databases offer a feature set of vector operations, most notably vector similarity search, that makes it easy and fast to work with vector embeddings and in conjunction with AI models.

When and why do you need a vector database? 

Given the significance of vector embeddings (vectors) for AI models, particularly Large Language Models (LLMs) and AI applications, vector databases are now integral to the AI technology stack. They can be used to:

Train AI models (e.g. ML model training, LLM training)
Vector databases manage the datasets large models are trained on. Training AI models typically entails finding patterns in large data sets. Training ML models often involves finding patterns in large datasets. Vector databases significantly speed up identifying patterns and finding relationships by enabling efficient retrieval of similar data points.

Speed up AI model / LLM responses
Vector databases use various techniques to speed up vector retrieval and similarity search, e.g. compression and filtering. They accelerate both model training and inference, thus, enhancing the performance of generative AI applications. By optimizing vector retrieval and similarity search, vector dbs can enhance the efficiency and scalability of AI applications that rely on high-dimensional data representations

Add long-term memory to AI models and LLMs
Vector databases add long term memory to AI applications in two ways: They persist the history to 1. continue on the tasks or conversation later as needed and 2. to personalize and enhance the model for better-fitting results.

Enable Multimodel Search
Vector databases serve as the backbone to jointly analyze vectors from multimodal data (text, image, audio, and video) for unified multimodal search and analytics. The use of a combination of vectors from different modalities enables a deeper understanding of the information, leading to more accurate and relevant search results.

Enhancing LLMs responses, primarily “RAG
With a vector database, you have additional knowledge to enhance the quality of a model’s responses and to decrease hallucinations; real-time updates, as well as personalized responses, become possible.

Perform Similarity Search / Semantic Retrieval
Vector databases are the heart and soul of semantic retrieval and similarity search. Vector search often works better than „full-text search“ (FTS) as it finds related objects that share the same semantics/meaning instead of matching the exact keyword. Thus, it is possible to handle synonyms, ambiguous language, as well as broad and fuzzy queries.

Cache: Reduce LLM calls
Vector databases are used to cache similar queries and responses can be used as a lookup prior to calling the LLM. This saves resources, time, and costs.

The shift to on-device computation (aka Edge Computing)

Edge Computing is in its essence a decentralized computing paradigm and based on Edge Computing, AI on decentralized devices (aka Edge AI) becomes possible. Note: In computing, we have regularly seen shifts from centralized to decentralized computing and back again.

What is Edge Computing?

Our world is decentralized. Data is produced and needed everywhere, on a myriad of distributed devices like smartphones, TVs, robots, machines, and cars – on the so-called “edge” of the network. It would not only be unsustainable, expensive, and super slow to send all this data to the cloud, but it is also literally unfeasible. So, much of this data simply stays on the device it was created on. To harness the value of this data, the distributed “Edge Computing” paradigm is employed.

When and why do you need Edge Computing? 

Edge Computing stores and processes data locally on the device it was created on, e.g. on IoT, Mobile, and other edge devices. In practice, Edge Computing often complements a cloud setup. The benefits of extending the cloud with on-device computing are:

    • Offline-capability
      Storing and computing data directly on-device allows devices to operate independently from an Internet connection, which is crucial for remote locations (e.g. oil rigs in the ocean) or applications that need to always work (e.g., while the car is in underground garages, or in remote areas).
    • Data ownership/privacy
      Cloud apps are fundamentally non-private and limit the user’s control over their own data. Edge Computing allows data to stay where it is produced, used, and where it belongs (with the user/on the edge devices). It therefore reduces data security risks, and data privacy and ownership concerns.
    • Bandwidth constraints and the cost of data transmission
      Ever growing data volumes strain bandwidth and associated network/cloud costs, even with advanced technologies like 5G/6G networks. Storing data locally in a structured way at the edge, such as in an on-device database, is necessary to unlock the power of this data. At the same time, some of this data can still be made available centrally (in the cloud or on an on-premise server), combining the best of both worlds.
    • Fast response rates and real-time data processing
      Doing the processing directly on the device is much faster than sending data to the cloud and waiting for a response (latency). With on-device data storage and processing, real-time decision making is possible.
    • Sustainability
      By reducing data overhead and unnecessary data transfers, you can cut down 60-90% of data traffic, thereby significantly reducing the CO2 footprint of an application. A welcome side effect is that this also lowers costs tremendously.

Edge AI needs on-device vector databases

Every megashift in computing is empowered by specific infrastructure software, like e.g. databases. Shifting from AI to Edge AI, we still see a notable gap: On-device support for vector data management (the typical AI data) and data synchronization capabilities (to update AI models across devices). To efficiently support Edge AI, vector databases that run locally, on edge devices, are as crucial as they are on servers today. So far, all vector databases are cloud / server databases and cannot run on restricted devices like mobile phones and microcontrollers. But moreover, they often don’t run on more capable devices like standard PCs either, or only with really bad performance. To empower everyday life AI that works anytime all around us, we therefore need a database that can run performantly on a wide variety of devices on the edge of the network.

In fact, vector databases may be even more important on the edge than they are in cloud / server environments. On the edge, the tradeoff between accuracy and performance is a much more delicate line to walk, and vector databases are a way to balance the scales.

Edge AI Vector Databases for on-device use

On-device AI: Use Cases and why they need an Edge Vector Database

Seamless AI support where it is needed most, on everyday devices and all the things around us needs an optimized local AI tech stack that runs efficiently on the devices. From private home appliences to on-premise devices in business settings, medical equipment in healthcare, digital infrastructure in urban environments, or just mobile phones, you name it: To empower these devices with advanced AI applications, you need local vector databases. From the broad scope of AI’s impact in various fields, let’s focus on some specific examples to make it more tangible: the integration of AI within vehicle onboard systems and the use of Edge AI in healthcare.   

Vehicle onboard AI and edge vector databases – examples

Imagine a car crashing because the car software was waiting on the cloud to respond – unthinkable. The car is therefore one of the most obvious use cases for on-device AI.

Any AI application is only as good as its data. A car today is a complex distributed system on wheels, traversing a complex decentralized world. Its complexity is permanently growing due to increased data (7x more data per car generation), devices, and the number of functions. Making use of the available data inside the car and managing the distributed data flows is therefore a challenge in itself. Useful onboard AI applications depend on an on-device vector database (Edge AI). Some in-car AI application examples:

  • Advanced driver assistance systems (ADAS)
    ADAS benefit in a lot of areas from in-vehicle AI. Let’s look, for example, at driver behaviour: By monitoring the eye movements and head, ADAS can determine when the driver shows any signs of unconcentrated driving, e.g., drowsiness. Using an on-device database, the ADAS can use the historic data, the realtime data, and other car data, like, e.g., the driving situation, to deduce its action and  issue alerts, avoid collisions, or suggest other corrective measures. 
  • Personalized, next-gen driver experience
    With an on-device database and Edge AI, an onboard AI can analyze driver behavior and preferences over a longer period of time and combine it with other available data to optimize comfort and convenience for a personalised driving experience that goes way beyond a saved profile. For example, an onboard AI can adjust the onboard entertainment system continually to the driver’s detected state, the driving environment, and the personal preferences. 

Applications of Edge AI in Healthcare – examples

Edge Computing has seen massive growth in healthcare applications in the last years as it helps to maintain the privacy of patients and provides the reliability and speed needed. Artificial intelligence is also already in wide use making healthcare smarter and more accurate than ever before. With the means for Edge AI at hand, this transformation of the healthcare industry will become even more radical. With Edge AI and on-device vector databases, healthcare can rely on smart devices to react in realtime to users’ health metrics, provide personalized health recommendations, and offer assistance during emergencies – anytime and anyplace, with or without an Internet connection. And while ensuring data security, privacy, and ownership. Some examples:

  • Personalized health recommendations
    By monitoring the user’s health data and lifestyle factors (e.g. sleep hours, daily sports activity) combined with their historic medical data, if available, AI apps can help detect early signs of health issues or potential health risks for early diagnosis and intervention. The Ai app can provide personalized recommendations for exercise, diet, or medication adherence. While this case does not rely on real-time analysis and fast feedback as much as the previous example, it benefits from an edge vector database in regards to data privacy and security.
  • Point of care realtime decision support
    By deploying AI algorithms on medical devices, healthcare providers can receive immediate recommendations, treatment guidelines, and alerts based on patient-specific data. One example of where this is used with great success, is in surgeries. An operating room, today, is a complex environment with many decentralized medical devices that requires teams to process, coordinate, and act upon several information sources at one time. Ultra-low latency streaming of surgical video into AI-powered data processing workflows on-site, enables surgeons to make better informed decisions, helps them detect abnormalities earlier, and focus on the core of their task.

Edge AI: Clearing the Path for AI anywhere, anytime

For an AI-empowered world when and where needed, we still have to overcome some technical challenges. With AI moving so fast, this seems however quite close. The move into this new era of ubiqutuous AI needs Edge AI infrastructure. Only when Edge AI is so easy to implement and deploy as cloud AI, will we see the ecosystem thriving and bringing AI functionalities that work anytime and anyplace to everyone. An important corner stone will be on-device vector databases as well as new AI frameworks and models, which are specifically designed to address Edge Computing constraints. Some of the corresponding recent advances in the AI area include “LLM in a Flash” (a novel technique from Apple for effective inference of LLMs at the edge) and Liquid Neural Networks  (designed for continuous learning and adaptation on edge devices). There’s more to come, follow us to keep your edge on Edge AI News.

Vector Databases for Edge AI

Vector Databases for Edge AI

The intersection of AI and Edge Computing is where Edge AI happens – and it needs databases that support AI and can run on the edge (for lack of a better term “Edge Vector Databases”, also refered to as On-device vector databases or local vector databases). Vector Databases are the databases for AI and are an important piece of the AI tech stack. Edge Databases are databases that can run on edge devices.

Edge Vector Databases are the basis for Edge AI

 Edge Vector Databases – the intersection of Edge Computing and AI needs a database

In 2023, the Edge Computing Market is estimated to be at $53B,[1] while the AI market is expected to reach a whopping $87B.[2] Both markets are expected to grow dramatically in the coming years with the two technologies enhancing each other. For many use cases it is advantageous, and oftentimes necessary, to use both Edge Computing and AI in conjunction. This is what is called Edge AI and Gartner prognosed its plateau within 2023-2024.[3] In fact, recently, Gartner named Edge AI as one of the breakthrough technologies of 2023 due to the growing demand for real-time AI solutions and the need for decentralized data processing.[4] The global Edge AI market size was valued at roughly $14.5B in 2022 with expected CAGRs of 20-30% from 2023 to 2030.[5] In this article we will take a closer look at the use of vector databases in Edge AI.

The AI market: AI model trainings vs. using trained AI models

To understand the AI market, it is important to distinguish between the AI model generation, initial training, and the use of these models.

AI model training

AI model training is the most resource-intensive and costly part of AI – and is considered hard.[6] The larger the model, the higher the costs and the more time it needs. Also with models getting larger, the training fails more often and must be restarted, adding to duration and costs. Initially, AI models were specifically trained for one task only. Now, however, a new type of AI models, namely, foundation models, have evolved. They are general-purpose models. This development was a main driver of the current AI boom we are seeing.

Foundation models

Foundation models (also base models) are trained on a vast quantity of data at scale in such a way that they can be used for a wide variety of tasks. Costs for training such large foundation AI models from scratch (like Chat-GPT4) are typically in the millions of USD and the expectation is that large model training costs will go up to 500 million USD by 2030.[7] They can be “fine-tuned” (trained) with specific data sets (on top of the already trained model) to adapt them to specific environments,[8] e.g. GitHub Copilot is based on GPT-3 and was tweaked as well as additionally trained on the code from GitHub repos.[9] Fine-tuning is typically way less costly than training a specific one-task model from scratch (let alone the initial foundation model training). Examples of foundation models include the GPT-models from Open AI, LLaMa, and Google’s BERT. A foundation model is basically a specific subset of LLMs (Large Language Models).

Using trained AI models

Using trained foundation models (like Chat-GPT3 or LLaMA), additionally tuning them with specific data sets, is what is empowering most current AI tools / apps and all the innovations we are seeing around that. Basically, at this moment, a handful of popular foundation models are empowering a whole and currently thriving ecosystem. As AI models use vector embeddings, using trained models can be supported and enhanced with a vector database.[10] We’ll dive into this a bit more below.

📈 Side note on the market

From a market perspective, this means there is a relevant market entry barrier to training foundation models, and therefore it is reasonable to expect only few companies doing it (as opposed to many startups ;)). Yet, training new models is likely where the greatest innovation potential and most significant advancements lie. Given that the current AI market largely depends on large trained models (mainly foundation models) being deployed as free (often open source) models, there is a real risk of few tech giants owning the market later on, whereas the market that was built depending on those models starts stalling.[11] For example, openAI didn’t open source Chat-GPT4 anymore, because they felt it could impede their business interests; Sam Altman even went as far as saying it was wrong to ever open it – likely they should change their name.[12]

At the same time, Sam Altman went lobbying around the world to regulate AI (model trainings) and enhance entry barriers. The most reasonable explanation I read is that they are trying to ensure corporate dominance, and likewise, the reason behind the movement to pause AI model training from other actors in the space is a move to avoid a monopoly by OpenAI and get meat in the game quickly.[13]

Vector Databases and their important role in AI

Many machine learning and deep learning algorithms, as well as the AI models described above, depend on vector embeddings / embeddings. The increasing demand for creating, storing, and managing these embeddings has led to the emergence of vector databases

What are vector embeddings?

Vector embeddings represent (multimodal) data as n-dimensional vectors, meaning n-dimensional matrices, which comes down to a set of numbers. This is what makes them easy and efficient to compute with. The power of vector embeddings lies in their ability to capture the essence of the data they represent.

Vector embeddings basically are “the output of the process of learning embeddings”, which is done by feeding raw input data, like texts, images, words, into a trained AI model.[14]

In this process, the input data is translated into a lower-dimensional space. The result is a set of n-dimensional vectors in an embedding space.

The process of embedding

 The process of embedding [15]

The embedding space is specific to the data on which the embeddings were trained, but it can be transferred to other tasks and domains via transfer learning. Two embeddings that are close together in the embedding space indicate that the data they represent is similar.[16]

Once generated, the vector embeddings can be stored for efficient retrieval and use by AI / ML apps.[17] We visualized the process below.  

Vector Database initial preparation

Vector Database initial preparation [26]

How are embeddings generated?

Creating vector embeddings used to be a time-consuming process requiring domain experts and manual work. Today, however, there are many specialized models available for generating embeddings:

  • For text data, for example: Word2Vec, GloVe, and BERT. These models translate the semantic meaning of words and phrases into numerical form.
  • For image data, for example: VGG and Inception. These models capture visual characteristics of images and translate them into numerical form.

–> The Huggingface Model Hub offers many models that can create embeddings for different types of data. Anyone can do it with very little to no coding know-how.

But… how are embeddings generated?

There is no easy answer to this, at least I didn’t find one. If you really want to know, you will need to spend the time and effort to dig very deep. Today’s large language models were built over decades by many brilliant minds. They entail many fundamental concepts of converting different types of data (like words or images)  into numerical representations. The following three fundamental concepts seem to be part of most LLMs and are worth having heard of:[18]

  • Encoding – Non-numerical, multimodal data needs to be transferred into numbers, so models can be created out of them. 
  • Vectors – In order to store the encoded data and efficiently perform mathematical functions on them, encodings are stored as vectors (typically floating-point arrays).
  • Lookup matrices – Also known as lookup or hash tables; this table maps data  to quickly jump from numerical to word representations (and back) across large chunks of text.

From the database perspective, the primary interest is in working efficiently with the generated embeddings [19]. Once a piece of information (a sentence, a document, an image) is a vector embedding and stored in a database, it’s time to get creative.

Where are vector embeddings used? 

One of the most common vector embedding applications is in recommendation systems and search engines, e.g., Google Search uses embeddings to match text to text and text to images; Snapchat uses them to “serve individual ads depending on the user and time; and Meta (Facebook) uses them for social search.[20] But use cases are endless, e.g. they can also be used in chatbots, fraud detection, predictive maintenance, and autonomous driving. The ability to convert complex data into numerical form opens up endless applications. Generative AI applications typically work with vector embeddings. All of these applications benefit from using a vector database to enhance speed and efficiency. 

How are vector databases used? 

While a vector database can sometimes return responses directly, it typically returns results in conjunction with an LLM as we have depicted below. The vector database improves the accuracy of the LLM responses by using domain specific data from the database. More specifically: The vector search will give you the most relevant data for a specific query to provide to the LLM.[21] In both cases, the vector database helps reduce the number of queries to the LLM, which are costly, and also speeds up response times.

The use of vector databases: In use / production []

Vector databases: In use / production [26]

Accordingly, apart from efficiently handling vectors (as a data type), nearest neighbour search (primarily Approximate Nearest Neighbour (ANN) Search) is the most important feature of a vector database. The ANN algorithm finds the most similar vectors quickly. Additional filtering optimizes the result further, making queries even more efficient. Using the retrieved vectors (for context) and the initial query, an LLM generates the response. Typically, the response will be stored as a response embedding in the database, so over time, the database can serve more questions directly and / or improve the accuracy of the answer even more. This is depicted in the image above.

Why to use a vector database?

Vector databases enhance the efficiency and accuracy of AI applications, especially of those applications that are heavy on similarity searches, e.g. recommendation systems, natural language processing, or computer vision. Vector databases are also essential for scaling AI applications as the efficiency and speed starts to matter more. Through this, vector databases also bring costs down and heighten the sustainability of the AI application. On top, developers benefit from the additional functionalities a database offers for managing data, especially its querying capabilities, making any AI application more adaptable and therefore future-proof.

why use a vector database?

Edge vector databases – giving AI an edge

Edge Computing in a gist

Since data is produced and used everywhere (decentralized), using cloud computing for storage and processing is inefficient, wasteful, and often impossible. To unlock the value of decentralized data and drive digitization, you need to compute on the edge of the network (i.e. locally, closer to where data is generated). Gartner emphasizes Edge Computing’s importance for digital transformation.[22] To fully utilize Edge Computing, we need edge-specific infrastructure technologies, or “to make the edge as easy as the cloud for developers.” Edge databases are one such core infrastructure software. They enable rapid implementation of edge solutions by providing fast local data persistence (on the edge) and the capability to control and direct decentralized data flows (on the edge as well as in conjunction with a cloud). 

Edge computing and edge databases can unlock decentralized data’s full potential, drive digital transformation, and create a sustainable and efficient data management landscape.

What is Edge AI?

Edge AI is the implementation of AI applications on the edge of the network without using a cloud, meaning, the necessary AI computations are performed on the edge directly, where the data is produced, e.g. in the car (onboard AI) or on a mobile device or simply within a specific location like a shop floor. A local Edge AI enables making decisions reliably in milliseconds, also when offline, and way cheaper. At this moment, this is particularly interesting for mission-critical use cases, offline-scenarios, and applications with high data security / privacy requirements. To run AI models directly on the edge, they need to be optimized for edge devices. The good news is, there are several AI models optimized for small devices available, e.g. Google’s Gecko, which is open source.[23] “Gecko is so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.”[24] Edge AI applications benefit tremendously from using a local vector database; only few use cases could do without one.

Edge AI basically offers the advantages of Edge Computing, whereas the disadvantages are more specific to AI.

Advantages of Edge AI (vs. cloud) [25]
Disadvantages of Edge AI (vs. cloud) [25]
Edge AI is faster, can guarantee QoS requirements, and works offline Decentralized data access can be challenging and needs specific skills
Edge AI saves Internet bandwidth, cloud and networking costs (e.g. MNO costs) The initial setup for the ongoing training of decentralized Edge AIs is more complex and therefore costly (while the cloud setup is quick and easy)
Edge AI helps ensure data privacy, data security, data ownership Edge AI needs specific skills (entailing “oldschool programming skills”) – dev talent is hard to attract and expensive to keep 
Edge AI is more sustainable to run (less wasteful data traversal meaning less energy use, less costs for the energy and less CO2 emissions) The heterogeneity of the edge makes it difficult to develop solutions for a wide range of devices
Edge AI is a young market and therefore holds opportunities to capture market share and competitive advantages Edge AI is a young market and still lacks infrastructure software

Edge AI setup / architecture

There are generally two setups for Edge AI applications: Full edge setup, running the AI model on the edge devices directly, or a hybrid approach, using the cloud or a central server for the AI model.

Edge AI - setup / architecture options

Edge AI: general setup / architecture options

The edge / cloud approach has the advantage that the AI model (re-)training, enhancement will happen centrally, on the cloud without additional efforts. On the other hand, in the full edge setup, you have all the advantages of the edge (offline, cost-effective, private, …) and you can use the power of all edge devices, making it even more affordable and fast. However, the caveat lies in the challenge of organizing the learning. The individual local models diverge and you need to get these learnings distributed to all devices. This will be done in a “global AI model” (centrally).

Depending on the details, this can be done locally on a central server or in the cloud, or even on an edge device. Also, depending on the number of edge devices, connectivity, and need for speedy updates, the distribution can be organized in a more decentralized way, fully using the power of the edge. Once all devices are harmonized, any edge device could, if it has the capabilities from a hardware perspective, be the global model combining all updates and sending them out. This offers great advantages with regards to availability and resilience. You can find more about decentralized Edge AI setups under the term federated learning.

Edge AI - decentralized setup, using the full power of the edge

Edge AI – decentralized setup, using the full power of the edge

Summary: Vector Databases for the edge

According to our research and experience, no “Edge Vector Database” exists yet. The Edge Database market has always been limited to fewer players – and is certainly not as crowded as the central server / cloud database space. However, the cloud / server databases cannot be used on the edge (big to small just doesn’t work ;)), whereas Edge Databases can run anywhere and can sometimes be a good choice for a server / cloud setup.

Opinions differ on whether there is a need for specific, dedicated vector databases, or whether general databases will evolve to include vector support and become the go-to-solution. In any case, the vector database space is hot and has recently been added as a category on db-engines. Adding a new database category is a rare occurrence for db-engines, the established platform for databases. In any case, we can see that both types of databases are converging towards each other and we firmly believe that in the future, all databases will support vectors.

So, when it comes to Edge Databases, there are the same two options: Either someone will implement a dedicated edge vector database or Edge Databases will evolve to support vectors. Because so far we have seen neither, we have extended the ObjectBox Edge Database with vector support. And the huge advantage this brings is that ObjectBox already offers an excellent, highly efficient, and battle-tested out-of-the-box Sync that takes care of the “hard stuff” of decentralized data management for developers. As a developer tool that is self-hosted and can be used on all kinds of edge devices, and certainly on premise, it offers companies the flexibility to implement a myriad of applications, reduce costs (especially cloud and networking costs), while not jeopardizing data ownership in any way.

References and Notes

  1. https://www.marketsandmarkets.com/Market-Reports/edge-computing-market-133384090.html
  2. https://www.forbes.com/advisor/business/ai-statistics/
  3. https://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2022-gartner-hype-cycle
  4. https://www.gartner.com/en/articles/4-emerging-technologies-you-need-to-know-about
  5. https://www.grandviewresearch.com/industry-analysis/edge-ai-market-report
  6. https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
  7. https://mpost.io/ai-model-training-costs-are-expected-to-rise-from-100-million-to-500-million-by-2030/
  8. https://en.wikipedia.org/wiki/Foundation_models
  9. https://en.wikipedia.org/wiki/GitHub_Copilot, https://github.com/features/preview/copilot-x
  10. While this is great for the current landscape of AI-driven tools and apps, and thus the consumers, these are incremental innovations and will not take the foundation of AI forwards. So, there is a rightful fear that big corporations will discontinue open sourcing advancements, once they feel protecting their business interests outweighs the benefits they have from open sourcing.
  11. https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
  12. https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview
  13. https://analyticsindiamag.com/openai-has-stopped-caring-about-open-ai-altogether/
  14. An AI training model is the initial version that goes through the training process, while the AI model refers to the trained and optimized version that is ready for deployment and inference in real-world applications.
  15. Adapted from https://vickiboykis.com/what_are_embeddings/
  16. If you are interested in understanding distances in vector databases more, read this post; it explains the most important commonly used distance algorithms in a straightforward and highly understandable way.
  17. Note: Traditional databases can offer vector support too. Our guess is that there will be a consolidation of databases, with traditional databases moving towards vector support and vector databases moving towards supporting other (more traditional) database functionalities. In fact, we cannot imagine a future where any database will not support AI applications (as in: Support vectors, nearest neighbour search etc.).
  18. Based on: https://vickiboykis.com/what_are_embeddings/
  19. If you are interested in diving deeper, read this article, which is written in a highly understandable and comprehensive way, explaining everything you would need to know to develop a basic understanding.
  20. https://huggingface.co/blog/getting-started-with-embeddings
  21. https://hackernoon.com/how-llms-and-vector-search-have-revolutionized-building-ai-applications
  22. https://www.gartner.com/en/documents/4263499
  23. Though the merit of it being open source is unclear an ongoing discussion in the more legal oriented open source community.
  24. https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
  25. https://builtin.com/artificial-intelligence/edge-ai, https://objectbox.io/what-is-edge-computing/, https://objectbox.io/what-is-an-edge-database-and-why-do-you-need-one/, https://www.marketsandmarkets.com/Market-Reports/edge-ai-software-market-70030817.html, https://www.run.ai/guides/machine-learning-operations/edge-ai#What-Are-the-Benefits-Of-Edge-AI, https://cambrian-ai.com/wp-content/uploads/edd/2023/07/Large-Language-Models-On-Edge-Publication-FINAL.pdf
  26. Adapted from https://www.swirlai.com/

Vector Database Release for Flutter / Dart + Python

Vector Database Release for Flutter / Dart + Python

The Flutter / Dart and Python binding of our database now enable “vector types”. In both languages these are more commonly referred to as “lists” and now you are able to efficiently store lists of numeric types, i.e. integers and floating point numbers (aka “vectors / vector embeddings”). Native support for that is crucial for data intensive applications, especially in the field of AI.

What are Vector embeddings? Multi-dimensional vectors are a central building block for AI applications. And accordingly, the ability to store vectors to add long-term memory to your AI applications (e.g. via vector databases) is gaining importance. This is what the ObjectBox database now supports natively.

Dart example code

Let’s assume some shapes that use a palette of RGB colors. This allows the shape to reference colors by their index. An entity for this might look like this:

Python example code

Python is the number one programming language for AI. So let’s assume having an Image class that contains an URL to point to the content (e.g. JPEG/PNG images) and additionally a vector embedding. The latter are supplied by a ML model and contain a list of 32-bit floating points.

There is more…

The support for vector types is not the only new feature. E.g. ObjectBox Flutter database comes with several fixes and our Python database binding now also supports date types. For details, please check the changelog for Dart DB vector release or Python DB vector release.

Vector types (aka arrays) added with ObjectBox Java 3.6 release

Vector types (aka arrays) added with ObjectBox Java 3.6 release

Vector embeddings (multi-dimensional vectors) are a central building block for AI applications. And accordingly, the ability to store vectors to add long-term memory to your AI applications (e.g. via vector databases) is gaining importance. Sounds fancy, but for the basic use cases, this simply boils down to “arrays of floats” for developers. And this is exactly what ObjectBox database now supports natively. If you want to use vectors on the edge, e.g. in a mobile app or on an embedded device, when offline, independent from an Internet connection, removing the unknown latency, try it…

See the release notes for all new features this release brings.

Code Examples

Let’s start with a simple example: let’s assume some shapes that use a palette of RGB colors. An entity for this might look like this:

We can now create a query to find all shapes that use a certain color:

Another typical use case is the embedding of certain types of data, like text, audio or images, as vector coordinates. To store such a vector embedding, in the following example we store the floating point coordinates that were computed by a machine learning model for an image together with a reference to the actual image:

Ready to go?

To update to this release, change the version of objectbox-gradle-plugin to 3.6.0.

To add ObjectBox database to your JVM or Android project read our Getting Started guide.
As always, we look forward to your feedback on GitHub or via our anonymous feedback form and hope you have a great time building apps with ObjectBox! ❤️

Vector databases – a look at the AI database market with a comprehensive comparison matrix

Vector databases – a look at the AI database market with a comprehensive comparison matrix

Vector databases - a look at the AI database market

⭐ What are vector databases? ⭐ What do you need them for? ⭐ Who is in the market?

Includes a comparison matrix of vector database options like Pinecone, Milvus, Vespa, Vald, Chroma, Marqo AI, Weaviate, and Qdrant

With 350M+ USD invested in AI / vector databases in the last months, one thing is clear: The vector database market is hot 🔥 Everyone, not just investors, is  interested in the booming AI market. While AI applications have dominated the news for quite some time, the infrastructure software that supports these applications, such as vector databases, is finally gaining attention too.  In the following, we’ll have a look at why vector databases are gaining attention and compare current vector database alternatives.

What is a vector database? 

A vector database stores vectors, or more precisely vector embeddings. A vector database therefore is a specialised type of database designed to store and manage large sets of vectors efficiently. However, the challenge and value are not derived from simply being able to store vectors. The value is created by the type of computations that can be run over the stored vector data and the speed with which these computations can be run, e.g. similarity searches. 

Vector databases are essentially an important piece of the AI tech stack. They can be used e.g. to give LLMs (Large Language Models) – or more broadly speaking, AI applications – a long-term memory and faster search and querying capabilities. Another important use case is RAG (Retrieval-Augmented Generation).

To give some context: The most traditional databases, SQL databases, store data in rows and columns; graph databases store graphs and object databases store objects.

Because Large Language Models and AI applications rely on vector embeddings, vector databases are especially apt at supporting AI applications. 

Accordingly, vector databases are becoming a critical layer in the AI tech stack; they are sometimes also called “AI databases”. However, databases tend to converge over time, meaning that many databases support several different database models.

What is a vector embedding?

A vector embedding is a list of numbers that represent objects and relationships, allowing unstructured data (such as images) to be searched and used. Typically, Large Language Models (more precisely the underlying Machine Learning (ML) algorithms) are used to create these vectors. The ML algorithms analyse large amounts of data to learn how to represent complex / unstructured data in a lower dimensional space (as vectors).

What does it have to do with nearest neighbour search?

Searchability (making unstructured data usable) is at the heart of this concept. The nearest neighbour search is therefore a key concept in vector databases. The distance between vector embeddings expresses the similarity of the vectors (and thus the represented objects). Therefore, as you are searching for the most similar data, the so-called “nearest neighbour search” is a key concept and the time required to find the nearest neighbours is essential. 

Do we need special vector databases?

There is already a discussion going on about whether special vector databases are needed or do not warrant a new category in the database landscape. Instead, vector extensions of traditional databases could be supporting the AI market. Both are reasonable expectations, and time will tell. Notable databases that have already added a vector extension include e.g. redis and elasticsearch. Additionally, more and more databases now allow storing vector types.

How does the vector database landscape look like?

To have a look at the current market situation, we are comparing the choices with the most traction, but excluding established players that have added vector capabilities to their existing database offering. Generally speaking we see a lot of very young companies, some companies that did pivot from their original specialization, and massive fundings. Please note: the table is not optimized to be readable on mobile or small screens (there just is a trade-off between providing the information and making it readable on every device).

If you’re on mobile, use this link to view a version that is readable on mobile.

  Open Source License GitHub stars  Developed in (language) Summary Business Model Embeds / Uses founding date / first released date In-memory Unterstützung Sharding Index Types Consistency Model Benchmarks (Performance?) Queries per second (using text nytimes-256-angular) Latency, ms (Recall/Percentile 95 (millis), nytimes-256-angular) Approximate Nearest Neighbor (ANN) Vector Databases Funding Who's behind it HQ in 
Marqo AI Y Apache-2.0 2.8k ⭐ Python A tensor-based cloud-native commercial Open Source search and analytics engine. Open SaaS Tensor-based   Y HNSW   -   undisclosed preseed in May 2022 S2Search Australia Pty Ltd 🇦🇺
Weaviate Y BSD 5.6k ⭐ Assembly, C++, GoLang Weaviate is a commercial Open Source cloud-native vector database that stores both objects and vectors. Open SaaS started in 2018 as a traditional graph database, first released in 2019 N Y, static sharding a custom HNSW PQ algorithm that supports CRUD Eventual Consistency not comparative, just evaluating their own performance  791 2 Y (multiple ANN algorithms as long as they support full CRUD) 67.7M USD, series B SeMI Technologies 🇪🇺
Chroma Y Apache-2.0 4.4k ⭐ Python & Typescript Chroma is a Commercial Open Source vector database Preparing a (Partly Open) SaaS model* [Commercial Open Source] HNSW lib, DuckDB; based on ClickHouse looks like 2022 N Dynamic segment placement       Y 20.3M USD, seed Chroma Inc. 🇺🇸
Qdrant Y Apache-2.0 6.6k ⭐ Rust Qdrant is a Commercial Open Source vector similarity search engine and vector database Open SaaS RocksDB first released: 2021 Y Y, static sharding HNSW (SQ & PQ) Eventual Consistency, tunable consistency compares to weaviate, milvus, elastic (note: redis took too long to complete) 326 4 Y 9.8M € Qdrant Solutions GmbH 🇪🇺
Milvus Y Apache-2.0 18k ⭐ GoLang & Python Milvus is a cloud-native Commercial Open Source vector database (Partly Open) SaaS* [Commercial Open Source] Initial blog post from them said SQLite, but meanwhile they said RocksDB - exchanged?
they also have a ChatGPT-Cache that is build on SQLite
and say "Milvus uses SQLite or MySQL to manage metadata"
founded 2017, first released: 2019 N Dynamic segment placement ANNOY; HNSW; IVF_PQ; IVF_SQ(; IVF_FLAT; FLAT; IVF_SQ8_H; RNSG Strong, bounded staleness, session, and eventually. The default consistency level in Milvus is bounded staleness.  not comparative 2406 1 Y 113M USD, series B Zilliz 🇺🇸
Vespa Y Apache-2.0 4.4k ⭐ Java & C++ Vespa is a Commercial Open Source vector database by Yahoo! It is a search engine which supports vector search, lexical search, and search in structured data Open SaaS Originally a web search engine (alltheweb), acquired by Yahoo! in 2003 and later open sourced as Vespa in 2017; sinde Oct 2023 spinoff, raised series A in Nov 2023 maintains disk and memory structures for documents Y Custom HNSW (Multi-vector hybrid HNSW-IF) Eventual Consistency not comparative  Y Spinoff from Yahoo! in Oct 2023, then raised a 31M USD series A Yahoo! 🇺🇸
Vald Y Apache-2.0 1.2k ⭐ GoLang Vald is a cloud-native Open Source distributed approximate nearest neighbor (ANN) dense vector search engine Community project, currently looks like no commercial interests are pursued uses the vector search engine NGT Technology incubation at Yahoo! Japan Corporation, development was stared in 2019 N/A N/A N/A not comparitive, but Vald performance only Y (NGT) - Yusuke Kato (Yahoo Japan Corporation), Kiichiro Yukawa (Yahoo Japan Corporation) 🇯🇵
Pinecone
N Proprietary NA   Pinecone is a fully managed vector database that specializes in enabling semantic search capabilities SaaS built on top of Faiss first released in 2019 N Y proprietary Eventual Consistency more programming language comparison for vector databases 150 (for p2, but more pods can be added) 1 (batched search, 0.99 recall, 200k SBERT) Y (proprietary), plus KNN (with Faiss) 138M, series B Pinecone Systems Inc 🇺🇸

Want to know more about the vector database market?

Here are some more questions answered for anyone interested

What is an "Open SaaS" business model?

Software as a service (SaaS) refers to software that is managed / hosted for the client and is essentially “rented.” The open in Open SaaS refers to the open source software that is being offered as such a service.

This frequently implies that not all code is open source, particularly that which is part of the managed service / hosting and associated value-adding features. Note: The open source software offered in this manner may or may not be provided by the company providing the software as a service. This has caused some friction in the open source community, as original creators often struggle to make a living, and/or maintainers struggle to keep maintaining the software – while other companies profit. Most famously, huge cloud providers have taken advantage of this option, leading to new licenses that keep the source open but restrict others from hosting as a service without donating the whole source code back to the community.

Why should I care about index types?

Indexes are essentially a way to speed up searching a database. There are several established index types for vector databases and they affect the performance of the database, e.g. the time it takes a query to complete.

What about benchmarks?

You will see, if you review the benchmarks given at the top, that results typically vary. Benchmarks are difficult to do and neutral benchmarks even more so. Certain use cases may favor certain solutions. Therefore, ideally you benchmark based on your specific use case…. but as a first evaluation, try to understand the basic influencing factors and have a look at a handful of benchmarks and explanations. Having said all this: There is a benchmarking tool available for approximate nearest neighbor (ANN) algorithms search. If you use this, you can compare the performance of different databases (with regards to the ANN search)  for the same setup, based on the same approach. Also: The underlying libs often used by databases (like NGT and HNSW, see above) have already been benchmarked with it and you can compare to these directly.

Why is the market so hot, how can companies raise so much money?

AI is hot, everyone agrees that data and its management will be key to future success, and the database market is interesting: It is a long established market with many players, yet still demonstrating continually good growth (e.g. 17% in 2020). And the database market history shows that from time to time a new type of database comes up, and with it, the creation of a new market category. In such a market, typically the market creator “takes all” (not quite literally, but such a significant share, definetely the vast majority, that all other players are not attractive from a VC-perspective). Such a market could easily be worth 100M+ in ARR. Examples from the last 20 years: MongoDB (NoSQL databases), Cockroach (NewSQL databases), Neo4J (Graph databases), Influx (Time-Series databases). So, VCs are looking to find the next new type of database that can create a market… Maybe it will be vector databases? However, the database market has also shown to take 10 years+ for players to become profitable, so expect a longterm game. The race is still on for Edge Databases we think 🙂

Want to know more about the database market?

We recommend checking out db-engines. The website compares all relevant systems and has tons of data from the last 20 years. Note: They do only add databases once they have some traction and notability, not any hobby project. Accordingly not all databases of the above comparison have been added to the website yet.