Vector Databases for Edge AI

Q: What are vector embeddings?

Vector embeddings represent (multimodal) data as n-dimensional vectors, meaning n-dimensional matrices, which comes down to a set of numbers. This is what makes them easy and efficient to compute with. The power of vector embeddings lies in their ability to capture the essence of the data they represent. Vector embeddings basically are “the output of the process of learning embeddings”, which is done by feeding raw input data, like texts, images, words, into a trained AI model.[14] In this process, the input data is translated into a lower-dimensional space. The result is a set of n-dimensional vectors in an embedding space. The process of embedding [15] The embedding space is specific to the data on which the embeddings were trained, but it can be transferred to other tasks and domains via transfer learning. Two embeddings that are close together in the embedding space indicate that the data they represent is similar.[16] Once generated, the vector embeddings can be stored for efficient retrieval and use by AI / ML apps.[17] We visualized the process below. Vector Database initial preparation [26]

The intersection of AI and Edge Computing is where Edge AI happens – and it needs databases that support AI and can run on the edge (for lack of a better term “Edge Vector Databases”, also refered to as On-device vector databases or local vector databases). Vector Databases are the databases for AI and are an important piece of the AI tech stack. Edge Databases are databases that can run on edge devices.

Edge Vector Databases are the basis for Edge AI

Edge Vector Databases – the intersection of Edge Computing and AI needs a database

In 2023, the Edge Computing Market is estimated to be at $53B,[1] while the AI market is expected to reach a whopping $87B.[2] Both markets are expected to grow dramatically in the coming years with the two technologies enhancing each other. For many use cases it is advantageous, and oftentimes necessary, to use both Edge Computing and AI in conjunction. This is what is called Edge AI and Gartner prognosed its plateau within 2023-2024.[3] In fact, recently, Gartner named Edge AI as one of the breakthrough technologies of 2023 due to the growing demand for real-time AI solutions and the need for decentralized data processing.[4] The global Edge AI market size was valued at roughly $14.5B in 2022 with expected CAGRs of 20-30% from 2023 to 2030.[5] In this article we will take a closer look at the use of vector databases in Edge AI.

The AI market: AI model trainings vs. using trained AI models

To understand the AI market, it is important to distinguish between the AI model generation, initial training, and the use of these models.

AI model training

AI model training is the most resource-intensive and costly part of AI – and is considered hard.[6] The larger the model, the higher the costs and the more time it needs. Also with models getting larger, the training fails more often and must be restarted, adding to duration and costs. Initially, AI models were specifically trained for one task only. Now, however, a new type of AI models, namely, foundation models, have evolved. They are general-purpose models. This development was a main driver of the current AI boom we are seeing.

Foundation models

Foundation models (also base models) are large-scale AI systems trained on a vast quantity of data at scale in such a way that they can be used for a wide variety of tasks. Costs for training such large foundation AI models from scratch (GPT-4) are typically in the millions of USD and the expectation is that large model training costs will go up to 500 million USD by 2030.[7] They can be “fine-tuned” (trained) with specific data sets (on top of the already trained model) to adapt them to specific environments,[8] e.g. GitHub Copilot is based on GPT-3.5 Turbo and was tweaked as well as additionally trained on the code from GitHub repos.[9] Fine-tuning is typically way less costly than training a specific one-task model from scratch (let alone the initial foundation model training). Examples of foundation models include the GPT-models from OpenAI, LLaMa, and Gemini. LLMs (Large Language Models) are basically a specific subset of foundation models.

Using trained AI models

Using trained foundation models (like GPT-4 or LLaMA), additionally tuning them with specific data sets, is what is empowering most current AI tools / apps and all the innovations we are seeing around that. Basically, at this moment, a handful of popular foundation models are empowering a whole and currently thriving ecosystem. As AI models use vector embeddings, using trained models can be supported and enhanced with a vector database.[10] We’ll dive into this a bit more below.

📈 Side note on the market

From a market perspective, this means there is a relevant market entry barrier to training foundation models, and therefore it is reasonable to expect only few companies doing it (as opposed to many startups ;)). Yet, training new models is likely where the greatest innovation potential and most significant advancements lie. Given that the current AI market largely depends on large trained models (mainly foundation models) being deployed as free (often open source) models, there is a real risk of few tech giants owning the market later on, whereas the market that was built depending on those models starts stalling.[11] For example, OpenAI didn’t open source GPT-4 anymore, because they felt it could impede their business interests; Sam Altman even went as far as saying it was wrong to ever open it – likely they should change their name.[12]

At the same time, Sam Altman went lobbying around the world to regulate AI (model trainings) and enhance entry barriers. The most reasonable explanation I read is that they are trying to ensure corporate dominance, and likewise, the reason behind the movement to pause AI model training from other actors in the space is a move to avoid a monopoly by OpenAI and get meat in the game quickly.[13]

Vector Databases and their important role in AI

Many machine learning and deep learning algorithms, as well as the AI models described above, depend on vector embeddings / embeddings. The increasing demand for creating, storing, and managing these embeddings has led to the emergence of vector databases.

What are vector embeddings?

Vector embeddings represent (multimodal) data as n-dimensional vectors, meaning n-dimensional matrices, which comes down to a set of numbers. This is what makes them easy and efficient to compute with. The power of vector embeddings lies in their ability to capture the essence of the data they represent.

Vector embeddings basically are “the output of the process of learning embeddings”, which is done by feeding raw input data, like texts, images, words, into a trained AI model.[14]

In this process, the input data is translated into a lower-dimensional space. The result is a set of n-dimensional vectors in an embedding space.

The process of embedding [15]

The embedding space is specific to the data on which the embeddings were trained, but it can be transferred to other tasks and domains via transfer learning. Two embeddings that are close together in the embedding space indicate that the data they represent is similar.[16]

Once generated, the vector embeddings can be stored for efficient retrieval and use by AI / ML apps.[17] We visualized the process below.

Vector Database initial preparation [26]

How are embeddings generated?

Creating vector embeddings used to be a time-consuming process requiring domain experts and manual work. Today, however, there are many specialized models available for generating embeddings:

For text data, for example: Word2Vec, GloVe, and BERT. These models translate the semantic meaning of words and phrases into numerical form.
For image data, for example: VGG and Inception. These models capture visual characteristics of images and translate them into numerical form.

–> The Huggingface Model Hub offers many models that can create embeddings for different types of data. Anyone can do it with very little to no coding know-how.

But… how are embeddings generated?

There is no easy answer to this, at least I didn’t find one. If you really want to know, you will need to spend the time and effort to dig very deep. Today’s large language models were built over decades by many brilliant minds. They entail many fundamental concepts of converting different types of data (like words or images) into numerical representations. The following three fundamental concepts seem to be part of most LLMs and are worth having heard of:[18]

Encoding – Non-numerical, multimodal data needs to be transferred into numbers, so models can be created out of them.
Vectors – In order to store the encoded data and efficiently perform mathematical functions on them, encodings are stored as vectors (typically floating-point arrays).
Lookup matrices – Also known as lookup or hash tables; this table maps data to quickly jump from numerical to word representations (and back) across large chunks of text.

From the database perspective, the primary interest is in working efficiently with the generated embeddings [19]. Once a piece of information (a sentence, a document, an image) is a vector embedding and stored in a database, it’s time to get creative.

Where are vector embeddings used?

One of the most common vector embedding applications is in recommendation systems and search engines, e.g., Google Search uses embeddings to match text to text and text to images; Snapchat uses them to “serve individual ads depending on the user and time; and Meta (Facebook) uses them for social search.[20] But use cases are endless, e.g. they can also be used in chatbots, fraud detection, predictive maintenance, and autonomous driving. The ability to convert complex data into numerical form opens up endless applications. Generative AI applications typically work with vector embeddings. All of these applications benefit from using a vector database to enhance speed and efficiency.

How are vector databases used?

While a vector database can sometimes return responses directly, it typically returns results in conjunction with an LLM as we have depicted below. The vector database improves the accuracy of the LLM responses by using domain specific data from the database. More specifically: The vector search will give you the most relevant data for a specific query to provide to the LLM.[21] In both cases, the vector database helps reduce the number of queries to the LLM, which are costly, and also speeds up response times.

The use of vector databases: In use / production []

Vector databases: In use / production [26]

Accordingly, apart from efficiently handling vectors (as a data type), nearest neighbour search (primarily Approximate Nearest Neighbour (ANN) Search) is the most important feature of a vector database. The ANN algorithm finds the most similar vectors quickly. Additional filtering optimizes the result further, making queries even more efficient. Using the retrieved vectors (for context) and the initial query, an LLM generates the response. Typically, the response will be stored as a response embedding in the database, so over time, the database can serve more questions directly and / or improve the accuracy of the answer even more. This is depicted in the image above.

Why to use a vector database?

Vector databases enhance the efficiency and accuracy of AI applications, especially of those applications that are heavy on similarity searches, e.g. recommendation systems, natural language processing, or computer vision. Vector databases are also essential for scaling AI applications as the efficiency and speed starts to matter more. Through this, vector databases also bring costs down and heighten the sustainability of the AI application. On top, developers benefit from the additional functionalities a database offers for managing data, especially its querying capabilities, making any AI application more adaptable and therefore future-proof.

Edge vector databases – giving AI an edge

Edge Computing in a gist

Since data is produced and used everywhere (decentralized), using cloud computing for storage and processing is inefficient, wasteful, and often impossible. To unlock the value of decentralized data and drive digitization, you need to compute on the edge of the network (i.e. locally, closer to where data is generated). Gartner emphasizes Edge Computing’s importance for digital transformation.[22] To fully utilize Edge Computing, we need edge-specific infrastructure technologies, or “to make the edge as easy as the cloud for developers.” Edge databases are one such core infrastructure software. They enable rapid implementation of edge solutions by providing fast local data persistence (on the edge) and the capability to control and direct decentralized data flows (on the edge as well as in conjunction with a cloud).

Edge computing and edge databases can unlock decentralized data’s full potential, drive digital transformation, and create a sustainable and efficient data management landscape.

What is Edge AI?

Edge AI is the implementation of AI applications on the edge of the network without using a cloud, meaning, the necessary AI computations are performed on the edge directly, where the data is produced, e.g. in the car (onboard AI) or on a mobile device or simply within a specific location like a shop floor. A local Edge AI enables making decisions reliably in milliseconds, also when offline, and way cheaper. At this moment, this is particularly interesting for mission-critical use cases, offline-scenarios, and applications with high data security / privacy requirements. To run AI models directly on the edge, they need to be optimized for edge devices. The good news is, there are several AI models optimized for small devices available, e.g. Google’s Gecko, which is open source.[23] “Gecko is so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.”[24] Edge AI applications benefit tremendously from using a local vector database; only few use cases could do without one.

Edge AI basically offers the advantages of Edge Computing, whereas the disadvantages are more specific to AI.

Advantages of Edge AI (vs. cloud) [25]	Disadvantages of Edge AI (vs. cloud) [25]
Edge AI is faster, can guarantee QoS requirements, and works offline	Decentralized data access can be challenging and needs specific skills
Edge AI saves Internet bandwidth, cloud and networking costs (e.g. MNO costs)	The initial setup for the ongoing training of decentralized Edge AIs is more complex and therefore costly (while the cloud setup is quick and easy)
Edge AI helps ensure data privacy, data security, data ownership	Edge AI needs specific skills (entailing “oldschool programming skills”) – dev talent is hard to attract and expensive to keep
Edge AI is more sustainable to run (less wasteful data traversal meaning less energy use, less costs for the energy and less CO2 emissions)	The heterogeneity of the edge makes it difficult to develop solutions for a wide range of devices
Edge AI is a young market and therefore holds opportunities to capture market share and competitive advantages	Edge AI is a young market and still lacks infrastructure software

Edge AI setup / architecture

There are generally two setups for Edge AI applications: Full edge setup, running the AI model on the edge devices directly, or a hybrid approach, using the cloud or a central server for the AI model.

Edge AI: general setup / architecture options

The edge / cloud approach has the advantage that the AI model (re-)training, enhancement will happen centrally, on the cloud without additional efforts. On the other hand, in the full edge setup, you have all the advantages of the edge (offline, cost-effective, private, …) and you can use the power of all edge devices, making it even more affordable and fast. However, the caveat lies in the challenge of organizing the learning. The individual local models diverge and you need to get these learnings distributed to all devices. This will be done in a “global AI model” (centrally).

Depending on the details, this can be done locally on a central server or in the cloud, or even on an edge device. Also, depending on the number of edge devices, connectivity, and need for speedy updates, the distribution can be organized in a more decentralized way, fully using the power of the edge. Once all devices are harmonized, any edge device could, if it has the capabilities from a hardware perspective, be the global model combining all updates and sending them out. This offers great advantages with regards to availability and resilience. You can find more about decentralized Edge AI setups under the term federated learning.

Edge AI – decentralized setup, using the full power of the edge

Summary: Vector Databases for the edge

According to our research and experience, no “Edge Vector Database” exists yet. The Edge Database market has always been limited to fewer players – and is certainly not as crowded as the central server / cloud database space. However, the cloud / server databases cannot be used on the edge (big to small just doesn’t work ;)), whereas Edge Databases can run anywhere and can sometimes be a good choice for a server / cloud setup.

Opinions differ on whether there is a need for specific, dedicated vector databases, or whether general databases will evolve to include vector support and become the go-to-solution. In any case, the vector database space is hot and has recently been added as a category on db-engines. Adding a new database category is a rare occurrence for db-engines, the established platform for databases. In any case, we can see that both types of databases are converging towards each other and we firmly believe that in the future, all databases will support vectors.

So, when it comes to Edge Databases, there are the same two options: Either someone will implement a dedicated edge vector database or Edge Databases will evolve to support vectors. Because so far we have seen neither, we have extended the ObjectBox Edge Database with vector support. And the huge advantage this brings is that ObjectBox already offers an excellent, highly efficient, and battle-tested out-of-the-box Sync that takes care of the “hard stuff” of decentralized data management for developers. As a developer tool that is self-hosted and can be used on all kinds of edge devices, and certainly on premise, it offers companies the flexibility to implement a myriad of applications, reduce costs (especially cloud and networking costs), while not jeopardizing data ownership in any way.

References and Notes

https://www.marketsandmarkets.com/Market-Reports/edge-computing-market-133384090.html
https://www.forbes.com/advisor/business/ai-statistics/
https://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2022-gartner-hype-cycle
https://www.gartner.com/en/articles/4-emerging-technologies-you-need-to-know-about
https://www.grandviewresearch.com/industry-analysis/edge-ai-market-report
https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
https://mpost.io/ai-model-training-costs-are-expected-to-rise-from-100-million-to-500-million-by-2030/
https://en.wikipedia.org/wiki/Foundation_models
https://en.wikipedia.org/wiki/GitHub_Copilot, https://github.com/features/preview/copilot-x
While this is great for the current landscape of AI-driven tools and apps, and thus the consumers, these are incremental innovations and will not take the foundation of AI forwards. So, there is a rightful fear that big corporations will discontinue open sourcing advancements, once they feel protecting their business interests outweighs the benefits they have from open sourcing.
https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview
https://analyticsindiamag.com/openai-has-stopped-caring-about-open-ai-altogether/
An AI training model is the initial version that goes through the training process, while the AI model refers to the trained and optimized version that is ready for deployment and inference in real-world applications.
Adapted from https://vickiboykis.com/what_are_embeddings/
If you are interested in understanding distances in vector databases more, read this post; it explains the most important commonly used distance algorithms in a straightforward and highly understandable way.
Note: Traditional databases can offer vector support too. Our guess is that there will be a consolidation of databases, with traditional databases moving towards vector support and vector databases moving towards supporting other (more traditional) database functionalities. In fact, we cannot imagine a future where any database will not support AI applications (as in: Support vectors, nearest neighbour search etc.).
Based on: https://vickiboykis.com/what_are_embeddings/
If you are interested in diving deeper, read this article, which is written in a highly understandable and comprehensive way, explaining everything you would need to know to develop a basic understanding.
https://huggingface.co/blog/getting-started-with-embeddings
https://hackernoon.com/how-llms-and-vector-search-have-revolutionized-building-ai-applications
https://www.gartner.com/en/documents/4263499
Though the merit of it being open source is unclear an ongoing discussion in the more legal oriented open source community.
https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
https://builtin.com/artificial-intelligence/edge-ai, https://objectbox.io/what-is-edge-computing/, https://objectbox.io/what-is-an-edge-database-and-why-do-you-need-one/, https://www.marketsandmarkets.com/Market-Reports/edge-ai-software-market-70030817.html, https://www.run.ai/guides/machine-learning-operations/edge-ai#What-Are-the-Benefits-Of-Edge-AI, https://cambrian-ai.com/wp-content/uploads/edd/2023/07/Large-Language-Models-On-Edge-Publication-FINAL.pdf
Adapted from https://www.swirlai.com/

Vector Databases for Edge AI

The AI market: AI model trainings vs. using trained AI models

AI model training

Foundation models

Using trained AI models

Vector Databases and their important role in AI

What are vector embeddings?

How are embeddings generated?

But… how are embeddings generated?

Where are vector embeddings used?

How are vector databases used?

Why to use a vector database?

Edge vector databases – giving AI an edge

Edge Computing in a gist

What is Edge AI?

Advantages of Edge AI (vs. cloud) [25]

Disadvantages of Edge AI (vs. cloud) [25]

Edge AI setup / architecture

Summary: Vector Databases for the edge

References and Notes