Edge AI: The era of on-device AI

by Vivien | Apr 23, 2024 | AI, Edge AI, Edge Computing, Mobile Database, vector database

AI anywhere and anytime - free from Internet dependencies & 100% private

Edge AI is an often overlooked aspect of AI’s natural evolution. It is basically the move of AI functionalities away from the cloud (or powerful server infrastructure) towards decentralized (typically less powerful) devices at the network’s edges, including on mobile phones, smartwatches, IoT devices, microcontrollers, ECUs, or simply your local computer. Or in more broadly speaking: “Edge AI” means AI that works directly on-device, “local AI“.

Therefore, Edge AI apps work independently from an internet connection, offline as well as online. So, they are ideal for low, intermittent, or no connectivity scenarios. They are reliably available, more sustainable, and – of course – way faster on-device than anything hosted in the cloud. On-device AI apps can empower realtime AI anytime and anyplace.

Edge AI is where Edge Computing meets AI

That’s why Gartner believes that “more than 55% of all data analysis by deep neural networks will occur at the point of capture in an edge system by 2025.”

The importance of vector databases for AI applications

To enable powerful on-device AI applications, the on-device (edge) technology stack needs local vector databases. So, before diving deeper into Edge AI, we’ll dive into vector databases first. Jump this section, if you are already familiar with them.

What is a vector database?

Just as SQL databases handle data in rows and columns, graph databases manage graphs, object databases store objects, vector databases store and manage large data sets of vectors, or more precisely, vector embeddings. Because AI models work with vector embeddings, vector databases are basically the databases for AI applications. Vector databases offer a feature set of vector operations, most notably vector similarity search, that makes it easy and fast to work with vector embeddings and in conjunction with AI models.

When and why do you need a vector database?

Given the significance of vector embeddings (vectors) for AI models, particularly Large Language Models (LLMs) and AI applications, vector databases are now integral to the AI technology stack. They can be used to:

Train AI models (e.g. ML model training, LLM training)
Vector databases manage the datasets large models are trained on. Training AI models typically entails finding patterns in large data sets. Training ML models often involves finding patterns in large datasets. Vector databases significantly speed up identifying patterns and finding relationships by enabling efficient retrieval of similar data points.

Speed up AI model / LLM responses
Vector databases use various techniques to speed up vector retrieval and similarity search, e.g. compression and filtering. They accelerate both model training and inference, thus, enhancing the performance of generative AI applications. By optimizing vector retrieval and similarity search, vector dbs can enhance the efficiency and scalability of AI applications that rely on high-dimensional data representations

Add long-term memory to AI models and LLMs
Vector databases add long term memory to AI applications in two ways: They persist the history to 1. continue on the tasks or conversation later as needed and 2. to personalize and enhance the model for better-fitting results.

Enable Multimodel Search
Vector databases serve as the backbone to jointly analyze vectors from multimodal data (text, image, audio, and video) for unified multimodal search and analytics. The use of a combination of vectors from different modalities enables a deeper understanding of the information, leading to more accurate and relevant search results.

Enhancing LLMs responses, primarily “RAG”
With a vector database, you have additional knowledge to enhance the quality of a model’s responses and to decrease hallucinations; real-time updates, as well as personalized responses, become possible.

Perform Similarity Search / Semantic Retrieval
Vector databases are the heart and soul of semantic retrieval and similarity search. Vector search often works better than „full-text search“ (FTS) as it finds related objects that share the same semantics/meaning instead of matching the exact keyword. Thus, it is possible to handle synonyms, ambiguous language, as well as broad and fuzzy queries.

Cache: Reduce LLM calls
Vector databases are used to cache similar queries and responses can be used as a lookup prior to calling the LLM. This saves resources, time, and costs.

The shift to on-device computation (aka Edge Computing)

Edge Computing is in its essence a decentralized computing paradigm and based on Edge Computing, AI on decentralized devices (aka Edge AI) becomes possible. Note: In computing, we have regularly seen shifts from centralized to decentralized computing and back again.

What is Edge Computing?

Our world is decentralized. Data is produced and needed everywhere, on a myriad of distributed devices like smartphones, TVs, robots, machines, and cars – on the so-called “edge” of the network. It would not only be unsustainable, expensive, and super slow to send all this data to the cloud, but it is also literally unfeasible. So, much of this data simply stays on the device it was created on. To harness the value of this data, the distributed “Edge Computing” paradigm is employed.

When and why do you need Edge Computing?

Edge Computing stores and processes data locally on the device it was created on, e.g. on IoT, Mobile, and other edge devices. In practice, Edge Computing often complements a cloud setup. The benefits of extending the cloud with on-device computing are:

- Offline-capability
  Storing and computing data directly on-device allows devices to operate independently from an Internet connection, which is crucial for remote locations (e.g. oil rigs in the ocean) or applications that need to always work (e.g., while the car is in underground garages, or in remote areas).
- Data ownership/privacy
  Cloud apps are fundamentally non-private and limit the user’s control over their own data. Edge Computing allows data to stay where it is produced, used, and where it belongs (with the user/on the edge devices). It therefore reduces data security risks, and data privacy and ownership concerns.
- Bandwidth constraints and the cost of data transmission
  Ever growing data volumes strain bandwidth and associated network/cloud costs, even with advanced technologies like 5G/6G networks. Storing data locally in a structured way at the edge, such as in an on-device database, is necessary to unlock the power of this data. At the same time, some of this data can still be made available centrally (in the cloud or on an on-premise server), combining the best of both worlds.
- Fast response rates and real-time data processing
  Doing the processing directly on the device is much faster than sending data to the cloud and waiting for a response (latency). With on-device data storage and processing, real-time decision making is possible.
- Sustainability
  By reducing data overhead and unnecessary data transfers, you can cut down 60-90% of data traffic, thereby significantly reducing the CO2 footprint of an application. A welcome side effect is that this also lowers costs tremendously.

Edge AI needs on-device vector databases

Every megashift in computing is empowered by specific infrastructure software, like e.g. databases. Shifting from AI to Edge AI, we still see a notable gap: On-device support for vector data management (the typical AI data) and data synchronization capabilities (to update AI models across devices). To efficiently support Edge AI, vector databases that run locally, on edge devices, are as crucial as they are on servers today. So far, all vector databases are cloud / server databases and cannot run on restricted devices like mobile phones and microcontrollers. But moreover, they often don’t run on more capable devices like standard PCs either, or only with really bad performance. To empower everyday life AI that works anytime all around us, we therefore need a database that can run performantly on a wide variety of devices on the edge of the network.

In fact, vector databases may be even more important on the edge than they are in cloud / server environments. On the edge, the tradeoff between accuracy and performance is a much more delicate line to walk, and vector databases are a way to balance the scales.

Edge AI Vector Databases for on-device use

On-device AI: Use Cases and why they need an Edge Vector Database

Seamless AI support where it is needed most, on everyday devices and all the things around us needs an optimized local AI tech stack that runs efficiently on the devices. From private home appliences to on-premise devices in business settings, medical equipment in healthcare, digital infrastructure in urban environments, or just mobile phones, you name it: To empower these devices with advanced AI applications, you need local vector databases. From the broad scope of AI’s impact in various fields, let’s focus on some specific examples to make it more tangible: the integration of AI within vehicle onboard systems and the use of Edge AI in healthcare.

Vehicle onboard AI and edge vector databases – examples

Imagine a car crashing because the car software was waiting on the cloud to respond – unthinkable. The car is therefore one of the most obvious use cases for on-device AI.

Any AI application is only as good as its data. A car today is a complex distributed system on wheels, traversing a complex decentralized world. Its complexity is permanently growing due to increased data (7x more data per car generation), devices, and the number of functions. Making use of the available data inside the car and managing the distributed data flows is therefore a challenge in itself. Useful onboard AI applications depend on an on-device vector database (Edge AI). Some in-car AI application examples:

Advanced driver assistance systems (ADAS)
ADAS benefit in a lot of areas from in-vehicle AI. Let’s look, for example, at driver behaviour: By monitoring the eye movements and head, ADAS can determine when the driver shows any signs of unconcentrated driving, e.g., drowsiness. Using an on-device database, the ADAS can use the historic data, the realtime data, and other car data, like, e.g., the driving situation, to deduce its action and issue alerts, avoid collisions, or suggest other corrective measures.
Personalized, next-gen driver experience
With an on-device database and Edge AI, an onboard AI can analyze driver behavior and preferences over a longer period of time and combine it with other available data to optimize comfort and convenience for a personalised driving experience that goes way beyond a saved profile. For example, an onboard AI can adjust the onboard entertainment system continually to the driver’s detected state, the driving environment, and the personal preferences.

Applications of Edge AI in Healthcare – examples

Edge Computing has seen massive growth in healthcare applications in the last years as it helps to maintain the privacy of patients and provides the reliability and speed needed. Artificial intelligence is also already in wide use making healthcare smarter and more accurate than ever before. With the means for Edge AI at hand, this transformation of the healthcare industry will become even more radical. With Edge AI and on-device vector databases, healthcare can rely on smart devices to react in realtime to users’ health metrics, provide personalized health recommendations, and offer assistance during emergencies – anytime and anyplace, with or without an Internet connection. And while ensuring data security, privacy, and ownership. Some examples:

Personalized health recommendations
By monitoring the user’s health data and lifestyle factors (e.g. sleep hours, daily sports activity) combined with their historic medical data, if available, AI apps can help detect early signs of health issues or potential health risks for early diagnosis and intervention. The Ai app can provide personalized recommendations for exercise, diet, or medication adherence. While this case does not rely on real-time analysis and fast feedback as much as the previous example, it benefits from an edge vector database in regards to data privacy and security.
Point of care realtime decision support
By deploying AI algorithms on medical devices, healthcare providers can receive immediate recommendations, treatment guidelines, and alerts based on patient-specific data. One example of where this is used with great success, is in surgeries. An operating room, today, is a complex environment with many decentralized medical devices that requires teams to process, coordinate, and act upon several information sources at one time. Ultra-low latency streaming of surgical video into AI-powered data processing workflows on-site, enables surgeons to make better informed decisions, helps them detect abnormalities earlier, and focus on the core of their task.

Edge AI: Clearing the Path for AI anywhere, anytime

For an AI-empowered world when and where needed, we still have to overcome some technical challenges. With AI moving so fast, this seems however quite close. The move into this new era of ubiqutuous AI needs Edge AI infrastructure. Only when Edge AI is so easy to implement and deploy as cloud AI, will we see the ecosystem thriving and bringing AI functionalities that work anytime and anyplace to everyone. An important corner stone will be on-device vector databases as well as new AI frameworks and models, which are specifically designed to address Edge Computing constraints. Some of the corresponding recent advances in the AI area include “LLM in a Flash” (a novel technique from Apple for effective inference of LLMs at the edge) and Liquid Neural Networks (designed for continuous learning and adaptation on edge devices). There’s more to come, follow us to keep your edge on Edge AI News.

In-Memory Database Use Cases

by Vivien | Feb 15, 2024 | Android, Edge Database, Mobile Database

ObjectBox was a purely disk-based database until now. Today, we added in-memory storage as a non-persistent alternative. This enables additional use cases requiring temporary in-process data. It’s also great for testing.

Disk + In-memory: simply use the best of both worlds

When opening a new database, you can now choose if the database is stored on disk or in-memory. Because this is a per database option, it is possible to use both types in your application. It’s very simple to use: when opening the store, instead of providing an actual directory, provide an pseudo-directory as a string with the prefix “memory:”. After the prefix, you pick a name for the database to address it, e.g. “memory:myApp”.

Note: in-memory databases are kept after closing a store; they have to be explicitly deleted or are automatically deleted if the creating process exists.

So, what are typical in-memory database use cases?

Caching and temporary data

If data is short lived, it may not make sense involving the disk with persistent storage. Unlike programming language containers like maps and hash tables, caches built on in-memory databases have advanced querying capabilities and support complex object graphs. For example, databases allow lookups by more than one key (e.g. ID, name and URL). Or deleting certain entries using a query. As ObjectBox is closely integrated with programming languages, putting and getting an object are typically just “one liners” similar to map and hash table containers.

Bringing “online-only” and “offline-first” apps closer together

Let’s say you want to start simple by creating an application that always fetches the data from the cloud. You can put that data in an in-memory database (similar to the caching approach above). The data is available (“cached”) for all app components via a common Box-based API, which is already great. But let’s say later on, you want to go “offline-first” with your app to respond quicker to user requests and save cloud and/or mobile networking operator (MNO) costs. Since you are already using the Box-based API, you simply “turn on persistence” by using a disk-based database instead.

Performance and app speed

Shouldn’t this be the first point in the list? Well, ObjectBox did already operate at “in-memory speed” for mostly-read scenarios even though it used a disk-based approach. So, do not expect huge improvements for reads. Writes (Create, Update, Delete) are different though: to fully support ACID, a disk-based database must wait on the disk to fully complete the operation. Contrary to this, an in-memory database can immediately start the next transaction.

Diskless devices

Some small devices, e.g. sensors, may not have a disk or an accessible file system. This update makes it possible to run ObjectBox here too. This can be an interesting combination with ObjectBox Sync and automatically getting data from another device.

Testing

For example in unit tests, you can now spin up ObjectBox databases even faster than before, e.g. opening and closing a store in less than a millisecond.

“Transactional memory”

In concurrent (multi-threaded) scenarios, you may want to provide transactional consistent views (or “checkpoints”) of your data. Let’s say bringing the data from one consistent view to another is a rather complex operation involving the modification of several objects. In such cases locking may be a concern (complex or blocking), so having an in-memory database may be a nice alternative. It “naturally” offers transactions and thus transactional safe view on data. Thus, you can always read consistent data without worrying about data being modified at the same time. Also, you never have to wait for a modifying thread to finish.

What’s next?

This is only our first version of our in-memory store. Consider it as an starting point for more to come:

Performance: to ship early, we made rather big performance tradeoffs. At this point, starting a new write transaction will copy all data internally, which of course is not great for performance. A future version will be a lot smarter than that.
Persistence: While this version is purely in-memory without persistence, we want to add persistence gradually. This will include a write-ahead-log (WAL) and snapshots. This constellation may become even preferable over the default disk-base store for some scenarios.

We are currently rolling out the in-memory feature to all language supported by ObjectBox:

Let us know your thoughts

❤️

Vector Databases for Edge AI

by Vivien | Aug 9, 2023 | AI, Edge Database, vector database

The intersection of AI and Edge Computing is where Edge AI happens – and it needs databases that support AI and can run on the edge (for lack of a better term “Edge Vector Databases”, also refered to as On-device vector databases or local vector databases). Vector Databases are the databases for AI and are an important piece of the AI tech stack. Edge Databases are databases that can run on edge devices.

Edge Vector Databases are the basis for Edge AI

Edge Vector Databases – the intersection of Edge Computing and AI needs a database

In 2023, the Edge Computing Market is estimated to be at $53B,[1] while the AI market is expected to reach a whopping $87B.[2] Both markets are expected to grow dramatically in the coming years with the two technologies enhancing each other. For many use cases it is advantageous, and oftentimes necessary, to use both Edge Computing and AI in conjunction. This is what is called Edge AI and Gartner prognosed its plateau within 2023-2024.[3] In fact, recently, Gartner named Edge AI as one of the breakthrough technologies of 2023 due to the growing demand for real-time AI solutions and the need for decentralized data processing.[4] The global Edge AI market size was valued at roughly $14.5B in 2022 with expected CAGRs of 20-30% from 2023 to 2030.[5] In this article we will take a closer look at the use of vector databases in Edge AI.

The AI market: AI model trainings vs. using trained AI models

To understand the AI market, it is important to distinguish between the AI model generation, initial training, and the use of these models.

AI model training

AI model training is the most resource-intensive and costly part of AI – and is considered hard.[6] The larger the model, the higher the costs and the more time it needs. Also with models getting larger, the training fails more often and must be restarted, adding to duration and costs. Initially, AI models were specifically trained for one task only. Now, however, a new type of AI models, namely, foundation models, have evolved. They are general-purpose models. This development was a main driver of the current AI boom we are seeing.

Foundation models

Foundation models (also base models) are large-scale AI systems trained on a vast quantity of data at scale in such a way that they can be used for a wide variety of tasks. Costs for training such large foundation AI models from scratch (GPT-4) are typically in the millions of USD and the expectation is that large model training costs will go up to 500 million USD by 2030.[7] They can be “fine-tuned” (trained) with specific data sets (on top of the already trained model) to adapt them to specific environments,[8] e.g. GitHub Copilot is based on GPT-3.5 Turbo and was tweaked as well as additionally trained on the code from GitHub repos.[9] Fine-tuning is typically way less costly than training a specific one-task model from scratch (let alone the initial foundation model training). Examples of foundation models include the GPT-models from OpenAI, LLaMa, and Gemini. LLMs (Large Language Models) are basically a specific subset of foundation models.

Using trained AI models

Using trained foundation models (like GPT-4 or LLaMA), additionally tuning them with specific data sets, is what is empowering most current AI tools / apps and all the innovations we are seeing around that. Basically, at this moment, a handful of popular foundation models are empowering a whole and currently thriving ecosystem. As AI models use vector embeddings, using trained models can be supported and enhanced with a vector database.[10] We’ll dive into this a bit more below.

📈 Side note on the market

From a market perspective, this means there is a relevant market entry barrier to training foundation models, and therefore it is reasonable to expect only few companies doing it (as opposed to many startups ;)). Yet, training new models is likely where the greatest innovation potential and most significant advancements lie. Given that the current AI market largely depends on large trained models (mainly foundation models) being deployed as free (often open source) models, there is a real risk of few tech giants owning the market later on, whereas the market that was built depending on those models starts stalling.[11] For example, OpenAI didn’t open source GPT-4 anymore, because they felt it could impede their business interests; Sam Altman even went as far as saying it was wrong to ever open it – likely they should change their name.[12]

At the same time, Sam Altman went lobbying around the world to regulate AI (model trainings) and enhance entry barriers. The most reasonable explanation I read is that they are trying to ensure corporate dominance, and likewise, the reason behind the movement to pause AI model training from other actors in the space is a move to avoid a monopoly by OpenAI and get meat in the game quickly.[13]

Vector Databases and their important role in AI

Many machine learning and deep learning algorithms, as well as the AI models described above, depend on vector embeddings / embeddings. The increasing demand for creating, storing, and managing these embeddings has led to the emergence of vector databases.

What are vector embeddings?

Vector embeddings represent (multimodal) data as n-dimensional vectors, meaning n-dimensional matrices, which comes down to a set of numbers. This is what makes them easy and efficient to compute with. The power of vector embeddings lies in their ability to capture the essence of the data they represent.

Vector embeddings basically are “the output of the process of learning embeddings”, which is done by feeding raw input data, like texts, images, words, into a trained AI model.[14]

In this process, the input data is translated into a lower-dimensional space. The result is a set of n-dimensional vectors in an embedding space.

The process of embedding [15]

The embedding space is specific to the data on which the embeddings were trained, but it can be transferred to other tasks and domains via transfer learning. Two embeddings that are close together in the embedding space indicate that the data they represent is similar.[16]

Once generated, the vector embeddings can be stored for efficient retrieval and use by AI / ML apps.[17] We visualized the process below.

Vector Database initial preparation [26]

How are embeddings generated?

Creating vector embeddings used to be a time-consuming process requiring domain experts and manual work. Today, however, there are many specialized models available for generating embeddings:

For text data, for example: Word2Vec, GloVe, and BERT. These models translate the semantic meaning of words and phrases into numerical form.
For image data, for example: VGG and Inception. These models capture visual characteristics of images and translate them into numerical form.

–> The Huggingface Model Hub offers many models that can create embeddings for different types of data. Anyone can do it with very little to no coding know-how.

But… how are embeddings generated?

There is no easy answer to this, at least I didn’t find one. If you really want to know, you will need to spend the time and effort to dig very deep. Today’s large language models were built over decades by many brilliant minds. They entail many fundamental concepts of converting different types of data (like words or images) into numerical representations. The following three fundamental concepts seem to be part of most LLMs and are worth having heard of:[18]

Encoding – Non-numerical, multimodal data needs to be transferred into numbers, so models can be created out of them.
Vectors – In order to store the encoded data and efficiently perform mathematical functions on them, encodings are stored as vectors (typically floating-point arrays).
Lookup matrices – Also known as lookup or hash tables; this table maps data to quickly jump from numerical to word representations (and back) across large chunks of text.

From the database perspective, the primary interest is in working efficiently with the generated embeddings [19]. Once a piece of information (a sentence, a document, an image) is a vector embedding and stored in a database, it’s time to get creative.

Where are vector embeddings used?

One of the most common vector embedding applications is in recommendation systems and search engines, e.g., Google Search uses embeddings to match text to text and text to images; Snapchat uses them to “serve individual ads depending on the user and time; and Meta (Facebook) uses them for social search.[20] But use cases are endless, e.g. they can also be used in chatbots, fraud detection, predictive maintenance, and autonomous driving. The ability to convert complex data into numerical form opens up endless applications. Generative AI applications typically work with vector embeddings. All of these applications benefit from using a vector database to enhance speed and efficiency.

How are vector databases used?

While a vector database can sometimes return responses directly, it typically returns results in conjunction with an LLM as we have depicted below. The vector database improves the accuracy of the LLM responses by using domain specific data from the database. More specifically: The vector search will give you the most relevant data for a specific query to provide to the LLM.[21] In both cases, the vector database helps reduce the number of queries to the LLM, which are costly, and also speeds up response times.

The use of vector databases: In use / production []

Vector databases: In use / production [26]

Accordingly, apart from efficiently handling vectors (as a data type), nearest neighbour search (primarily Approximate Nearest Neighbour (ANN) Search) is the most important feature of a vector database. The ANN algorithm finds the most similar vectors quickly. Additional filtering optimizes the result further, making queries even more efficient. Using the retrieved vectors (for context) and the initial query, an LLM generates the response. Typically, the response will be stored as a response embedding in the database, so over time, the database can serve more questions directly and / or improve the accuracy of the answer even more. This is depicted in the image above.

Why to use a vector database?

Vector databases enhance the efficiency and accuracy of AI applications, especially of those applications that are heavy on similarity searches, e.g. recommendation systems, natural language processing, or computer vision. Vector databases are also essential for scaling AI applications as the efficiency and speed starts to matter more. Through this, vector databases also bring costs down and heighten the sustainability of the AI application. On top, developers benefit from the additional functionalities a database offers for managing data, especially its querying capabilities, making any AI application more adaptable and therefore future-proof.

Edge vector databases – giving AI an edge

Edge Computing in a gist

Since data is produced and used everywhere (decentralized), using cloud computing for storage and processing is inefficient, wasteful, and often impossible. To unlock the value of decentralized data and drive digitization, you need to compute on the edge of the network (i.e. locally, closer to where data is generated). Gartner emphasizes Edge Computing’s importance for digital transformation.[22] To fully utilize Edge Computing, we need edge-specific infrastructure technologies, or “to make the edge as easy as the cloud for developers.” Edge databases are one such core infrastructure software. They enable rapid implementation of edge solutions by providing fast local data persistence (on the edge) and the capability to control and direct decentralized data flows (on the edge as well as in conjunction with a cloud).

Edge computing and edge databases can unlock decentralized data’s full potential, drive digital transformation, and create a sustainable and efficient data management landscape.

What is Edge AI?

Edge AI is the implementation of AI applications on the edge of the network without using a cloud, meaning, the necessary AI computations are performed on the edge directly, where the data is produced, e.g. in the car (onboard AI) or on a mobile device or simply within a specific location like a shop floor. A local Edge AI enables making decisions reliably in milliseconds, also when offline, and way cheaper. At this moment, this is particularly interesting for mission-critical use cases, offline-scenarios, and applications with high data security / privacy requirements. To run AI models directly on the edge, they need to be optimized for edge devices. The good news is, there are several AI models optimized for small devices available, e.g. Google’s Gecko, which is open source.[23] “Gecko is so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.”[24] Edge AI applications benefit tremendously from using a local vector database; only few use cases could do without one.

Edge AI basically offers the advantages of Edge Computing, whereas the disadvantages are more specific to AI.

Advantages of Edge AI (vs. cloud) [25]	Disadvantages of Edge AI (vs. cloud) [25]
Edge AI is faster, can guarantee QoS requirements, and works offline	Decentralized data access can be challenging and needs specific skills
Edge AI saves Internet bandwidth, cloud and networking costs (e.g. MNO costs)	The initial setup for the ongoing training of decentralized Edge AIs is more complex and therefore costly (while the cloud setup is quick and easy)
Edge AI helps ensure data privacy, data security, data ownership	Edge AI needs specific skills (entailing “oldschool programming skills”) – dev talent is hard to attract and expensive to keep
Edge AI is more sustainable to run (less wasteful data traversal meaning less energy use, less costs for the energy and less CO2 emissions)	The heterogeneity of the edge makes it difficult to develop solutions for a wide range of devices
Edge AI is a young market and therefore holds opportunities to capture market share and competitive advantages	Edge AI is a young market and still lacks infrastructure software

Edge AI setup / architecture

There are generally two setups for Edge AI applications: Full edge setup, running the AI model on the edge devices directly, or a hybrid approach, using the cloud or a central server for the AI model.

Edge AI: general setup / architecture options

The edge / cloud approach has the advantage that the AI model (re-)training, enhancement will happen centrally, on the cloud without additional efforts. On the other hand, in the full edge setup, you have all the advantages of the edge (offline, cost-effective, private, …) and you can use the power of all edge devices, making it even more affordable and fast. However, the caveat lies in the challenge of organizing the learning. The individual local models diverge and you need to get these learnings distributed to all devices. This will be done in a “global AI model” (centrally).

Depending on the details, this can be done locally on a central server or in the cloud, or even on an edge device. Also, depending on the number of edge devices, connectivity, and need for speedy updates, the distribution can be organized in a more decentralized way, fully using the power of the edge. Once all devices are harmonized, any edge device could, if it has the capabilities from a hardware perspective, be the global model combining all updates and sending them out. This offers great advantages with regards to availability and resilience. You can find more about decentralized Edge AI setups under the term federated learning.

Edge AI – decentralized setup, using the full power of the edge

Summary: Vector Databases for the edge

According to our research and experience, no “Edge Vector Database” exists yet. The Edge Database market has always been limited to fewer players – and is certainly not as crowded as the central server / cloud database space. However, the cloud / server databases cannot be used on the edge (big to small just doesn’t work ;)), whereas Edge Databases can run anywhere and can sometimes be a good choice for a server / cloud setup.

Opinions differ on whether there is a need for specific, dedicated vector databases, or whether general databases will evolve to include vector support and become the go-to-solution. In any case, the vector database space is hot and has recently been added as a category on db-engines. Adding a new database category is a rare occurrence for db-engines, the established platform for databases. In any case, we can see that both types of databases are converging towards each other and we firmly believe that in the future, all databases will support vectors.

So, when it comes to Edge Databases, there are the same two options: Either someone will implement a dedicated edge vector database or Edge Databases will evolve to support vectors. Because so far we have seen neither, we have extended the ObjectBox Edge Database with vector support. And the huge advantage this brings is that ObjectBox already offers an excellent, highly efficient, and battle-tested out-of-the-box Sync that takes care of the “hard stuff” of decentralized data management for developers. As a developer tool that is self-hosted and can be used on all kinds of edge devices, and certainly on premise, it offers companies the flexibility to implement a myriad of applications, reduce costs (especially cloud and networking costs), while not jeopardizing data ownership in any way.

References and Notes

https://www.marketsandmarkets.com/Market-Reports/edge-computing-market-133384090.html
https://www.forbes.com/advisor/business/ai-statistics/
https://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2022-gartner-hype-cycle
https://www.gartner.com/en/articles/4-emerging-technologies-you-need-to-know-about
https://www.grandviewresearch.com/industry-analysis/edge-ai-market-report
https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
https://mpost.io/ai-model-training-costs-are-expected-to-rise-from-100-million-to-500-million-by-2030/
https://en.wikipedia.org/wiki/Foundation_models
https://en.wikipedia.org/wiki/GitHub_Copilot, https://github.com/features/preview/copilot-x
While this is great for the current landscape of AI-driven tools and apps, and thus the consumers, these are incremental innovations and will not take the foundation of AI forwards. So, there is a rightful fear that big corporations will discontinue open sourcing advancements, once they feel protecting their business interests outweighs the benefits they have from open sourcing.
https://www.technologyreview.com/2023/05/12/1072950/open-source-ai-google-openai-eleuther-meta/
https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview
https://analyticsindiamag.com/openai-has-stopped-caring-about-open-ai-altogether/
An AI training model is the initial version that goes through the training process, while the AI model refers to the trained and optimized version that is ready for deployment and inference in real-world applications.
Adapted from https://vickiboykis.com/what_are_embeddings/
If you are interested in understanding distances in vector databases more, read this post; it explains the most important commonly used distance algorithms in a straightforward and highly understandable way.
Note: Traditional databases can offer vector support too. Our guess is that there will be a consolidation of databases, with traditional databases moving towards vector support and vector databases moving towards supporting other (more traditional) database functionalities. In fact, we cannot imagine a future where any database will not support AI applications (as in: Support vectors, nearest neighbour search etc.).
Based on: https://vickiboykis.com/what_are_embeddings/
If you are interested in diving deeper, read this article, which is written in a highly understandable and comprehensive way, explaining everything you would need to know to develop a basic understanding.
https://huggingface.co/blog/getting-started-with-embeddings
https://hackernoon.com/how-llms-and-vector-search-have-revolutionized-building-ai-applications
https://www.gartner.com/en/documents/4263499
Though the merit of it being open source is unclear an ongoing discussion in the more legal oriented open source community.
https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
https://builtin.com/artificial-intelligence/edge-ai, https://objectbox.io/what-is-edge-computing/, https://objectbox.io/what-is-an-edge-database-and-why-do-you-need-one/, https://www.marketsandmarkets.com/Market-Reports/edge-ai-software-market-70030817.html, https://www.run.ai/guides/machine-learning-operations/edge-ai#What-Are-the-Benefits-Of-Edge-AI, https://cambrian-ai.com/wp-content/uploads/edd/2023/07/Large-Language-Models-On-Edge-Publication-FINAL.pdf
Adapted from https://www.swirlai.com/

Vector types (aka arrays) added with ObjectBox Java 3.6 release

by Vivien | Jun 1, 2023 | AI, Edge Database, Release, vector database

Vector embeddings (multi-dimensional vectors) are a central building block for AI applications. And accordingly, the ability to store vectors to add long-term memory to your AI applications (e.g. via vector databases) is gaining importance. Sounds fancy, but for the basic use cases, this simply boils down to “arrays of floats” for developers. And this is exactly what ObjectBox database now supports natively. If you want to use vectors on the edge, e.g. in a mobile app or on an embedded device, when offline, independent from an Internet connection, removing the unknown latency, try it…

See the release notes for all new features this release brings.

Code Examples

Let’s start with a simple example: let’s assume some shapes that use a palette of RGB colors. An entity for this might look like this:

@Entity

public class Shape {

@Id

public long id;

// An array of RGB color values that are used by this shape.

public int[ ] palette;

}

We can now create a query to find all shapes that use a certain color:

// Find all shapes that use red in their palette

try (Query<Shape> query = store.boxFor(Shape.class)

.query(Shape_.palette.equal(0xFF0000 /* red */))

.build()) {

query.findIds();

}

Another typical use case is the embedding of certain types of data, like text, audio or images, as vector coordinates. To store such a vector embedding, in the following example we store the floating point coordinates that were computed by a machine learning model for an image together with a reference to the actual image:

@Entity

public class ImageEmbedding {

@Id

public long id;

// Link to the actual image, e.g. on Cloud storage

public String url;

// The coordinates computed for this image (vector embedding)

public float[] coordinates;

}

Ready to go?

To update to this release, change the version of objectbox-gradle-plugin to 3.6.0.

To add ObjectBox database to your JVM or Android project read our Getting Started guide.
As always, we look forward to your feedback on GitHub or via our anonymous feedback form and hope you have a great time building apps with ObjectBox! ❤️

Vector databases – a look at the AI database market with a comprehensive comparison matrix

by Vivien | May 30, 2023 | AI, Open Source, vector database

Vector databases - a look at the AI database market

⭐ What are vector databases? ⭐ What do you need them for? ⭐ Who is in the market? (Updated Oct 2024)

Includes a comparison matrix of vector database options like Pinecone, Milvus, Vespa, Vald, Chroma, Marqo AI, Weaviate, and Qdrant

In 2023 we saw record fundings of vector database players vector database. Since then almost every general purpose database (like MongoDB, elastic, Orcale MySQL etc.) have added a Vector Search and related features, basically making all of the vector databases too. There is an ongoing discussion if pure players are superior, but as always, the right answer is: “it depends”. Any ways, the vector database market is stilly very hot in Q4 of 2024 🔥

Of course, everyone, not just investors, is interested in the booming AI market. While AI applications have dominated the news for quite some time, the infrastructure software that supports these applications, such as vector databases, has finally gained more spotlight. In the following, we’ll have a look at why vector databases are gaining attention and compare current vector database alternatives.

What is a vector database?

A vector database stores vectors, or more precisely vector embeddings. A vector database therefore is a specialised type of database designed to store and manage large sets of vectors efficiently. However, the challenge and value are not derived from simply being able to store vectors. The value is created by the type of computations that can be run over the stored vector data and the speed with which these computations can be run, e.g. similarity searches.

Vector databases are essentially an important piece of the AI tech stack. They can be used e.g. to give LLMs (Large Language Models) – or more broadly speaking, AI applications – a long-term memory and faster search and querying capabilities. Another important use case is RAG (Retrieval-Augmented Generation).

To give some context: The most traditional databases, SQL databases, store data in rows and columns; graph databases store graphs and object databases store objects.

Because Large Language Models and AI applications rely on vector embeddings, vector databases are especially apt at supporting AI applications.

Accordingly, vector databases are becoming a critical layer in the AI tech stack; they are sometimes also called “AI databases”. However, databases tend to converge over time, meaning that many databases support several different database models.

What is a vector embedding?

A vector embedding is a list of numbers that represent objects and relationships, allowing unstructured data (such as images) to be searched and used. Typically, Large Language Models (more precisely the underlying Machine Learning (ML) algorithms) are used to create these vectors. The ML algorithms analyse large amounts of data to learn how to represent complex / unstructured data in a lower dimensional space (as vectors).

What do vector databases have to do with nearest neighbour search?

Searchability (making unstructured data usable) is at the heart of this concept. The nearest neighbour search is therefore a key concept in vector databases. The distance between vector embeddings expresses the similarity of the vectors (and thus the represented objects). Therefore, as you are searching for the most similar data, the so-called “nearest neighbour search” is a key concept and the time required to find the nearest neighbours is essential.

Do we need special vector databases?

There is already a discussion going on about whether special vector databases are needed or do not warrant a new category in the database landscape. Instead, vector extensions of traditional databases could be supporting the AI market. Both are reasonable expectations, and time will tell. Notable databases that have already added a vector extension include e.g. redis and elasticsearch. Additionally, more and more databases now allow storing vector types.

How does the vector database landscape look like?

To have a look at the current market situation, we are comparing the choices with the most traction, but excluding established players that have added vector capabilities to their existing database offering. Generally speaking we see a lot of very young companies, some companies that did pivot from their original specialization, and massive fundings. Please note: the table is not optimized to be readable on mobile or small screens (there just is a trade-off between providing the information and making it readable on every device).

If you’re on mobile, use this link to view a version that is readable on mobile.

Name

Open Source

License

GitHub stars

Developed in (language)

Summary

Business Model

Embeds / Uses

founding date / first released date

In-memory Unterstützung

Sharding

Index Types

Consistency Model

Benchmarks (Performance?)

Approximate Nearest Neighbor (ANN) Vector Databases

Funding

Who's behind it

HQ in

ObjectBox

Apache-2.0

C++, supports native language APIs in Java, Flutter / Dart, Swift, Python, GoLang, and C++

ObjectBox is an on-device vector database for Edge AI on Mobile, IoT, Embedded and other commodity devices

Free to use; paid Data Sync

HNSW built and optimized from scratch for efficiency / speed on devices with limited resources

development of the initial on-device database started in 2015; released the vector search to become the first on-device vector database for productive use early in 2024

HNSW

Transactionally safe, ACID

Seed in 2018

ObjectBox

🇪🇺

Marqo AI

Apache-2.0

2.8k ⭐

Python

A tensor-based cloud-native commercial Open Source search and analytics engine.

Open SaaS

Tensor-based

❔

HNSW

undisclosed preseed in May 2022

S2Search Australia Pty Ltd

🇦🇺

Weaviate

BSD

5.6k ⭐

Assembly, C++, GoLang

Weaviate is a commercial Open Source cloud-native vector database that stores both objects and vectors.

Open SaaS

❔

started in 2018 as a traditional graph database, first released in 2019

Y, static sharding

a custom HNSW PQ algorithm that supports CRUD

Eventual Consistency

not comparative, just evaluating their own performance

Y (multiple ANN algorithms as long as they support full CRUD)

67.7M USD, series B

SeMI Technologies

🇪🇺

Chroma

Apache-2.0

4.4k ⭐

Python & Typescript

Chroma is a Commercial Open Source vector database

Preparing a (Partly Open) SaaS model* [Commercial Open Source]

HNSW lib, DuckDB; based on ClickHouse

looks like 2022

Dynamic segment placement

20.3M USD, seed

Chroma Inc.

🇺🇸

Qdrant

Apache-2.0

6.6k ⭐

Rust

Qdrant is a Commercial Open Source vector similarity search engine and vector database

Open SaaS

RocksDB

first released: 2021

Y, static sharding

HNSW (SQ & PQ)

Eventual Consistency, tunable consistency

compares to weaviate, milvus, elastic (note: redis took too long to complete)

9.8M €

Qdrant Solutions GmbH

🇪🇺

Milvus

Apache-2.0

18k ⭐

GoLang & Python

Milvus is a cloud-native Commercial Open Source vector database

(Partly Open) SaaS* [Commercial Open Source]

Initial blog post from them said SQLite, but meanwhile they said RocksDB - exchanged?
they also have a ChatGPT-Cache that is build on SQLite
and say "Milvus uses SQLite or MySQL to manage metadata"

founded 2017, first released: 2019

Dynamic segment placement

ANNOY; HNSW; IVF_PQ; IVF_SQ(; IVF_FLAT; FLAT; IVF_SQ8_H; RNSG

Strong, bounded staleness, session, and eventually. The default consistency level in Milvus is bounded staleness.

not comparative

113M USD, series B

Zilliz

🇺🇸

Vespa

Apache-2.0

4.4k ⭐

Java & C++

Vespa is a Commercial Open Source vector database by Yahoo! It is a search engine which supports vector search, lexical search, and search in structured data

Open SaaS

❔

Originally a web search engine (alltheweb), acquired by Yahoo! in 2003 and later open sourced as Vespa in 2017; sinde Oct 2023 spinoff, raised series A in Nov 2023

maintains disk and memory structures for documents

Custom HNSW (Multi-vector hybrid HNSW-IF)

Eventual Consistency

not comparative

Spinoff from Yahoo! in Oct 2023, then raised a 31M USD series A

Yahoo!

🇺🇸

Vald

Apache-2.0

1.2k ⭐

GoLang

Vald is a cloud-native Open Source distributed approximate nearest neighbor (ANN) dense vector search engine

Community project, currently looks like no commercial interests are pursued

uses the vector search engine NGT

Technology incubation at Yahoo! Japan Corporation, development was stared in 2019

❔

N/A

not comparitive, but Vald performance only

Y (NGT)

Yusuke Kato (Yahoo Japan Corporation), Kiichiro Yukawa (Yahoo Japan Corporation)

🇯🇵

Pinecone

Proprietary

Pinecone is a fully managed vector database that specializes in enabling semantic search capabilities

SaaS

built on top of Faiss

first released in 2019

proprietary

Eventual Consistency

more programming language comparison for vector databases

Y (proprietary), plus KNN (with Faiss)

138M, series B

Pinecone Systems Inc

🇺🇸

Want to know more about the vector database market?

Here are some more questions answered for anyone interested

What is an "Open SaaS" business model?

Software as a service (SaaS) refers to software that is managed / hosted for the client and is essentially “rented.” The open in Open SaaS refers to the open source software that is being offered as such a service.

This frequently implies that not all code is open source, particularly that which is part of the managed service / hosting and associated value-adding features. Note: The open source software offered in this manner may or may not be provided by the company providing the software as a service. This has caused some friction in the open source community, as original creators often struggle to make a living, and/or maintainers struggle to keep maintaining the software – while other companies profit. Most famously, huge cloud providers have taken advantage of this option, leading to new licenses that keep the source open but restrict others from hosting as a service without donating the whole source code back to the community.

Why should I care about index types?

Indexes are essentially a way to speed up searching a database. There are several established index types for vector databases and they affect the performance of the database, e.g. the time it takes a query to complete.

What about benchmarks?

You will see, if you review the benchmarks given at the top, that results typically vary. Benchmarks are difficult to do and neutral benchmarks even more so. Certain use cases may favor certain solutions. Therefore, ideally you benchmark based on your specific use case…. but as a first evaluation, try to understand the basic influencing factors and have a look at a handful of benchmarks and explanations. Having said all this: There is a benchmarking tool available for approximate nearest neighbor (ANN) algorithms search. If you use this, you can compare the performance of different databases (with regards to the ANN search) for the same setup, based on the same approach. Also: The underlying libs often used by databases (like NGT and HNSW, see above) have already been benchmarked with it and you can compare to these directly.

Why is the market so hot, how can companies raise so much money?

AI is hot, everyone agrees that data and its management will be key to future success, and the database market is interesting: It is a long established market with many players, yet still demonstrating continually good growth (e.g. 17% in 2020). And the database market history shows that from time to time a new type of database comes up, and with it, the creation of a new market category. In such a market, typically the market creator “takes all” (not quite literally, but such a significant share, definetely the vast majority, that all other players are not attractive from a VC-perspective). Such a market could easily be worth 100M+ in ARR. Examples from the last 20 years: MongoDB (NoSQL databases), Cockroach (NewSQL databases), Neo4J (Graph databases), Influx (Time-Series databases). So, VCs are looking to find the next new type of database that can create a market… Maybe it will be vector databases? However, the database market has also shown to take 10 years+ for players to become profitable, so expect a longterm game. The race is still on for Edge Databases we think 🙂

Want to know more about the database market?

We recommend checking out db-engines. The website compares all relevant systems and has tons of data from the last 20 years. Note: They do only add databases once they have some traction and notability, not any hobby project. Accordingly not all databases of the above comparison have been added to the website yet.

« Older Entries

Next Entries »

Edge AI: The era of on-device AI

AI anywhere and anytime - free from Internet dependencies & 100% private

The importance of vector databases for AI applications

What is a vector database?

When and why do you need a vector database?

The shift to on-device computation (aka Edge Computing)

What is Edge Computing?

When and why do you need Edge Computing?

Edge AI needs on-device vector databases

On-device AI: Use Cases and why they need an Edge Vector Database

Vehicle onboard AI and edge vector databases – examples

Applications of Edge AI in Healthcare – examples

Edge AI: Clearing the Path for AI anywhere, anytime

In-Memory Database Use Cases

Disk + In-memory: simply use the best of both worlds

So, what are typical in-memory database use cases?

Caching and temporary data

Bringing “online-only” and “offline-first” apps closer together

Performance and app speed

Diskless devices

Testing

“Transactional memory”

What’s next?

Let us know your thoughts

Vector Databases for Edge AI

The AI market: AI model trainings vs. using trained AI models

AI model training

Foundation models

Using trained AI models

Vector Databases and their important role in AI

What are vector embeddings?

How are embeddings generated?

But… how are embeddings generated?

Where are vector embeddings used?

How are vector databases used?

Why to use a vector database?

Edge vector databases – giving AI an edge

Edge Computing in a gist

What is Edge AI?

Advantages of Edge AI (vs. cloud) [25]

Disadvantages of Edge AI (vs. cloud) [25]

Edge AI setup / architecture

Summary: Vector Databases for the edge

References and Notes

Vector types (aka arrays) added with ObjectBox Java 3.6 release

Code Examples

Ready to go?

Vector databases – a look at the AI database market with a comprehensive comparison matrix

Vector databases - a look at the AI database market

What is a vector database?

What is a vector embedding?

What do vector databases have to do with nearest neighbour search?

Do we need special vector databases?

How does the vector database landscape look like?

Want to know more about the vector database market?

What is an "Open SaaS" business model?

Why should I care about index types?

What about benchmarks?

Why is the market so hot, how can companies raise so much money?

Want to know more about the database market?