Vector search: making sense of search queries

Vector search: making sense of search queries

Today, finding the most valuable information to your search is more complicated than finding a needle in a haystack. Traditional search engines match keywords and favor SEO-optimized content, but what if there was a way for search engines to truly understand the meaning behind our queries? Enter vector search – a powerful technology that is transforming how we navigate information, not just for users, but also for applications performing background searches. In this article, we will discuss what vector search is and how it works.

What is a vector search and why should you care?

Example Results with a traditional search for “Simple Fruit Cake”.

Vector search, which is also known as semantic search, is a technology that improves search accuracy by understanding the meaning (semantics) of the data and relations between its parts. Unlike traditional search, vector search efficiently handles synonyms, typos, ambiguous language, and broad or fuzzy queries. This is because it focuses on meaning, not just keywords.

Imagine that you are searching for a dessert to cook during the weekend. In a traditional search engine, the “simple fruit cake” query will reveal only websites that include these keywords. However, a vector search engine is able to provide results like “apple pie in 20 minutes” or “easy summer desserts”, which capture the essence of the query and align with your desire for a straightforward dessert option, providing more valuable results to you. 

At its core, vector search uses Large Language models (LLMs), like GPT, to transform data into mathematical vectors, also known as vector embeddings

What is a vector embedding?

2D Vector Space Representation. “Easy apple pie” is close to “simple fruit cake” as they are both simple and have fruit as an ingredient. “Easy chocolate mousse” shares simplicity but does not contain fruit. “Fancy plum cake” has fruit but is not simple to make. And “extravagant chocolate mousse” does not share either simplicity or fruit as an ingredient. Thus, it is the farthest from “simple fruit cake”.

A vector or vector embedding is a numerical representation of any kind of unstructured data (e.g. texts, images, videos, audio). It captures its meaning while being easy and efficient to compute with. Think of it like this: imagine you have a collection of cake recipes. You can convert each recipe into a vector embedding, which is like a unique numerical code that represents the recipe’s characteristics (ingredients, cooking methods, flavors, etc.).

Once all the recipes are encoded into embeddings, we can perform a similarity search. This means we can compare the vectors to see how similar the recipes are. For example, the vector for an easy apple pie recipe would be close to the vector for a simple fruit cake recipe because they share similar characteristics (e.g. simplicity, fruitiness). On the other hand, the vector for an extravagant chocolate mousse cake would be farther away because it involves different ingredients and methods.

How to compare vectors?

Vector similarity is a measure of how similar two vectors are (see ep. 4 of ObjectBox Bites). There are three ways to compare vectors: Jaccard Similarity, Cosine Similarity, and L2 Distance (also known as Euclidean distance). Jaccard Similarity calculates the ratio of elements that are common to both vectors divided by the total number of elements in both vectors. Cosine Similarity calculates the cosine of the angle between two vectors. The last method is the L2 distance. It calculates the straight-line distance between two points in space represented by the vectors. This is the most frequently used method in AI applications. It is important to note that the choice of vector comparison method does not affect the mechanics of similarity search.

What is a vector database and how is it related to vector search?

A vector database is a specialized database designed to store, manage, and search vectors efficiently. This efficiency is crucial for handling large datasets and performing fast vector similarity searches. Also, with a vector database, the knowledge of AI models can be improved, adapted, and updated. Therefore, today, most AI apps use a vector database.

Imagine having an AI that knows your habits, your preferences, your health data, maybe even what’s in your fridge, and can use this knowledge to suggest recipes that fit your lifestyle and individual preferences. A standard AI model doesn’t have that data and wouldn’t learn that way, but with a vector database it can. Now, when you search for a “fruit cake recipe”, using this data, it can suggest a “simple fruit cake” without sugar if you usually prefer quick, easy, and healthy recipes, or a “fancy plum cake” if you enjoy more challenging baking projects and don’t like apples. Or, a vegan option, if you have neither milk nor eggs left in the fridge.

This technique is called Retrieval-Augmented Generation (RAG). It enhances the capabilities of LLMs with additional data (e.g. personal data, company data, fresh data) stored in a vector database.

When you query a vector database, it uses the query’s vector representation to find the nearest neighbors in the database.

Nearest Neighbor Search

How do we find the nearest neighbor to our query vector? The most straightforward approach is a brute-force search. It calculates the distance between our query vector and all other vectors in the database, one by one. Any metrics discussed in “How to compare vectors” can be used. However, this brute-force approach has a time complexity of O(N*d), where N is the number of vectors and d is the dimensionality. This becomes computationally expensive for large datasets.

Since exact nearest neighbor search can be slow for massive datasets, we often turn to approximate nearest neighbor (ANN) algorithms. These algorithms prioritize efficiency by finding neighbors that are very close (but not necessarily the absolute closest) to the query vector, significantly reducing search time. 

Continuing with the cooking assistant app example, imagine you’re searching for a “fruit cake recipe”. Assume that in our database, the real closest recipe is “simple apple pie”. With a massive database, an exact nearest neighbor search might take a long time to find the perfect match. However, an ANN algorithm can quickly find a recipe that is very similar to what you’re looking for, such as a “simple fruit cake” or a “basic apple pie”, even if it might not be the exact closest match. This efficiency ensures you get relevant and useful recipe suggestions promptly, enhancing your overall experience without a noticeable compromise in quality.

Approximate Nearest Neighbour Search

Now, let’s delve into the world of Approximate Nearest Neighbor (ANN) algorithms. The way you search for nearest neighbors depends on how the data is stored in the vector database. One of the earliest ANN algorithms, established in 1975, is called k-d trees. These trees work by recursively splitting the data space using hyperplanes, making the search process more efficient (see ep. 5 of ObjectBox Bites). However, k-d trees, like many exact nearest neighbor algorithms, suffer from the dimensionality curse. This means that as the number of dimensions (features) in your data increases, the distance between points becomes less meaningful, making searching very slow in high-dimensional spaces like those used in vector databases. 

For instance, consider simple fruit recipes. With a few features, such as cooking time and number of ingredients, finding similar recipes would be relatively straightforward. However, if we also include many other features like sweetness level, calorie count, fruit type, all specific ingredients, preparation complexity, and user ratings, the number of dimensions increases significantly. In such high-dimensional spaces, the traditional k-d tree method becomes inefficient because the distances between points (recipes) become less distinct and meaningful.

To overcome this challenge, ANN algorithms leverage two main approaches: indexing methods and sketching methods. Indexing methods work by creating a hierarchical data structure that allows for faster exploration of the search space. Imagine a well-organized library with categorized sections instead of just randomly placed books.  Sketching methods, on the other hand,  don’t search the entire dataset directly.  Instead, they create compressed versions (sketches) of the data that are faster to compare with the query vector. This reduces the search time significantly. Often, these two approaches are combined for optimal performance.


A popular example of an ANN search implementation for high-dimensional data is the Hierarchical Navigable Small World (HNSW) algorithm (e.g. implemented in Azure AI). HNSW relies on graph-based indexing to efficiently navigate the data space and find nearest neighbors. For more details watch episodes 6, 7, and 8 of ObjectBox Bites miniseries, where we describe the fundamentals of HNSW.

Take-away notes

To sum up, vector search offers a significant leap forward in how we search for information. By understanding the meaning and relationships behind data, it delivers more relevant and accurate results, even for unstructured data and complex queries. This technology has the potential to revolutionize various fields, from enhancing search engines to empowering AI applications. As vector search continues to evolve, we can expect even more exciting possibilities for navigating the ever-growing ocean of information and unlocking its full potential. This includes operating with data directly on the devices it was created on, reducing cloud costs, eliminating the reliance on an internet connection, and opening up using your private data without it ever being shared (100% private). If you’re interested in other AI and vector database-related topics, check out the ObjectBox mini-series. Stay tuned for more articles in the future.

Evolution of search: traditional vs vector search

Evolution of search: traditional vs vector search

Introduction

In today’s digital landscape, searching for information is integral to our daily lives, whether for education, research, work, or shopping. However, as the volume and complexity of data kept growing, traditional search methods faced more and more challenges in providing accurate and relevant results. That’s where vector search comes in. We’re already seeing Google changing its search engine to empower the vector search (RankBrain, BERT, Neural matching), and expecting even greater incorporation of AI tools to improve search experience. Let’s explore the differences between traditional (keyword) search and vector search to understand how these technologies are shaping our search experiences, and how this impacts the discoverability of any content you might produce.

Traditional (keyword) search

Traditional search performs exact keyword matching from user queries to the data to retrieve relevant results. For example, searching for “programming languages” with traditional search will list every source containing those words. A more advanced version can also incorporate additional rules to enhance search results, such as:

  • keyword frequency (how often the term “programming languages” is used within the result text),
  • the presence of related terms (e.g. “Java”, “Python”, “C++” versus “cooking”, “gardening”),
  • or location (results closer to your location are favored).

While this approach has served us well, it struggles with ambiguous language, synonyms as well as the impact of SEO strategies, often resulting in less accurate or less valuable search results. This can be especially frustrating for businesses who are trying to get their content seen by the right people. For example, a business that publishes a blog post about “sustainable fashion tips” might miss out on potential customers who are searching for “eco-friendly clothing recommendations” or “green clothing ideas,” simply because their keywords don’t exactly match.

Vector (semantic) search

2D Vector Space Representation. In this space, “Python” and “Java” are close to each other as well as to the “Programming language” query we are searching for because they are similar (they share high values in their features).


On the other hand, vector search takes a different approach by seeking out related objects that share similar characteristics or semantics. You can think of it as finding results based on meaning or understanding rather than just exact wording. For example, searching for “programming languages” with vector search will not only find sources mentioning those exact words but also identify specific languages like “Python” or “Java” as well as related concepts such as “coding tutorials” or“development frameworks”.

To do a vector search, first of all, the content, such as texts, images, audio files, or videos, needs to be represented as vectors/embeddings (also often called vector embeddings) by AI models. These embeddings represent data in a multidimensional vector space. Vectors capture the essence or semantics of the data they represent while remaining computationally efficient.

Once these vector representations are generated, they are basically sets of numbers, and therefore easy to compute with. For instance, instead of searching for a specific word in text, we aim to find the closest vector (from the text embeddings) to the query vector (representing the word we’re searching for). This process relies on well-established vector computing methods, such as calculating the distance between vectors or minimizing the angle between them (Cosine similarity).

Comparison

Let’s now compare different aspects of searching to understand the main differences between traditional search and vector search (also summarized in Table). 

  • Search Approach
    In traditional search, the approach relies on matching keywords directly from the user query to the content. Vector search uses vector embeddings to catch the semantics of the data to perform a meaning-based approach.
  • Ambiguity handling
    Therefore, vector search shines when it comes to handling ambiguity. It is superior for handling synonyms, ambiguous language, and broad or fuzzy queries compared to traditional search. This also automatically influences the relevance of the search results.
  • Search relevance calculation
    The metrics used for search relevance calculations are different. The traditional search uses term frequency-inverse document frequency (TF-IDF) and BM25, while vector search uses Jaccard Similarity, Cosine Similarity, and L2 Distance (or Euclidean).
  • Speed and Implementation
    Traditional search is easy to implement, straightforward in usage, and fast for simple queries. Vector search may be slower for simple queries and more complicated to implement, once it comes to huge datasets. However, the implementation of approximate nearest neighbor techniques (ANN) allows to significantly speed up the process.
  • Scalability
    Continuous expansion of content, challenges the scalability of traditional search, while for vector search scalability is one of the advantages. 
  • Cost
    While traditional search may have lower computational requirements, the superior performance and accuracy of vector search often justify the investment in additional computing power. Furthermore, the computational costs for vector search can be significantly reduced with the use of ANN.
Comparison table between keyword search and vector search

Conclusions

In summary, both traditional search and vector search offer distinct advantages and drawbacks. Vector search excels in handling ambiguity, correcting typos, enhancing relevance, and managing extensive datasets. Traditional search remains advantageous for straightforward queries, exact matches, or smaller datasets. Historically, limited computational resources, particularly for on-device computation (i.e. Edge Computing), favored traditional search. However, the landscape is evolving rapidly with the introduction of the first edge vector database solution by ObjectBox. This innovation promises to revolutionize the scenario by optimizing vector search for devices with constrained resources, extending the benefits of semantic search to the Edge.

The first On-Device Vector Database: ObjectBox 4.0

The first On-Device Vector Database: ObjectBox 4.0

The new on-device vector database enables advanced AI applications on small restricted devices like mobile phones, Raspberry Pis, medical equipment, IoT gadgets and all the smart things around you. It is the missing piece to a fully local AI stack and the key technology to enable AI language models to interact with user specific data like text and images without an Internet connection and cloud services.

An AI Technology Enabler

Recent AI language models (LLMs) demonstrated impressive capabilities while being small enough to run on e.g. mobile phones. Recent examples include Gemma, Phi3 and OpenELM. The next logical step from here is to use these LLMs for advanced AI applications that go beyond a mere chat. A new generation of apps is currently evolving. These apps create “flows” with user specific data and multiple queries to the LLM to perform complex tasks. This is also known as RAG (retrieval augmented generation), which, in its simplest form, allows one to chat with your documents. And now, for the very first time, this will be possible to do locally on restricted devices using a fully fledged embedded database.

What is special about ObjectBox Vector Search?

We know restricted devices. Where others see limitations, we see the potential and we have repeatedly demonstrated creating superefficient software for these. And thus maximizing speed, minimizing resource use, saving battery life and CO2. With this knowledge, we approached vector search in a unique way.

Efficient memory management is the key. The challenge with vector data is that on the one hand, it consumes a lot of memory – while on the other hand, relevant vectors must be present in memory to compute distances between vectors efficiently. For this, we introduced a special multi-layered caching that gives the best performance for the full range of devices; from memory-constrained small devices to large machines that can keep millions of vectors in memory. This worked out so well that we saw ObjectBox outperform several vector databases built for servers (open source benchmarks coming soon). This is no small feat given that ObjectBox still holds up full ACID properties, e.g. caching must be transaction-aware.

Also, keep in mind that ObjectBox is a fully capable database that allows you to store complex data objects along with vectors. From an ObjectBox data model point of view, a vector is “just” another property type. This allows you to store all your data (vectors along with objects) in a single database. This “one database” approach also includes queries. You can already combine vector search with other conditions. Note that some limitations still apply with this initial release. Full hybrid search is close to being finished and will be part of one of the next releases.

In short, the following features make ObjectBox a unique vector database:

  • Embedded Database that runs inside your application without latency
  • Vector search based is state-of-the-art HNSW algorithm that scales very well with growing data volume
  • HNSW is tightly integrated within our internal database. Vector Search doesn’t just run “on top of database persistence”.
  • With this deep integration we do not need to keep all vectors in memory.
  • Multi-layered caching: if a vector is not in-memory, ObjectBox fetches it from disk.
  • Not just a vector database: you can store any data in ObjectBox, not just vectors. You won’t need a second database.
  • Low minimum hardware requirements: e.g. an old Raspberry Pi comfortably runs ObjectBox smoothly.
  • Low memory footprint: ObjectBox itself just takes a few MB of memory. The entire binary is only about 3 MB (compressed around 1 MB).
  • Scales with hardware: efficient resource usage is also an advantage when running on more capable devices like the latest phones, desktops and servers.
  • ObjectBox additionally offers commercial editions, e.g. a Server Cluster mode, GraphQL, and of course, ObjectBox Sync, our data synchronization solution.

Why is this relevant? AI anywhere & anyplace

With history repeating itself, we think AI is in a “mainframe era” today. Just like clunky computers from decades before, AI is restricted to big and very expensive machines running far away from the user. In the future, AI will become decentralized, shifting to the user and their local devices. To support this shift, we created the ObjectBox vector database. Our vision is a future where AI can assist everyone, anytime, and anywhere, with efficiency, privacy, and sustainability at its core.

What do we launch today?

Today, we are releasing ObjectBox 4.0 with Vector Search for a variety of languages:

*) We acknowledge Python’s popularity within the AI community and thus have invested significantly in our Python binding over the last months to make it part of this initial release. Since we still want to smooth out some rough edges with Python, we decided to label Python an alpha release. Expect Python to quickly catch up and match the comfort of our more established language bindings soon (e.g. automatic ID and model handling).

Let’s get you started right away? Check our Vector Search documentation to see how to use it!

One more thing: ObjectBox Open Source Database (OSS)

We are also very happy to announce that we will fully open source the core of ObjectBox. As a company we follow the open core model. Since we still have some cleaning up to do, this will happen in one of the next releases, likely 4.1.

“Release week”

With today’s initial releases, we are far from done yet. Starting next Tuesday, you can  expect additional announcements from us. Follow us to get the news as soon as it is released.

What’s next?

This is our very first version of a “vector database”. And while we are very happy with this release, there are still so many things to do! For example, we will optimize vector search by adding vector quantization and integrate it more tightly with our data synchronization. We are also focusing on expanding our solution’s reach through strategic partnerships. If you think you are a good fit, let us know. And as always, we are very eager to get some feedback from you! Take care.

Edge AI: The era of on-device AI

Edge AI: The era of on-device AI

AI anywhere and anytime - free from Internet dependencies & 100% private

Edge AI is an often overlooked aspect of AI’s natural evolution. It is basically the move of AI functionalities away from the cloud (or powerful server infrastructure) towards decentralized (typically less powerful) devices at the network’s edges, including on mobile phones, smartwatches, IoT devices, microcontrollers, ECUs, or simply your local computer. Or in more broadly speaking: “Edge AI” means AI that works directly on-device, “local AI”.

Therefore, Edge AI apps work independently from an internet connection, offline as well as online. So, they are ideal for low, intermittent, or no connectivity scenarios. They are reliably available, more sustainable, and – of course – way faster on-device than anything hosted in the cloud. On-device AI apps can empower realtime AI anytime and anyplace.

Edge AI is where Edge Computing meets AI

The importance of vector databases for AI applications

To enable powerful on-device AI applications, the on-device (edge) technology stack needs local vector databases. So, before diving deeper into Edge AI, we’ll dive into vector databases first. Jump this section, if you are already familiar with them.

What is a vector database?

Just as SQL databases handle data in rows and columns, graph databases manage graphs, object databases store objects, vector databases store and manage large data sets of vectors, or more precisely, vector embeddings. Because AI models work with vector embeddings, vector databases are basically the databases for AI applications. Vector databases offer a feature set of vector operations, most notably vector similarity search, that makes it easy and fast to work with vector embeddings and in conjunction with AI models.

When and why do you need a vector database? 

Given the significance of vector embeddings (vectors) for AI models, particularly Large Language Models (LLMs) and AI applications, vector databases are now integral to the AI technology stack. They can be used to:

Train AI models (e.g. ML model training, LLM training)
Vector databases manage the datasets large models are trained on. Training AI models typically entails finding patterns in large data sets. Training ML models often involves finding patterns in large datasets. Vector databases significantly speed up identifying patterns and finding relationships by enabling efficient retrieval of similar data points.

Speed up AI model / LLM responses
Vector databases use various techniques to speed up vector retrieval and similarity search, e.g. compression and filtering. They accelerate both model training and inference, thus, enhancing the performance of generative AI applications. By optimizing vector retrieval and similarity search, vector dbs can enhance the efficiency and scalability of AI applications that rely on high-dimensional data representations

Add long-term memory to AI models and LLMs
Vector databases add long term memory to AI applications in two ways: They persist the history to 1. continue on the tasks or conversation later as needed and 2. to personalize and enhance the model for better-fitting results.

Enable Multimodel Search
Vector databases serve as the backbone to jointly analyze vectors from multimodal data (text, image, audio, and video) for unified multimodal search and analytics. The use of a combination of vectors from different modalities enables a deeper understanding of the information, leading to more accurate and relevant search results.

Enhancing LLMs responses, primarily “RAG
With a vector database, you have additional knowledge to enhance the quality of a model’s responses and to decrease hallucinations; real-time updates, as well as personalized responses, become possible.

Perform Similarity Search / Semantic Retrieval
Vector databases are the heart and soul of semantic retrieval and similarity search. Vector search often works better than „full-text search“ (FTS) as it finds related objects that share the same semantics/meaning instead of matching the exact keyword. Thus, it is possible to handle synonyms, ambiguous language, as well as broad and fuzzy queries.

Cache: Reduce LLM calls
Vector databases are used to cache similar queries and responses can be used as a lookup prior to calling the LLM. This saves resources, time, and costs.

The shift to on-device computation (aka Edge Computing)

Edge Computing is in its essence a decentralized computing paradigm and based on Edge Computing, AI on decentralized devices (aka Edge AI) becomes possible. Note: In computing, we have regularly seen shifts from centralized to decentralized computing and back again.

What is Edge Computing?

Our world is decentralized. Data is produced and needed everywhere, on a myriad of distributed devices like smartphones, TVs, robots, machines, and cars – on the so-called “edge” of the network. It would not only be unsustainable, expensive, and super slow to send all this data to the cloud, but it is also literally unfeasible. So, much of this data simply stays on the device it was created on. To harness the value of this data, the distributed “Edge Computing” paradigm is employed.

When and why do you need Edge Computing? 

Edge Computing stores and processes data locally on the device it was created on, e.g. on IoT, Mobile, and other edge devices. In practice, Edge Computing often complements a cloud setup. The benefits of extending the cloud with on-device computing are:

    • Offline-capability
      Storing and computing data directly on-device allows devices to operate independently from an Internet connection, which is crucial for remote locations (e.g. oil rigs in the ocean) or applications that need to always work (e.g., while the car is in underground garages, or in remote areas).
    • Data ownership/privacy
      Cloud apps are fundamentally non-private and limit the user’s control over their own data. Edge Computing allows data to stay where it is produced, used, and where it belongs (with the user/on the edge devices). It therefore reduces data security risks, and data privacy and ownership concerns.
    • Bandwidth constraints and the cost of data transmission
      Ever growing data volumes strain bandwidth and associated network/cloud costs, even with advanced technologies like 5G/6G networks. Storing data locally in a structured way at the edge, such as in an on-device database, is necessary to unlock the power of this data. At the same time, some of this data can still be made available centrally (in the cloud or on an on-premise server), combining the best of both worlds.
    • Fast response rates and real-time data processing
      Doing the processing directly on the device is much faster than sending data to the cloud and waiting for a response (latency). With on-device data storage and processing, real-time decision making is possible.
    • Sustainability
      By reducing data overhead and unnecessary data transfers, you can cut down 60-90% of data traffic, thereby significantly reducing the CO2 footprint of an application. A welcome side effect is that this also lowers costs tremendously.

Edge AI needs on-device vector databases

Every megashift in computing is empowered by specific infrastructure software, like e.g. databases. Shifting from AI to Edge AI, we still see a notable gap: On-device support for vector data management (the typical AI data) and data synchronization capabilities (to update AI models across devices). To efficiently support Edge AI, vector databases that run locally, on edge devices, are as crucial as they are on servers today. So far, all vector databases are cloud / server databases and cannot run on restricted devices like mobile phones and microcontrollers. But moreover, they often don’t run on more capable devices like standard PCs either, or only with really bad performance. To empower everyday life AI that works anytime all around us, we therefore need a database that can run performantly on a wide variety of devices on the edge of the network.

In fact, vector databases may be even more important on the edge than they are in cloud / server environments. On the edge, the tradeoff between accuracy and performance is a much more delicate line to walk, and vector databases are a way to balance the scales.

Edge AI Vector Databases for on-device use

On-device AI: Use Cases and why they need an Edge Vector Database

Seamless AI support where it is needed most, on everyday devices and all the things around us needs an optimized local AI tech stack that runs efficiently on the devices. From private home appliences to on-premise devices in business settings, medical equipment in healthcare, digital infrastructure in urban environments, or just mobile phones, you name it: To empower these devices with advanced AI applications, you need local vector databases. From the broad scope of AI’s impact in various fields, let’s focus on some specific examples to make it more tangible: the integration of AI within vehicle onboard systems and the use of Edge AI in healthcare.   

Vehicle onboard AI and edge vector databases – examples

Imagine a car crashing because the car software was waiting on the cloud to respond – unthinkable. The car is therefore one of the most obvious use cases for on-device AI.

Any AI application is only as good as its data. A car today is a complex distributed system on wheels, traversing a complex decentralized world. Its complexity is permanently growing due to increased data (7x more data per car generation), devices, and the number of functions. Making use of the available data inside the car and managing the distributed data flows is therefore a challenge in itself. Useful onboard AI applications depend on an on-device vector database (Edge AI). Some in-car AI application examples:

  • Advanced driver assistance systems (ADAS)
    ADAS benefit in a lot of areas from in-vehicle AI. Let’s look, for example, at driver behaviour: By monitoring the eye movements and head, ADAS can determine when the driver shows any signs of unconcentrated driving, e.g., drowsiness. Using an on-device database, the ADAS can use the historic data, the realtime data, and other car data, like, e.g., the driving situation, to deduce its action and  issue alerts, avoid collisions, or suggest other corrective measures. 
  • Personalized, next-gen driver experience
    With an on-device database and Edge AI, an onboard AI can analyze driver behavior and preferences over a longer period of time and combine it with other available data to optimize comfort and convenience for a personalised driving experience that goes way beyond a saved profile. For example, an onboard AI can adjust the onboard entertainment system continually to the driver’s detected state, the driving environment, and the personal preferences. 

Applications of Edge AI in Healthcare – examples

Edge Computing has seen massive growth in healthcare applications in the last years as it helps to maintain the privacy of patients and provides the reliability and speed needed. Artificial intelligence is also already in wide use making healthcare smarter and more accurate than ever before. With the means for Edge AI at hand, this transformation of the healthcare industry will become even more radical. With Edge AI and on-device vector databases, healthcare can rely on smart devices to react in realtime to users’ health metrics, provide personalized health recommendations, and offer assistance during emergencies – anytime and anyplace, with or without an Internet connection. And while ensuring data security, privacy, and ownership. Some examples:

  • Personalized health recommendations
    By monitoring the user’s health data and lifestyle factors (e.g. sleep hours, daily sports activity) combined with their historic medical data, if available, AI apps can help detect early signs of health issues or potential health risks for early diagnosis and intervention. The Ai app can provide personalized recommendations for exercise, diet, or medication adherence. While this case does not rely on real-time analysis and fast feedback as much as the previous example, it benefits from an edge vector database in regards to data privacy and security.
  • Point of care realtime decision support
    By deploying AI algorithms on medical devices, healthcare providers can receive immediate recommendations, treatment guidelines, and alerts based on patient-specific data. One example of where this is used with great success, is in surgeries. An operating room, today, is a complex environment with many decentralized medical devices that requires teams to process, coordinate, and act upon several information sources at one time. Ultra-low latency streaming of surgical video into AI-powered data processing workflows on-site, enables surgeons to make better informed decisions, helps them detect abnormalities earlier, and focus on the core of their task.

Edge AI: Clearing the Path for AI anywhere, anytime

For an AI-empowered world when and where needed, we still have to overcome some technical challenges. With AI moving so fast, this seems however quite close. The move into this new era of ubiqutuous AI needs Edge AI infrastructure. Only when Edge AI is so easy to implement and deploy as cloud AI, will we see the ecosystem thriving and bringing AI functionalities that work anytime and anyplace to everyone. An important corner stone will be on-device vector databases as well as new AI frameworks and models, which are specifically designed to address Edge Computing constraints. Some of the corresponding recent advances in the AI area include “LLM in a Flash” (a novel technique from Apple for effective inference of LLMs at the edge) and Liquid Neural Networks  (designed for continuous learning and adaptation on edge devices). There’s more to come, follow us to keep your edge on Edge AI News.

What is an Edge Database, and why do you need one?

What is an Edge Database, and why do you need one?

Edge Databases – from trends to use cases

Data is decentralized. Cloud computing is centralized.

Forcing the decentralized world into the centralized cloud topology is not only inefficient, but also economically, ecologically and socially wasteful – and sometimes simply impossible.

To drive digitization and extract value from decentralized data, we need to give the cloud an edge, or more precisely add Edge Computing. Edge computing is a decentralized topology for storing and processing data as close as possible to the data source, i.e., the place where the data is produced, at the edge of the network.

Valuable data is increasingly generated in a decentralized manner – outside traditional and centralized data centers and cloud environments. The dominance of centralized cloud computing approaches slows down digitization and the use of this existing decentralized data. Therefore, according to Gartner (2023) “Edge computing is integral to digital transformation”, and we need infrastructure technologies for the edge that enable developers to quickly and reliably work with decentralized edge data.

Edge Database (Foundation for Edge Data Management) is a new type of database that addresses these requirements. Developers need fast local data persistence and decentralized data flows (Data Sync) to implement edge solutions. Edge Databases solve these core edge functionalities out-of-the-box, allowing application developers to quickly implement edge solutions.

Megatrend to decentralized Edge Computing

By 2025, 30+ billion IoT devices will be creating ~4.6 trillion GB of data per day. The growing numbers of devices and data volume, variety, and velocity, as well as bandwidth infrastructure limitations, make it infeasible to store and process all data in a centralized cloud. On top, new use cases come with new requirements, a centralized cloud infrastructure cannot meet. For example, soft and hard response rate requirements, offline-functionality, and security and data protection regulations.

trends-driving-edge-computing

These trends accelerate the shift away from centralized cloud computing to a decentralized edge computing topology. Edge computing refers to decentralized data processing at the “edge” of the network. For example, in a car, on a machine, on a smartphone, or in a building. Hardware specifications do not capture the definition of an “edge device”. The crucial point is rather the decentralized use of data at, or as close as possible to, the data source.

Edge computing itself is not a technology but a topology, and according to McKinsey, one of the top growing trends in tech in 2021. The technologies needed to implement the edge computing topology are still inadequate. More specifically, there is a gap in basic “core” edge technologies, so-called “software infrastructure”. This gap is one of the main reasons for the failure of edge projects.

Needed: Infrastructure Software for Edge Computing

With computing shifting to the edge of the network, the needs of this decentralized topology become clear:

hugh performance db

Need for fast local data storage

→ i.e. a machine on the factory floor collects data on stiffness, friction, pressure points. There is limited space on the device, and typically no connection to the Internet. Even with an Internet connection, high data rates quickly push the available bandwidth, as well as associated networking / cloud costs, to the limit. To be able to use this data, it must be persisted in a structured manner at the edge, e.g. stored locally in a database.

feedback dialogue icon

Need for reliable on-device data flows

→ i.e. the car is an edge device consisting of many control units. Therefore, data must be stored on multiple control units. In order to access and use the data within several of the control units of the car, the data must be selectively synchronized between the devices. A centralized structure and thus a single point of failure is unthinkable.

data sync

Need for edge-to-edge-to-cloud data flows

→ i.e. in a manufacturing hall: Typically, you will find any number of diverse devices from sensors to brownfield to greenfield devices, and no internet connectivity. At the same time, there are diverse employee devices such as tablets or smartphones, as well as central PCs, and a cloud. To extract value from the data, it must be available in raw, aggregated, or summary form, in different places. This means it needs to be synchronized efficiently and selectively, with possible conflicts resolved.

types-of-data-on-edge-flexibility

Need for flexible edge data management

→ e.g. with the rise of IoT, time-series data have become common. However, time series data alone is usually not sufficient, and needs to be combined with other data structures (like objects) to add value. At the same time, a push to standardize data formats in industries (e.g. VSS in automotive or Umati in Industrial IoT) requires that the database supports flexible data structures.

Developing solutions without software infrastructure on an individual level is possible, but has many drawbacks:

Custom in-house implementations are cumbersome, slow, costly, and typically scale poorly. Oftentimes, applications or certain feature sets become unfeasible to deliver because of the lack of core software infrastructure. Legacy code and individual workarounds create problems over the lifetime of a product. Instead of a thriving ecosystem, only a few big players are able to implement edge solutions. Innovation and creativity are limited. An edge database is part of the solution and enables the entire edge ecosystem to build edge applications faster, cheaper and more efficiently.

lack-of-core-tech-for-the-edge

What is an edge database?

An edge database is a new type of database specifically tailored to the unique requirements of the Edge Computing topology. An edge database has specific features that make it easy for application developers to use decentralized data from edge devices when and where needed and focus on value creation. It removes the burden of implementing underlying functionalities for secure storage and the decentralized synchronization of data.

First, an edge database is optimized for resource efficiency (CPU, memory, …) and performance on resource-constrained devices (embedded devices, IoT, mobile). It has a small footprint of a few megabytes. Traditional databases such as MySQL or MongoDB are too large and cumbersome for typical edge devices, and unsuitable for computing at the edge.  

An edge device without data flows to/from other devices is just a data island with very limited utility. Accordingly, an edge database must support the management of decentralized data flows. There is no more efficient way than at the database level. This ideally includes a range of conflict resolution strategies due to the decentralized and multi-directional structure of the Edge.

Data security and protection is an increasingly important issue and can quickly become a showstopper for edge projects.

What is an Edge Database?

When do you need an Edge Database?

Most IoT applications need to store and synchronize data. An edge database is always useful when functions / applications are planned that:

  • should work offline and independent of an internet connection
  • need to guarantee fast response times
  • work with a lot of, possibly high-frequency data
  • need to serve many devices at the same time
  • need historical data

In addition, developers also often decide to use an edge database to save time and nerves, or to be able to react quickly and flexibly to future requirements.

Edge Database Use Case Example in Manufacturing

Today, you can find everything from low-frequency brownfield devices to high-frequency greenfield devices on a factory floor. As a rule, the machine controllers in use are not designed to store or transmit data. They usually lack not only the functionality, but also the resources to support this. Therefore, additional edge devices are often needed to collect, analyze and interpret the huge amounts of data that each machine produces on site. For such an edge device, rapid data persistence and ingestion, and efficient data flow from edge-to-edge and edge-to-cloud are at the heart of value creation. The clear separation of machine control and edge data processing unit ensures that there is no risk of unintentional interference with the machine controller. An edge device with a powerful edge database can support multiple use cases on the shop floor today:

manufacturing-edge-computing-use-case

1. Operational efficiency

Process optimization along the line to increase quality and reduce damage. When the first machine in a production line uses a new batch of material, i.e. in sheet metal processing, one of the first steps is to cut a sheet to the required size. At this stage, the machine can already detect the differences in the metal compared to a previous batch (deviations are allowed within the DIN standard). With an Edge device this data can be evaluated, and the relevant information passed on to the next machine. With this data machines further down the line can avoid damage / breakpoints of the material.

2. Condition monitoring

Continuous machine condition monitoring reduces downtime and increases maintenance efficiency. A constant stream of high-frequency machine data is compared against the fingerprint of the machine. Any slight deviation is immediately detected and reported. Catching deviations early reduces down-times and costly repairs.

3. Historical Data

Historical data is stored for learning and training to optimize the production line. With an edge database, the data is persisted and thus available in the event of faulty behavior. In case of an error, the data preceding the incident can be analyzed and used to find the causes and predict, or even avoid, such an error in the future. Chances are that “fuzzy expert knowledge” already available at the production site can be translated into deterministic rules when tested with these data sets.

The future of Edge Databases 

Edge computing provides numerous benefits and enables many applications and functionalities that are only possible with edge computing. However, only a few (usually large) players have been able to create value in edge computing projects, gaining competitive advantages. One reason is a lack of basic edge software. A thriving edge ecosystem necessitates edge software infrastructure that addresses the fundamental recurring needs of edge projects. Edge databases are a critical component in the development of such an ecosystem. Looking further into the future, AI applications’ biggest challenge at this moment is resource-intensity. Databases can give AI apps a long term memory and boost AI app performance while lowering resource use. Distributing this to edge devices will be a game changer for everyday use of AI apps. 

Why do we need Edge Computing for a sustainable future?

Why do we need Edge Computing for a sustainable future?

Centralized data centers use a lot of energy and water, emit a lot of CO2, and generate a lot of electronic waste. In fact, cloud data centers are already responsible for around 300 Mt CO2-eq of greenhouse gas emissions [1]. And the energy consumption of data centers is increasing at an exponential rate [2].

While more data centers are switching to green energy [3], this approach is not nearly enough to solve the problem. A more sustainable approach is to reduce unnecessary cloud traffic, central computation, and storage as much as possible by shifting computation to the edge. In our experience, just reducing data overhead and unnecessary data traversals can easily cut 60-90% of data traffic and thus significantly impact the CO2 footprint of an application, as well as costs.

Edge Computing stores and uses data on or near the device on which it was created. This reduces the amount of traffic sent to the cloud and, on a large scale, has a significant impact on energy consumption and carbon emissions.

Why do Digitization projects need to think about sustainability now?

Given the gravity of the climate crisis, every industry needs to assess its potential environmental impact and find ways to reduce its carbon footprint. The digital world, and its most valuable commodity, data, should not be any different. The digital transformation is ongoing and with it electronic devices and IT usage numbers are exploding. Thus, new apps must consider their carbon footprint throughout their lifecycle, especially resource use in operation and at scale [4]. 

Also, think about this: The share of global electricity used by data centers is already estimated to be around 1-3% [1] and data centers generate 2% of worldwide CO2 emissions (on par with the aviation industry) [5]. 54% of all emissions due to cloud data centers are caused by the big hyperscalers (Google, Amazon, Microsoft, Alibaba Cloud) [6]. On top of this, providing and maintaining cloud infrastructure (manufacturing, shipping of hardware, buildings and lines) also consumes a huge amount of greenhouse gasses [7] and produces a lot of abnormal waste (e.g. toxic coolants) at the end of life [8].

sustainable edge computing

Bearing that in mind, the growth rate for data center demand is concerning. The steady increase in data processing, storage, and traffic in the future, comes with a forecasted electricity consumption by data centers to grow by 10% a year [9]. In fact, estimations expect the communications industry to use 20% of all the world’s electricity by 2025 [10].

sustainable edge computing

Shifting to green energy is a good step. However, a more effective and ultimately longer term solution requires looking at the current model of data storage, filtering, processing and transferal. By implementing Edge Computing, we can reduce the amount of useless and wasteful data traversing to and from the cloud as much as possible, thus reducing overall energy requirements in the long term. Of course, everyone can make a difference with their daily behavior and for developers that is especially true: Applying green coding principles helps producing applications that produce lower CO2 emissions over the whole app lifetime. 

What is Edge Computing?

Until recently 90% of enterprise data was sent to the cloud, but this is changing rapidly. In fact, this number is dropping to only 25% by 2025, according to Gartner. By then, most of the data will be stored and used locally, on the device it was created on, e.g. on smartphones, cars, trains, machines, watches. This is Edge Computing, and it is an inherently decentralized computing paradigm (as opposed to the centralized cloud computing approach). Accordingly, every edge device needs the same technology stack (just in a much smaller format) as a cloud server. This means: An operating system, a data storage / persistence layer (database), a networking layer, security functionalities etc. that run efficiently on restricted hardware.

As you can only use the devices’ resources, which can be pretty limited, inefficient applications can push a device to its limits, leading to slow response rates, crashes, and battery drain.

edge device architecture

EDGE DEVICE ARCHITECTURE

Edge Computing is much more than some simple data pre-processing, which takes advantage of only a small portion of the computing that is possible on the edge. An Edge Database is a prerequisite for meaningful Edge Computing. With an Edge Database, data can be stored and processed on the devices directly (the so-called edge). Only useful data is sent to the server and saved there, reducing the networking traffic and computing power used in data centers tremendously, while also making use of the computing resources of devices which are already in use. This greatly reduces bandwidth and energy required by data centers. On top, Edge Computing also provides the flexibility to operate independently from an Internet connection, enables fast real time response rates, and cuts cloud costs.

Why is Edge Computing sustainable?

Edge Computing reduces network traffic and data center usage

With Edge Computing the amount of data traversing the network can be reduced greatly, freeing up bandwidth. Bandwidth is a measure of the quantity / size of data a network can transfer in a given time frame. Bandwidth is shared among users. Accordingly, the more data is supposed to be sent via the network at a given moment, the slower the network speed. Data on the edge is also much more likely to be useful and indeed used on the edge, in context of its environment. Instead of constantly sending data strems to the cloud, it therefore makes sense to work with the data on the edge and only send that data to the cloud that really is of use there (e.g. results, aggregated data etc.).

Edge computing is optimized for efficiency

Edge “data centers” are typically more efficient than cloud data centers. As described above, resources on edge devices are restricted. Therefore, and as opposed to cloud infrastructure, edge devices do not scale horizontally. That is one reason why every piece of the edge tech stack is – typically and ideally – highly optimized for resource efficiency. Any computing done more efficiently helps reduce energy consumption. Taking into account the huge number of devices already deployed , the worldwide impact of reducing resource use for the same operations is significant.

Edge Computing uses available hardware

There is a realm of edge devices already deployed that is currently underused. Many existing devices are capable of data persistence, and some even for fairly complex computing. When these devices – instead – send all of their data to the cloud, an opportunity is lost. Edge Computing enables companies to use existing hardware and infrastructure (retrofitting),  taking advantage of the available computing power. If these devices continue to be underused, we will need to build bigger and bigger central data centers, simultaneously burdening existing network infrastructure and reducing bandwidth for senselessly sending everything to the cloud.

Cloud versus Edge: an Example

Today, many projects are built based on cloud computing. Especially in first prototypes or pilots, cloud computing offers an easy and fast start. However, with scale, cloud computing often becomes too slow, expensive, and unreliable. In a typical cloud setup, data is gathered on edge devices and forwarded to the cloud for computation and storage. Often a computed result is sent back. In this design, the edge devices are dumb devices that are dependent upon a working internet connection and a working cloud server; they do not have any intelligence or logic of their own. In a smart home cloud example, data would be sent from devices in the home, e.g. a thermostat, the door, the TV etc. to the cloud, where it is saved and used.

Cloud vs Edge

If the user would want to make changes via a cloud-based mobile app when in the house, the changes would be sent to the cloud, changed there and then from there be sent to the devices. When the Internet connection is down or the server is not working, the application will not work.

With Edge Computing, data stays where it is produced, used and where it belongs – without traversing the network unnecessarily. This way, cloud infrastructure needs are reduced in three ways: Firstly, less network traffic, secondly, less central storage and thirdly less computational power. Rather, edge computing makes use of all the capable hardware already deployed in the world. E.g. in a smart home, all the data could stay within the house and be used on site. Only the small part of the data truly needed accessible from anywhere would be synchronized to the cloud.

Cloud vs Edge

Take for example a thermostat in such a home setting: it might produce 1000s of temperature data points per minute. However, minimal changes typically do not matter and data updates aren’t necessary every millisecond. On top, you really do not need all this data in the cloud and accessible from anywhere.

With Edge Computing, this data can stay on the edge and be used within the smart home as needed. Edge Computing enables the smart home to work fast, efficiently, and autonomous from a working internet connection. In addition, the smart home owner can keep the private data to him/herself and is less vulnerable to hacker attacks. 

How does ObjectBox make Edge Computing even more sustainable?

ObjectBox improves the sustainability of Edge Computing with high performance and efficiency: our 10X speed advantage translates into less use of CPU and battery / electricity. With ObjectBox, devices compute 10 times as much data with equivalent power. Due to the small size and efficiency, ObjectBox runs on restricted devices allowing application developers to utilize existing hardware longer and/or to do more instead of existing infrastructure / hardware.

Alongside the performance and size advantages, ObjectBox’ Sync solution takes care of making data available where needed when needed. It allows synchronization in an offline setting and / or to the cloud. Based on efficient syncing principles, ObjectBox Sync aims to reduce unnecessary data traffic as much as possible and is therefore perfectly suited for efficient, useful, and sustainable Edge Computing. Even when syncing the same amount of data, ObjectBox Sync reduces the bandwidth needed and thus cloud networking usage, which incidentally reduces cloud costs.

ObjectBox’ Time Series feature, provides users an intuitive dashboard to see patterns behind the data, further helping users to track thousands of data points/second in real-time.

How Edge Computing enables new use cases that help make the world more sustainable

As mentioned above, there are a variety of IoT applications that help reduce waste of all kinds. These applications can have a huge impact on creating a more sustainable world, assuming the applications themselves are sustainable. Three powerful examples to demonstrate the huge impact IoT applications can have on the world:

food-icon

Reducing Food Waste

From farm to kitchen, IoT applications can help to reduce food waste across the food chain. Sensors used to monitor the cold chain, from field to supermarket, can ensure that food maintains a certain temperature, thus guaranteeing that products remain food safe and fresh longer, reducing food waste. In addition, local storage can be used to power apps that fight household waste (you can learn how to build a food sharing app yourself in Flutter with this tutorial).

light bulb

Smart City Lighting

Smart City Lighting: Chicago has implemented a system which allows them to save approx. 10 million USD / year and London estimates it can save up to 70% of current electricity use and costs as well as maintenance costs through smart public lighting systems [10].

water-drop

Reducing Water Waste

Many homes and commercial building landscapes are still watered manually or on a set schedule. This is an inexact method of watering, which does not take into account weather, soil moistness, or the water levels needed by the plant. Using smart IoT water management solutions, landscape irrigation can be reduced, saving water and improving landscape health.

These positive effects are all the more powerful when the applications themselves are sustainable.

Sustainable digitization needs an edge

The benefits of cloud computing are broad and powerful, however there are costs to this technology. A combination of green data centers and Edge Computing helps to resolve these often unseen costs. With Edge Computing we can reduce the unnecessary use of bandwidth and server capacity (which comes down to infrastructure, electricity and physical space) while simultaneously taking advantage of underused device resources. Also with AI growing in popularity, Edge Computing will become very relevant for sustainable AI applications. AI applications are very resource intensive and Edge AI will help to distribute workloads in a resourceful manner, lowering the resource-use. One example of this is an efficient local vector database. ObjectBox amplifies these benefits, with high performance on small devices and efficient data synchronization – making edge computing an even more sustainable solution.