by Vivien | Jun 12, 2025 | Data Sync, Mobile Database, ObjectBox | 0 comments

ObjectBox Data Sync Setup Steps for Java (5 Minute tutorial)

Note: ObjectBox Data Sync always includes MongoDB Connector

1Register for Trial

Go to https://objectbox.io/user-portal/
Register for ObjectBox Sync trial

2Pull Docker Image

docker pull objectboxio/sync-server-trial

3Update Build Configuration

Go to your gradle build file and make this change:

// Change from: apply plugin: 'io.objectbox' // To: apply plugin: 'io.objectbox.sync'

4Add @Sync to Entities

Add the import and annotation to each entity you want to sync:

import io.objectbox.annotation.Sync; // Add this import @Sync // Add this annotation @Entity public class YourEntity { // Your entity fields (no relationships to non-synced entities) }

5Generate Data Model

Build project → find gradle-support/objectbox-models/default.json
Copy to project root as objectbox-model.json

6Start Sync Server

Windows:

docker run --rm -it --volume "%cd%:/data" --publish 127.0.0.1:9999:9999 --publish 127.0.0.1:9980:9980 objectboxio/sync-server-trial --model /data/objectbox-model.json --unsecured-no-authentication --admin-bind 0.0.0.0:9980

Linux/Mac:

docker run --rm -it --volume "$(pwd):/data" --publish 127.0.0.1:9999:9999 --publish 127.0.0.1:9980:9980 objectboxio/sync-server-trial --model /data/objectbox-model.json --unsecured-no-authentication --admin-bind 0.0.0.0:9980

7Add Sync Client to App

Add these imports to your main application class:

import io.objectbox.sync.Sync; import io.objectbox.sync.SyncClient; import io.objectbox.sync.SyncCredentials;

Add this code after creating your ObjectBox store:

if (Sync.isAvailable()) { SyncClient syncClient = Sync.client(store, "ws://127.0.0.1:9999", SyncCredentials.none()).build(); syncClient.start(); logger.info("Sync client started"); }

 Note: Important: Never close ObjectBox store while sync is active (generally, there is rarely ever need to close the store, so if you feel you need to, be very careful with this)
Only sync entities that don't have relationships to non-synced entities
Vector embeddings are not yet syncable (reach out to us if you need this!)
Keep the store open throughout application lifecycle
To test, run app with different database paths and add data in one instance, verify it syncs to the other
 

ObjectBox Data Sync Server & MongoDB Connector Updates

by Vivien | Jun 2, 2025 | Data Sync, Mobile Database, ObjectBox, Release

We’re excited to announce the latest updates to ObjectBox Sync Server with our recent 2025-06-02 and 2025-05-27 releases. These updates bring significant improvements to data handling, authentication, and user interface, making your data synchronization experience even smoother.

Powering Up Your Data Flow

Exciting news for developers! Starting from late May 2025, ObjectBox Sync Server trials are publicly available as Docker images on Docker Hub. This means you can now effortlessly pull our Sync Server trial directly with a simple command:

1	docker pull objectboxio/sync-server-trial

This provides a straightforward, no-fuss way to start testing the Sync Server with your data. Each trial gives you 30 days per dataset to explore the full spectrum of ObjectBox Sync capabilities, allowing you to experience its power and ease of use firsthand.

New “JSON to Native” External Property Type

Managing complex, nested JSON structures and mapping them to native database objects can be cumbersome and lead to data integrity issues. One of the most powerful additions in the 2025-05-27 release is the new “JSON to native” property type mapping. This feature allows you to convert strings to nested documents in MongoDB, providing a more elegant way to handle complex data structures. Note: This feature requires client version 4.3 or newer to function correctly.

Here’s how you can implement it in your applications:

// Java

@ExternalType(ExternalPropertyType.JSON_TO_NATIVE)

private String myNestedDocumentJson;

// Kotlin

@ExternalType(ExternalPropertyType.JSON_TO_NATIVE)

var name: String? = myNestedDocumentJson

// Dart

@ExternalType(type: ExternalPropertyType.jsonToNative)

String? myNestedDocumentJson;

// Swift

// objectbox: externalType="jsonToNative"

var myNestedDocumentJson: String?

Key Advantages of “JSON to Native”

You can use your preferred JSON API to access the data
It supports nested documents and arrays
The order of keys is preserved, unlike with flex properties

Increased Maximum Sync Message Size

We’ve increased the maximum Sync message size to 32 MB, allowing for larger data transfers between clients and the server. This improvement is particularly useful for applications that need to synchronize larger chunks of data or complex documents. Clients version 4.3.0 and above are required.

Enhanced JWT Authentication

JWT authentication has been improved with more flexible options for public key configurations. Public key URLs can now refer directly to PEM public key or X509 certificate files, in addition to the previously supported JSON formats.

This means you can now use the following formats for your public key URL:

Key-value JSON file
JWKS (JSON Web Key Set)
PEM public keyfile
PEM certificate file

This enhancement provides more flexibility when integrating ObjectBox Sync Server with various authentication providers and existing security infrastructures..

Admin UI Improvements

The 2025-06-02 release includes several user experience improvements to the Admin UI:

Resolved issues on the GraphQL page for more reliable interactions
Enhanced menu UI with improved icons and optimized padding for better visual clarity and navigation

Getting Started with the ObjectBox Sync Server Trial (including the MongoDB Connector)

If you haven’t tried ObjectBox Sync Server yet, now is a great time to start! With our publicly available Docker images, you can quickly set up and start testing (just ensure Docker is installed on your system):

docker run --rm -it \

--volume "$(pwd):/data" \

--publish 127.0.0.1:9999:9999 \

--publish 127.0.0.1:9980:9980 \

--user $UID \

objectboxio/sync-server-trial \

--model /data/objectbox-model.json \

--unsecured-no-authentication \

--admin-bind 0.0.0.0:9980

Note: this assumes you already have an existing data model (objectbox-model.json) ready. If you don’t, you can use the existing ObjectBox Sync Examples repository for a quick start.
Then, access the Admin UI by opening your web browser and navigate to http://127.0.0.1:9980
Follow the on-screen instructions in the Admin UI to activate your 30-day trial per dataset.

Or just go here to register and follow the step by step guide to get syncing now.
If you are using Java, you can also follow this 7 easy steps to sstart syncing your data in minutes.

What’s Next?

We’re continuously working to improve ObjectBox Sync to make your data synchronization experience seamless and robust. Stay tuned for more updates and improvements in the coming months!

For detailed information about these features, please refer to our documentation:

Data Sync Alternatives: Offline vs. Online Solutions

by Anastasia | Feb 5, 2025 | Data Sync, Edge Database, Mobile Database, ObjectBox, Open Source, SQlite

Ever waited to order or pay with a waiter holding their ordering device in the air for a signal? These moments show why offline-first Data Sync is essential. With more and more services relying on the availability of on-device apps and the IoT market projected to hit $1.1 trillion by 2026, choosing the right solution – particularly online-only or offline-first data sync – is more crucial than ever. In this blog, we discuss their differences and highlight common Data Sync alternatives.

What is Data Sync?

Data synchronization (Sync) aligns data between two or more devices to maintain consistency over time. It is an essential component in applications ranging from IoT and mobile apps to cloud computing. Challenges in data synchronization include asynchrony, conflicts, and managing data across flaky networks.

Data Sync vs. Data Replication

Data Synchronization is often confused with Data Replication. Nevertheless, they serve different purposes:

Data Replication: A unidirectional process (works in one direction only) that duplicates data across storage locations to ensure availability and prevent loss. It is simple but limited in its application, and efficiency, and lacks conflict management.
Data Synchronization: A bidirectional process that harmonizes all or a subset of data between two or more devices. It ensures consistency across devices and entails conflict resolution. It is inherently more complex but also more flexible.

Online vs Offline Solutions: Why Offline Sync Matters

Online-only synchronization solutions rely entirely on cloud infrastructure, requiring a stable internet connection to function. While these tools offer simplicity and scalability, their dependency on constant cloud connectivity brings limitations: Online Data Sync solutions cannot guarantee response rates and their speed varies depending on the network. They do not work when offline or in on-premise settings. Using an Online Sync solution often entails sharing the data and might not comply with data privacy requirements. So, do read the terms and conditions.

Offline-first solutions (offline Sync) focus on local data storage and processing, ensuring the app remains fully functional even without an internet connection. When a network is available, the app synchronizes seamlessly with a server, the cloud, or other devices as needed. These solutions are ideal for on-premise scenarios with unreliable or no internet access, mission-critical applications that must always operate, real-time and high-performance use cases, as well as situations requiring high data privacy and data security compliance.

A less discussed, but in our view also relevant point, is sustainability. While there might be exceptions depending on the use case, for most applications offline-first solutions are more resourceful and therefore more sustainable. If CO2 footprint or battery usage is of concern to you, you might want to look into offline-first Data Sync alternatives.

Now, let’s have a look at current options:

Data Sync Alternatives

(If you are on mobile, click here for a view that’s optimized for mobile)

Solution

Company

Type

Offline Support

Self-hosted Sync

Decentralized Sync

Database

Type of DB

OS/Platforms

Languages

Open-Source Component

License

Other Considerations

Country

Firebase

Google
(Firebase was acquired by Google in 2014)

Online

Local cache only, no persistence, syncs when online

❌

Cloud: Firebase Realtime Database; Edge: Only caching, no DB (called Firestore)

Document store

iOS, Android, Web

Java
JavaScript
Objective-C
Swift
Kotlin
C++
Dart
C#
Python, Go, Node.js

❌

proprietory

Tied to Google Cloud, requires internet connectivity

🇺🇸

Supabase

Online

Limited

✅

❌

Cloud DB: PostgreSQL

Relational document store

Primarily a cloud solution

JavaScript/TypeScript
Flutter/Dart
C#
Swift
Kotlin
Python

✅

Apache License 2.0

Supabase is mainly designed as a SaaS, for use cases with constant connectivity

🇸🇬

ObjectBox Sync

ObjectBox

Offline-first

✅

In development

ObjectBox

Object-oriented embedded NoSQL DB

Android, Linux, Ubuntu,
Windows,
macOS, iOS,
QNX, Raspbian,
any POSIX system really,
any cloud (e.g. AWS/Azure/Google Cloud),
bare metal

C
C++
Java
Kotlin
Swift
Go
Flutter / Dart
Python

✅

DB: Open source bindings, Apache 2.0, closed core

Highly efficient (saves CPU, Memory, battery, and bandwidth); fully offline-first, supports on-premise settings, 100% cloud optional

🇩🇪

Couchbase (Lite + Couchbase Sync Gateway)

Couchbase (a merger of Couch One and Membase)

Online

✅

The CE Sync is a bare minimum and typically not usable; Self-hosted Sync with Couchbase Servers is available as part of their Enterprise offering

✅ as part of the Enterprise offering; gets expensive quickly

Edge: Couchbase Lite; Server: Couchbase

Multi-model NoSQL document-oriented database

Couchbase Lite: iOS, Android, macOS, Linux, Windows, Raspbian and Raspberry Pi OS

Couchbase Sync Gateway: Red Hat Enterprise Linux (RHEL) 9.x, Alma Linux 9.x, Rocky Linux 9.x, Ubuntu, Debian (11.x, 12.x), Windows Server 2022

.Net
C
Go
Java
JavaScript info
Kotlin
PHP
Python
Ruby
Scala

✅

Couchbase Lite is available under different licenses; the open source Community Edition does not get regular updates and misses many features especially around Sync (e.g. it does not include Delta Sync making it slow and expensive)

Typically requires Couchbase servers, quickly gets expensive

🇺🇸

MongoDB Realm + Atlas Device Sync

MongoDB
(Realm was acquired by MongoDB in 2019)

Offline-First

✅

Cloud-based sync only

❌

Cloud: MongoDB, Edge: Mongo Realm DB

MongoDB: NoSQL document store; RealmDB: Embedded NoSQL DB

MongoDB: Linux
OS X
Solaris
Windows
Mongo Realm DB:
Android, iOS

more than 20 languages, e.g. Java, C, C#, C++

✅

MongoDB changed its license from open source (AGPL) to MongoDB Inc.’s Server Side Public License (SSPL) in 2018. RealmDB is open source under the Apache 2.0 License. The Data Sync was proprietary.

Deprecated (in Sep 2024); End-of-life in Sep 2025; ObjectBox offers a migration option

🇺🇸

While SQLite does not offer a sync solution out-of-the-box, various vendors have built something on top, or integrated with SQLite giving them offline persistence.

Key Considerations for Choosing a Data Sync Solution

When selecting a synchronization solution, consider:

Connectivity Requirements: Will the application function in offline environments; how will it work with flaky network conditions; how is the user experience when there is intermittent connectivity?
Data Privacy & Security: How critical is it to ensure sensitive data remains local? Data compliance? How important is it that data is not breached?
Scalability and Performance: What are the expected data loads and network constraints? How important is speed for the users? Is there any need to guarantee QoS parameters? How much will the cloud and networking costs be?
Conflict Resolution: How does the solution handle data conflicts?
Delta Sync: Does the solution always synchronize all data or only changes (data delta)? Can a subset of data be synchronized? How efficient is the Sync protocol (affecting costs and speed)?

The Shift Towards Edge Computing

The trend toward Edge Computing highlights the growing preference for offline-first solutions. By processing and storing data closer to its source, Edge Computing reduces cloud dependency, enhances privacy, and improves efficiency. Data synchronization plays an important role in this shift, ensuring seamless operation across decentralized networks.

Offline and online synchronization solutions each have their merits, but the rise of edge computing and data privacy concerns has propelled offline Sync to the forefront. Developers must assess their application’s unique requirements to select the most appropriate synchronization method. As the industry evolves, hybrid and offline-first solutions are going to dominate, offering the best balance of functionality, privacy, and performance.

First on-device Vector Database (aka Semantic Index) for iOS

by Uwe | Jul 24, 2024 | AI, Edge AI, Edge Database, Mobile Database, ObjectBox, Swift, vector database

Easily empower your iOS and macOS apps with fast, private, and sustainable AI features. All you need is a Small Language Model (SLM; aka “small LLM”) and ObjectBox – our on-device vector database built for Swift apps. This gives you a local semantic index for fast on-device AI features like RAG or GenAI that run without an internet connection and keep data private.

The recently demonstrated “Apple Intelligence” features are precisely that: a combination of on-device AI models and a vector database (semantic index). Now, ObjectBox Swift enables you to add the same kind of AI features easily and quickly to your iOS apps right now.

Not developing with Swift? We also have a Flutter / Dart binding (works on iOS, Android, desktop), a Java / Kotlin binding (works on Android and JVM), or one in C++ for embedded devices.

Enabling Advanced AI Anywhere, Anytime

Typical AI apps use data (e.g. user-specific data, or company-specific data) and multiple queries to enhance and personalize the quality of the model’s response and perform complex tasks. And now, for the very first time, with the release of ObjectBox 4.0, this will be possible locally on restricted devices.

Local AI Tech Stack Example for on-device RAG

Swift on-device Vector Database and search for iOS and MacOS

With the ObjectBox Swift 4.0 release, it is possible to create a scalable vector index on floating point vector properties. It’s a very special index that uses an algorithm called HNSW. It’s scalable because it can find relevant data within millions of entries in a matter of milliseconds.
Let’s pick up the cities example from our vector search documentation. Here, we use cities with a location vector and want to find the closest cities (a proximity search). The Swift class for the City entity shows how to define an HNSW index on the location:

// objectbox: entity

class City {

var id: Id = 0

var name: String?

// objectbox:hnswIndex: dimensions=2

var location: [Float]?

}

</code>

Inserting City objects with a float vector and HNSW index works as usual, the indexing happens behind the scenes:

let box: Box<city> = store.box()

try box.put([

City("Barcelona", [41.385063, 2.173404]),

City("Nairobi", [-1.292066, 36.821945]),

City("Salzburg", [47.809490, 13.055010]),

])

</city></code>

To then find cities closest to a location, we do a nearest neighbor search using the new query condition and “find with scores” methods. The nearest neighbor condition accepts a query vector, e.g. the coordinates of Madrid, and a count to limit the number of results of the nearest neighbor search, here we want at max 2 cities. The find with score methods are like a regular find, but in addition return a score. This score is the distance of each result to the query vector. In our case, it is the distance of each city to Madrid.

let madrid = [40.416775, -3.703790] // query vector

// Prepare a Query object to search for the 2 closest neighbors:

let query = try box

.query { City.location.nearestNeighbors(queryVector: madrid, maxCount: 2) }

.build()

let results = try query.findWithScores()

for result in results {

print("City: \(result.object.name), distance: \(result.score)")

}

The ObjectBox on-device vector database empowers AI models to seamlessly interact with user-specific data — like texts and images — directly on the device, without relying on an internet connection. With ObjectBox, data never needs to leave the device, ensuring data privacy.

Thus, it’s the perfect solution for developers looking to create smarter apps that are efficient and reliable in any environment. It enhances everything from personalized banking apps to robust automotive systems.

ObjectBox: Optimized for Resource Efficiency

At ObjectBox, we specialize on efficiency that comes from optimized code. Our hearts beat for creating highly efficient and capable software that outperforms alternatives on small and big hardware. ObjectBox maximizes speed while minimizing resource use, extending battery life, and reducing CO₂ emissions.

With this expertise, we took a unique approach to vector search. The result is not only a vector database that runs efficiently on constrained devices but also one that outperforms server-side vector databases (see first benchmark results; on-device benchmarks coming soon). We believe this is a significant achievement, especially considering that ObjectBox still upholds full ACID properties (guaranteeing data integrity).

Cloud/server vector databases vs. On-device/Edge vector databases

Also, keep in mind that ObjectBox is a fully capable database. It allows you to store complex data objects along with vectors. Thus, you have the full feature set of a database at hand. It empowers hybrid search, traceability, and powerful queries.

Use Cases / App ideas

ObjectBox can be used for a million different things, from empowering generative AI features in mobile apps to predictive maintenance on ECUs in cars to AI-enhanced games. For iOS apps, we expect to see the following on-device AI use cases very soon:

Across all categories we’ll see Chat-with-files apps:
- Travel: Imagine chatting to your favorite travel guide offline, anytime, anywhere. No need to carry bulky paper books, or scroll through a long PDF on your mobile.
- Research: Picture yourself chatting with all the research papers in your field. Easily compare studies and findings, and quickly locate original quotes.

Chat-with-files apps (across verticals & categories)

Travel: Imagine chatting to your favorite travel guide offline, anytime, anywhere. No need to carry bulky paper books, or scroll through a long PDF on your mobile
Research: Picture yourself chatting with all the research papers in your field. Easily compare studies and findings, and quickly locate original quotes.
Education: Educational apps featuring “chat-with-your-files” functionality for learning materials and research papers. But going beyond that, they generate quizzes and practice questions to help people solidify knowledge.

Lifestyle – from Coaching to Health

- Health: Apps offering personalized recommendations based on scientific research, your preferences, habits, and individual health data. This includes data tracked from your device, lab results, and doctoral diagnosis.

Productivity: Personal assistants for all areas of life

- Family Management: Interact with assistants tailored to specific roles. Imagine a parent’s assistant that monitors school channels, chat groups, emails, and calendars. Its goal is to automatically add events like school plays, remind you about forgotten gym bags, and even suggest birthday gifts for your child’s friends.

- Professional Assistants: Imagine being a busy sales rep on the go, juggling appointments and travel. A powerful on-device sales assistant can do more than just automation. It can prepare contextual and personalized follow-ups instantly. For example, by summarizing talking points, attaching relevant company documents, and even suggesting who to CC in your emails.

Run the local AI Stack with a Language Model (SLM, LLM)

Recent Small Language Models (SMLs) already demonstrate impressive capabilities while being small enough to run on e.g. mobile phones. To run the model on-device of an iPhone or a macOS computer, you need a model runtime. On Apple Silicone the best choice in terms of performance typically MLX – a framework brought to you by Apple machine learning research. It supports the hardware very efficiently by supporting CPU/GPU and unified memory.

To summarize, you need these three components to run on-device AI with an semantic index:

- ObjectBox: vector database for the semantic index
- Models: choose an embedding model and a language model to match your requirements
- MLX as the model runtime

Start building next generation on-device AI apps today! Head over to our vector search documentation and Swift documentation for details.

The on-device Vector Database for Android and Java

by Markus | May 29, 2024 | AI, Android, Benchmarks, Edge AI, Edge Database, Mobile Database, ObjectBox

ObjectBox 4.0 is the very first on-device, local vector database for Android and Java developers to enhance their apps with local AI capabilities (Edge AI). A vector database facilitates advanced vector data processing and analysis, such as measuring semantic similarities across different document types like images, audio files, and texts. A classic use case would be to enhance a Large Language Model (LLM), or a Small Language Model (SLM, like e.g. the Phi-3), with your domain expertise, your proprietary knowledge, and / or your private data. Combining the power of AI models with a specific knowledge base empowers high-quality, perfectly matching results a generic model simply cannot provide. This is called “retrieval-augmented generation” (RAG). Because ObjectBox works on-device, you can now do on-device RAG with data that never leaves the device and therefore stays 100% private. This is your chance to explore this technology on-device.

Vector Search (Similarity Search)

With this release, it is possible to create a scalable vector index on floating point vector properties. It’s a very special index that uses an algorithm called HNSW (Hierarchical Navigable Small World). It’s scalable because it can find relevant data within millions of entries in a matter of milliseconds.

We pick up the example used in our vector search documentation. In short, we use cities with a location vector to perform proximity search. Here is the City entity and how to define a HNSW index on the location:

@Entity

data class City(

@Id var id: Long = 0,

var name: String? = null,

@HnswIndex(dimensions = 2) var location: FloatArray? = null

)

Vector objects are inserted as usual (the indexing is done automatically behind the scenes):

val box = store.boxFor(City::class)

box.put(

City(name = "Barcelona", location = floatArrayOf(41.385063f, 2.173404f)),

City(name = "Nairobi", location = floatArrayOf(-1.292066f, 36.821945f)),

City(name = "Salzburg", location = floatArrayOf(47.809490f, 13.055010f))

)

To perform a nearest neighbor search, use the new nearestNeighbors(queryVector, maxResultCount) query condition and the new “find with scores” query methods (the score is the distance to the query vector). For example, let’s find the 2 closest cities to Madrid:

val madrid = floatArrayOf(40.416775f, -3.703790f)

val query = box

.query(City_.location.nearestNeighbors(madrid, 2))

.build()

val results = query.findWithScores()

for (result in results) {

println("City: ${result.get().id}, distance: ${result.score}")

}

Vector Embeddings

In the cities example above, the vectors were straight forward: they represent latitude and longitude. Maybe you already have vector data as part of your data. But often, you don’t. So where do you get the vector emebeddings of texts, images, video, audio files from?

For most AI applications, vectors are created by a embedding model. There are plenty of embedding models to choose from, but first you have to decide if it should run in the cloud or locally. Online embeddings are the easier way to get started and great for first testing; you can set up an account at your favorite AI provider and create embeddings online (only).

Depending on how much you care about privacy, you can also run embedding models locally and create your embeddings on your own device. There are a couple of choices for desktop / server hardware, e.g. check these on-device embedding models. For Android, MediaPipe is a good start as it has embedders for text and images.

Updated open source benchmarks 2024 (CRUD)

A new release is also a good occasion to update our open source benchmarks. The Android performance benchmark app provides many more options, but here are the key results:

CRUD is short for the basic operations a database does: create, read, update and delete. It’s an important metric for the general efficiency of a database.

Disclaimer 1: our focus is the “Object” performance (you may find a hint for that in our product name 🙂); so e.g. relational systems may perform a bit better when you directly work with raw columns and rows.

Disclaimer 2: ObjectBox delete performance was cut off at 800k per second to keep the Y axis within reasonable bounds. The actually measured value was 2.5M deletes per second.

Disclaimer 3: there cannot be enough disclaimers on any performance benchmark. It’s a complex topic where details matter. It’s best if you make your own picture for your own use case. We try to give a fair “arena” with our open source benchmarks, so it could be a starting point for you.

Feedback and Outlook: On-device vector search Benchmarks for Android coming soon

We’re still working on a lot of stuff (as always ;)) and with on-device / local vector search being a completely new technology for Android, we need your feedback, creativity and support more than ever. We’ll also soon release benchmarks on the vector search. Follow us on LinkedIn, GitHub, or Twitter to keep up-to-date.

The first On-Device Vector Database: ObjectBox 4.0

by Markus | May 16, 2024 | Edge AI, Edge Computing, Edge Database, Mobile Database, ObjectBox, Open Source

The new on-device vector database enables advanced AI applications on small restricted devices like mobile phones, Raspberry Pis, medical equipment, IoT gadgets and all the smart things around you. It is the missing piece to a fully local AI stack and the key technology to enable AI language models to interact with user specific data like text and images without an Internet connection and cloud services.

An AI Technology Enabler

Recent AI language models (LLMs) demonstrated impressive capabilities while being small enough to run on e.g. mobile phones. Recent examples include Gemma, Phi3 and OpenELM. The next logical step from here is to use these LLMs for advanced AI applications that go beyond a mere chat. A new generation of apps is currently evolving. These apps create “flows” with user specific data and multiple queries to the LLM to perform complex tasks. This is also known as RAG (retrieval augmented generation), which, in its simplest form, allows one to chat with your documents. And now, for the very first time, this will be possible to do locally on restricted devices using a fully fledged embedded database.

What is special about ObjectBox Vector Search?

We know restricted devices. Where others see limitations, we see the potential and we have repeatedly demonstrated creating superefficient software for these. And thus maximizing speed, minimizing resource use, saving battery life and CO2. With this knowledge, we approached vector search in a unique way.

Efficient memory management is the key. The challenge with vector data is that on the one hand, it consumes a lot of memory – while on the other hand, relevant vectors must be present in memory to compute distances between vectors efficiently. For this, we introduced a special multi-layered caching that gives the best performance for the full range of devices; from memory-constrained small devices to large machines that can keep millions of vectors in memory. This worked out so well that we saw ObjectBox outperform several vector databases built for servers (open source benchmarks coming soon). This is no small feat given that ObjectBox still holds up full ACID properties, e.g. caching must be transaction-aware.

Also, keep in mind that ObjectBox is a fully capable database that allows you to store complex data objects along with vectors. From an ObjectBox data model point of view, a vector is “just” another property type. This allows you to store all your data (vectors along with objects) in a single database. This “one database” approach also includes queries. You can already combine vector search with other conditions. Note that some limitations still apply with this initial release. Full hybrid search is close to being finished and will be part of one of the next releases.

In short, the following features make ObjectBox a unique vector database:

Embedded Database that runs inside your application without latency
Vector search based is state-of-the-art HNSW algorithm that scales very well with growing data volume
HNSW is tightly integrated within our internal database. Vector Search doesn’t just run “on top of database persistence”.
With this deep integration we do not need to keep all vectors in memory.
Multi-layered caching: if a vector is not in-memory, ObjectBox fetches it from disk.
Not just a vector database: you can store any data in ObjectBox, not just vectors. You won’t need a second database.

Low minimum hardware requirements: e.g. an old Raspberry Pi comfortably runs ObjectBox smoothly.
Low memory footprint: ObjectBox itself just takes a few MB of memory. The entire binary is only about 3 MB (compressed around 1 MB).
Scales with hardware: efficient resource usage is also an advantage when running on more capable devices like the latest phones, desktops and servers.
ObjectBox additionally offers commercial editions, e.g. a Server Cluster mode, GraphQL, and of course, ObjectBox Sync, our data synchronization solution.

Why is this relevant? AI anywhere & anyplace

With history repeating itself, we think AI is in a “mainframe era” today. Just like clunky computers from decades before, AI is restricted to big and very expensive machines running far away from the user. In the future, AI will become decentralized, shifting to the user and their local devices. To support this shift, we created the ObjectBox vector database. Our vision is a future where AI can assist everyone, anytime, and anywhere, with efficiency, privacy, and sustainability at its core.

What do we launch today?

Today, we are releasing ObjectBox 4.0 with Vector Search for a variety of languages:

Python* : Github, blog post follows today
Android/Java: GitHub
Dart/Flutter: GitHub
C: GitHub

*) We acknowledge Python’s popularity within the AI community and thus have invested significantly in our Python binding over the last months to make it part of this initial release. Since we still want to smooth out some rough edges with Python, we decided to label Python an alpha release. Expect Python to quickly catch up and match the comfort of our more established language bindings soon (e.g. automatic ID and model handling).

Let’s get you started right away? Check our Vector Search documentation to see how to use it!

One more thing: ObjectBox Open Source Database (OSS)

We are also very happy to announce that we will fully open source the core of ObjectBox. As a company we follow the open core model. Since we still have some cleaning up to do, this will happen in one of the next releases, likely 4.1.

“Release week”

With today’s initial releases, we are far from done yet. Starting next Tuesday, you can expect additional announcements from us. Follow us to get the news as soon as it is released.

What’s next?

This is our very first version of a “vector database”. And while we are very happy with this release, there are still so many things to do! For example, we will optimize vector search by adding vector quantization and integrate it more tightly with our data synchronization. We are also focusing on expanding our solution’s reach through strategic partnerships. If you think you are a good fit, let us know. And as always, we are very eager to get some feedback from you! Take care.

« Older Entries