by Vivien | Jun 12, 2025 | Data Sync, Mobile Database, ObjectBox

ObjectBox Data Sync Setup Steps for Java (5 Minute tutorial)

Note: ObjectBox Data Sync always includes MongoDB Connector

1Register for Trial

Go to https://objectbox.io/user-portal/
Register for ObjectBox Sync trial

2Pull Docker Image

docker pull objectboxio/sync-server-trial

3Update Build Configuration

Go to your gradle build file and make this change:

// Change from: apply plugin: 'io.objectbox' // To: apply plugin: 'io.objectbox.sync'

4Add @Sync to Entities

Add the import and annotation to each entity you want to sync:

import io.objectbox.annotation.Sync; // Add this import @Sync // Add this annotation @Entity public class YourEntity { // Your entity fields (no relationships to non-synced entities) }

5Generate Data Model

Build project → find gradle-support/objectbox-models/default.json
Copy to project root as objectbox-model.json

6Start Sync Server

Windows:

docker run --rm -it --volume "%cd%:/data" --publish 127.0.0.1:9999:9999 --publish 127.0.0.1:9980:9980 objectboxio/sync-server-trial --model /data/objectbox-model.json --unsecured-no-authentication --admin-bind 0.0.0.0:9980

Linux/Mac:

docker run --rm -it --volume "$(pwd):/data" --publish 127.0.0.1:9999:9999 --publish 127.0.0.1:9980:9980 objectboxio/sync-server-trial --model /data/objectbox-model.json --unsecured-no-authentication --admin-bind 0.0.0.0:9980

7Add Sync Client to App

Add these imports to your main application class:

import io.objectbox.sync.Sync; import io.objectbox.sync.SyncClient; import io.objectbox.sync.SyncCredentials;

Add this code after creating your ObjectBox store:

if (Sync.isAvailable()) { SyncClient syncClient = Sync.client(store, "ws://127.0.0.1:9999", SyncCredentials.none()).build(); syncClient.start(); logger.info("Sync client started"); }

        Note:
        Important: Never close ObjectBox store while sync is active (generally, there is rarely ever need to close the store, so if you feel you need to, be very careful with this)
Only sync entities that don't have relationships to non-synced entities
Vector embeddings are not yet syncable (reach out to us if you need this!)
Keep the store open throughout application lifecycle
To test, run app with different database paths and add data in one instance, verify it syncs to the other

    

ObjectBox Data Sync Server & MongoDB Connector Updates

by Vivien | Jun 2, 2025 | Data Sync, Mobile Database, ObjectBox, Release

We’re excited to announce the latest updates to ObjectBox Sync Server with our recent 2025-06-02 and 2025-05-27 releases. These updates bring significant improvements to data handling, authentication, and user interface, making your data synchronization experience even smoother.

Powering Up Your Data Flow

Exciting news for developers! Starting from late May 2025, ObjectBox Sync Server trials are publicly available as Docker images on Docker Hub. This means you can now effortlessly pull our Sync Server trial directly with a simple command:

1	docker pull objectboxio/sync-server-trial

This provides a straightforward, no-fuss way to start testing the Sync Server with your data. Each trial gives you 30 days per dataset to explore the full spectrum of ObjectBox Sync capabilities, allowing you to experience its power and ease of use firsthand.

New “JSON to Native” External Property Type

Managing complex, nested JSON structures and mapping them to native database objects can be cumbersome and lead to data integrity issues. One of the most powerful additions in the 2025-05-27 release is the new “JSON to native” property type mapping. This feature allows you to convert strings to nested documents in MongoDB, providing a more elegant way to handle complex data structures. Note: This feature requires client version 4.3 or newer to function correctly.

Here’s how you can implement it in your applications:

// Java

@ExternalType(ExternalPropertyType.JSON_TO_NATIVE)

private String myNestedDocumentJson;

// Kotlin

@ExternalType(ExternalPropertyType.JSON_TO_NATIVE)

var name: String? = myNestedDocumentJson

// Dart

@ExternalType(type: ExternalPropertyType.jsonToNative)

String? myNestedDocumentJson;

// Swift

// objectbox: externalType="jsonToNative"

var myNestedDocumentJson: String?

Key Advantages of “JSON to Native”

You can use your preferred JSON API to access the data
It supports nested documents and arrays
The order of keys is preserved, unlike with flex properties

Increased Maximum Sync Message Size

We’ve increased the maximum Sync message size to 32 MB, allowing for larger data transfers between clients and the server. This improvement is particularly useful for applications that need to synchronize larger chunks of data or complex documents. Clients version 4.3.0 and above are required.

Enhanced JWT Authentication

JWT authentication has been improved with more flexible options for public key configurations. Public key URLs can now refer directly to PEM public key or X509 certificate files, in addition to the previously supported JSON formats.

This means you can now use the following formats for your public key URL:

Key-value JSON file
JWKS (JSON Web Key Set)
PEM public keyfile
PEM certificate file

This enhancement provides more flexibility when integrating ObjectBox Sync Server with various authentication providers and existing security infrastructures..

Admin UI Improvements

The 2025-06-02 release includes several user experience improvements to the Admin UI:

Resolved issues on the GraphQL page for more reliable interactions
Enhanced menu UI with improved icons and optimized padding for better visual clarity and navigation

Getting Started with the ObjectBox Sync Server Trial (including the MongoDB Connector)

If you haven’t tried ObjectBox Sync Server yet, now is a great time to start! With our publicly available Docker images, you can quickly set up and start testing (just ensure Docker is installed on your system):

docker run --rm -it \

--volume "$(pwd):/data" \

--publish 127.0.0.1:9999:9999 \

--publish 127.0.0.1:9980:9980 \

--user $UID \

objectboxio/sync-server-trial \

--model /data/objectbox-model.json \

--unsecured-no-authentication \

--admin-bind 0.0.0.0:9980

Note: this assumes you already have an existing data model (objectbox-model.json) ready. If you don’t, you can use the existing ObjectBox Sync Examples repository for a quick start.
Then, access the Admin UI by opening your web browser and navigate to http://127.0.0.1:9980
Follow the on-screen instructions in the Admin UI to activate your 30-day trial per dataset.

Or just go here to register and follow the step by step guide to get syncing now.
If you are using Java, you can also follow this 7 easy steps to sstart syncing your data in minutes.

What’s Next?

We’re continuously working to improve ObjectBox Sync to make your data synchronization experience seamless and robust. Stay tuned for more updates and improvements in the coming months!

For detailed information about these features, please refer to our documentation:

Why Edge AI is crucial for retail and POS systems in 2025

by Vivien | Feb 19, 2025 | AI, Edge AI, Edge Computing, Edge Database, IoT, Mobile Database, vector database

In recent years, the retail industry’s growth has been modest, with annual rates ranging from 1.5% to 3.5% depending on the sector. Competition and rising consumer expectations for seamless omnichannel experiences have squeezed profit margins. With AI advancing so rapidly, there’s a great opportunity to embrace innovative solutions that boost efficiency and help create new revenue streams. Accordingly, IDC (2025) expects that by 2026, 90% of retail tools will embed AI algorithms. Furthermore, by 2027, over 45% of major retailers will apply Edge AI for faster decision-making and store-specific assortment planning, selection, allocation, and replenishment. Let’s have a closer look at how retailers can leverage Edge AI no matter their size and budgets.

Defining Edge AI in Retail Contexts

Edge AI refers to decentralized artificial intelligence systems that process data locally on in-store devices, e.g. POS terminals, smart shelves, Raspberry Pis, mobile phones, or cameras, rather than relying on distant cloud servers. This architecture works independently from distant cloud servers or internet connectivity, and therefore offline with minimized latency. Both, offline-capability and speed, are critical for applications like fraud detection and checkout automation. Accordingly, IDC emphasizes that 45% of retailers now prioritize “near-the-network” edge deployments. There, AI models run locally on in-store servers or IoT devices, balancing cost and performance.

Key Components of Edge AI Systems

For Edge AI to deliver real-time, offline-capable intelligence, its architecture must integrate on-device databases, local processing, and efficient data synchronization. These three pillars ensure seamless AI-powered retail operations without dependence on the cloud, minimizing latency, costs, and privacy concerns.

Retail-EdgeAI-POS-Setup — Edge AI system architecture in retail, integrating local processing, real-time data sync, and various applications like POS or signage

1. Local Data Collection, Sync, and Storage

Retail generates vast real-time data from IoT sensors, POS transactions, smart cameras, and RFID tags. To ensure instant processing and uninterrupted availability you need:

On-device data storage: All kinds of devices from IoT sensors to cameras capture data. Depending on the device capabilities, with small on-device databases, data can be stored and used directly on the devices.
Local central server: A centralized on-premise device (e.g. a PC or Raspberry Pi, or more capable hw) ensures operations continue even if individual devices are resource-limited or offline.
Bi-directional on-premise data sync: Local syncing between devices and with a central on-site server ensures better decisions and fail-safe operations. It keeps all devices up-to-date without internet dependence.

2. Local Data Processing & Real-Time AI Decision-Making

Processing data where it is generated is critical for speed, privacy, and resilience:

On-device AI models: Small, quantized AI models (SLMs) like Microsoft’s Phi-3-mini (3.8B parameters, <2GB memory footprint) can run directly on many devices (e.g. tablets, and POS systems), enabling real-time fraud detection, checkout automation, and personalized recommendations.
Local on-premise AI models: Larger SLMs or LLMs run on the more capable in-store hardware for security, demand forecasting, or store optimization.
On-device & on-premise vector databases: AI models leverage on-device vector databases to structure and index data for real-time AI-driven insights (e.g., fraud detection, smart inventory management), fast similarity searches, and real-time decision-making.

3. Hybrid Data Sync: Local First, Selective Cloud Sync

Selective Cloud Sync: Bi-directional cloud data sync extends the on-premise data sync. Select data, such as aggregated insights (e.g., sales trends, shrinkage patterns), payment processing, and select learnings are synced with the cloud to enable Enterprise-wide analytics & compliance, Remote monitoring & additional backup, and Optimized centralized decision-making.
Cloud Database & Backend Infrastructure: A cloud-based database acts as the global repository. It integrates data from multiple locations to store aggregated insights & long-term trends for AI model refinement and enterprise reporting, facilitating cross-location comparisons.
Centralized cloud AI model: A centralized cloud AI model is optional for larger setups. It can be used to continuously learn from local insights, refining AI recommendations and operational efficiencies across all connected stores.

Use Cases of Edge AI for Retailers

Edge AI is unlocking new efficiencies for retailers by enabling real-time, offline-capable intelligence across customer engagement, marketing, in-store operations, and supply chains.

Key applications of Edge AI in retail, driving personalization, operational efficiency, and smarter decision-making.

Enhancing Customer Experiences in Retail Stores with Edge AI – Examples

Edge AI transforms the shopping experience, enabling retailers to offer more streamlined and more personalized services based on real-time data, thereby boosting customer satisfaction and sales. Key benefits include:

Realtime Product Recommendations: Using cognitive neural networks, retailers can respond instantly to a customer’s actions, such as browsing and purchasing, to recommend products that align with their preferences. An Accenture study found that 75% of consumers wish they could identify options that meet their needs more quickly and easily.
In-store experience: AI tracks customer movement and analyses purchase patterns, optimizing store layout and product placement. A large global furniture retailer’s in-store analytics led to a more than 10 percent rise in in-store traffic and high sales growth within a month.
Contactless Checkout: AI-driven self-checkouts allow customers to select products captured by cameras. Thus, bypassing the need for scanning product codes, which streamlines both standard and automated checkout processes. For example, Amazon’s Just Walk Out technology allows customers to pick up items and leave the store without traditional checkout, enhancing convenience and reducing wait times.
Real-Time Inventory Tracking: Smart shelves monitor inventory levels in real time, triggering automatic reorders and preventing stockouts. For example, a study proposed a smart shelf design capable of detecting the location and weight of items, ensuring accurate inventory counts and timely replenishment.

Retail operational excellence and cost optimization with Edge AI – Examples

Edge AI also significantly enhances operational efficiency, especially operational in-store efficiency, reduces losses, and helps lower costs (while at the same time enhancing sustainability):

Supply Chain Management: Edge AI enhances supply chain operations by decentralizing data processing, enabling real-time analysis and faster decision-making. This leads to optimized inventory levels, more accurate demand forecasting, and reduced operational costs. For example, Walmart’s pioneering use of GenAI in supply chain management has driven a 100x productivity boost, enabling more accurate demand forecasting, optimized inventory, and reduced waste. As reported in its Q2 2025 earnings call, these improvements trimmed operational costs by 20%.
Loss Prevention: Retail shrink, exacerbated by inflation-driven shoplifting and self-checkout vulnerabilities, costs the industry over $100 billion annually. Advanced sensors and intelligent cameras combined with Edge AI help detect early signs of theft or fraud. Thus, allowing security measures to intervene promptly, and independently from an internet connection.
Waste Reduction: Grocery chains like Tesco use Edge AI to analyze the expiry dates of goods and ripeness of produce, dynamically pricing items nearing expiration. This approach can reduce food waste by up to 40%. Food waste is a huge social, economic, and environmental challenge. If it was considered as a country, would be the world’s third greatest emitter of greenhouse gases. Edge AI in retail could play a pivotal role in food waste avoidance.
Energy Savings: Smart sensors and Edge AI can be used to optimize the use of energy for lighting, heating, ventilation, water use, etc. For example, 45 Broadway, a 32-story office building in Manhattan, implemented an AI system that analyzes real-time data. That included temperature, humidity, sun angle, and occupancy patterns – to proactively adjust HVAC settings. This integration led to a 15.8% reduction in HVAC-related energy consumption. Plus, saving over $42,000 annually and reducing carbon emissions by 37 metric tons in just 11 months.

Conclusion: Edge AI as Retail’s Strategic Imperative

Edge AI is a true game-changer for retailers in 2025. Faced with rising costs and fierce competition, stores need faster insights and better local experiences to stand out. Therefore, according to IDC, 90% of retail tools will embed AI by 2026, with edge solutions expected to help 45% of retailers optimize local assortments. Meanwhile, according to McKinsey, 44% of retailers that have implemented AI already reduced operational costs, while the majority have seen increases in revenue.

Yet, Edge AI isn’t just about running AI models locally. It’s about creating an autonomous, resilient system where on-device vector databases, local processing, and hybrid data sync work together. This combination enables real-time retail intelligence while keeping costs low, data private, and operations uninterrupted. To stay ahead, businesses should invest in edge-ready infrastructure with on-device vector databases and data sync that works on-premise at their core. Those who hesitate risk losing ground to nimble competitors who have already tapped into real-time, in-store intelligence.

Hybrid systems, combining lightning-fast offline-first edge response times with the power of the cloud, are becoming the norm. IDC projects that 78% of retailers will adopt these setups by 2026, saving an average of $3.6 million per store annually. In an inflation-driven market, Edge AI isn’t just a perk – it’s a critical strategy for thriving in the future of retail. By leveraging Edge AI-powered on-device databases, retailers gain the speed, efficiency, and reliability needed to stay competitive in an AI-driven retail landscape.

Data Sync Alternatives: Offline vs. Online Solutions

by Anastasia | Feb 5, 2025 | Data Sync, Edge Database, Mobile Database, ObjectBox, Open Source, SQlite

Ever waited to order or pay with a waiter holding their ordering device in the air for a signal? These moments show why offline-first Data Sync is essential. With more and more services relying on the availability of on-device apps and the IoT market projected to hit $1.1 trillion by 2026, choosing the right solution – particularly online-only or offline-first data sync – is more crucial than ever. In this blog, we discuss their differences and highlight common Data Sync alternatives.

What is Data Sync?

Data synchronization (Sync) aligns data between two or more devices to maintain consistency over time. It is an essential component in applications ranging from IoT and mobile apps to cloud computing. Challenges in data synchronization include asynchrony, conflicts, and managing data across flaky networks.

Data Sync vs. Data Replication

Data Synchronization is often confused with Data Replication. Nevertheless, they serve different purposes:

Data Replication: A unidirectional process (works in one direction only) that duplicates data across storage locations to ensure availability and prevent loss. It is simple but limited in its application, and efficiency, and lacks conflict management.
Data Synchronization: A bidirectional process that harmonizes all or a subset of data between two or more devices. It ensures consistency across devices and entails conflict resolution. It is inherently more complex but also more flexible.

Online vs Offline Solutions: Why Offline Sync Matters

Online-only synchronization solutions rely entirely on cloud infrastructure, requiring a stable internet connection to function. While these tools offer simplicity and scalability, their dependency on constant cloud connectivity brings limitations: Online Data Sync solutions cannot guarantee response rates and their speed varies depending on the network. They do not work when offline or in on-premise settings. Using an Online Sync solution often entails sharing the data and might not comply with data privacy requirements. So, do read the terms and conditions.

Offline-first solutions (offline Sync) focus on local data storage and processing, ensuring the app remains fully functional even without an internet connection. When a network is available, the app synchronizes seamlessly with a server, the cloud, or other devices as needed. These solutions are ideal for on-premise scenarios with unreliable or no internet access, mission-critical applications that must always operate, real-time and high-performance use cases, as well as situations requiring high data privacy and data security compliance.

A less discussed, but in our view also relevant point, is sustainability. While there might be exceptions depending on the use case, for most applications offline-first solutions are more resourceful and therefore more sustainable. If CO2 footprint or battery usage is of concern to you, you might want to look into offline-first Data Sync alternatives.

Now, let’s have a look at current options:

Data Sync Alternatives

(If you are on mobile, click here for a view that’s optimized for mobile)

Solution

Company

Type

Offline Support

Self-hosted Sync

Decentralized Sync

Database

Type of DB

OS/Platforms

Languages

Open-Source Component

License

Other Considerations

Country

Firebase

Google
(Firebase was acquired by Google in 2014)

Online

Local cache only, no persistence, syncs when online

❌

Cloud: Firebase Realtime Database; Edge: Only caching, no DB (called Firestore)

Document store

iOS, Android, Web

Java
JavaScript
Objective-C
Swift
Kotlin
C++
Dart
C#
Python, Go, Node.js

❌

proprietory

Tied to Google Cloud, requires internet connectivity

🇺🇸

Supabase

Online

Limited

✅

❌

Cloud DB: PostgreSQL

Relational document store

Primarily a cloud solution

JavaScript/TypeScript
Flutter/Dart
C#
Swift
Kotlin
Python

✅

Apache License 2.0

Supabase is mainly designed as a SaaS, for use cases with constant connectivity

🇸🇬

ObjectBox Sync

ObjectBox

Offline-first

✅

In development

ObjectBox

Object-oriented embedded NoSQL DB

Android, Linux, Ubuntu,
Windows,
macOS, iOS,
QNX, Raspbian,
any POSIX system really,
any cloud (e.g. AWS/Azure/Google Cloud),
bare metal

C
C++
Java
Kotlin
Swift
Go
Flutter / Dart
Python

✅

DB: Open source bindings, Apache 2.0, closed core

Highly efficient (saves CPU, Memory, battery, and bandwidth); fully offline-first, supports on-premise settings, 100% cloud optional

🇩🇪

Couchbase (Lite + Couchbase Sync Gateway)

Couchbase (a merger of Couch One and Membase)

Online

✅

The CE Sync is a bare minimum and typically not usable; Self-hosted Sync with Couchbase Servers is available as part of their Enterprise offering

✅ as part of the Enterprise offering; gets expensive quickly

Edge: Couchbase Lite; Server: Couchbase

Multi-model NoSQL document-oriented database

Couchbase Lite: iOS, Android, macOS, Linux, Windows, Raspbian and Raspberry Pi OS

Couchbase Sync Gateway: Red Hat Enterprise Linux (RHEL) 9.x, Alma Linux 9.x, Rocky Linux 9.x, Ubuntu, Debian (11.x, 12.x), Windows Server 2022

.Net
C
Go
Java
JavaScript info
Kotlin
PHP
Python
Ruby
Scala

✅

Couchbase Lite is available under different licenses; the open source Community Edition does not get regular updates and misses many features especially around Sync (e.g. it does not include Delta Sync making it slow and expensive)

Typically requires Couchbase servers, quickly gets expensive

🇺🇸

MongoDB Realm + Atlas Device Sync

MongoDB
(Realm was acquired by MongoDB in 2019)

Offline-First

✅

Cloud-based sync only

❌

Cloud: MongoDB, Edge: Mongo Realm DB

MongoDB: NoSQL document store; RealmDB: Embedded NoSQL DB

MongoDB: Linux
OS X
Solaris
Windows
Mongo Realm DB:
Android, iOS

more than 20 languages, e.g. Java, C, C#, C++

✅

MongoDB changed its license from open source (AGPL) to MongoDB Inc.’s Server Side Public License (SSPL) in 2018. RealmDB is open source under the Apache 2.0 License. The Data Sync was proprietary.

Deprecated (in Sep 2024); End-of-life in Sep 2025; ObjectBox offers a migration option

🇺🇸

While SQLite does not offer a sync solution out-of-the-box, various vendors have built something on top, or integrated with SQLite giving them offline persistence.

Key Considerations for Choosing a Data Sync Solution

When selecting a synchronization solution, consider:

Connectivity Requirements: Will the application function in offline environments; how will it work with flaky network conditions; how is the user experience when there is intermittent connectivity?
Data Privacy & Security: How critical is it to ensure sensitive data remains local? Data compliance? How important is it that data is not breached?
Scalability and Performance: What are the expected data loads and network constraints? How important is speed for the users? Is there any need to guarantee QoS parameters? How much will the cloud and networking costs be?
Conflict Resolution: How does the solution handle data conflicts?
Delta Sync: Does the solution always synchronize all data or only changes (data delta)? Can a subset of data be synchronized? How efficient is the Sync protocol (affecting costs and speed)?

The Shift Towards Edge Computing

The trend toward Edge Computing highlights the growing preference for offline-first solutions. By processing and storing data closer to its source, Edge Computing reduces cloud dependency, enhances privacy, and improves efficiency. Data synchronization plays an important role in this shift, ensuring seamless operation across decentralized networks.

Offline and online synchronization solutions each have their merits, but the rise of edge computing and data privacy concerns has propelled offline Sync to the forefront. Developers must assess their application’s unique requirements to select the most appropriate synchronization method. As the industry evolves, hybrid and offline-first solutions are going to dominate, offering the best balance of functionality, privacy, and performance.

Top Small Language Models (SLMs) and how local vector databases add power

by Vivien | Jan 20, 2025 | AI, Edge AI, Edge Database, Mobile Database, vector database

Can Small Language Models (SLMs) really do more with less? In this article, we discuss the unique strengths of SLMs, learn about the top SLMs, local vector databases, and how SLMs + local vector databases are shaping the future of AI, prioritizing privacy, immediacy, and sustainability.

The Evolution of Language Models

In the world of artificial intelligence (AI), bigger models were once seen as better. Large Language Models (LLMs) amazed everyone with their ability to write, translate, and analyze complex texts. But they come with big problems too: high costs, slow processing, and huge energy demands. For example, OpenAI’s latest GPT-o3 model can cost up to $6,000 per task. The annual energy consumption of GPT-3.5 is equivalent to powering more than 4000 US households for a year. That’s a huge price to pay, both financially and environmentally.

Now, Small Language Models (SLMs) are stepping into the spotlight, enabling sophisticated AI to run directly on devices (local AI) like your phone, laptop, or even a smart home assistant. These models not only reduce costs and energy consumption but also bring the power of AI closer to the user, ensuring privacy and real-time performance.

What Are Small Language Models (SLMs)?

LLMs are designed to understand and generate human language. Small Language Models (SLMs) are compact versions of LLMs. So, the key difference between SLMs and LLMs is their size. While LLMs like GPT-4 are designed with hundreds of billions of parameters, SLMs use only a fraction of that. There is no strict definition of SLM vs. LLM yet. At this moment, SLM sizes can be as small as single-digit million parameters and go up to several billion parameters. Some authors suggest 8B parameters as the limit for SLMs. However, in our view that opens up the question if we need a definition for Tiny Language Models (TLMs)?

Advantages and disadvantages of SLM

According to Deloitte’s latest tech trends report, SLMs are gaining increasing importance in the AI landscape due to their cost-effectiveness, efficiency, and privacy advantages. Small Language Models (SLMs) bring a range of benefits, particularly for local AI applications, but they also come with trade-offs.

Benefits of SLMs

Privacy: By running on-device, SLMs keep sensitive information local, eliminating the need to send data to the cloud.
Offline Capabilities: Local AI powered by SLMs functions seamlessly without internet connectivity.
Speed: SLMs require less computational power, enabling faster inference and smoother performance.
Sustainability: With lower energy demands for both training and operation, SLMs are more environmentally friendly.
Accessibility: Affordable training and deployment, combined with minimal hardware requirements, make SLMs accessible to users and businesses of all sizes.

Limitations of SLMs

The main disadvantage is the flexibility and quality of SLM responses: SLMs typically cannot tackle the same broad range of tasks as LLMs in the same quality. However, in certain areas, they already match their larger counterparts. For example, Artificial Analysis AI Review 2024 highlights that GPT-4o-mini (July 2024) has a similar Quality Index to GPT-4 (March 2023), while being 100x cheaper in price.

A recent study comparing various SLMs highlights the growing competitiveness of these models, demonstrating that in specific tasks, SLMs can achieve performance levels comparable to much larger models.

Overcoming limitations of SLMs

A combination of SLMs with local vector databases is a game-changer. With a local vector database, the variety of tasks and the quality of answers cannot only be enhanced but also for the areas that are actually relevant to the use case you are solving. E.g. you can add internal company knowledge, specific product manuals, or personal files to the SLM. In short, you can provide the SLM with context and additional knowledge that has not been part of its training via a local vector database. In this combination, an SLM can already today be as powerful as an LLM for your specific case and context (your tasks, your apps, your business). We’ll dive into this a bit more later.

In the following, we’ll have a look at the current landscape of SLMs – including the top SLMs – in a handy comparison matrix.

Top SLM Models


Model Name	Size (Parameters)	Company/ Team	License	Source	Quality claims
DistilBERT	66 M	Hugging Face	Apache 2	Hugging Face	"40% less parameters than google-bert/bert-base-uncased, runs 60% faster while preserving over 95% of BERT’s performances"
MobileLLM	1.5 B	Meta	Pre-training code for MobileLLM open sourced (Attribution-NonCommercial 4.0 International)	Arxiv.org	"2.7%/4.3% accuracy boost over preceding 125M/350M state-of-the-art models" "close correctness to LLaMA-v2 7B in API calling tasks"
TinyGiant (xLAM-1B)	1.3 B	Salesforce	Training set open sourced (Creative Commons Public Licenses); trained model will be open sourced	Announcement Related Research on Arxiv.org	"outperforming models 7x its size, including GPT-3.5 & Claude"
Gemma 2B	2 B	Google	Gemma license (not open source per definition, but seemingly pretty much unrestricted use), training data not shared	Huggingface	"The Gemma performs well on the Open LLM leaderboard. But if we compare Gemma-2b (2.51 B) with PHI-2 (2.7 B) on the same benchmarks, PHI-2 easily beats Gemma-2b."
Phi-3	3.8 B, 7 B	Microsoft	MIT License	Microsoft News	iPhone 14: Phi-3-mini processing speed of 12 tokens per second. From the H2O Danube3 benchmarks you can see that the Phi-3 model shows top performance compared to similar size models, oftentimes beating the Danube3
OpenELM	270M, 450M, 1.1B, 3B	Apple	Apple License, but pretty much reads like you can do as much with it as a permissive oss license (of course not use their logo)	Huggingface GitHub	OpenELM 1.1 B shows 1.28% (Zero Shot Tasks), 2.36% (OpenLLM Leaderboard), and 1.72% (LLM360) higher accuracy compared to OLMo 1.2 B, while using 2× less pretraining data
H2O Danube3	3-500M, 3-4B	H2O.ai	Apache 2.0	Arvix.org Huggingface	"competitive performance compared to popular models of similar size across a wide variety of benchmarks including academic benchmarks, chat benchmarks, as well as fine-tuning benchmarks"
GPT-4o mini	~8B (rumoured)	OpenAI	Proprietary	Announcement	GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in LMSYS leaderboard⁠. GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o
Gemini 1.5 Flash 8B	8B	Google	Proprietary	Announcement on Google for Developers	Smaller and faster variant of 1.5 Flash features half the price, twice the rate limits, and lower latency on small prompts compared to its forerunner. Nearly matches 1.5 Flash on many key benchmarks.
Llama 3.1 8B	8B	Meta	Llama 3.1 Community	Huggingface Artificial Analysis	MMLU score of 69.4% and a Quality Index across evaluations of 53. Faster compared to average, with a output speed of 157.7 tokens per second. Low latency (0.37s TTFT), small context window (128k).
Mistral-7B	7B	Mistral	Apache 2.0	Huggingface Artificial Analysis	MMLU score 60.1%. Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B (since Llama 2 34B was not released, we report results on Llama 34B). It is also vastly superior in code and reasoning benchmarks. Was the best model for its size in autumn 2023.
Ministral	3B, 8B	Mistral	Mistral Research License	Huggingface Artificial Analysis	Claimed (by Mistral) to be the world's best Edge models. Ministral 3B has MMLU score of 58% and Quality index across evaluations of 51. Ministral 8B has MMLU score of 59% and Quality index across evaluations of 53.
Granite	2B, 8B	IBM	Apache 2.0	Huggingface IBM Announcement	Granite 3.0 8B Instruct matches leading similarly-sized open models on academic benchmarks while outperforming those peers on benchmarks for enterprise tasks and safety.
Qwen 2.5	0.5B, 1.5B, 3B, 7B	Alibaba Cloud	Apache 2.0 (0.5B, 1.5B, 7B) Qwen Research (3B)	Huggingface Qwen Announcement	Models specializing in coding and solving Math problems. For 7B model, MMLU score 74.2%, context window (128k).
Phi-4	14 B	Microsoft	MIT License	Huggingface Artificial Analysis	Quality Index across evaluations of 77, MMLU 85%, Supports a 16K token context window, ideal for long-text processing. Outperforms Phi3 and outperforms on many metrices or is comparable with Qwen 2.5 , and GPT-4o-mini

SLM Use Cases – best choice for running local AI

SLMs are perfect for on-device or local AI applications. On-device / local AI is needed in scenarios that involve hardware constraints, demand real-time or guaranteed response rates, require offline functionality or need to comply with strict data privacy and security needs. Here are some examples:

Mobile Applications: Chatbots or translation tools that work seamlessly on phones even when not connected to the internet.
IoT Devices: Voice assistants, smart appliances, and wearable tech running language models directly on the device.
Healthcare: Embedded in medical devices, SLMs allow patient data to be analyzed locally, preserving privacy while delivering real-time diagnostics.
Industrial Automation: SLMs process language on edge devices, increasing uptime and reducing latency in robotics and control systems.

By processing data locally, SLMs not only enhance privacy but also ensure reliable performance in environments where connectivity may be limited.

On-device Vector Databases and SLMs: A Perfect Match

Imagine a digital assistant on your phone that goes beyond generic answers, leveraging your company’s (and/or your personal) data to deliver precise, context-aware responses – without sharing this private data with any cloud or AI provider. This becomes possible when Small Language Models are paired with local vector databases. Using a technique called Retrieval-Augmented Generation (RAG), SLMs access the additional knowledge stored in the vector database, enabling them to provide personalized, up-to-date answers. Whether you’re troubleshooting a problem, exploring business insights, or staying informed on the latest developments, this combination ensures tailored and relevant responses.

Key Benefits of using a local tech stack with SLMs and a local vector database

Privacy. SLMs inherently provide privacy advantages by operating on-device, unlike larger models that rely on cloud infrastructure. To maintain this privacy advantage when integrating additional data, a local vector database is essential. ObjectBox is a leading example of a local database that ensures sensitive data remains local.
Personalization. Vector databases give you a way to enhance the capabilities of SLMs and adapt them to your needs. For instance, you can integrate internal company data or personal device information to offer highly contextualized outputs.
Quality. Using additional context-relevant knowledge reduces hallucinations and increases the quality of the responses.
Traceability. As long as you store your metadata alongside the vector embeddings, all the knowledge you use from the local vector database can give the sources.
Offline-capability. Deploying SLMs directly on edge devices removes the need for internet access, making them ideal for scenarios with limited or no connectivity.
Cost-Effectiveness. By retrieving and caching the most relevant data to enhance the response of the SLM, vector databases reduce the workload of the SLM, saving computational resources. This makes them ideal for edge devices, like smartphones, where power and computing resources are limited.

Use case: Combining SLMs and local Vector Databases in Robotics

Imagine a warehouse robot that organizes inventory, assists workers, and ensures smooth operations. By integrating SLMs with local vector databases, the robot can process natural language commands, retrieve relevant context, and adapt its actions in real time – all without relying on cloud-based systems.

For example:

A worker says, “Can you bring me the red toolbox from section B?”
The SLM processes the request and consults the vector database, which stores information about the warehouse layout, inventory locations, and specific task history.
Using this context, the robot identifies the correct toolbox, navigates to section B, and delivers it to the worker.

The future of AI is – literally – in our hands

AI is becoming more personal, efficient, and accessible, and Small Language Models are driving this transformation. By enabling sophisticated local AI, SLMs deliver privacy, speed, and adaptability in ways that larger models cannot. Combined with technologies like vector databases, they make it possible to provide affordable, tailored, real-time solutions without compromising data security. The future of AI is not just about doing more – it’s about doing more where it matters most: right in your hands.

Learn about the rise and significance of Small Language Models in AI in this article.

The Critical Role of Databases for Edge AI

by Vivien | Nov 11, 2024 | AI, Data Sync, Edge AI, Edge Computing, Edge Database, Mobile Database, Sync

Edge AI vs. Cloud AI

Edge AI is where Edge Computing meets AI

What is Edge AI? Edge AI (also: “on-device AI”, “local AI”) brings artificial intelligence to applications at the network’s edge, such as mobile devices, IoT, and other embedded systems like, e.g., interactive kiosks. Edge AI combines AI with Edge Computing, a decentralized paradigm designed to bring computing as close as possible to where data is generated and utilized.

What is Cloud AI? As opposed to this, cloud AI refers to an architecture where applications rely on data and AI models hosted on distant cloud infrastructure. The cloud offers extensive storage and processing power.

An Edge for Edge AI: The Cloud

Example: Edge-Cloud AI setup with a secure, two-way Data Sync architecture

Today, there is a broad spectrum of application architectures combining Edge Computing and Cloud Computing, and the same applies to AI. For example, “Apple Intelligence” performs many AI tasks directly on the phone (on-device AI) while sending more complex requests to a private, secure cloud. This approach combines the best of both worlds – with the cloud giving an edge to the local AI rather than the other way around. Let’s have a look at the advantages on-device AI brings to the table.

Benefits of Local AI on the Edge

Enhanced Privacy. Local data processing reduces the risk of breaches.
Faster Response Rates. Processing data locally cuts down travel time for data, speeding up responses.
Increased Availability. On-device processing makes apps fully offline-capable. Operations can continue smoothly during internet or data center disruptions.
Sustainability/costs. Keeping data where it is produced and used minimizes data transfers, cutting networking costs and reducing energy consumption—and with it, CO2 emissions.

Challenges of Local AI on the Edge

Data Storage and Processing: Local AI requires an on-device database that runs on a wide variety of edge devices (Mobile,IoT, Embedded) and performs complex tasks such as vector search locally on the device with minimal resource consumption.
Data Sync: It’s vital to keep data consistent across devices, necessitating robust bi-directional Data Sync solutions. Implementing such a solution oneself requires specialized tech talent, is non-trivial and time-consuming, and will be an ongoing maintenance factor.
Small Language Models: Small Language Models (SLMs) like Phi-2 (Microsoft Research), TinyStories (HuggingFace), and Mini-Giants (arXiv) are efficient and resource-friendly but often need enhancement with local vector databases for better response accuracy. An on-device vector database allows on-device semantic search with private, contextual information, reducing latency while enabling faster and more relevant outputs. For complex queries requiring larger models, a database that works both on-device and in the cloud (or a large on-premise server) is perfect for scalability and flexibility in on-device AI applications.

On-device AI Use Cases

On-device AI is revolutionizing numerous sectors by enabling real-time data processing wherever and whenever it’s needed. It enhances security systems, improves customer experiences in retail, supports predictive maintenance in industrial environments, and facilitates immediate medical diagnostics. On-device AI is essential for personalizing in-car experiences, delivering reliable remote medical care, and powering personal AI assistants on mobile devices—always keeping user privacy intact.

Personalized In-Car Experience: Features like climate control, lighting, and entertainment can be adjusted dynamically in vehicles based on real-time inputs and user habits, improving comfort and satisfaction. Recent studies, such as one by MHP, emphasize the increasing consumer demand for these AI-enabled features. This demand is driven by a desire for smarter, more responsive vehicle technology.

Remote Care: In healthcare, on-device AI enables on-device data processing that’s crucial for swift diagnostics and treatment. This secure, offline-capable technology aligns with health regulations like HIPAA and boosts emergency response speeds and patient care quality.

Personal AI Assistants: Today’s personal AI assistants often depend on the cloud, raising privacy issues. However, some companies, including Apple, are shifting towards on-device processing for basic tasks and secure, anonymized cloud processing for more complex functions, enhancing user privacy.

ObjectBox for On-Device AI – an edge for everyone

The continuum from Edge to Cloud

ObjectBox supports AI applications from Edge to cloud. It stands out as the first on-device vector database, enabling powerful Edge AI on mobile, IoT, and other embedded devices with minimal hardware needs. It works offline and supports efficient, private AI applications with a seamless bi-directional Data Sync solution, completely on-premise, and optional integration with MongoDB for enhanced backend features and cloud AI.

Interested in extending your AI to the edge? Let’s connect to explore how we can transform your applications.

« Older Entries