Benchmarks Archives

The on-device Vector Database for Android and Java

by Markus | May 29, 2024 | AI, Android, Benchmarks, Edge AI, Edge Database, Mobile Database, ObjectBox

ObjectBox 4.0 is the very first on-device, local vector database for Android and Java developers to enhance their apps with local AI capabilities (Edge AI). A vector database facilitates advanced vector data processing and analysis, such as measuring semantic similarities across different document types like images, audio files, and texts. A classic use case would be to enhance a Large Language Model (LLM), or a Small Language Model (SLM, like e.g. the Phi-3), with your domain expertise, your proprietary knowledge, and / or your private data. Combining the power of AI models with a specific knowledge base empowers high-quality, perfectly matching results a generic model simply cannot provide. This is called “retrieval-augmented generation” (RAG). Because ObjectBox works on-device, you can now do on-device RAG with data that never leaves the device and therefore stays 100% private. This is your chance to explore this technology on-device.

Vector Search (Similarity Search)

With this release, it is possible to create a scalable vector index on floating point vector properties. It’s a very special index that uses an algorithm called HNSW (Hierarchical Navigable Small World). It’s scalable because it can find relevant data within millions of entries in a matter of milliseconds.

We pick up the example used in our vector search documentation. In short, we use cities with a location vector to perform proximity search. Here is the City entity and how to define a HNSW index on the location:

@Entity

data class City(

@Id var id: Long = 0,

var name: String? = null,

@HnswIndex(dimensions = 2) var location: FloatArray? = null

)

Vector objects are inserted as usual (the indexing is done automatically behind the scenes):

val box = store.boxFor(City::class)

box.put(

City(name = "Barcelona", location = floatArrayOf(41.385063f, 2.173404f)),

City(name = "Nairobi", location = floatArrayOf(-1.292066f, 36.821945f)),

City(name = "Salzburg", location = floatArrayOf(47.809490f, 13.055010f))

)

To perform a nearest neighbor search, use the new nearestNeighbors(queryVector, maxResultCount) query condition and the new “find with scores” query methods (the score is the distance to the query vector). For example, let’s find the 2 closest cities to Madrid:

val madrid = floatArrayOf(40.416775f, -3.703790f)

val query = box

.query(City_.location.nearestNeighbors(madrid, 2))

.build()

val results = query.findWithScores()

for (result in results) {

println("City: ${result.get().id}, distance: ${result.score}")

}

Vector Embeddings

In the cities example above, the vectors were straight forward: they represent latitude and longitude. Maybe you already have vector data as part of your data. But often, you don’t. So where do you get the vector emebeddings of texts, images, video, audio files from?

For most AI applications, vectors are created by a embedding model. There are plenty of embedding models to choose from, but first you have to decide if it should run in the cloud or locally. Online embeddings are the easier way to get started and great for first testing; you can set up an account at your favorite AI provider and create embeddings online (only).

Depending on how much you care about privacy, you can also run embedding models locally and create your embeddings on your own device. There are a couple of choices for desktop / server hardware, e.g. check these on-device embedding models. For Android, MediaPipe is a good start as it has embedders for text and images.

Updated open source benchmarks 2024 (CRUD)

A new release is also a good occasion to update our open source benchmarks. The Android performance benchmark app provides many more options, but here are the key results:

CRUD is short for the basic operations a database does: create, read, update and delete. It’s an important metric for the general efficiency of a database.

Disclaimer 1: our focus is the “Object” performance (you may find a hint for that in our product name 🙂); so e.g. relational systems may perform a bit better when you directly work with raw columns and rows.

Disclaimer 2: ObjectBox delete performance was cut off at 800k per second to keep the Y axis within reasonable bounds. The actually measured value was 2.5M deletes per second.

Disclaimer 3: there cannot be enough disclaimers on any performance benchmark. It’s a complex topic where details matter. It’s best if you make your own picture for your own use case. We try to give a fair “arena” with our open source benchmarks, so it could be a starting point for you.

Feedback and Outlook: On-device vector search Benchmarks for Android coming soon

We’re still working on a lot of stuff (as always ;)) and with on-device / local vector search being a completely new technology for Android, we need your feedback, creativity and support more than ever. We’ll also soon release benchmarks on the vector search. Follow us on LinkedIn, GitHub, or Twitter to keep up-to-date.

ObjectBox Database Java / Kotlin 3.0 + CRUD Benchmarks

by Markus | Oct 19, 2021 | Android, Benchmarks, Mobile Database, ObjectBox, Release

The Android database for superfast Java / Kotlin data persistence goes 3.0. Since our first 1.0-release in 2017 (Android-first, Java), we have released C/C++, Go, Flutter/Dart, Swift bindings, as well as Data Sync and we’re thrilled that ObjectBox has been used by over 800,000 developers.

We love our Java / Kotlin community ❤️ who have been with us since day one. So, with today’s post, we’re excited to share a feature-packed new major release for Java Database alongside CRUD performance benchmarks for MongoDB Realm, Room (SQLite) and ObjectBox.

What is ObjectBox?

ObjectBox is a high performance database and an alternative to SQLite and Room. ObjectBox empowers developers to persist objects locally on Mobile and IoT devices. It’s a NoSQL ACID-compliant object database with an out-of-the-box Data Sync providing fast and easy access to decentralized edge data (Early Access).

New Query API

A new Query API is available that works similar to our existing Dart/Flutter Query API and makes it easier to create nested conditions:

// Kotlin

// equal AND (less OR oneOf)

val query = box.query(

User_.firstName equal "Joe"

and (User_.age less 12

or (User_.stamp oneOf longArrayOf(1012))))

.order(User_.age)

.build()

// Java

// equal AND (less OR oneOf)

Query<User> query = box.query(

User_.firstName.equal("Joe")

.and(User_.age.less(12)

.or(User_.stamp.oneOf(new long[]{1012}))))

.order(User_.age)

.build();

In Kotlin, the condition methods are also available as infix functions. This can help make queries easier to read:

1	val query = box.query(User_.firstName equal "Joe").build()

Unique on conflict replace strategy

One unique property in an @Entity can now be configured to replace the object in case of a conflict (“onConflict”) when putting a new object.

// Kotlin

@Entity

data class Example(

@Id

var id: Long = 0,

@Unique(onConflict = ConflictStrategy.REPLACE)

var uniqueKey: String? = null

)

// Java

@Entity

public class Example {

@Id

public long id;

@Unique(onConflict = ConflictStrategy.REPLACE)

String uniqueKey;

}

This can be helpful when updating existing data with a unique ID different from the ObjectBox ID. E.g. assume an app that downloads a list of playlists where each has a modifiable title (e.g. “My Jam”) and a unique String ID (“playlist-1”). When downloading an updated version of the playlists, e.g. if the title of “playlist-1” has changed to “Old Jam”, it is now possible to just do a single put with the new data. The existing object for “playlist-1” is then deleted and replaced by the new version.

Built-in string array and map support

String array or string map properties are now supported as property types out-of-the-box. For string array properties it is now also possible to find objects where the array contains a specific item using the new containsElement condition.

// Kotlin

@Entity

data class Example(

@Id

var id: Long = 0,

var stringArray: Array<String>? = null,

var stringMap: MutableMap<String, String>? = null

)

// matches [“first”, “second”, “third”]

box.query(Example_.stringArray.containsElement(“second”)).build()

// Java

@Entity

public class Example {

@Id

public long id;

public String[] stringArray;

public Map<String, String> stringMap;

}

// matches [“first”, “second”, “third”]

box.query(Example_.stringArray.containsElement(“second”)).build();

Kotlin Flow, Android 12 and more

Kotlin extension functions were added to obtain a Flow from a BoxStore or Query:

1	val flow = box.query().subscribe().toFlow()

Data Browser has added support for apps targeting Android 12.

For details on all changes, please check the ObjectBox for Java changelog.

Room (SQLite), Realm & ObjectBox CRUD performance benchmarks

We compared against the Android databases, MongoDB Realm and Room (on top of SQLite) and are happy to share that ObjectBox is still faster across all four major database operations: Create, Read, Update, Delete.

Android database comparative benchmarks for ObjectBox, Realm, and Room

We benchmarked ObjectBox along with Room 2.3.0 using SQLite 3.22.0 and MongoDB Realm 10.6.1 on an Samsung Galaxy S9+ (Exynos) mobile phone with Android 10. All benchmarks were run 10+ times and no outliers were discovered, so we used the average for the results graph above. Find our open source benchmarking code on GitHub and as always: feel free to check them out yourself. More to come soon, follow us on Twitter or sign up to our newsletter to stay tuned (no spam ever!).

Using a fast on-device database matters

A fast local database is more than just a “nice-to-have.” It saves device resources, so you have more resources (CPU, Memory, battery) left for other resource-heavy operations. Also, a faster database allows you to keep more data locally with the device and user, thus improving privacy and data ownership by design. Keeping data locally and reducing data transferal volumes also has a significant impact on sustainability.

Sustainable Data Sync

Some data, however, you might want or need to synchronize to a backend. Reducing overhead and synchronizing data selectively, differentially, and efficiently reduces bandwidth strain, resource consumption, and cloud / Mobile Network usage – lowering the CO2 emissions too. Check out ObjectBox Data Sync, if you are interested in an out-of-the-box solution.

Get Started with ObjectBox for Java / Kotlin Today

ObjectBox is free to use and you can get started right now via this getting-started article, or follow this video.

Already an ObjectBox Android database user and ready to take your application to the next level? Check out ObjectBox Data Sync, which solves data synchronization for edge devices, out-of-the-box. It’s incredibly efficient and (you guessed it) superfast 😎

We ❤️ your Feedback

We believe, ObjectBox is super easy to use. We are on a mission to make developers’ lives better, by building developer tools that are intuitive and fun to code with. Now it’s your turn: let us know what you love, what you don’t, what do you want to see next? Share your feedback with us, or check out GitHub and up-vote the features you’d like to see next in ObjectBox.

ObjectBox Dart/Flutter v0.11 Database: Performance & Relations

by Markus | Feb 2, 2021 | Benchmarks, Mobile Database, ObjectBox, Release | 4 comments

Flutter Databases are few. Therefore, we’re happy to take a big step towards 1.0 with this ObjectBox Dart v0.11 release, improving performance and bringing the much-desired relations support known from other ObjectBox DB language bindings to Dart/Flutter.

For those of you new to ObjectBox: ObjectBox is a superfast NoSQL object database for Flutter / Dart and here is how you can save data in your Dart / Flutter apps:

@Entity()

class Person {

int id;

String firstName;

String lastName;

}

final store = Store(getObjectBoxModel());

final box = store.box<Person>();

var person = Person()

..firstName = "Joe"

..lastName = "Green";

final id = box.put(person); // Create

person = box.get(id); // Read

person.lastName = "Black";

box.put(person); // Update

box.remove(person.id); // Delete

// find all people whose name start with letter 'J'

final query = box.query(Person_.firstName.startsWith('J')).build();

final people = query.find(); // find() returns List<Person>

To learn about more ObjectBox features, like relations, queries and data sync, check our ObjectBox pub.dev page.

How fast is ObjectBox Dart? Performance Benchmarks

Speed is important for data persistence solutions. Accordingly, we wanted to test how ObjectBox compares performance-wise to other Flutter Dart database options. Therefore, we looked for libraries with comparable levels of storage abstraction and feature set – so not just plain SQL/Key-value storage but also ORM-like features. There doesn’t seem to be that much choice…

We looked at some two popular approaches: sqflite a SQLite wrapper for Flutter (no Dart Native support), and Hive, a key-value store with Class-adapters which seems still popular although its technology is phased out (see below). As a third alternative we pulled in Firestore, which does not really fit as it is no local database, but would be fun to compare anyway.

What we tested

To get an overview of the databases, we tested CRUD operations (create, read, update, delete). Each test was run multiple times and executed manually outside of the measured time. Data preparation and evaluation was also done outside of the measured time.

We tried to keep the test implementations as close as possible to each other while picking the approaches recommended by the docs for each database. We open sourced the test code at https://github.com/objectbox/objectbox-dart-performance if you want to have a closer look.

Performance Benchmark Results

Looking at the results, we can see ObjectBox performing significantly faster than sqflite across the board, with up to 70 times speedup in case of create & update operations. Compared to Hive, the results are a little more varied, with Hive being faster at reading objects than ObjectBox (we come to that later in our outlook), and ObjectBox being faster at creating objects, about four times faster at updates and three times for deletes. As a mostly-online database, it becomes clear that Firestore’s performance is not really comparable.

Implementation notes

ObjectBox: This release largely boosted the performance. The remaining bottlenecks are due to Dart itself and how it allows to modify byte buffers. There’s potential to double the speed if we look at other languages supported by ObjectBox. And if that’s not happening soon, we’d still have the option to do some low-level hacks…

Sqflite: a wrapper around SQLite, which is a relational database without direct support for Dart objects. Each dart object field is mapped to a column in the database, as per sqflite docs, i.e. converting between the Dart class and a Map.

Hive: We’ve tested with the latest Hive release, which is technically discontinued. The author hit two architectural roadblocks (RAM usage and queries) and is currently in the process to do a rewrite from scratch.
Update: strictly speaking it’s not straightforward to directly compare e.g. ObjectBox vs. Hive. In Hive, the high read numbers result from Dart objects already cached in memory. If the objects are fetched using the async API from disk, the numbers drop by factor 1000.

Firestore: This is totally apples and oranges, but we still decided to include Firebase/Firestore as it seems at least somewhat popular to “persist data”. It’s quite Cloud centric and thus offers limited offline features. For example, in order to use batches (“transactions”), an internet connection is required to “commit”. Also, due to its low performance, the test configuration was different: batches of 500 objects and only 10 runs.

Test setup

We ran the benchmarks as a Flutter app on a Android 10 device with a Kirin 980 CPU. The app executed all operations in batches of 10.000 objects, with each batch forming a single transaction. Each test was run 50 times, averaging the results over all the runs. This ensured the VM warmup (optimization) during the first run and garbage collections don’t affect the overall result significantly. (We care about accurate benchmarks; read more about our benchmarking best practices here.)

Outlook

With this latest release, we’re not far away from a stable API for a 1.0 release (🎉), so please share your thoughts and feedback. For the next release, we’ll add features like async operations, more relation types and some smaller improvements. We are also working on an ObjectBox variant for the Web platform that is planned close to the 1.0 release. And of course there is ObjectBox Data Sync for Flutter/Dart. If you want to be first in line to try, drop us a line, we can put you on the shortlist.

ObjectBox EdgeX v1.1 – database with ARM32 support

by Uwe | Nov 27, 2019 | Benchmarks, IoT, Open Source, Release

With EdgeX Foundry just reaching v1.1, we continue to provide ObjectBox as an embedded high-performance database backend so you can start using ObjectBox EdgeX v1.1 right away. If you need data reliability and high-speed database operations, ObjectBox is for you. Additionally, starting with ObjectBox EdgeX 1.1, you can use it on 32-bit ARM devices.

Combining the speed and size advantages of ObjectBox on the EdgeX platform, we empower companies to analyze more data locally on the machine, enabling new use cases.

With ObjectBox-backed EdgeX we’re bringing the efficiency, performance and small footprint of the ObjectBox database to all EdgeX applications. It is fully compatible, so you can use it as a drop-in replacement: you call against the same REST and Go EdgeX APIs. As simple as that;no need to change any code.

Performance comparison of EdgeX database backends

EdgeX Foundry comes with a choice of two database engines: MongoDB and Redis. ObjectBox EdgeX brings an alternative to Redis and MongoDB to the table. Because ObjectBox is an embedded database, optimized for high speed and ease of use while also delivering data reliability, it enables a new set of use cases. As we all know, benchmarks are hard to do. This is why all our benchmarks are open source and we invite you to check them out for yourself. To give you a quick impression of how you could benefit from using ObjectBox, let’s have a look at how each compares in basic database operations on “Device Readings”, one of the most performance intensive data points.

Note: The Read and Write operations (all CRUD (Create, Read, Update, Delete) operations are measured in objects / second). The benchmarks test internal EdgeX database layer performance, not the REST APIs throughput.

These benchmarks provide a good perspective why you should consider ObjectBox with EdgeX. Benchmark sources are available publicly in ObjectBox EdgeX github repo.

So, why is ObjectBox EdgeX faster?

First of all, you are probably aware of the phrase “Lies, damned lies, and ~~statistics~~ benchmarks”. Of course, you should look at performance for yourself and consider based on your specific use case needs. That’s why we make our benchmarks available as open source. It is a good starting point.

To make it easier to compare ObjectBox (in addition to our open source benchmarks) here are some of the high-level “unfair advantages” that make ObjectBox fast:

Objects: As you can derive from its name, ObjectBox is all about for objects. It’s highly optimized for persisting objects. The EdgeX architecture and Go sources are a great fit here as it puts Go’s objects (structs) in the center of its interface. This means, we can omit costly transformations and this helps with speed.
Embedded database: Redis and MongoDB are client/server databases running in separate processes. ObjectBox, however, is running in the same process as EdgeX itself (each EdgeX microservice, to be precise). This has definite efficiency advantages, but it also comes with some restrictions: Whereas you can put Redis/MongoDB in separate Dockers or machines, this option is not available for ObjectBox yet.
Transaction merging: ObjectBox can execute individual write operations in a common database transaction. This means, we can reduce the costly transactions for a number of write operations. This is a great way to add on performance, delaying the transaction end by single digit milliseconds.

Get started with ObjectBox EdgeX

The simplest way to get started is to fetch the latest docker-compose.yml and start the containers:

1	wget https://raw.githubusercontent.com/objectbox/edgex-objectbox/fuji/bin/docker-compose.ymldocker-compose up -d

You can check the status of your running services by going to http://localhost:8500/. At this point, you have the REST services running at their respective ports, available to access from your EdgeX applications.

Find more details, instructions for ARM32, and sources in our GitHub repo at https://github.com/objectbox/edgex-objectbox.

If you’re new to EdgeX, find out all about the open source IoT Edge Platform here. The EdgeX project is led by the Linux Foundation and supported by many industry players, including Dell, IBM, and Fujitsu.

We love to hear from you ?

We’re very interested to hear about the challenges you are facing on the edge and in IoT. As performance experts, we are always embracing a tough challenge. Reach out to us to set up a pilot project.

Is there something you are missing? Please do reach out to us. We want to make ObjectBox the best edge data persistence layer available. We love to receive your feedback.

What next?

Find out more about ObjectBox EdgeX and get started, go directly to GitHub or download the snap on Snapcraft.

Get Started with ObjectBox on EdgeX >>

How to benchmark database performance – and ObjectBox

by Uwe | Aug 27, 2019 | Android, Benchmarks, Insights, IoT

Benchmarking the performance of databases is a science in and of itself and it’s hard to get reliable and comparable results. Therefore, we decided to note down some standard patterns and pitfalls when doing database benchmarks we learned about over the years. We are including specific notes and things to consider when benchmarking ObjectBox.

Designing a benchmarking test

Phrase your research question

When you want to benchmark databases, you will usually have a specific use case in mind and want to answer questions regarding that case. Therefore, before writing your first line of code, have a look at what it is you want to benchmark, why, and what statement you want to be able to make at the end. Just writing down a simple question like “Is X faster than Y doing A?” can help you verify that the finished benchmark actually measures what you want it to.

Start documenting

Whatever you do, document each step diligently. Benchmarks that are not properly documented are challenging to reproduce, and thus of limited worth. So, the mantra in benchmarking really is: Document, document, document. That way it is also easier to go back later and make adjustments if needed.

Select the sample

If you have a clear use case and goal in mind, this probably determines the type of databases you are going to consider for your benchmark. Generally speaking, benchmarks should compare databases of similar type and not mix approaches that are too different. A database might be designed to run on a cluster of servers and not, like ObjectBox, on a constrained device (e.g. a smartphone, Raspberry Pi or IoT device). Or data may be stored as documents (e.g. NoSQL) and not in a relational table-like structure (e.g. SQL); or on disk and not in-memory. Consistency guarantees can also make a difference (ACID-compliance vs. not transactionally safe). Document why you chose these databases for comparison.

Become an expert

Once you have decided on what databases and features to compare, familiarize yourself with the APIs of the products, e.g. by reading the documentation or looking at code examples. Making wrong assumptions about how a feature works can skew your results and make a product appear much faster or slower than it actually is. If the code is open source, it can also help to dive into the code to see how things actually work. For example: an insert function runs asynchronously so the function call returns immediately, but the actual insert still executes in the background. To correctly measure the actual time it takes to complete the insert, you need to do some additional work.

Things to watch out for when coding benchmarks

Let’s move on to some specific coding tips. A benchmark is only as good as its time measurement. Check what APIs your test platform offers to get the time; they might be affected by how and from which thread they are called. Make sure, you find an accurate way to measure the time that does not skew up your results. For example, on Android there are numerous possibilities beyond using currentTimeMillis().

If a measured code block executes so fast it is near the available precision of your clock (e.g. time is in milliseconds, but code takes microseconds), consider running it multiple times and measure the total time of all runs.

Next, before starting each measurement, make sure to free up memory and clear references no longer in use by the previous measurement or setup code. If the environment your benchmark is written for uses a garbage collector, check if it can be triggered manually to free memory (e.g. System.gc() in Java). Otherwise each consecutive measurement might be skewed due to less and less memory being available or a garbage collector halting execution to free memory. In your benchmarking results, you should look out for strange results like a continual decrease in performance from run to run.

Furthermore, take into account that some runtime environments, for example the Java Virtual Machine, do just-in-time compilation. This can cause a delay the first time code is executed, but provide better performance on subsequent executions. The effect of this on the final result can however be minimized by proper testing procedure, e.g. by running a code block multiple times instead of once and measuring the total execution time.

Then something obvious, but easily overlooked, is to ensure that between the start and end of a measurement only the functionality that you actually want to compare is executed. So avoid or turn off logging (a seemingly innocent string concatenation can skew results) and construct test data outside of a measured code block.

To be absolutely sure that your code is doing what you intended it to, use a profiler to inspect resource usage during a trial run. IDEs like Android Studio and Xcode come with an embedded profiler, and there are also several standalone profilers to choose from.

Optimizing benchmarks for meaningful results – with ObjectBox examples

Before benchmarking the chosen databases, make sure you understand their differences and default settings to adjust all settings to be comparable. In the following section, we will go through the most important differences and settings you need to look out for based on ObjectBox as an example.

Transactions, Durability, Consistency, ACID – how to make your benchmarks comparable

First, be aware of the impact the use of transactions or lack thereof can have. For databases, committing a transaction is an expensive operation as it requires waiting until the disk has safely stored the data. If possible, group multiple operations into a single transaction. For example, in the ObjectBox code snippet below, there is a notable speed-up when wrapping multiple box operations into a transaction block.

boxStore.runInTx(() -＞ {

// Would be slower if run outside of a transaction.

for (User user : allUsers) {

if (modify(user)) {

box.put(user);

} else {

box.remove(user);

}

});

Speaking of transactions, also check if using bulk operations is possible. These also use transactions to speed up execution. E.g. instead of performing a put on each entity in a list, put the whole list.

// Total cost: allUsers.size transactions.

for (User user : allUsers) {

modify(user);

box.put(user); // Implicit transaction

}

// Total cost: 1 transaction.

for (User user : allUsers) {

modify(user);

}

box.put(allUsers); // Implicit transaction.

The ObjectBox transactions docs provide more details and are available for Java, Swift or Go – though the basic principle is the same across languages.

Second, and closely related to transactions, are durability guarantees when writing data. This is about the “D” in the popular ACID acronym (Atomic, Consistent, Isolated and Durable). ObjectBox transactions and standard (non-async) operations are fully ACID-compliant.

Thus, pay close attention to what durability modes other tested databases guarantee, or respectively, which durability mode you want to measure. Most NoSQL databases don’t give hard durability guarantees. Some provide an extra command or special mode to enforce durability. Therefore, if your use case needs to ensure data is actually stored safely after a write operation, you would need to enable this durability for other databases when comparing to ObjectBox.

On the other hand, if you are interested in scenarios that emphasize performance over durability, you should look into the OjectBox async APIs. Those don’t come with durability guarantees unless you define “checkpoints” in your code to wait for async operations.

Indexing – how to make your database queries efficient

Third, when measuring query operations, see if you can use indexes, another typical database optimization. If the database has an index on a property that is used in a query condition, it can find matches much faster.

// Adding an index in a fictitious User class:

@Index String name;

// Makes this query faster:

userBox.query()

.equal(User_.name, "Jane")

.build().find();

An index makes queries “scalable” – the more objects are stored, the more an index makes sense. Without an index, a database has to do a “full scan” over all potential results.

Number of objects and the disk bottleneck – how to measure the database and not the disk

And lastly, keep an eye on the number of objects you operate on. For example, if you put a single object, something like 99.99% percent will be spent on disk I/O. Thus, if you test this on several databases, the chance of getting about the same results on all databases is quite high. The limiting factor is always the disk. So if you want to measure the efficiency of converting objects into their persisted counterparts instead, you should look at much higher object counts to factor the disk out of the equation. Depending on the disk and device speed, a bulk “put” of 10K, 100K or 1 million objects will make more sense to measure in this context.

Multi threaded tests – how to set writers and readers

ObjectBox is build upon a multiversion concurrency control foundation, and thus is ready for multi threaded access. Each thread will have a consistent transactional view of the data. ObjectBox differentiates between “readers” (a thread currently in a read transaction) and “writers” (a thread currently in a write transaction). Readers never block; no matter what goes on in other threads. However, a writer will block other writers when using standard transactions. Writers are sequential; only a single writer can be run at any time. Thus, if your load is write-heavy in multiple threads, you may want to look at the asynchronous APIs of ObjectBox. These handle write operations very efficiently; no matter how many threads are involved.

Extending the database size limit – size matters

By default, ObjectBox limits the database size to 1 GB to avoid filling up the disk by accident (e.g. your code has a bug in a data insertion loop). So if you use large data sets to benchmark, we recommended increasing the maximum database size when building the box store.

MyObjectBox.builder()

.maxSizeInKByte(10 * 1024 * 1024 /* 10 GB */)

.build()

Preparing the benchmarks: devices and platforms

Once your benchmark code is ready, it’s time to set up the test device(s). The first step is to ensure other apps or processes are not doing (too much) work while your measurements run. Ideally, you would use a clean device with no apps installed or services configured. This is especially true for mobile devices: once you connect them and they start charging, the operating system might wake numerous apps to perform background work or network updates. You can somewhat avoid this by switching on airplane mode. And to be safe, wait a few seconds after connecting the device.

Also ensure, this is again important for mobile devices, that your device does not enter a low power mode which reduces performance. For example on Android, keeping the screen on typically prevents that. When running on a laptop, check whether the power supply is plugged in and which power mode your operating system is set to. You may explicitly want to test in a certain power state or with the default behavior.

And just to mention it, make sure to turn off any profiling or monitoring tools that you used during building your benchmark. They can significantly skew your results.

When running on multiple devices, pay attention to the differences in hardware, like available memory and processor model, but also in software, like the operating system version. Different hardware and software might have different optimizations that can skew your results. Again, do not forget to document your device and software setup, so results can be properly interpreted and reproduced.

Running the database benchmark and collecting results

When it is time to run the benchmark it’s good to run it not once, not twice, but many times. This minimizes the impact of the various side effects we discussed above. Output the results of each run into a comma separated (CSV) or tabulator separated (TSV) format, that can easily be imported in a spreadsheet application for analysis. You can look at how the ObjectBox benchmark does it.

Once you have collected some data, verify its quality. You might have already spotted outliers while perusing the results. Alternatively, calculate the average of all measurements and see if it deviates a lot from their median value. If you are familiar with it, look for a high variance value instead. Too many outliers or a high variance might hint at side effects of your benchmark code or device setup you have not considered. Better double check to be sure.

Also coming soon ObjectBox time series which will provide users an intuitive dashboard to see patterns behind the data. It will further help users to track thousands of data points/second in real-time.

How to publish meaningful database performance benchmarks

The final step is to share your results with the world (or just your team). Make sure it’s clear what exactly you have measured and how you have arrived at those results. So include which device and software was used, how they were configured, what you did before running the benchmark and how the benchmark was run.

If possible, share the generated raw data so others can verify that your calculations, and remember to lead to the published results. Even better, publish the source code as well, so others can run it on other devices or help you spot and fix issues.

Last not least: Be careful to draw conclusions. Rather let the data speak for itself. Respond to questions and feedback. Be honest, if you learn your benchmarks may be skewed and update them. In the end, everyone wins by getting more meaningful results.

Your content goes here. Edit or remove this text inline or in the module Content settings. You can also style every aspect of this content in the module Design settings and even apply custom CSS to this text in the module Advanced settings.