fbpx
Edge Database Comparison: SQLite and SQLite alternatives

Edge Database Comparison: SQLite and SQLite alternatives

SQLite and SQLite alternatives for the Mobile and IoT edge

Updated comparison of mobile databases / edge databases

Note: This is an updated version of an earlier Mobile Database Comparison. Last Update: 2020.

What is a mobile database?

While Wikipedia defines a mobile database as “Mobile computing devices (e.g., smartphones and PDAs) store and share data over a mobile network, or a database which is actually stored by the mobile device,” we solely refer to the latter ones as a mobile database. Meaning only databases that run on the mobile device (as the edge device) itself, locally, and store the data on the device. Therefore, we also refer to it as “on-device” database.

What is an edge database?

The term edge database is too young to have a Wikipedia article. However, we see it used in the IoT industry increasingly. In the field of IoT applications, it is important to distinguish databases that run locally – “on the edge” – as opposed to “in the cloud”. A mobile database is a subset of edge databases, meaning the only difference is the device the database runs on. The main difference is the operating system support: There simply are edge databases that do not run on Android and / or iOS. Thus, these databases, while small enough for Edge Computing, and indeed qualifying as edge databases, are not suited for typical mobile devices, and therefore no mobile database.

What is an edge device?

An edge device may be any device from a sensor, to an IoT gateway, to a car, to a Raspberry Pi, to a mobile phone (smartphone) to an on-premise server. Typically, the challenge arises when running on the smaller, more restricted devices. Generally, any database can run on a big on-premise server or cloud infrastructure with unlimited resources, but only few fit on a Raspberry Pi Zero. The other way around is no issue, meaning an edge databases can run well on a server. Therefore, we look at databases that run on Raspberry Pi type size of devices (Rule of thumb).

Edge Devices to run mobile /edge databases on

What are the advantages and disadvantages of working with SQLite?

SQLite is easily the most established edge database and probably the only “established” mobile database. SQLite is public domain and maintained by Richard Hipp. SQLite database has been around since the year 2000 and been embedded with iOS and Android since the beginning. SQLite is a relational database.

Advantages  Disadvantages
  • Toolchain, e.g. DB browser
  • No dependencies, is included with Android and iOS
  • Developers can define exactly the data schema they want
  • Developers have full control, e.g. handwritten SQL queries
  • SQL is a powerful and established query language, and SQLite supports most of it
  • Debuggable data: developers can grab the database file and analyze it
  • Rock-solid, widely used technology, established since the year 2000
  • Using SQLite means a lot of boilerplate code and thus inefficiencies (also in the long run with the app maintenance)
  • 1 MB BLOB Limitation on Android
  • No compile time checks (e.g. SQL queries)
  • The performance of SQLite is unreliable
  • SQL is another language to master
  • SQL queries can get long and complicated
  • Testability (how to mock a database?)
  • Especially when database views are involved, maintainability may suffer with SQLite

 

 

What are SQLite alternatives?

There are plenty of alternatives to working with SQLite directly. If you simply want to avoid writing lots of SQL and boilerplate code, you can use an object abstraction on top of SQLite. This abstraction layer is usually an ORM (object/relational mapper), e.g. greenDAO. While an ORM makes it easy to use SQLite at the beginning, there typically comes a point “where you hit SQLite”; so even when using an abstraction layer you need to understand SQLite and SQL in the longrun.

However, if you rather seek a complete replacement for SQLite, there are a few alternative databases: Couchbase Lite, Interbase, LevelDB, ObjectBox, Oracle Berkeley DB (formerly Oracle’s mobile database was “Oracle Database Lite”), Realm (now Mongo Realm), SnappyDB, SQL Anywhere, and UnQLite.

Obviously, if your also looking for alternatives that run in the cloud, there are a lot of cloud / server options out there that you can use as a replacement like e.g. Firebase. Though, with these your app will not work offline, response rates will be slower than with an on-device database and cannot be guaranteed, and last not least you will have much higher networking / cloud costs. You can find out more about the benefits of Edge Computing.

To give you an overview, we have compiled a small comparison table:

Edge Database Android / iOS* Type of data stored Sync Central Sync P2P Offline Sync Data level encryption License / business model Short description Minimum Footprint size Company
Azure SQL Edge (in preview) No Relational DB No No No will provide encryption Proprietary Designed as a SQL database for the IoT edge; however, due to the footprint it is no edge database 500 MB+ Microsoft
Couchbase Mobile (prior Couchbase Lite) Android / iOS JSON Documents / NoSQL db Yes Yes No Database encryption with SQLCipher (256-bit AES) Apache 2.0 Embedded / portable database with P2P and central synchronization (sync) support. Secure SSL. < 3,5 MB Couchbase
extremeDB iOS In-memory relational DB, hybrid persistence No No No AES encryption Proprietary Embedded relational database 200kB McObject LLC
ForestDB Android / iOS Key-value pairs / NoSQL db No No No No Apache 2.0 Portable lightweight key-value store, NoSQL database    
InterBase ToGo / IBLite Android / iOS Relational No No No 256 bit AES strength encryption Proprietary Embeddable SQL database. 400 KB Embarcadero
LevelDB Android / iOS Key-value pairs / NoSQL db No No No No New BSD Portable lightweight key-value store, NoSQL, no index support; benchmarks from 2011 have been removed unfortunately. 350kB LevelDB
Team
LiteDB Android / iOS (with Xamarin only) NoSQL document store No No No Salted AES MIT license A .Net embedded NoSQL database 350kb  
Mongo Realm (acquired in 2019) Android / iOS Object Database Yes No   Yes Proprietary with Apache 2.0 License APIs Embedded object database 5 MB+ Realm Inc
ObjectBox Android / iOS / Linux / Windows / any POSIX Object-oriented NoSQL edge database for high-performance on edge devices in Mobile and IoT Yes WIP Yes transport encryption; additional encryption upon request Apache 2.0 and Proprietary Embedded object-oriented NoSQL high-performance edge database with out-of-the-box data synchronization; fully ACID compliant; benchmarks available. < 1 MB ObjectBox
Oracle Database Lite Android / iOS Relational Yes Yes No 128-bit AES Standard encrytion Proprietary Portable with P2P and central sync support as well as support for sync with SQLite < 1 MB Oracle Corporation
redis DB No K/V in-memory store, typically used as cache No No No TLS/SSL-based encryption can be enabled for data in motion. Three clause BSD license, RSAL and Proprietary High-performance in-memory Key Value store with optional durability An empty instance uses ~ 3MB of memory. redislabs (the original author of redis left in 2020)
Snappy DB Android Key-value pairs / NoSQL db No No   No Apache 2.0 Portable lightweight key-value store, NoSQL database based on LevelDB   Nabil HACHICHA 
SQL Anywhere Android / iOS Relational Dependent No   AES-FIPS cipher encryption for full database or selected tables Proprietary Embedded / portable database with central snyc support with a stationary database   Sybase iAnywhere
SQLite embedded on iOS and Android Relational No No   No, Use SQLCipher to encrypt SQLite Public domain C programming library; probably 90% market share (very personal assumption, 2016) 500KiB Hwaci
SQL Server Compact Android / iOS Relational No No   Yes Proprietary Small-footprint embedded / portable database for Microsoft Windows mobile devices and desktops, supports synchronization with Microsoft SQL Server 2 MB Microsoft
UnQLite Android / iOS Key-value pairs / document store / NoSQL db No No     2-Clause BSD Portable lightweight embedded db; self-contained C library without dependency.   Symisc systems

Side note: According to the database of databases there are more than 700 databases as of 2020. However, that list does include hobby projects. DB-engines “only” lists databases that have significant traction and are well-maintained; they still count more than 300 databases as of 2020.

If you are interested in an indication of the diffusion rate of databases and mobile databases, check out the following database popularity ranking: http://db-engines.com/en/ran.

Thanks for reading and sharing. Please let us know what you’re missing.

Time Series & Objects: Using Data on the Edge

Time Series & Objects: Using Data on the Edge

Many IoT projects collect, both time series data and other types of data. Typically, this means they will run two databases: A time-series database and a traditional database or key/value store. This creates fracture and overhead, which is why ObjectBox TS brings together the best of both worlds in one database (DB). ObjectBox TS is a hybrid database: an extremely fast object-oriented DB plus a time-series extension, specially optimized for time series data. In combination with its tiny footprint, ObjectBox is a perfect match for IoT applications running on the edge. The out-of-the-box synchronization takes care of synchronizing selected data sets super efficiently and it works offline and online, on-premise, in the cloud.

time-series-data-example-temperature

What is time series data?

There are a lot of different types of data that are used in IoT applications. Time-series is one of the most common data types in analytics, high-frequency inspections, and maintenance applications for IIoT / Industry 4.0 and smart mobility. Time series tracks data points over time, most often taken at equally spaced intervals. Typical data sources are sensor data, events, clicks, temperature – anything that changes over time.

Why use time series data on the edge?

Time-series data sets are usually collected from a lot of sensors, which sample at a high rate – which means that a lot of data is being collected.

For example, if a Raspberry Pi gateway collects 20 data points/second, typically that would mean 1200 entries a minute measuring e.g. 32 degrees. As temperatures rarely change significantly in short time frames, does all of this data need to go to the cloud? Unless you need to know the exact temperature in a central location every millisecond, the answer is no. Sending all data to the cloud is a waste of resources, causing high cloud costs without providing immediate, real-time insights.

time-series-edge

The Best of Both Worlds: time series + object oriented data persistence

With ObjectBox you aren’t limited to only using time series data. ObjectBox TS is optimized for time series data, but ObjectBox is a robust object oriented database solution that can store any data type. With ObjectBox, model your world in objects and combine this with the power of time-series data to identify patterns in your data, on the device, in real time. By combining time series data with more complex data types, ObjectBox empowers new use cases on the edge based on a fast and easy all-in-one data persistence solution. 

Bring together different data streams for a fusion of data; mix and match sensor data with the ObjectBox time series dashboard and find patterns in your data. On top, ObjectBox takes care of synchronizing selected data between devices (cloud / on-premise) efficiently for you.

time-series-data-visualization-dashboard

Get a complete picture of your data in one place

Use Case: Automotive (Process Optimization)

While most manufacturers, whether they’re producing cars, the food industry, or utilities, have been optimizing production for a long period of time. However, there are still many cases and reasons why costly manual processes prevail.  One such example is automotive varnish. In some cases, while the inspection is automatic and intelligent, a lot of cars need to be touched up by hand, because the factors leading to the errors in the paint are not yet discovered. While there is a lot of internal expert know-how available from the factory workers, their gut feel is typically not enough to adapt production processes.

How can this be improved using time series and object data? 

The cars (objects) are typically already persisted including all the mass customization and model information. If now, all data, including sensor data, of the manufacturing site like temperature, humidity, spray speed (all time-series data) is persisted and added to each car object, any kind of correlations between production site variables, individual car properties and varnish quality can be detected. The gut feel of the factory workers giving a great starting point for Quick Wins in the analysis and detecting patterns before more long term effects and AI / automatic learning kicks in to optimize the factory setup best possible to reduce the need for paint touch ups as much as possible. 

Use Case: Smart Grids

Utility grid loads shift continually throughout the day, effecting grid efficiency, pricing, and energy delivery. Using Smart Grids, utilities companies can increase efficiency and reliability in real time. In order to get insights from Smart Grids, companies need to collect a large volume of data from existing systems. A huge portion of this data is time series, e.g. usage and load statistics. On top, they incorporate other forms of data, e.g. asset relationship data, weather conditions, and customer profiles. Using visualization and analytical tools, these data types can be brought together to generate business insights and actionable operative goals.

ObjectBox TS: time series with objects

Storing and processing both time series data and objects on the edge, developers can gather complex data sets and get real time insight, even when offline. Combining these data types gives a fuller understanding and context for data – not only what happens over time, but what other factors could be influencing results. Using a fast hybrid edge database allows developers to save resources, while maintaining speed and efficiency. By synchronizing useful data to the cloud, real time data can be used for both immediate action, and post-event analysis.

Get in touch with our team to get a virtual demo of ObjectBox TS, or check out the sample GitHub repo to see more about the code.

Why Edge Computing is More Relevant in 2020 Than Ever

Why Edge Computing is More Relevant in 2020 Than Ever

The world has recently been forced to digitize – both more quickly and to a greater extent; coronavirus has created the need to remodel how work, socializing, production, entertainment, and supply chains function. Despite decades of digitization efforts, with the pandemic upon us, digitization challenges have become transparent. Many companies and countries realize now, they have fallen behind. And those that have not yet digitized are hit hardest by the pandemic. [1] With people leaning heavily on online digital solutions, internet infrastructure is at its capacity limit. [2] Accordingly, users are seeing broadband speeds drop by as much as half. [3] In Europe, governments even requested to reduce the quality of Netflix, Amazon Prime, Youtube and other streaming services to improve network speeds. [4]

These challenges bring to light the growing need for an alternative to cloud computing. Cloud computing is an inherently centralized computing paradigm. Edge Computing helps overcome many of the disadvantages of centralized computing. Edge Computing is inherently decentralized and keeps data local, at the ‘edge’ of the network. Edge Computing is ideal for both, data-intensive content and latency-sensitive applications. Edge Computing makes efficient use of local data and reduces the amount of traffic in the network.

Coronavirus accelerates the need to digitize

It was clear even before the outbreak that internet infrastructure was struggling to keep up with growing data volumes. However, the pandemic has made broadband limitations more apparent to everyday users.

Projections estimate that by 2025 there will be 20 million IoT devices [5] and 1.7MB of data created per second per person. It is slow, expensive, and wasteful to send all of this data to the cloud for storage and processing. This practice overburdens bandwidth and data center infrastructure. It makes projects expensive and unsustainable. Working with the data, locally, on the edge, where it was produced and is used, is more efficient than sending everything to the cloud and back. It brings reduced latency, reduced cloud usage and costs, independence from a network connection, more secure data and heightened data privacy – and even reduces CO2. Indeed, prior to the pandemic, edge computing was on the strategic roadmap for over 50% of mobility decision makers. [6]

As the world begins to recover from the coronavirus pandemic, digitization efforts will no doubt increase. We will see intelligent systems implemented across industries and value chains, accelerating innovation and alongside: data volumes and subsequent strain on network bandwidth. Edge computing is a key technology to ensure that this digitalization is both scalable and sustainable.  

Edge Computing takes the ‘edge’ off bandwidth strain

What is Edge Computing

Today, over 90% of enterprise data is sent to the cloud to be stored and processed. By 2025, this number will drop to just 25%. [7] The remaining data is stored and used on the device it was created on. This is called edge computing. It entails that data is stored and used locally, on the “edge” of the network, e.g. a smart phone or IoT device. Edge computing delivers faster decision making, local and offline data processing, as well as reduced data transfer to the cloud (e.g. filtered, computed, extra- or interpolated data), which saves both bandwidth and cloud storage costs. 

The Edge complements the Cloud

Although some might set cloud and edge in competition, the reality is that edge computing and cloud computing are both useful and relevant technologies. Both have different strengths and ideal use cases. Together they can provide the best of both worlds: decentralized local storage and processing, making efficient use of hardware on the edge and central storing and processing of some data, enabling additional centralized insights, data backups (redundancy), and remote access. To combine the best of both worlds, relevant and useful data must be synchronized between the edge and cloud in a smart and efficient way.  

Edge computing is an ideal technology to reduce the strain on data centers, so those functions that need cloud connection have adequate bandwidth; while those use cases that benefit from reduced latency and offline functionality are optimized on the edge.

The Edge: interface between the Physical and the Digital World

Edge devices handle the interface between the physical world and the cloud, enabling a whole set of new use cases. “Data-driven experiences are rich, immersive and immediate. But they’re also delay-intolerant data hogs”. [8] And therefore need to happen locally, on the edge. We may see edge computing enabling new forms of remote engagement [9], particularly in a post-corona environment.

Edge devices can be anything from a thermostat or small sensor to a fridge or mobile phone or car – and they are part of our direct physical world and use data from their local environment to enable new use cases. Think self-stocking fridges, self-driving cars, drone-delivered pizzas. In the same way, Edge Computing is the key to the first real world search engine. I am waiting for it every day: “Hey Google, where are my keys?” Within a location like a house, the concepts and technologies to enable such a real-world search engine are all clear and available – it is just a matter of time and ongoing digitization. The basis will need to be a fast and sustainable edge infrastructure. 

Sustainability on the Edge

Centralized data centers consume a lot of energy, produce a lot of carbon emissions and cause significant electronic waste. [10] While data centers are seeing a positive trend towards using green data centers, an even more sustainable approach is to cut unnecessary cloud traffic, central computation and storage as much as possible by shifting computation to the edge. Edge Computing strategies that harness the power of already deployed available hardware (like e.g. smartphones, machines, desktops, gateways) make the solution even more sustainable.

Intelligent Edge: AI and Edge advance hand in hand

The growth of Artificial Intelligence (AI) and the Edge will go hand in hand. As more and more data is generated at the edge of the network, there will be a greater demand for intelligent data processing and structured optimization to reduce raw data loads going to the cloud. [11] Edge AI will have the power to work with data on local devices, keeping data streams more useful and usable. In the near future, Machine Learning applications will have the ability to learn and create unique, localized, decentralized insights on the edge – based on local inputs.

“With Edge AI, personalization features that we want from the app can be achieved on device. Transferring data over networks and into cloud-based servers allows for latency. At each endpoint, there are security risks involved in the data transfer”. [12] Which is part of the reason why the Edge AI Software market is forecasted to reach 1.12 trillion dollars volume by 2023. The development of AI accelerators, which improve model inferencing on the edge, namely from NVIDIA, Intel and Google are helping to make AI on the edge more viable. [13] A fast edge database is a necessary base technology to enable more AI on the edge. 

Edge Computing – an answer to Data Privacy concerns and a need for Resilience

As data collection grows in both breadth and depth, there is a stronger need for data privacy and security. Edge computing is one way to tackle this challenge: keeping data where it is produced, locally, makes data ownership clear and data less likely to be attacked and compromised. If compromised, the data compromised is clearly defined, making notification and subsequent actions manageable. ObjectBox, in its core and as an edge technology, is designed to keep data private, on those devices it was created on, and only share select data as needed. 

The more our private and working lives as well as the larger economy depend on digitalization, the more important it is that systems, underlying computing paradigms as well as networks have strong resilience and security. In computer networking, resilience is the ability to “provide and maintain an acceptable level of service in the face of faults and challenges to normal operation.” [14]

 storing as much data as possible on the  0 01.05.2020Edge Computing shifts computer workloads – the collection, processing, and storage of data – from central locations (like the cloud) to the edge of the networks to many individual devices such as cell phones. Accordingly, any strain is distributed to many devices. Therefore, the risk of a total breakdown is reduced: If one device does not work anymore, the rest is still working. Depending on the setup, the individual devices could even compensate for devices that have a problem.

The same applies to security risks: Even if data from one device is compromised, all other data sets are still safe; the loss is thus very limited and clear.  Overall, as a complement to the cloud, edge computing provides improved strength and security in local networks around the world. These local infrastructures can relieve the pressure on the existing complex dependencies, and in turn make the wider system more resilient and flexible. With Edge Computing crisis response can therefore in all likelihood be faster, better informed, and more effective. [15]

Why Corona-Tracking-Apps need to work on the edge

There has been quite some debate about taking a centralized versus decentralized approach to Corona-Tracking-Apps. [16] Many people are rightly worried about their data. Edge Computing – storing most parts of the data locally, on the user’s device – could be a great way to avoid unnecessary data sharing and keeping data ownership clear. At the same time, data would be by and large much more secure and less likely to be attacked and hacked, as the data to be gained is very reduced. An intelligent syncing mechanism then takes care that the data which needs to be shared, is shared in a selective, transparent and secure way.

UPDATE 01.05.2020: The German government changed its initial decision and will now be using a decentralized approach, storing as much data as possible on the edge, for the Corona-Tracking-App.

The next few years will see big cultural changes in both our personal and professional lives – a portion of those changes will be driven by increased digitalization. Edge computing is an important paradigm to ensure these changes are sustainable, scalable, and secure. Ultimately, we have the chance to rise from this crisis with new insights, new innovation, and a more sustainable future.

1. https://www.netzoekonom.de/2020/04/11/die-oekonomie-nach-corona-digitalisierung-und-automatisierung-in-hoechstgeschwindigkeit/
2. https://www.cnet.com/news/coronavirus-has-made-peak-internet-usage-into-the-new-normal/
3. https://www.nytimes.com/2020/03/26/business/coronavirus-internet-traffic-speed.html
4. https://www.theverge.com/2020/3/27/21195358/streaming-netflix-disney-hbo-now-youtube-twitch-amazon-prime-video-coronavirus-broadband-network
5. https://www.gartner.com/imagesrv/books/iot/iotEbook_digital.pdf
6. https://www.forbes.com/sites/forrester/2019/12/02/predictions-2020-edge-computing-makes-the-leap/#1aba50104201
7. https://www.gartner.com/smarterwithgartner/what-edge-computing-means-for-infrastructure-and-operations-leaders/
8. https://www.iotworldtoday.com/2020/03/19/ai-at-the-edge-still-mostly-consumer-not-enterprise-market/
9. https://www.accenture.com/us-en/insights/high-tech/edge-processing-remote-viewership
10. https://link.springer.com/article/10.1007/s12053-019-09833-8
11. https://www.forbes.com/sites/cognitiveworld/2020/04/16/edge-ai-is-the-future-intel-and-udacity-are-teaming-up-to-train-developers/#232c8fab68f2
12. https://www.forbes.com/sites/cognitiveworld/2020/04/16/edge-ai-is-the-future-intel-and-udacity-are-teaming-up-to-train-developers/#232c8fab68f2
13. https://www.forbes.com/sites/janakirammsv/2019/07/15/how-ai-accelerators-are-changing-the-face-of-edge-computing/#2c1304ce674f
14. https://en.wikipedia.org/wiki/Resilience_(network)
15. https://www.coindesk.com/how-edge-computing-can-make-us-more-resilient-in-a-crisis
16. https://venturebeat.com/2020/04/13/what-privacy-preserving-coronavirus-tracing-apps-need-to-succeed/

Connecting database performance and business value – a fast edge database is a money saver

Connecting database performance and business value – a fast edge database is a money saver

We frequently get asked:   “Why does database performance matter?” “What is the business value of database speed?” 

As a developer, it seems clear that database performance matters. At the very least, a fast database that gives you out-of-the-box speed saves time and nerves during development. Any piece of the tech stack that works superfast makes a developer’s job easier. But there is more to it. And in the following, we will reason why and how database performance impacts businesses to hopefully inspire ideas on how to quantify this for your business case.

Data should be available when need where needed

We all dream of a future transformed by data. Cars that drive themselves to be repaired before a failure occurs. Fridges that are restocked while we are at work. Reducing resource waste to an absolute minimum. Building sustainable cities and communities.[1] It is truly amazing what is possible today…

Then reality hits: Before you can implement amazing solutions to make the world a better place for everyone, someone needs to solve the technical challenges, including hidden requirements. For example: you need the necessary data, and you need it available when needed where needed. This often isn’t that simple. Data persistence, database speed, and data synchronization are typical non-functional or “hidden” requirements. These are prerequisite technologies to allow the application to access, process and possibly depict the data required to answer a request (from another application or from a user), and thus enable the functionalities /  features. All in all, this is a pretty fundamental requirement. And it pays off to build your app on top of a solid foundation. Because, if you built your application on a solid foundation, every feature you dream up, no matter when,  and any next feature will be easier and faster to implement. 

Functional and non-functional requirements – the hidden challenges of your IoT project

While you need data in any application, most often no one will write down where and how to handle it  as a user story or requirement. As opposed to features, e.g. “being able to search for names in the address book”, data persistence, database speed, and often even data synchronization are “hidden requirements”. Data is just expected to be available where needed when needed. Whether  the data you need really will be available when you need it, depends strongly on the database the application is using and and where this database runs. On top, the mechanisms you employ to exchange data between different devices (end devices, servers, ….) matter.

Hidden requirements are one of the major reasons why the Industry 4.0 dream is still in many respects a dream and not a reality – in Europe at least. Despite it being a topic for more than 10 years. [2]

Database performance 

What is a database?

A database is a piece of software that allows the storage and systematic use of digital information. A database typically allows developers to store, access, search, update, query, and otherwise manipulate data in the database via a developer language or API. These types of operations are done within an application, in the background, typically hidden from end users. Most applications need a database as part of their technology stack.

What is database performance?

We like and therefore use the following definition from Craig Mullins (2002): “Database performance can be defined as the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.” [3]

Why does it matter if the database runs on the edge or in the cloud?

An edge database holds data on the (end) devices, where the data is used – and typically additionally sends some parts of the data to a central place like an on-premise server or the cloud. As opposed to this, a server / cloud-based database holds all data on the server / in the cloud. Where the data sits, determines from where, when and how it can be accessed. If all data is on a central server or the cloud, the prerequisite to accessing this data is a working network connection.

Online

Offline

It follows that edge applications are based upon a distributed computing paradigm, allowing edge devices to be autonomous. On the other hand, cloud-based applications are based on the centralized computing paradigm, where one central instance is in charge, with all other devices being dependent upon this central instance. This significantly affects the response time of the application, the availability of the application, and last not least the bandwidth needed for the application, which also translates into cloud costs.

Location matters: while a fast database gives you fast response times, if the database sits in the cloud and needs to be called from edge devices, you need to factor in  the duration it takes to request the data and get a response. And with any networking you cannot guarantee response times or ensure it is always available. While this is not the database performance itself, it highly affects application performance. 

The impact of database performance on your business

Database performance matters. Whether your solution needs the speed, because of the necessity to re-act in (near) realtime, or to keep your users (customers, employees, …) happy, productive, buying, or just to save costs for stronger edge hardware and the cloud. “Considering that even a single moment of latency or downtime can cost companies thousands of dollars, the speed advantages of edge computing cannot be overlooked.” [4]

The necessity of database speed for mission-critical, security relevant, (near) real-time functionalities 

If you need near real time functionalities, every piece in the tech stack matters, but the database has a particularly strong impact on the response rates of your application. Consider autonomous driving, healthcare and security applications, or IIoT solutions for production lines: Any application supporting such a scenario needs to respond reliably with speed. “This is not the same as a lag in loading your favorite cat pictures. A lag in a moving vehicle scenario is a matter of life and death.” [5]

Accordingly, if end devices like cars, smartphones, health trackers, machines on the factory floor are involved, a purely cloud-based application is not an option. Data needs to be stored and used on the devices directly. Thus, an edge database is necessary. Ideally, an extremely fast one.

Examples of use cases with a need for database speed

Autonomous driving capabilities are a special edge computing case that requires significant compute power to run the algorithms in real-time within the control unit of the car. As can be easily deducted from first-hand driving experience, during this kind of constant information processing and instantaneous decision making, every millisecond counts. Information processing speed and reliability (guaranteed QoS parameters)  is of the essence for autonomous driving.

Moving to a purely monetary example, let’s consider roadside tolling. In roadside tolling, the edge devices on the side of the road need to process the information from a moving vehicle in order to identify the car, bill according to usage, and detect violators. Ideally, it even informs the car owner of the result. As the car is constantly moving and can be going fast, all of this needs to happen in a very short amount of time. A super fast database lookup on the edge is key to avoid money loss and deliver good customer service. 

For a final example,  let us look at additive manufacturing. 3D printers use layering techniques with a variety of materials to quickly create custom designed parts. During the layering process, the controller needs to quickly and efficiently incorporate small changes in the environment (e.g. an increase in temperature) to ensure quality and accuracy of the part. Faster and more precise manufacturing is currently limited by the I/O throughput. With a fast database, the I/O throughput is higher, allowing for more complex and finite production.

In short: A superfast database is not a nice to-have, it is a must-have. The database speed a database brings out-of-the-box is critical for such an application.

 

The impact of database speed on Sales, Conversions, Retention (or at least, nerves) 

There is a reason Google forces companies to optimize their websites and mobile applications for performance: There is a wealth of research and evidence that suggests response rates of websites and mobile applications impact user behavior significantly.[6] Even more, there are several studies providing evidence that response rates impact actual buying behavior. [7] While there is less research on other digital applications like e.g. a desktop app or workplace software, some studies have shown that needing to work with slow applications decreases employee satisfaction and productivity. [8]

The impact of database speed on battery, CPU, hardware and related resources

Another hidden requirement typically is resource-efficiency with regards to CPU, RAM, Disc space and battery / electricity. For any application running in the cloud, these requirements are balanced in the backend as the cloud scales vertically. It “only” adds to cloud costs (and is a waste of energy – not to mention all the infrastructure / hardware enabling that waste). 

On the edge, you typically work with restricted devices, meaning you can only use the devices’ resources, which can be pretty limited. Therefore, inefficient applications can push a device to its limits, leading to e.g. slow response rates, crashes, and battery drain. Security is a very necessary cross-the-stack functionality that often impacts performance. While data that stays on the edge is challenging to hack, edge data needs to be protected just like data in the cloud.

How database performance impacts the business value of your IoT application

All applications on one device share the available hardware capabilities; resource allocation is managed by the operating system. Accordingly, the more resources an application or the database uses, the less resources are available for other uses. The faster a database executes its operations, the less CPU it uses, the less battery / electricity, and typically also memory. In practice that means there are more resources available on the device to run e.g. Edge AI or Edge ML applications.

From a business value perspective that means:

  • You can save on hardware costs (CPU, RAM, Disc, Memory, …): either do more on existing / chosen hardware, upgrade hardware later or choose smaller and thus less expensive hardware. 
  • You can save on energy and cloud costs: The more efficient, the less electricity, the less cloud costs. This can add up tremendously as projects scale.
  • You can add more features, deliver more functionalities, make your application more secure within a given environment. 
  • You can deliver a smooth, fast user experience, enabling applications that deliver in near-realtime. 

    In sum, it clearly impacts the cost structure and value you can deliver.

Database performance impacts business value, directly and indirectly

As projects scale in size and scope, hidden requirements like database performance often become clear. At scale, small issues like delayed data, or data volumes, become big headaches. Ideally, these sorts of requirements would be at the heart of the design stage of any project – and budgeted for at the beginning. The choice of database clearly has a huge impact on the business success of IoT applications.

[1] See https://www.weforum.org/agenda/2018/01/effect-technology-sustainability-sdgs-internet-things-iot/ for IoT impact on Sustainable Development Goals (SDG)
[2] https://restart-project.eu/much-know-industry-4-0/
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=13&cad=rja&uact=8&ved=2ahUKEwiGidSA6trnAhVQY8AKHTpSDUIQFjAMegQICBAB&url=https%3A%2F%2Fwww.mdpi.com%2F2076-3387%2F9%2F3%2F71%2Fpdf&usg=AOvVaw3cx44OOMfNzJ_BJlCG8Gfj
[3] Database Administration: The Complete Guide to Practices and Procedures By Craig Mullins 2002
[4] https://www.vxchnge.com/blog/the-5-best-benefits-of-edge-computing
[5] https://www.zdnet.com/article/why-autonomous-vehicles-will-rely-on-edge-computing-and-not-the-cloud/
[6] https://developers.google.com/web/fundamentals/performance/why-performance-matters https://www.thinkwithgoogle.com/intl/en-154/insights-inspiration/research-data/need-mobile-speed-how-mobile-latency-impacts-publisher-revenue/
https://www.machmetrics.com/speed-blog/how-does-page-load-time-affect-your-site-revenue
https://datadome.co/bot-management-protection/website-performance-how-to-increase-your-business-by-blocking-bots/
[7] https://developers.google.com/web/fundamentals/performance/why-performance-matters
https://www.thinkwithgoogle.com/intl/en-154/insights-inspiration/research-data/need-mobile-speed-how-mobile-latency-impacts-publisher-revenue/
https://www.machmetrics.com/speed-blog/how-does-page-load-time-affect-your-site-revenue
https://datadome.co/bot-management-protection/website-performance-how-to-increase-your-business-by-blocking-bots/
[8] https://drum.lib.umd.edu/handle/1903/1233
https://www.tandfonline.com/doi/abs/10.1080/01449290500196963

 

Car Tolling – A case for Edge Computing

Car Tolling – A case for Edge Computing

Governments often face tight budgets on infrastructure development; car tolling is increasingly seen as the answer for raising funds¹, making it more and more prevalent. From 2008 to 2018 the total length of tolled roads in Europe increased by 23%² and tolling revenue in Europe increased by 37%³ to €31.3 bn. per year; similarly, from 2010 to 2015 the United States experienced a 63% increase in transponders and 52% more tolling revenue, resulting in $13.8 bn. in 2015. On top, despite car sharing efforts, car ownership and traffic is still increasing in many countries, e.g. Germany, France and India. Increasing amounts of traffic, devices, and data points bring current tolling solutions to their limits. Taking data to the edge in new and existing tolling solutions, for example with the ObjectBox data storage and synchronization solution, can make tolling more efficient and reliable.

Setting the stage: a typical car tolling situation

A national infrastructure company has deployed several hundred car tolling stations all over the country. These stations automatically recognize passing cars by detecting licence plates, using visual recognition or wirelessly, e.g. by receiving data from an RFID transponder in the car. In order to ensure that only eligible cars are passing through the tolling station and violators are fined, it is necessary for the tolling station software to look up the gathered vehicle information – among millions of entries – as fast as possible. If the data look-up is not  fast enough, or the data on the roadsides/tolling stations isn’t up to date and in sync with the central data, the tolling station loses money.

“The importance of mobile apps is increasing for Kapsch TrafficCom so that we see ObjectBox’ edge computing database solution as an interesting future base technology for all types of mobility apps.”

Peter Ummenhofer

Executive VP Solution Management, Kapsch TrafficCom

Why edge computing and fast lookup is key to today’s car tolling systems

In general, modern nationwide tolling infrastructure consists of three systems: tolling stations operated by the respective agencies, central open road, also called mobile tolling, and central transaction clearing houses. Within this infrastructure, all data related to violators and other operational information needs to be synchronized between these three systems in a consistent way, with as little delay as possible. If this is not the case, together with other problems, car tolling system operators are faced with high monetary losses every day.

Today’s car tolling systems are based on the fundamental idea that cars do not need to stop to be checked or charged. Thus, as the cars move quickly through the scanning area, the challenge of implementing a car tolling system directly relates to the amount of data that needs to be searched within a very short time frame.  To be successful, this process needs to happen in near real-time. From a development perspective, these problems are rooted in:

  • accessing data from a remote location (speed of communication, speed of network)
  • keeping data in synchronization with car tolling stations that are closer to the drivers and/or roadside units
  • database speed on remote servers
  • database speed on roadside units (car tolling edge devices)
  • limitations of existing hardware as some systems are quite old, and rolling out new hardware is expensive

Furthermore, it is possible that stations shut down from time to time, due to the weather, power outages, vandalism or simply technical failures. However, tolling providers generally need to provide strict uptime guarantees and thus service level agreements often include penalty fees in case of excessive downtime. Such events cost the providers substantial amounts of money – and data loss, i.e. undetected violators, even more so.

Adding to this, privacy and legal requirements differ from country to country and increase the complexity of the systems and timings. For example, in Austria the pictures and derived license plate information may only be used for checking, but in case no violation was detected, they need to be removed in an unrecoverable manner¹⁰. On the other hand, the data of potential violators may be stored for the sole purpose of toll collection or prosecution, but only for a maximum of three years.

How fast data storage and syncing can help in car tolling

To solve these problems, a data storage and data synchronization solution like ObjectBox can be deployed on every type of tolling station, i.e. open and static stations, as well as on the central server. From a technical point of view, this is not a problem, because the ObjectBox library supports virtually all platforms and operating systems. Financially, it is considerably cheaper to update software, than it is to upgrade hardware.

Having the library installed, with ObjectBox Sync, it is guaranteed that the vehicle data in the internal stations’ memory is always up-to-date with the central server, so the station will make a decision based on the most accurate data every time. Additionally, the other systems involved in the tolling infrastructure consistently receive the most recent information with no further effort required.

Deploying the synchronization solution also means, because ObjectBox is particularly reliable (ACID compliant) and well-tested, that station shutdowns or internet connection issues are not a problem anymore. The stations’ operating company will no longer lose violator’s information due to technical reasons.

Summary – Car tolling is moving to the edge

As this case study shows, the use of edge computing is a perfect fit for modern infrastructure. In the context of car tolling, speed, reliable data storage and synchronization are indispensable, resulting in ObjectBox being an effective solution for today’s and future technological advancements.

If you are interested in learning more, feel free to get in touch with us! We appreciate any kind of feedback.