Skip to main content

On-Device AI Goes Mainstream on Android

This article is a recap of my Droidcon Berlin 2025 talk, so the focus is on Android and Mobile AI in the hands-on, practical part. You can find the slides here. In the talk, we discussed why the shift towards Edge AI is so important - especially for Android developers - what opportunities it opens up, and how developers can start building with on-device AI today, and what to expect.

Edge AI may also be called On-device AI, Mobile AI, or Local AI.

Artificial Intelligence (AI) is shifting from the cloud to the edge — onto our phones, cars, and billions of connected devices. This move, often described as Edge AI (What is Edge AI?), unlocks AI experiences that are private, fast, and sustainable.


Why Edge AI Now?

Two megatrends are converging:

  • Edge Computing - Processing data where it is created, on the device, locally, at the edge of the network, is called "Edge Computing" and it is growing
  • AI - AI capabilities and use are expanding rapidly and without a need for further explanation
Edge AI: Where Edge Computing and AI intersect

--> where these two trends overlap (at the intersection), it is called Edge AI (or local AI, on-device AI, or with regards to a subsection: "Mobile AI")

The shift to Edge AI is driven by use cases that:

  • need to work offline
  • have to comply with specific privacy / data requirements
  • want to transfer more data than the bandwidth will allow
  • need to meet realtime or (QoS) specific response rate requirements
  • are not economically viable when using the cloud / a cloud AI
  • want to be sustainable
Edge AI drivers (benefits)

If you're interested in the sustainability aspect, see also: Why Edge Computing matters for a sustainable future

It's not Edge AI vs. Cloud AI - the reality is Hybrid AI

Of course, while we see a market shift towards Edge Computing, there is no Edge Computing vs. Cloud Computing - the two complement each other and the question is mainly: How much edge does your use case need?

Edge AI drivers (benefits)

Every shift in computing is empowered by core technologies

Every shift in computing is empowered by core technologies

What are the core technologies empowering Edge AI?

If every megashift in computing is powered by core tech, what are the core technologies empowering the shift to Edge AI?

Typically, Mobile AI apps need three core components:

  1. An on-device AI model (e.g. SLM)
  2. A vector database
  3. Data sync for hybrid architectures (Data Sync Alternatives)
The core technologies empowering Edge AI

A look at AI models

The trend to "bigger is better" has been broken - the rise of SLM and Small AI models

Large foundation models (LLMs) remain costly and centralized. In contrast, Small Language Models (SLMs) bring similar capabilities in a lightweight, resource-efficient way.

SLM quality and cost comparison
  • Up to 100x cheaper to run
  • Faster, with lower energy consumption
  • Near-Large-Model quality in some cases

This makes them ideal for local AI scenarios: assistants, semantic search, or multimodal apps running directly on-device. However....

Frontier AI Models are still getting bigger and costs are skyrocketing

SLM quality and cost comparison

Why this matters for developers: Monetary and hidden costs of using Cloud AI

Running cloud AI comes at a cost:

  • Monetary Costs: Cloud cost conundrum (Andreessen Horowitz 2021) is fueled by cloud AI; margins shrink as data center and AI bills grow (Gartner 2025)
  • Dependency: Few tech giants hold all major AI models, the data, and the know-how, and they make the rules (e.g. thin AI layers on top of huge cloud AI models will fade away due to vertical integration)
  • Data privacy & compliance: Sending data around adds risk, sharing data too (what are you agreeing to?)
  • Sustainability: Large models consume way more energy, and transmitting data unnecessarily consumes way more energy too (think of this as shopping apples from New Zealand in Germany instead of buying local produce) (Sustainable Future with Edge Computing).

What about Open Source AI Models?

Yes, they are an option, but be mindful of potential risks and caveats. Be aware that you also pay to be free of liability risks.

SLM quality and cost comparison

While SLMs are all the rage, it's really about specialised AI models in Edge AI (at this moment...)

SLM quality and cost comparison

On-device Vector Databases are the second essential piece of the Edge AI Tech Stack

  • Vector databases are basically the databases for AI applications. AI models work with vectors (vector embeddings) and vector databases make working with vector embeddings easy and efficient.
  • Vector databases offer powerful vector search and querying capabilities, provide additional context and filtering mechanisms and give AI applications a long-term memory.
  • For most AI applications you need to use a vector database, e.g. Retrieval Augmented Generation (RAG) or agentic AI, but they are also used to make AI apps more efficient, e.g. reducing LLM calls and providing faster responses.
info

On-device (or Edge) vector databases have a small footprint (a couple of MB, not hundreds of MB) and are optimized for efficiency on resource-restricted devices.

Edge Vector databases, or on-device vector databases, are still rare. Some server- and cloud-oriented vector databases have recently begun positioning themselves for edge use. However, their relatively large footprint (several hundred MB) makes them unsuitable for truly resource-constrained embedded devices, while they can of course work on laptops and local PCs. More importantly, solutions derived from scaling down from larger systems are generally not optimized for restricted environments, resulting in higher computational demands and increased battery consumption, which in the end also leads to increased costs.

Vector Databases

Developer Story: On-device AI Screenshot Searcher Example App

To test the waters, I built a Screenshot Searcher app with ObjectBox Vector Database:

  • OCR text extraction with ML Kit
  • Semantic search with MediaPipe and ObjectBox
  • Image similarity search with TensorFlow Lite and ObjectBox
  • Image categorization with ML Kit Image Labeling

This was easy and took less than a day. However, I learned more with the stuff I tried that wasn't easy... ;)

What I learned about text classification (and hopefully helps you)

On-device Text Classification Learnings

--> See Finetuning.... without Finetuning, no model, no text classification.

What I learned about finetuning (and hopefully helps you)

Finetuning Learnings (exemplary, based on finetuning DBpedia)

--> Finetuning failed --> I will try again ;)

What I learned about integrating an SLM (Google's Gemma)

Integrating Gemma was super straightforward; it worked on-device in less than an hour (just don't try to use the Android emulator (AVD) - it's not recommended to try and run Gemma on the AVD, and it also did not work for me).

Using Gemma on Android

In this example app, we are using Gemma to enhance the screenshot search with an additional AI layer:

  • Generates intelligent summaries from OCR text
  • Create semantic categories and keywords
  • Enhance search queries with synonyms and related terms

Overall assessment of the practical, hands-on state of On-device AI on Android

It's already fairly easy - and vibe coding an Edge AI app very doable. While of course I would recommend the latter only for prototyping and testing, it is amazing what you can do on-device with AI already, even not being a developer!

Final Tech Stack

Key Questions to Ask Yourself

  • How much edge vs. cloud do you need?
  • Which tasks benefit from local inference?
  • What data must remain private?
  • How can you make your app cost-efficient long term?

How to Get Started


Conclusion

We’re at an inflection point: AI is moving from centralized, cloud-based services to decentralized, personal on-device AI. With SLMs, vector databases, and data sync, developers can now build AI apps that are:

  • Private
  • Offline-first
  • Cost-efficient
  • Sustainable

The future of AI is not just big — it's also small, local, and synced.

AI Anytime Anywhere Future