On-Device AI Goes Mainstream on Android
This article is a recap of my Droidcon Berlin 2025 talk, so the focus is on Android and Mobile AI in the hands-on, practical part. You can find the slides here. In the talk, we discussed why the shift towards Edge AI is so important - especially for Android developers - what opportunities it opens up, and how developers can start building with on-device AI today, and what to expect.
Artificial Intelligence (AI) is shifting from the cloud to the edge — onto our phones, cars, and billions of connected devices. This move, often described as Edge AI (What is Edge AI?), unlocks AI experiences that are private, fast, and sustainable.
Why Edge AI Now?
Two megatrends are converging:
- Edge Computing - Processing data where it is created, on the device, locally, at the edge of the network, is called "Edge Computing" and it is growing
- AI - AI capabilities and use are expanding rapidly and without a need for further explanation
--> where these two trends overlap (at the intersection), it is called Edge AI (or local AI, on-device AI, or with regards to a subsection: "Mobile AI")
The shift to Edge AI is driven by use cases that:
- need to work offline
- have to comply with specific privacy / data requirements
- want to transfer more data than the bandwidth will allow
- need to meet realtime or (QoS) specific response rate requirements
- are not economically viable when using the cloud / a cloud AI
- want to be sustainable
If you're interested in the sustainability aspect, see also: Why Edge Computing matters for a sustainable future
It's not Edge AI vs. Cloud AI - the reality is Hybrid AI
Of course, while we see a market shift towards Edge Computing, there is no Edge Computing vs. Cloud Computing - the two complement each other and the question is mainly: How much edge does your use case need?
Every shift in computing is empowered by core technologies
What are the core technologies empowering Edge AI?
If every megashift in computing is powered by core tech, what are the core technologies empowering the shift to Edge AI?
Typically, Mobile AI apps need three core components:
- An on-device AI model (e.g. SLM)
- A vector database
- Data sync for hybrid architectures (Data Sync Alternatives)
A look at AI models
The trend to "bigger is better" has been broken - the rise of SLM and Small AI models
Large foundation models (LLMs) remain costly and centralized. In contrast, Small Language Models (SLMs) bring similar capabilities in a lightweight, resource-efficient way.
- Up to 100x cheaper to run
- Faster, with lower energy consumption
- Near-Large-Model quality in some cases
This makes them ideal for local AI scenarios: assistants, semantic search, or multimodal apps running directly on-device. However....
Frontier AI Models are still getting bigger and costs are skyrocketing
Why this matters for developers: Monetary and hidden costs of using Cloud AI
Running cloud AI comes at a cost:
- Monetary Costs: Cloud cost conundrum (Andreessen Horowitz 2021) is fueled by cloud AI; margins shrink as data center and AI bills grow (Gartner 2025)
- Dependency: Few tech giants hold all major AI models, the data, and the know-how, and they make the rules (e.g. thin AI layers on top of huge cloud AI models will fade away due to vertical integration)
- Data privacy & compliance: Sending data around adds risk, sharing data too (what are you agreeing to?)
- Sustainability: Large models consume way more energy, and transmitting data unnecessarily consumes way more energy too (think of this as shopping apples from New Zealand in Germany instead of buying local produce) (Sustainable Future with Edge Computing).
What about Open Source AI Models?
Yes, they are an option, but be mindful of potential risks and caveats. Be aware that you also pay to be free of liability risks.
While SLMs are all the rage, it's really about specialised AI models in Edge AI (at this moment...)
On-device Vector Databases are the second essential piece of the Edge AI Tech Stack
- Vector databases are basically the databases for AI applications. AI models work with vectors (vector embeddings) and vector databases make working with vector embeddings easy and efficient.
- Vector databases offer powerful vector search and querying capabilities, provide additional context and filtering mechanisms and give AI applications a long-term memory.
- For most AI applications you need to use a vector database, e.g. Retrieval Augmented Generation (RAG) or agentic AI, but they are also used to make AI apps more efficient, e.g. reducing LLM calls and providing faster responses.
On-device (or Edge) vector databases have a small footprint (a couple of MB, not hundreds of MB) and are optimized for efficiency on resource-restricted devices.
Edge Vector databases, or on-device vector databases, are still rare. Some server- and cloud-oriented vector databases have recently begun positioning themselves for edge use. However, their relatively large footprint (several hundred MB) makes them unsuitable for truly resource-constrained embedded devices, while they can of course work on laptops and local PCs. More importantly, solutions derived from scaling down from larger systems are generally not optimized for restricted environments, resulting in higher computational demands and increased battery consumption, which in the end also leads to increased costs.
Developer Story: On-device AI Screenshot Searcher Example App
To test the waters, I built a Screenshot Searcher app with ObjectBox Vector Database:
- OCR text extraction with ML Kit
- Semantic search with MediaPipe and ObjectBox
- Image similarity search with TensorFlow Lite and ObjectBox
- Image categorization with ML Kit Image Labeling
This was easy and took less than a day. However, I learned more with the stuff I tried that wasn't easy... ;)
What I learned about text classification (and hopefully helps you)
--> See Finetuning.... without Finetuning, no model, no text classification.
What I learned about finetuning (and hopefully helps you)
--> Finetuning failed --> I will try again ;)
What I learned about integrating an SLM (Google's Gemma)
Integrating Gemma was super straightforward; it worked on-device in less than an hour (just don't try to use the Android emulator (AVD) - it's not recommended to try and run Gemma on the AVD, and it also did not work for me).
In this example app, we are using Gemma to enhance the screenshot search with an additional AI layer:
- Generates intelligent summaries from OCR text
- Create semantic categories and keywords
- Enhance search queries with synonyms and related terms
Overall assessment of the practical, hands-on state of On-device AI on Android
It's already fairly easy - and vibe coding an Edge AI app very doable. While of course I would recommend the latter only for prototyping and testing, it is amazing what you can do on-device with AI already, even not being a developer!
Key Questions to Ask Yourself
- How much edge vs. cloud do you need?
- Which tasks benefit from local inference?
- What data must remain private?
- How can you make your app cost-efficient long term?
How to Get Started
- Learn about Local AI
- Explore Vector Databases
- Prototype with the On-device AI Screenshot Searcher Example
- Consider Data Sync for hybrid apps
- Read more on Empowering Edge AI with Databases
Conclusion
We’re at an inflection point: AI is moving from centralized, cloud-based services to decentralized, personal on-device AI. With SLMs, vector databases, and data sync, developers can now build AI apps that are:
- Private
- Offline-first
- Cost-efficient
- Sustainable
The future of AI is not just big — it's also small, local, and synced.