โ† Back to Blog

Running Python 3.11 on Android Without a Server

One of the core promises of Forge OS is that the agent can write and run Python โ€” not in a cloud sandbox, not via a remote API, but right there on the device. This post explains how we made that work, what the real constraints are, and where the edges of the system are.

Why Chaquopy

There are a few ways to run Python on Android. You can ship a full CPython interpreter as a native library, use a transpiler like Brython (Python-to-JS, not useful here), or use Chaquopy โ€” a Gradle plugin that bundles CPython and a JNI bridge so you can call Python from Kotlin and vice versa.

We chose Chaquopy for a few reasons. It supports Python 3.11, which is current enough to matter. It handles the JNI plumbing so we don't have to. It integrates cleanly with the Android build system. And it has a pip block in the Gradle config that lets us declare Python dependencies at build time, which means they're bundled in the APK rather than downloaded at runtime.

What we ship

The packages bundled in the APK:

numpy==1.26.2
pillow==11.0.0
requests
beautifulsoup4
pandas
lxml
python-dateutil
pyyaml
openpyxl
xlrd / xlwt
psutil

This covers the vast majority of what an agent needs for data processing, web scraping, file manipulation, and system inspection. We deliberately kept the list conservative โ€” every package adds APK size, and we're targeting ARM64 only, which means we can't include packages that require native compilation for other architectures.

The ARM64-only decision

Forge OS targets arm64-v8a only. We dropped armeabi-v7a (32-bit) and x86_64 (emulators). This was a deliberate tradeoff: every modern Android phone sold in the last several years is 64-bit ARM. Supporting 32-bit would roughly double the native library size in the APK for a tiny fraction of real-world devices. Supporting x86_64 would help emulator testing but adds complexity we didn't need at this stage.

If you're running Forge OS on an emulator, you'll need an ARM64 system image. x86_64 emulators won't work.

AST-based import filtering

Giving an LLM the ability to run arbitrary Python on your phone is a significant trust decision. We needed a way to prevent the agent from importing things it shouldn't โ€” subprocess, os.system, socket-level networking outside the approved channels, and so on.

We built an AST-based import filter that runs before execution. Before any script runs, we parse it with Python's ast module and walk the tree looking for import statements. If the script tries to import a blocked module, execution is refused and the agent gets an error explaining why. The filter runs in Python itself, which means it's fast and doesn't require a separate parsing step.

The blocked list is configurable via the security policy layer. Users can tighten or loosen it from Settings โ†’ Advanced. The agent can't modify the policy โ€” that's a human-only control.

Timeouts and output capture

Scripts run with a configurable timeout (default: 30 seconds). If a script exceeds the timeout, it's killed and the agent receives a timeout error. This prevents runaway loops from blocking the agent indefinitely.

stdout and stderr are captured and returned to the agent as part of the tool result. The agent sees exactly what a human would see if they ran the script in a terminal. This is important for debugging โ€” the agent can read error messages and adjust its approach.

The "save as skill" feature

One of the more useful things we added: if a script works well, the agent can save it as a reusable skill. Skills are named Python functions stored in the workspace that the agent can call by name on future turns without rewriting the code. This is how the agent builds up a library of useful utilities over time.

Shell execution

In addition to Python, the agent can run shell commands via a separate executor. Shell execution is more restricted than Python โ€” the command runs in a sandboxed environment with a limited PATH and no access to system directories outside the workspace. It's useful for things like grep, find, file manipulation, and calling command-line tools that are available on the device.

Where the edges are

A few things that don't work and why:

What we expect to work well

Data processing with pandas and numpy should be fast and reliable on ARM64. Web scraping with requests and BeautifulSoup works exactly as you'd expect โ€” it's standard Python, nothing Android-specific. File manipulation, YAML/JSON parsing, Excel reading and writing โ€” all solid. psutil gives the agent visibility into device resource usage. The combination covers a wide range of real automation tasks.

The Python runtime is one of the things we're most confident about. It's not a toy sandbox โ€” it's a real Python 3.11 interpreter running on your phone, with a real package ecosystem, real output capture, and real security controls. That's what makes Forge OS an agent rather than a chatbot.

โ€” The Forge OS Team