NexaSDK is a unified AI inference toolkit powered by the custom-built NexaML engine, designed for peak performance across NPUs, GPUs, and CPUs. Unlike wrappers dependent on existing runtimes, NexaML is built from scratch at the kernel level, enabling Day-0 support for new architectures including LLMs, multimodal, audio, and vision models across GGUF, MLX, and .nexa formats.
The platform features NPU-first architecture with support for Qualcomm, Apple Neural Engine, Intel, and AMD NPUs, Android SDK for mobile deployment, full multimodality support for image, audio, and text, cross-platform compatibility for desktop, mobile, automotive, and IoT, OpenAI-compatible API with function calling, and low-level control with one-line execution for both Python and C++ implementations.
Use Cases:
π€ Supported chipmakers
NexaSDK is an easy-to-use developer toolkit for running any AI model locally β across NPUs, GPUs, and CPUs β powered by our NexaML engine, built entirely from scratch for peak performance on every hardware stack. Unlike wrappers that depend on existing runtimes, NexaML is a unified inference engine built at the kernel level. Itβs what lets NexaSDK achieve Day-0 support for new model architectures (LLMs, multimodal, audio, vision). NexaML supports 3 model formats: GGUF, MLX, and Nexa AI's own .nexa format.
| Features | NexaSDK | Ollama | llama.cpp | LM Studio |
|---|---|---|---|---|
| NPU support | β NPU-first | β | β | β |
| Android SDK support | β NPU/GPU/CPU support | β οΈ | β οΈ | β |
| Support any model in GGUF, MLX, NEXA format | β Low-level Control | β | β οΈ | β |
| Full multimodality support | β Image, Audio, Text | β οΈ | β οΈ | β οΈ |
| Cross-platform support | β Desktop, Mobile, Automotive, IoT | β οΈ | β οΈ | β οΈ |
| One line of code to run | β | β | β οΈ | β |
| OpenAI-compatible API + Function calling | β | β | β | β |
Legend:
β
Supported |
β οΈ Partial or limited support |
β No
Multi-agent financial platform with AI-powered stock selection, research, and automated crypto trading on Binance, OKX, and Hyperliquid with local-first data storage.
GUI tool to remove Windows 11 advertisements introduced in April 2024 update by modifying Registry keys across File Explorer, Start Menu, and Settings.
Invisible desktop AI assistant that captures screen and audio for real-time meeting notes, context-aware answers, and proactive summaries with OpenAI, Gemini, or local models.
Rust-based context engineering tool that converts codebases into structured LLM prompts with CLI, Python SDK, and MCP server for AI agents and automation.
Self-hosted desktop streaming solution with built-in virtual displays, native client resolution matching, HDR support, and permission-based client management for Moonlight.
Privacy-first desktop investment tracker with local-only data storage, comprehensive portfolio analytics, multi-currency support, and extensible addon system built with Rust and Tauri.