βœ“
Nexa Sdk Download

Unified AI inference SDK with custom NexaML engine providing Day-0 model support across NPU, GPU, and CPU with GGUF, MLX, and .nexa format compatibility.

⭐ 6,159 stars on GitHub
Latest Release: v0.2.64

About Software

NexaSDK is a unified AI inference toolkit powered by the custom-built NexaML engine, designed for peak performance across NPUs, GPUs, and CPUs. Unlike wrappers dependent on existing runtimes, NexaML is built from scratch at the kernel level, enabling Day-0 support for new architectures including LLMs, multimodal, audio, and vision models across GGUF, MLX, and .nexa formats.

The platform features NPU-first architecture with support for Qualcomm, Apple Neural Engine, Intel, and AMD NPUs, Android SDK for mobile deployment, full multimodality support for image, audio, and text, cross-platform compatibility for desktop, mobile, automotive, and IoT, OpenAI-compatible API with function calling, and low-level control with one-line execution for both Python and C++ implementations.

Use Cases:

  • Run LLMs and VLMs locally across NPU, GPU, and CPU with one command
  • Execute models in GGUF, MLX, and .nexa formats on desktop and mobile
  • Deploy AI on Qualcomm NPU, Apple Neural Engine, Intel NPU, and AMD NPU
  • Access Day-0 support for latest models like Qwen3-VL, Granite4, and Gemma3n
  • Build AI applications with Android SDK and cross-platform Python SDK

Downloads

v0.2.64 December 09, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.63 December 03, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.62 December 02, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.61 December 01, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.60 November 22, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.60-ane November 18, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.59 November 15, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.58 November 14, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
v0.2.57 November 07, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.56 November 03, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.55 November 02, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.54 October 30, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.53 October 23, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.52 October 21, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.50 October 20, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.49 October 14, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.48 October 14, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.47 October 10, 2025
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.46 October 09, 2025
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.45 October 09, 2025
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.43 October 07, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.42 October 07, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.41 October 07, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.40 October 06, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.39 October 04, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.38 October 02, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.37 September 28, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe
v0.2.36 September 28, 2025
nexa-cli_macos_arm64.pkgpkg
nexa-cli_macos_x86_64.pkgpkg
nexa-cli_windows_arm64.exeexe
nexa-cli_windows_x86_64.exeexe
nexa-cli_windows_x86_64_cuda.exeexe

Package Info

Last Updated
Dec 09, 2025
Latest Version
v0.2.64
License
Apache-2.0
Total Versions
28

README

    🀝 Supported chipmakers 
      
        
        
        
      
    
  


    


    



    


    

NexaSDK - Run any AI model on any backend

NexaSDK is an easy-to-use developer toolkit for running any AI model locally β€” across NPUs, GPUs, and CPUs β€” powered by our NexaML engine, built entirely from scratch for peak performance on every hardware stack. Unlike wrappers that depend on existing runtimes, NexaML is a unified inference engine built at the kernel level. It’s what lets NexaSDK achieve Day-0 support for new model architectures (LLMs, multimodal, audio, vision). NexaML supports 3 model formats: GGUF, MLX, and Nexa AI's own .nexa format.

βš™οΈ Differentiation

Features NexaSDK Ollama llama.cpp LM Studio
NPU support βœ… NPU-first ❌ ❌ ❌
Android SDK support βœ… NPU/GPU/CPU support ⚠️ ⚠️ ❌
Support any model in GGUF, MLX, NEXA format βœ… Low-level Control ❌ ⚠️ ❌
Full multimodality support βœ… Image, Audio, Text ⚠️ ⚠️ ⚠️
Cross-platform support βœ… Desktop, Mobile, Automotive, IoT ⚠️ ⚠️ ⚠️
One line of code to run βœ… βœ… ⚠️ βœ…
OpenAI-compatible API + Function calling βœ… βœ… βœ… βœ…
  Legend:
  βœ… Supported   |  
  ⚠️ Partial or limited support    |  
  ❌ No

Recent Wins

  • πŸ“£ Release Nexa AI’s AutoNeural-VL-1.5B, an NPU-native vision–language model built for real-time in-car assistants, delivering 14Γ— lower latency, 3Γ— faster decode, and 4Γ— longer context on Qualcomm SA8295P β€” now also runnable on Qualcomm X Elite laptops.
  • πŸ“£ Support Mistral AI's Ministral-3-3B across Qualcomm Hexagon NPU, Apple Neural Engine, GPU and CPU.
  • πŸ“£ Release Linux SDK for NPU/GPU/CPU. See Linux SDK Doc (https://docs.nexa.ai/nexa-sdk-docker/overview).
  • πŸ“£ Support Apple Neural Engine for Granite-4.0 (https://huggingface.co/NexaAI/Granite-4-Micro-ANE), Qwen3 (https://huggingface.co/NexaAI/Qwen3-0.6B-ANE), Gemma3 (https://huggingface.co/NexaAI/Gemma3-1B-ANE), and Parakeetv3 (https://huggingface.co/NexaAI/parakeet-tdt-0.6b-v3-ane). Download NexaSDK for ANE here (https://nexa-model-hub-bucket.s3.us-west-1.amazonaws.com/public/nexa_sdk/downloads/nexa-cli_macos_arm64_ane.pkg).
  • πŸ“£ Support Android SDK for NPU/GPU/CPU. See Android SDK Doc (https://docs.nexa.ai/nexa-sdk-android/overview) and Android SDK Demo App.
  • πŸ“£ Support SDXL-turbo image generation on AMD NPU. See AMD blog : Advancing AI with Nexa AI (https://www.amd.com/en/developer/resources/technical-articles/2025/advancing-ai-with-nexa-ai--image-generation-on-amd-npu-with-sdxl.html).
  • Support Android Python SDK for NPU/GPU/CPU. See Android Python SDK Doc (https://docs.nexa.ai/nexa-sdk-android/python) and Android Python SDK Demo App.
  • πŸ“£ Day-0 Support for Qwen3-VL-4B and 8B in GGUF, MLX, .nexa format for NPU/GPU/CPU. We are the only framework that supports the GGUF format. Featured in Qwen's post about our partnership (https://x.com/Alibaba_Qwen/status/1978154384098754943).
  • πŸ“£ Day-0 Support for IBM Granite 4.0 on NPU/GPU/CPU. NexaML engine were featured right next to vLLM, llama.cpp, and MLX in IBM's blog (https://x.com/IBM/status/1978154384098754943).
  • πŸ“£ Day-0 Support for Google EmbeddingGemma on NPU. We are featured in Google's social post (https://x.com/googleaidevs/status/1969188152049889511).
  • πŸ“£ Supported vision capability for Gemma3n: First-ever Gemma-3n (https://sdk.nexa.ai/model/Gemma3n-E4B) multimodal inference for GPU & CPU, in GGUF format.
  • πŸ“£ Intel NPU Support DeepSeek-r1-distill-Qwen-1.5B (https://sdk.nexa.ai/model/DeepSeek-R1-Distill-Qwen-1.5B-Intel-NPU) and Llama3.2-3B (https://sdk.nexa.ai/model/Llama3.2-3B-Intel-NPU)
  • πŸ“£ Apple Neural Engine Support for real-time speech recognition with Parakeet v3 model (https://sdk.nexa.ai/model/parakeet-v3-ane)
See full README on repository.