Krillin AI is a privacy-first video translation and dubbing tool powered by large language models that supports bidirectional translation across 100 languages. It offers one-click deployment with a full processing pipeline optimized for platforms like YouTube, TikTok, Bilibili, and Douyin, handling both landscape and portrait video formats.
The platform integrates high-accuracy speech recognition via Whisper and Parakeet, intelligent subtitle segmentation using LLM context, professional translation maintaining natural semantics, and voice cloning through CosyVoice. It supports automatic video composition, cross-platform deployment on Windows, Linux, and macOS, and offers both desktop and server versions for maximum flexibility.
Use Cases:
English|简体中文|日本語|한국어|Tiếng Việt|Français|Deutsch|Español|Português|Русский|اللغة العربية
Twitter (https://img.shields.io/badge/Twitter-KrillinAI-orange?logo=twitter) QQ 群 (https://img.shields.io/badge/QQ%20群-754069680-green?logo=tencent-qq) Bilibili (https://img.shields.io/badge/dynamic/json?label=Bilibili&query=%24.data.follower&suffix=粉丝&url=https%3A%2F%2Fapi.bilibili.com%2Fx%2Frelation%2Fstat%3Fvmid%3D242124650&logo=bilibili&color=00A1D6&labelColor=FE7398&logoColor=FFFFFF) Ask DeepWiki (https://deepwiki.com/badge.svg)
Quick Start
KrillinAI is a versatile audio and video localization and enhancement solution developed by Krillin AI. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats to ensure perfect presentation on all major platforms (Bilibili, Xiaohongshu, Douyin, WeChat Video, Kuaishou, YouTube, TikTok, etc.). With an end-to-end workflow, you can transform raw materials into beautifully ready-to-use cross-platform content with just a few clicks.
🎯 One-click Start: No complex environment configuration required, automatic dependency installation, ready to use immediately, with a new desktop version for easier access!
📥 Video Acquisition: Supports yt-dlp downloads or local file uploads
📜 Accurate Recognition: High-accuracy speech recognition based on Whisper
🧠 Intelligent Segmentation: Subtitle segmentation and alignment using LLM
🔄 Terminology Replacement: One-click replacement of professional vocabulary
🌍 Professional Translation: LLM translation with context to maintain natural semantics
🎙️ Voice Cloning: Offers selected voice tones from CosyVoice or custom voice cloning
🎬 Video Composition: Automatically processes landscape and portrait videos and subtitle layout
💻 Cross-Platform: Supports Windows, Linux, macOS, providing both desktop and server versions
The image below shows the effect of the subtitle file generated after importing a 46-minute local video and executing it with one click, without any manual adjustments. There are no omissions or overlaps, the segmentation is natural, and the translation quality is very high. !Alignment Effect
https://github.com/user-attachments/assets/bba1ac0a-fe6b-4947-b58d-ba99306d0339
https://github.com/user-attachments/assets/0b32fad3-c3ad-4b6a-abf0-0865f0dd2385
https://github.com/user-attachments/assets/c2c7b528-0ef8-4ba9-b8ac-f9f92f6d4e71
All local models in the table below support automatic installation of executable files + model files; you just need to choose, and Klic will prepare everything for you.
| Service Source | Supported Platforms | Model Options | Local/Cloud | Remarks |
|---|---|---|---|---|
| OpenAI Whisper | All Platforms | - | Cloud | Fast speed and good effect |
| FasterWhisper | Windows/Linux | tiny/medium/large-v2 (recommended medium+) |
Local | Faster speed, no cloud service cost |
| WhisperKit | macOS (M-series only) | large-v2 |
Local | Native optimization for Apple chips |
| WhisperCpp | All Platforms | large-v2 |
Local | Supports all platforms |
| Alibaba Cloud ASR | All Platforms | - | Cloud | Avoids network issues in mainland China |
Terminal-based AI coding assistant with multi-provider LLM support, session management, LSP integration, and interactive TUI for developers.
All-in-one AI content marketing platform for creating, publishing, and monetizing across 14+ social channels with automation, trend tracking, and engagement tools.
Privacy-first AI meeting assistant with local transcription, speaker diarization, and automated summarization running entirely on your infrastructure without cloud dependencies.
Modern cross-platform system monitor built with Rust offering real-time CPU and memory tracking with beautiful UI, process management, and advanced search capabilities.
Ultra-efficient large language model achieving 3x faster reasoning generation on end devices with hybrid sparse attention and extensive hardware acceleration support.
Feature-rich Flutter-based Bilibili third-party client supporting multiple platforms with offline playback, DLNA casting, and extensive social interaction features.