KanTV("Kan", aka Chinese PinYin "Kan" or Chinese HanZi "看" or English "watch/listen") , an open source project focus on study and practise state-of-the-art AI technology in real scenario(such as online-TV playback and online-TV transcription(real-time subtitle) and online-TV language translation and online-TV video&audio recording works at the same time) on Android phone/device, derived from original , with much enhancements and new features:
-
Watch online TV and local media by my customized , source code of my customized FFmpeg 6.1 could be found in external/ffmpeg according to FFmpeg's license
-
Record online TV to automatically generate videos (useful for short video creators to generate short video materials but pls respect IPR of original content creator/provider); record online TV's video / audio content for gather video / audio data which might be required of/useful for AI R&D activity
-
AI subtitle(Real-time English subtitle for English online-TV(aka OTT TV) by the great & excellent & amazing whisper.cpp )(PoC finished on Xiaomi 14. Xiaomi 14 or other powerful Android mobile phone is HIGHLY required/recommended for real-time subtitle feature otherwise unexpected behavior would happen)
-
2D graphic performance
-
Set up a customized playlist and then use this software to watch the content of the customized playlist for R&D activity
-
UI refactor(closer to real commercial Android application and only English is supported in UI language currently)
-
Well-maintained "workbench" for ASR(Automatic Speech Recognition) researchers/developers who was interested in practise state-of-the-art AI tech(such as whisper.cpp) in real scenario on Android phone/device
-
Well-maintained "workbench" for LLM(Large Language Model) researchers/developers who was interested in practise state-of-the-art AI tech(such as llama.cpp) in real scenario on Android phone/device, or Run/experience LLM model(such as llama-2-7b, baichuan2-7b, qwen1_5-1_8b, gemma-2b) on Android phone/device using the magic llama.cpp
-
Well-maintained "workbench" for GGML beginners to study and practise GGML inference framework on Android phone/device
-
Well-maintained "workbench" for NCNN beginners to study and practise NCNN inference framework on Android phone/device
-
Android turn-key project for AI researchers(whom mightbe not familiar with regular Android software development)/developers/beginners focus on edge/device-side AI learning / R&D activity, some AI R&D activities (AI algorithm validation / AI model validation / performance benchmark in ASR, LLM, TTS, NLP, CV......field) could be done by Android Studio IDE + a powerful Android phone very easily
(depend on zhouwg#121 and https://github.com/zhouwg/kantv/issues/176 )
git clone https://github.com/zhouwg/kantv.git
cd kantv
git checkout master
cd kantv
-
Build docker image
docker build build -t kantv --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) --build-arg USER_NAME=$(whoami)
-
Run docker container
# map source code directory into docker container docker run -it --name=kantv --volume=`pwd`:/home/`whoami`/kantv kantv # in docker container . build/envsetup.sh ./build/prebuild-download.sh
-
Prerequisites
- tools & utilities
-
Android Studio
download and install Android Studio manually
-
vim settings
Host OS information:
uname -a Linux 5.8.0-43-generic #49~20.04.1-Ubuntu SMP Fri Feb 5 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux cat /etc/issue Ubuntu 20.04.2 LTS \n \l
sudo apt-get update sudo apt-get install build-essential -y sudo apt-get install cmake -y sudo apt-get install curl -y sudo apt-get install wget -y sudo apt-get install python -y sudo apt-get install tcl expect -y sudo apt-get install nginx -y sudo apt-get install git -y sudo apt-get install vim -y sudo apt-get install spawn-fcgi -y sudo apt-get install u-boot-tools -y sudo apt-get install ffmpeg -y sudo apt-get install openssh-client -y sudo apt-get install nasm -y sudo apt-get install yasm -y sudo apt-get install openjdk-17-jdk -y sudo dpkg --add-architecture i386 sudo apt-get install lib32z1 -y sudo apt-get install -y android-tools-adb android-tools-fastboot autoconf \ automake bc bison build-essential ccache cscope curl device-tree-compiler \ expect flex ftp-upload gdisk acpica-tools libattr1-dev libcap-dev \ libfdt-dev libftdi-dev libglib2.0-dev libhidapi-dev libncurses5-dev \ libpixman-1-dev libssl-dev libtool make \ mtools netcat python-crypto python3-crypto python-pyelftools \ python3-pycryptodome python3-pyelftools python3-serial \ rsync unzip uuid-dev xdg-utils xterm xz-utils zlib1g-dev sudo apt-get install python3-pip -y sudo apt-get install indent -y pip3 install meson ninja echo "export PATH=/home/`whoami`/.local/bin:\$PATH" >> ~/.bashrc
or run below script accordingly after fetch project's source code
./build/prebuild.sh
borrow from http://ffmpeg.org/developer.html#Editor-configuration
set ai set nu set expandtab set tabstop=4 set shiftwidth=4 set softtabstop=4 set noundofile set nobackup set fileformat=unix set undodir=~/.undodir set cindent set cinoptions=(0 " Allow tabs in Makefiles. autocmd FileType make,automake set noexpandtab shiftwidth=8 softtabstop=8 " Trailing whitespace and tabs are forbidden, so highlight them. highlight ForbiddenWhitespace ctermbg=red guibg=red match ForbiddenWhitespace /\s\+$\|\t/ " Do not highlight spaces at the end of line while typing on that line. autocmd InsertEnter * match ForbiddenWhitespace /\t\|\s\+\%#\@<!$/
-
Download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist
. build/envsetup.sh ./build/prebuild-download.sh
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android device is Xiaomi 14 or Qualcomm Snapdragon 8 Gen 3 SoC based Android phone
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android phone is Qualcomm SoC based Android phone and enable QNN backend for inference framework on Qualcomm SoC based Android phone
-
Remove the hardcoded debug flag in Android NDK android-ndk issue
# open $ANDROID_NDK/build/cmake/android.toolchain.cmake for ndk < r23 # or $ANDROID_NDK/build/cmake/android-legacy.toolchain.cmake for ndk >= r23 # delete "-g" line list(APPEND ANDROID_COMPILER_FLAGS -g -DANDROID
. build/envsetup.sh
-
Option 1: Build APK from source code by Android Studio IDE
-
Option 2: Build APK from source code by command line
. build/envsetup.sh lunch 1 ./build-all.sh android
This project is focus on learning&practising real AI tech on Android device, so the Android APK will not collect/upload user data in Android device. The Android APK should be works well on any mainstream Android phone(report issue in various Android phone to this project is greatly welcomed) and the following four permissions are required:
- Access to storage is required to generate necessary temporary files
- Access to device information is required to obtain current phone network status information, distinguishing whether the current network is Wi-Fi or mobile when playing online TV
- Access to camera is needed for AI Agent
- Access to mic(audio recorder) is needed for AI Agent
here is a short video to demostrate AI subtitle by running the great & excellent & amazing whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
realtime-subtitle-by-whispercpp-demo-on-xiaomi14-finetune-20240324.mp4
here is a screenshot to demostrate LLM inference by running the magic llama.cpp on a Xiaomi 14 device - fully offline, on-device.
here is a screenshot to demostrate ASR inference by running the excellent whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
here are some screenshots to demostrate CV inference by running the excellent ncnn on a Xiaomi 14 device - fully offline, on-device.
-
Android multimodal AI agent(ASR, LLM, TTS, CV, NLP, ..., an open source GPT-4o style multimodal AI agent on Android phone) by GGML + NCNN
-
bugfix in UI layer(Java)
-
bugfix in native layer(C/C++)
Be sure to review the opening issues before contribute to project KanTV, We use GitHub issues for tracking requests and bugs, please see how to submit issue in this project .
Report issue in various Android-based phone or even submit PR to this project is greatly welcomed.
- How to setup customized KanTV server in local dev env
- How to create customized playlist for kantv apk
- How to integrate proprietary/open source codes to project KanTV for personal/proprietary/commercial R&D activity
- How to use whisper.cpp and ffmpeg to add subtitle to video
- How to reduce the size of build apk
- How to sign apk
- Acknowledgement
- ChangeLog
- F.A.Q
- AI inference framework
- GGML by Georgi Gerganov
- NCNN by Ni Hui
- AI application engine
- ASR engine whisper.cpp by Georgi Gerganov
- LLM engine llama.cpp by Georgi Gerganov
- TTS engine bark.cpp by PABannier
- Text2Image engine stablediffusion.cpp by leejet
- ASR engine sherpa-ncnn(an open-source ASR engine using next-generation Kaldi with ncnn) by k2-fsa
- Students:understand calculus/linear algebra/mathematical statistics and probability theory, have a little / some experiences in C/C++/Java software development, want to learn Android software development(UI, NDK, streamming media, AI application); 学生:了解微积分,线性代数,数理统计与概率论, 且有一定的C/C++/Java开发经验, 希望学习Android开发(UI, NDK, 音视频(FFmpeg), 流媒体(HLS, RTMP, WebRTC), 端侧AI应用);
- Programmers: have good experiences in C/C++ software development, know a little/nothing about real/hardcore AI technology, want to learn AI technology in depth(know how); 程序员:有丰富的C/C++开发经验,几乎不懂真正的AI技术,希望深入学习AI技术(know how);
- Authors/maintainers of AI inference framework: compare the advantages of ggml and ncnn on Android(why focus on ggml & ncnn); AI推理框架的开发人员: 对比两个端侧推理框架ggml与ncnn的优点(为啥只关注ggml与ncnn这两个AI推理框架);
- AI experts/algorithm engineers: validate/verify AI(ASR, TTS, CV, NLP, LLM...) algorithm on Android with framework provided in this project(how to validate AI algorithm/model on Android using this project); AI特定领域(ASR, TTS, CV, NLP, LLM,...)的专家/算法工程师:使用本项目提供的框架在Android设备上调试/验证AI特定领域算法/模型(如何使用本项目在Android设备上调试/验证AI特定领域算法/模型);
Copyright (c) 2021 - 2023 Project KanTV
Copyright (c) 2024 - Authors of Project KanTV
Licensed under Apachev2.0 or later