Silicon-Based Digital Human SDK [中文]
2D digital human virtual human SDK that can be driven in real-time through voice.
Low deployment cost: No need for customers to provide technical teams for cooperation, supports low-cost and rapid deployment on various terminals and large screens; Small network dependency: Can be implemented in various scenarios such as subway, bank, and government virtual assistant self-service; Diverse functions: Can meet the diverse needs of video, media, customer service, finance, radio and television and other industries according to customer needs.
Provide customized AI anchors, smart customer service and other multi-scene image rentals, support customers to deploy quickly and operate at low cost; Exclusive image customization: Support custom exclusive virtual assistant images, optional low-cost or deep image generation; Broadcast content customization: Support custom exclusive broadcast content, used in training, broadcasting and other scenarios; Real-time interactive Q&A: Support real-time dialogue, can also customize exclusive Q&A database, can meet consulting inquiries, voice chat, virtual companions, vertical scene customer service questions and other needs.
Item | Description |
---|---|
System | Supports Android 7.0+ (API Level 24) to Android 13 (API Level 33) system. |
CPU Architecture | armeabi-v7a, arm64-v8a |
Hardware Requirements | Requires devices with 4-core CPU or higher, 4GB memory or higher, and available storage space of 500MB or higher. |
Network | Supports WIFI and mobile networks. If using cloud-based Q&A database, the device bandwidth (for the actual bandwidth of digital humans) is expected to be 10mbps or higher. |
Development IDE | Android Studio Giraffe 2022.3.1 Patch 2 |
Memory Requirements | Memory available for digital humans >= 400MB |
add the following configuration in build.gradle:
dependencies {
// reference SDK project
implementation project(":duix-sdk")
// The SDK uses exoplayer to handle audio (required)
implementation 'com.google.android.exoplayer:exoplayer:2.14.2'
// Cloud Q&A interface uses SSE component (optional)
implementation 'com.squareup.okhttp3:okhttp-sse:4.10.0'
...
}
Permission requirements, add the following configuration in AndroidManifest.xml:
<manifest xmlns:android="http://schemas.android.com/apk/res/android">
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.CHANGE_WIFI_STATE" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.WAKE_LOCK" />
</manifest>
Build the DUIX object and add callback events in the render page onCreate() stage:
duix = DUIX(mContext, baseDir, modelDir, mDUIXRender) { event, msg, info ->
when (event) {
ai.guiji.duix.sdk.client.Constant.CALLBACK_EVENT_INIT_READY -> {
initOK()
}
ai.guiji.duix.sdk.client.Constant.CALLBACK_EVENT_INIT_ERROR -> {
}
// ...
}
}
// Asynchronous callback result
duix?.init()
DUIX object construction instructions:
Parameter | Type | Description |
---|---|---|
context | Context | System context |
baseDir | String | Stores configuration files for model driving, need to manage it yourself. You can unzip the compressed file to external storage and provide the folder path |
modelDir | String | Stores model files Folder, need to manage it yourself. You can unzip the compressed file to external storage and provide the folder path |
render | RenderSink | Rendering data interface, the SDK provides a default rendering component that inherits from this interface, or you can implement it yourself |
callback | Callback | Various callback events handled by the SDK |
Refer to the LiveActivity demo example
object : Callback {
fun onEvent(event: String, msg: String, info: Object) {
when (event) {
"init.ready" -> {
// SDK model initialization successful
}
"init.error" -> {
// Initialization failed
Log.e(TAG, "init error: $msg")
}
// ...
}
}
}
Use the RenderSink interface to accept rendering frame data; the SDK provides an implementation of this interface, DUIXRenderer.java. You can also implement the interface yourself to customize rendering. The definition of RenderSink is as follows:
/**
* Rendering pipeline, returns rendering data through this interface
*/
public interface RenderSink {
// The buffer data in frame is arranged in bgr order
void onVideoFrame(ImageFrame imageFrame);
}
Use DUIXRenderer and DUIXTextureView control to simply implement rendering display; this control supports transparency and can be freely set background and foreground:
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
// ...
mDUIXRender =
DUIXRenderer(
mContext,
binding.glTextureView
)
binding.glTextureView.setEGLContextClientVersion(GL_CONTEXT_VERSION)
binding.glTextureView.setEGLConfigChooser(8, 8, 8, 8, 16, 0) // Transparency
binding.glTextureView.isOpaque = false // Transparency
binding.glTextureView.setRenderer(mDUIXRender)
binding.glTextureView.renderMode =
GLSurfaceView.RENDERMODE_WHEN_DIRTY // Must be called after setting Render
duix = DUIX(mContext, baseDir, modelDir, mDUIXRender) { event, msg, _ ->
}
// ...
}
After initialization is successful, you can play audio to drive the image:
duix?.playAudio(wavPath)
Parameter description:
Parameter | Type | Description |
---|---|---|
wavPath | String | Address or https network address of a 16k sample rate mono channel wav file |
Audio playback status and progress callback:
object : Callback {
fun onEvent(event: String, msg: String, info: Object) {
when (event) {
// ...
"play.start" -> {
// Start playing audio
}
"play.end" -> {
// Complete playing audio
}
"play.error" -> {
// Audio playback exception
}
"play.progress" -> {
// Audio playback progress
}
}
}
}
Call this interface to terminate the broadcast when the digital human is broadcasting.
Function definition:
boolean stopAudio();
Example call:
duix?.stopAudio()
When the model supports playing action intervals, you can use this function to play multiple intervals; when there are multiple, it plays randomly.
Function definition:
void motion();
Example call:
duix?.motion()
If the code uses obfuscation, please configure in proguard-rules.pro:
-keep class com.btows.ncnntest.** {*; }
-dontwarn com.squareup.okhttp3.**
-keep class com.squareup.okhttp3.** { *;}
- The basic configuration folder and the corresponding model folder storage path must be correctly configured to drive rendering.
- The audio file to be played should not be too large; a large audio file import will consume a lot of CPU, causing drawing stuck.
3.0.4
1. Fixed the issue that the default low-precision float of gl on some devices caused the image to not be displayed properly.
3.0.3
1. Optimized local rendering.
Module | Description |
---|---|
ExoPlayer | Media player |
okhttp | Networking framework |
onnx | Artificial intelligence framework |
ncnn | High-performance neural network computing framework |