Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal : Failed to initialize session: %s% INTERNAL: CalculatorGraph::Run() failed #5724

Open
Arya-Hari opened this issue Nov 12, 2024 · 1 comment
Assignees
Labels
os:linux-non-arm Issues on linux distributions which run on x86-64 architecture. DOES NOT include ARM devices. task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution

Comments

@Arya-Hari
Copy link

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Linux Ubuntu 16.04

Mobile device if the issue happens on mobile device

Pixel 7a

Browser and version if the issue happens on browser

No response

Programming Language and version

Python

MediaPipe version

No response

Bazel version

No response

Solution

LLM Inference

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

I created .tflite file using ai-edge-torch for Llama 3.2 1B model and created the Task Bundle as instructed in the documentation. After pushing the .task file to the device and modified the mediapipe example (which was for Gemma) for LLama. After running it, I get an error.

Describe the expected behaviour

I should be able to run inference without any issue.

Standalone code/steps you may have used to try to get what you need

# InferenceModel.kt after modification
package com.google.mediapipe.examples.llminference

import android.content.Context
import com.google.mediapipe.tasks.genai.llminference.LlmInference
import java.io.File
import kotlinx.coroutines.channels.BufferOverflow
import kotlinx.coroutines.flow.MutableSharedFlow
import kotlinx.coroutines.flow.SharedFlow
import kotlinx.coroutines.flow.asSharedFlow

class InferenceModel private constructor(context: Context) {
    private var llmInference: LlmInference

    private val modelExists: Boolean
        get() = File(MODEL_PATH).exists()

    private val _partialResults = MutableSharedFlow<Pair<String, Boolean>>(
        extraBufferCapacity = 1,
        onBufferOverflow = BufferOverflow.DROP_OLDEST
    )
    val partialResults: SharedFlow<Pair<String, Boolean>> = _partialResults.asSharedFlow()

    init {
        if (!modelExists) {
            throw IllegalArgumentException("Model not found at path: $MODEL_PATH")
        }

        val options = LlmInference.LlmInferenceOptions.builder()
            .setModelPath(MODEL_PATH)
            .setMaxTokens(1024)
            .setResultListener { partialResult, done ->
                _partialResults.tryEmit(partialResult to done)
            }
            .build()

        llmInference = LlmInference.createFromOptions(context, options)
    }

    fun generateResponseAsync(prompt: String) {
        // Add the gemma prompt prefix to trigger the response.
        val gemmaPrompt = prompt
        llmInference.generateResponseAsync(gemmaPrompt)
    }

    companion object {
        // NB: Make sure the filename is *unique* per model you use!
        // Weight caching is currently based on filename alone.
        private const val MODEL_PATH = "/data/local/tmp/llm/llama.task"
        private var instance: InferenceModel? = null

        fun getInstance(context: Context): InferenceModel {
            return if (instance != null) {
                instance!!
            } else {
                InferenceModel(context).also { instance = it }
            }
        }
    }
}

Other info / Complete Logs

Error -
internal: Failed to initialize session: %sINTERNAL: CalculatorGraph::Run() failed: Calculator::Open() for node "odml.infra.TfLitePrefillDecodeRunnerCalculator" failed; RET_CHECK failure (external/odml/odml/infra/genai/inference/utils/tflite_utils/tflite_llm_utils.cc:59) std::find_if(signature_keys.begin(), signature_keys.end(), [&](const std::string* key) { return *key == required_key; }) != signature_keys.end()
@Arya-Hari Arya-Hari added the type:bug Bug in the Source Code of MediaPipe Solution label Nov 12, 2024
@kuaashish kuaashish added task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup os:linux-non-arm Issues on linux distributions which run on x86-64 architecture. DOES NOT include ARM devices. labels Nov 13, 2024
@talumbau talumbau self-assigned this Nov 14, 2024
@talumbau
Copy link
Contributor

The error indicates that your TF Lite model does not have the two required signatures: "prefill" and "decode". Thus, I think something went wrong in the step where you "created .tflite file using ai-edge-torch for Llama 3.2 1B". Our typical conversion scripts enforce the creation of those two signatures for a converted language model. Can you double check that the TF Lite file has those signatures and also post the conversion code you used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
os:linux-non-arm Issues on linux distributions which run on x86-64 architecture. DOES NOT include ARM devices. task:LLM inference Issues related to MediaPipe LLM Inference Gen AI setup type:bug Bug in the Source Code of MediaPipe Solution
Projects
None yet
Development

No branches or pull requests

3 participants