Google Gemini API Upload file larger than 20 MB #3203
georgezhai123
started this conversation in
Ideas & Feedback
Replies: 1 comment
-
@georgezhai123 I'm not aware it was doable when you asked this question two months back, but AI SDK doc says you could use Gemini's native file API to upload a large file then pass it to the prompt.
This is my test code and it looks working: import process from "node:process";
import { extname } from "jsr:@std/path";
import mime from "mime";
import { generateText } from "ai";
import { google } from "@ai-sdk/google";
import { GoogleAIFileManager } from "@google/generative-ai/server";
const apiKey = process.env.GOOGLE_GENERATIVE_AI_API_KEY;
if (!apiKey) {
console.log("Please provide a Google API key.");
process.exit(1);
}
const fileManager = new GoogleAIFileManager(apiKey);
async function main() {
console.log(Deno.args);
if (Deno.args.length === 0) {
console.log("Please provide a file path.");
return;
}
const filepath = Deno.args[0];
const fileExt = extname(filepath);
const mimeType = mime.getType(fileExt);
if (!mimeType) {
console.log("Unsupported file type.");
return;
}
console.log(`File path: ${filepath}`);
console.log(`MIME type: ${mimeType}`);
console.log("Uploading file...");
const geminiFile = await fileManager.uploadFile(filepath, { mimeType });
console.log("File uploaded: ", geminiFile.file.uri);
const fileUri = geminiFile.file.uri;
const response = await generateText({
model: google("gemini-2.0-flash-exp"),
temperature: 0.0,
messages: [
{
role: "user",
content: [
{
type: "text",
text: "Summarize the document.",
},
{
type: "file",
data: fileUri,
mimeType: mimeType,
},
],
},
],
});
console.log(response.text);
}
await main(); Hope this helps. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently, the gemini model doesn't allow uploading file for more than 20 MB ( attached with prompt). However, google has a File API which supports files more than 20 MB, if this feature could be implmeneted that would be very useful for Gemini's long context capability
Beta Was this translation helpful? Give feedback.
All reactions