Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ai): add support for Valibot schemas #3015

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fabian-hiller
Copy link

This is a draft PR as there are a few questions that need to be answered before I can finalize the implementation.

The initial idea is to support Valibot for the parameters property in addition to Zod. Valibot has seen extreme growth in recent months, increasing the likelihood that users of the AI SDK will want to use Valibot instead of raw JSON Schema or Zod. Our library offers the advantage that the API design is modular, and schemas typically require only a few hundred bytes.

To make Valibot a great match with AI tools (we are also in exchange with OpenAI) we added a bunch of new features in the last weeks. Valibot now supports a title and description metadata action that is compatible with JSON Schema. Furthermore, we implemented an official toJsonSchema function that reliably converts our schemas to JSON Schema format.

Valibot's API seems to be stable. We are just waiting with our v1 RC release as there are efforts in the background to make Zod, Valibot and other schema libraries more compatible with common interface properties, which would result in easier integration of these schema libraries.

This PR adds a valibotSchema function to the ui-utils package that works similar to zodSchema. However, it is unclear to me whether it is preferable to export valibotSchema in the ai package or to integrate it into the asSchema function. The latter would allow users to pass Valibot schemas directly to parameters without a wrapper function, resulting in a smoother DX.

import { generateText, tool } from 'ai';
import * as v from 'valibot';

const result = await generateText({
  model: yourModel,
  tools: {
    weather: tool({
      description: 'Get the weather in a location',
      parameters: v.object({
        location: v.pipe(
          v.string(),
          v.describe('The location to get the weather for')
        ),
      }),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  toolChoice: 'required',
  prompt: '...',
});

Copy link

socket-security bot commented Sep 15, 2024

New and removed dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/@valibot/[email protected] None 0 44.4 kB fabian-hiller
npm/[email protected] environment, filesystem, network, shell, unsafe +10 98.9 MB vercel-release-bot
npm/[email protected] None +1 6.4 MB fb, gnoff, react-bot, ...2 more
npm/[email protected] environment 0 237 kB react-bot
npm/[email protected] None 0 1.46 MB fabian-hiller

🚮 Removed packages: npm/[email protected], npm/[email protected], npm/[email protected]

View full report↗︎

Copy link

socket-security bot commented Sep 15, 2024

👍 Dependency issues cleared. Learn more about Socket for GitHub ↗︎

This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored.

View full report↗︎

@lgrammel
Copy link
Collaborator

My preferred way would be to have a separate @ai-sdk/valibot package for now. I'll look into this tomorrow.

@fabian-hiller
Copy link
Author

Thank you for your response. I recommend waiting a few days before taking any action, as we are currently trying to standardize a common interface for TS schema libraries. If our ideas are well received by other schema libraries as well as libraries like the Vercel AI SDK, the integration of schema libraries will become much easier and less complicated.

@lgrammel
Copy link
Collaborator

@fabian-hiller do you have a timeline for the standardization? I could also just release the valibot integration now and then update once the standard is out.

@fabian-hiller
Copy link
Author

We are trying to release a v0 or v1 of the Standard Schema this week. I will keep you posted.

@fabian-hiller
Copy link
Author

I have not forgotten you. I will probably give you an update in the next few days.

@fabian-hiller
Copy link
Author

fabian-hiller commented Oct 1, 2024

Standard Schema v1 beta and Valibot v1 beta are now available. Below is some pseudocode on how the Vercel AI SDK could implement it. Standard Schema can reduce implementation effort and prevent or minimize vendor lock-in. We will be adding more documentation and examples to the Standard Schema repository soon. Let us know what you think.

import type { StandardSchema, InferOutput } from "@standard-schema/spec";
import { toJsonSchema as valibotToJsonSchema } from "@valibot/to-schema-schema";
import type { JSONSchema7 } from "json-schema";
import { zodToJsonSchema } from "zod-to-json-schema";

export type Schema<OBJECT = unknown> = Validator<OBJECT> & {
  // ...
};

export function standardSchema<T extends StandardSchema>(
  schema: T,
): Schema<InferOutput<T>> {
  let jsonSchema7: JSONSchema7 | undefined;
  if (schema["~vendor"] === "zod") {
    jsonSchema7 = zodToJsonSchema(schema);
  } else if (schema["~vendor"] === "valibot") {
    jsonSchema7 = valibotToJsonSchema(schema);
  } else {
    throw new Error("Unsupported schema vendor");
  }
  return jsonSchema(jsonSchema7, {
    validate: async (value) => {
      const result = await schema["~validate"]({ value });
      if (result.issues) {
        console.warn("Issues", result.issues);
        return { success: false, error: new Error(result.issues[0].message) };
      }
      return { success: true, value: result.value };
    },
  });
}

@fabian-hiller
Copy link
Author

The main idea is that libraries like the Vercel AI SDK implement Standard Schema to easily support any schema that follows that spec. This simplifies and improves the DX and both sides. Library authors are no longer required to understand and implement a specific schema library, and in the worst case, maintain a bunch of adapters in their own source code. Users, on the other hand, no longer need to wrap their schema in an adapter function to make it compatible.

The Vercel AI SDK could allow JSON schema and standard schema as input for the parameters property. If the input contains a "~standard" key, it can be passed to the standardSchema function to format it to a unified schema interface.

@MentalGear
Copy link

Great initiative on the StandardSchema. This is definitely the way forward as it massively reduces implementation overhead while increasing interoperability.

@lgrammel
Copy link
Collaborator

@fabian-hiller I really like the idea. However, until standard schema supports conversion to JSON schema and potentially OpenAPI schema, it is only a half-way solution for the needs of the AI SDK (tool calling, structured outputs). We would still need dependencies to the json schema converters, which in turn will have dependencies on the libraries. I'll think more about how to apply it to the AI SDK.

@MentalGear
Copy link

MentalGear commented Oct 15, 2024

Indeed I would assume for JSON export to be next on the task list for standard schema since it's the most universal and would also give an additional incentive for other validation lib authors to use standard schema and get JSON export for free.

@fabian-hiller
Copy link
Author

Yes, in general it is a good idea to add it to the Standard Schema spec, but there is one drawback and that is the bundle size. Being forced to add our toJsonSchema function to every schema is probably not an option for Valibot, because we need a tree-shakable solution to not unnecessary increase the bundle size of every schema if this functionality is never used.

@fabian-hiller
Copy link
Author

fabian-hiller commented Nov 1, 2024

@lgrammel another option for the Vercel SDK could be a new function called standardSchema (similar to jsonSchema) that takes a Standard Schema as the first argument and a matching toJsonFunction function as the second. This way the SDK does not have to add external dependencies and it is still easy to use if it is official documented in the docs.

Example code:

export function standardSchema<TSchema extends StandardSchema>(
  schema: TSchema,
  toJsonSchema: (schema: TSchema) => JSONSchema7
): Schema<InferOutput<TSchema>> {
  return jsonSchema(toJsonSchema(schema), {
    validate: (value) => {
      const result = schema['~validate'](value);
      return result.issues
        ? { success: false, error: new SchemaError(result.issues) }
        : { success: true, value: result.output };
    },
  });
}
import toJsonSchema from '@valibot/to-json-schema';
import { generateText, standardSchema, tool } from 'ai';
import * as v from 'valibot';

const result = await generateText({
  model: yourModel,
  tools: {
    weather: tool({
      description: 'Get the weather in a location',
      parameters: standardSchema(
        v.object({
          location: v.pipe(
            v.string(),
            v.describe('The location to get the weather for')
          ),
        }),
        toJsonSchema
      ),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  toolChoice: 'required',
  prompt: '...',
});

@lgrammel
Copy link
Collaborator

lgrammel commented Nov 1, 2024

@fabian-hiller unnecessarity overhead for users to care about the json schema mapping imo, and not much advantage over toJsonSchema. Plus I'll probably need to a add toOpenAPISchema as well for some providers. My favorite solution (once Valibot 1.0 is out) is to have an @ai-sdk/valibot package:

import { valibotSchema } from '@ai-sdk/valibot';
import { generateText, standardSchema, tool } from 'ai';
import * as v from 'valibot';

const result = await generateText({
  model: yourModel,
  tools: {
    weather: tool({
      description: 'Get the weather in a location',
      parameters: valibotSchema(
        v.object({
          location: v.pipe(
            v.string(),
            v.describe('The location to get the weather for')
          ),
        })
      ),
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  // ...
});

Then we can doc it in the SDK as well, and for users it's as easy as: I use Valibot, so I use the Valibot schema adapter of the AI SDK. (vs knowing that valibot uses standard schema and knowing what the name of the valibot to json pkg is)

@MentalGear
Copy link

MentalGear commented Nov 2, 2024 via email

@fabian-hiller
Copy link
Author

AI SDK's are not the main use case Standard Schema is optimized for. Read more here: https://github.com/standard-schema/standard-schema?tab=readme-ov-file#background

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants