More rewording (#61)

* update to new model * Add setup instructions and change wording
fw-ai · Feb 7, 2024 · 926dc9e · 926dc9e
1 parent 4fecd1c
commit 926dc9e
Show file tree

Hide file tree

Showing 4 changed files with 289 additions and 152 deletions.
diff --git a/examples/function_calling/fireworks_functions_information_extraction.ipynb b/examples/function_calling/fireworks_functions_information_extraction.ipynb
@@ -17,9 +17,9 @@
       "source": [
         "# Summarize Anything - Information Extraction via [Fireworks Function Calling](https://readme.fireworks.ai/docs/function-calling)\n",
         "\n",
-        "This is inspired by awesome colab notebook by [Deepset](https://colab.research.google.com/github/anakin87/notebooks/blob/main/information_extraction_via_llms.ipynb). Checkout there OSS LLM Orchestration framework [haystack](https://haystack.deepset.ai/).\n",
+        "This is inspired by awesome colab notebook by [Deepset](https://colab.research.google.com/github/anakin87/notebooks/blob/main/information_extraction_via_llms.ipynb). Check out there OSS LLM Orchestration framework [haystack](https://haystack.deepset.ai/).\n",
         "\n",
-        "In this experiment, we will use function calling ability of [Fireworks Function Calling](https://readme.fireworks.ai/docs/function-calling) model to generate structured information from unstrucutred data.\n",
+        "In this experiment, we will use function calling ability of [Fireworks Function Calling](https://readme.fireworks.ai/docs/function-calling) model to generate structured information from unstructured data.\n",
         "\n",
         "🎯 Goal: create an application that, given a text (or URL) and a specific structure provided by the user, extracts information from the source.\n",
         "\n",
@@ -55,6 +55,16 @@
         "id": "w6TPh-3lOTDP"
       }
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Setup\n",
+        "Let's install the dependencies needed for the demo first and import any dependencies needed."
+      ],
+      "metadata": {
+        "id": "CuitKPhdd5Ze"
+      }
+    },
     {
       "cell_type": "code",
       "source": [
@@ -82,6 +92,17 @@
         "from IPython.display import HTML, display"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Setup your API Key\n",
+        "\n",
+        "In order to use the Fireworks AI function calling model, you must first obtain Fireworks API Keys. If you don't already have one, you can one by following the instructions [here](https://readme.fireworks.ai/docs/quickstart)."
+      ],
+      "metadata": {
+        "id": "pLVxmW58eIIw"
+      }
+    },
     {
       "cell_type": "code",
       "source": [
@@ -105,7 +126,9 @@
       "source": [
         "## Introduction\n",
         "\n",
-        "The [documentation](https://readme.fireworks.ai/docs/function-calling) for FW function calling details the API we can use to specify the list of tools/functions available to the model. We will use the described API to test out the structured response usecase."
+        "The [documentation](https://readme.fireworks.ai/docs/function-calling) for FW function calling details the API we can use to specify the list of tools/functions available to the model. We will use the described API to test out the structured response usecase.\n",
+        "\n",
+        "Before we can begin, let's give the function calling model a go with a simple toy example and examine it's output."
       ]
     },
     {
@@ -138,7 +161,7 @@
         "messages = [\n",
         "    {\n",
         "         \"role\": \"system\",\n",
-        "         \"content\": \"You are a helpful assistant with access to tools. Use them wisely and don't image parameter values\",\n",
+        "         \"content\": \"You are a helpful assistant with access to tools. Use them wisely and don't imagine parameter values\",\n",
         "    },\n",
         "    {\n",
         "        \"role\": \"user\",\n",
@@ -208,13 +231,16 @@
         "id": "4zvaVXFcmy2v"
       },
       "source": [
-        "All good! ✅"
+        "The model outputs the function that should be called along with arguments under the `tool_calls` field. This field contains the arguments to be used for calling the function as JSON Schema and the `name` field contains the name of the function to be called.\n",
+        "\n",
+        "\n",
+        "The output demonstrates a sample input & output to function calling model. All good! ✅"
       ]
     },
     {
       "cell_type": "markdown",
       "source": [
-        "# Document Retrieval & Clean Up\n",
+        "## Document Retrieval & Clean Up\n",
         "\n",
         "Before we can get started with extracting the right set of information. We need to first obtaint the document given a url & then clean it up. For cleaning up HTML, we will use [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/)."
       ],
@@ -272,6 +298,21 @@
       "execution_count": null,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Setup Information Extraction using Function Calling\n",
+        "\n",
+        "After we have obtained clean data from a html page given a url, we are going to send this data to function calling model. Along with sending the cleaned html, we are also going to send it the schema in which we expect the model to produce output. This schema is sent under the tool specification of chat completion call.\n",
+        "\n",
+        "For this notebook, we use the `animal_info_tools` schema to extract information from species info pages of [Rain Forest Alliance](https://www.rainforest-alliance.org/). There are several attributes about the animal we want the model to extract from the web page e.g. `weight`, `habitat`, `diet` etc. Additionally, we specify some attributes as `required` forcing the model to always output this information regardless of the input. Given, we would be supplying the model with species information pages, we expect this information to be always present.\n",
+        "\n",
+        "**NOTE** We set the temperature to 0.0 to get reliable and consistent output across calls. In this particular example, we want the model to produce the right answer rather than creative answer."
+      ],
+      "metadata": {
+        "id": "sGHBcM90gNTH"
+      }
+    },
     {
       "cell_type": "code",
       "source": [
@@ -391,7 +432,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### Let's learn about Capybara"
+        "### Let's learn about Capybara\n",
+        "\n",
+        "Given the schema, we expect the model to produce some basic information like `weight`, `habitat`, `diet` & `predators` for Capybara. You can visit the [webpage](https://www.rainforest-alliance.org/species/capybara/) to see the source of the truth."
       ],
       "metadata": {
         "id": "0kVJ8IfSI-Dx"
@@ -426,6 +469,15 @@
         }
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "You can see the model correctly identifies the correct weight - `100 lbs` for the Capybara even though the webpage mentions the weight in `kgs` too. It also identifies the correct habitat etc. for the animal.  "
+      ],
+      "metadata": {
+        "id": "iiAq5zoSiy9C"
+      }
+    },
     {
       "cell_type": "markdown",
       "source": [
@@ -591,13 +643,11 @@
       ]
     },
     {
-      "cell_type": "code",
+      "cell_type": "markdown",
       "source": [],
       "metadata": {
         "id": "cJ3baFWLQg6L"
-      },
-      "execution_count": null,
-      "outputs": []
+      }
     }
   ],
   "metadata": {
@@ -616,4 +666,4 @@
   },
   "nbformat": 4,
   "nbformat_minor": 0
-}
+}
diff --git a/examples/function_calling/fireworks_functions_qa.ipynb b/examples/function_calling/fireworks_functions_qa.ipynb
@@ -17,13 +17,25 @@
         "id": "71a43144"
       },
       "source": [
-        "# Structure answers with Fireworks functions\n",
+        "# Structured answers with Fireworks functions\n",
         "\n",
-        "Fireworks (FW) function calling model allows has the ability to produced structured responses. This is often useful in question answering when you want to not only get the final answer but also supporting evidence, citation, etc.\n",
+        "Several real world applications of LLM require them to respond in a strucutred manner. This structured response could look like `JSON` or `YAML`. For e.g. answering research questions using arxiv along with citations. Instead of parsing the entire LLM response and trying to figure out the actual answer of the LLM vs the citations provided by the LLM, we can use function calling ability of the LLMs to answer questions in a structured way.\n",
         "\n",
-        "In this notebook we show how to use an LLM chain which uses FW functions as part of an overall retrieval pipeline."
+        "In this notebook, we demonstrate structured response generation ability of the Fireworks function calling model. We will build an application that can answer questions (along with citations) regarding the State of the Union speech of 2022."
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Setup\n",
+        "\n",
+        "Install all the dependencies and import the required python modules."
+      ],
+      "metadata": {
+        "id": "-7tAxHrBp4IQ"
+      },
+      "id": "-7tAxHrBp4IQ"
+    },
     {
       "cell_type": "code",
       "source": [
@@ -38,7 +50,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 2,
+      "execution_count": null,
       "id": "f059012e",
       "metadata": {
         "id": "f059012e"
@@ -51,6 +63,18 @@
         "import openai"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "##  Download & Clean the Content\n",
+        "\n",
+        "We are going to download the content using the python package `requests` and perform minor cleanup by removing several newlines. Even minimal cleanup should be good enough to obtain good results with the model."
+      ],
+      "metadata": {
+        "id": "tgbH6j3Lp-_x"
+      },
+      "id": "tgbH6j3Lp-_x"
+    },
     {
       "cell_type": "code",
       "source": [
@@ -62,7 +86,7 @@
         "id": "IcIybYoE35ro"
       },
       "id": "IcIybYoE35ro",
-      "execution_count": 3,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -75,9 +99,21 @@
         "id": "xTeisbO_4UI7"
       },
       "id": "xTeisbO_4UI7",
-      "execution_count": 4,
+      "execution_count": null,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Setup your API Key\n",
+        "\n",
+        "In order to use the Fireworks AI function calling model, you must first obtain Fireworks API Keys. If you don't already have one, you can one by following the instructions [here](https://readme.fireworks.ai/docs/quickstart)."
+      ],
+      "metadata": {
+        "id": "XBfEwDuiqQMT"
+      },
+      "id": "XBfEwDuiqQMT"
+    },
     {
       "cell_type": "code",
       "source": [
@@ -91,9 +127,21 @@
         "id": "ZlTFlhtB5baq"
       },
       "id": "ZlTFlhtB5baq",
-      "execution_count": 5,
+      "execution_count": null,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Define the Structure\n",
+        "\n",
+        "Let's define the strucutre in which we want our model to responsd. The JSON structure for function calling follows the conventions of [JSON Schema](https://json-schema.org/). Here we define a structure with `answer` and `citations` field."
+      ],
+      "metadata": {
+        "id": "JoHfdVFlqbjN"
+      },
+      "id": "JoHfdVFlqbjN"
+    },
     {
       "cell_type": "code",
       "source": [
@@ -129,13 +177,26 @@
         "id": "Zj-9l4m283b4"
       },
       "id": "Zj-9l4m283b4",
-      "execution_count": 6,
+      "execution_count": null,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Perform Sanity Test\n",
+        "\n",
+        "Let's perform a sanity test by querying the speech for some basic information. This would ensure that our model setup is working correctly and the document is being processed correctly."
+      ],
+      "metadata": {
+        "id": "4tz7bwV-qset"
+      },
+      "id": "4tz7bwV-qset"
+    },
     {
       "cell_type": "code",
       "source": [
-        "messages = [\n",
+        "mp\n",
+        "essages = [\n",
         "    {\"role\": \"system\", \"content\": f\"You are a helpful assistant who is given document with following content: {clean_content}.\"\n",
         "     \t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\"Please reply in succinct manner and be truthful in the reply.\"},\n",
         "    {\"role\": \"user\", \"content\": \"What did the president say about Ketanji Brown Jackson?\"}\n",
@@ -145,7 +206,7 @@
         "id": "LcnDoz7H8jjE"
       },
       "id": "LcnDoz7H8jjE",
-      "execution_count": 7,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -163,7 +224,7 @@
         "id": "ENX3Fgcd_JfZ"
       },
       "id": "ENX3Fgcd_JfZ",
-      "execution_count": 8,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -179,7 +240,7 @@
         "outputId": "ee5a8472-167e-4167-f716-94378d8bd333"
       },
       "id": "0WzRJ5PgFAXc",
-      "execution_count": 9,
+      "execution_count": null,
       "outputs": [
         {
           "output_type": "stream",
@@ -225,7 +286,7 @@
         "id": "bF-o87oxD05g"
       },
       "id": "bF-o87oxD05g",
-      "execution_count": 10,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -261,7 +322,7 @@
         "id": "wYGPiSXfAysM"
       },
       "id": "wYGPiSXfAysM",
-      "execution_count": 11,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -277,7 +338,7 @@
         "outputId": "dcce465f-8d89-4937-f17c-4906dc142dcd"
       },
       "id": "UkWhQ4hPFMc_",
-      "execution_count": 12,
+      "execution_count": null,
       "outputs": [
         {
           "output_type": "stream",
@@ -351,7 +412,7 @@
         "id": "JxsGWpUcIGan"
       },
       "id": "JxsGWpUcIGan",
-      "execution_count": 13,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -381,7 +442,7 @@
         "id": "g-CYqzXIIUIl"
       },
       "id": "g-CYqzXIIUIl",
-      "execution_count": 14,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -399,7 +460,7 @@
         "id": "RylJZ8BiIewx"
       },
       "id": "RylJZ8BiIewx",
-      "execution_count": 15,
+      "execution_count": null,
       "outputs": []
     },
     {
@@ -415,7 +476,7 @@
         "outputId": "d3018c86-21dd-4b40-9a38-7f14bbec05cd"
       },
       "id": "qi_gNf-qI-CG",
-      "execution_count": 16,
+      "execution_count": null,
       "outputs": [
         {
           "output_type": "stream",
@@ -476,4 +537,4 @@
   },
   "nbformat": 4,
   "nbformat_minor": 5
-}
+}