Skip to content

Commit

Permalink
Merge pull request #147 from deepset-ai/update-eval-with-haystack
Browse files Browse the repository at this point in the history
Fix the cookbook video link and titles
  • Loading branch information
bilgeyucel authored Oct 30, 2024
2 parents 0c19a83 + e275cd8 commit 5441a5c
Showing 1 changed file with 18 additions and 44 deletions.
62 changes: 18 additions & 44 deletions notebooks/evaluating_ai_with_haystack.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,40 +22,7 @@
"\n",
"## 📺 Watch Along\n",
"\n",
"<iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/live/Dy-n_yC3Cto\" title=\"Evaluating AI with Haystack\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "toc",
"id": "WI3_y1HNGiqQ"
},
"source": [
">[Evaluating AI with Haystack](#scrollTo=uriHEO8pkgSo)\n",
"\n",
">[Building your pipeline](#scrollTo=C_WUXQzEQWv8)\n",
"\n",
">>[ARAGOG](#scrollTo=Dms5Ict6NGXq)\n",
"\n",
">[Human Evaluation](#scrollTo=zTbmQzeXQY1F)\n",
"\n",
">[Deciding on Metrics](#scrollTo=-U-QnCBqQcd6)\n",
"\n",
">[Building an Evaluation Pipeline](#scrollTo=yLkAcM_5Qfat)\n",
"\n",
">[Running Evaluation](#scrollTo=p76stWMQQmPD)\n",
"\n",
">>>[Run the RAG Pipeline](#scrollTo=rUfQQzusXhgk)\n",
"\n",
">>>[Run the Evaluation](#scrollTo=mfepD9HwXk4Q)\n",
"\n",
">[Analyzing Results](#scrollTo=mC_mIqdMQqZG)\n",
"\n",
">>[Evaluation Harness (Step 4, 5, and 6)](#scrollTo=OmkHqAsQZhFr)\n",
"\n",
">[Evaluation Frameworks](#scrollTo=gKfrFf1CebJJ)\n",
"\n"
"<iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/Dy-n_yC3Cto?si=LB0GdFP0VO-nJT-n\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>"
]
},
{
Expand Down Expand Up @@ -91,7 +58,7 @@
"id": "C_WUXQzEQWv8"
},
"source": [
"# 1. Building your pipeline"
"## 1. Building your pipeline"
]
},
{
Expand All @@ -100,7 +67,7 @@
"id": "Dms5Ict6NGXq"
},
"source": [
"## ARAGOG\n",
"### ARAGOG\n",
"\n",
"This dataset is based on the paper [Advanced Retrieval Augmented Generation Output Grading (ARAGOG)](https://arxiv.org/pdf/2404.01037). It's a\n",
"collection of papers from ArXiv covering topics around Transformers and Large Language Models, all in PDF format.\n",
Expand All @@ -113,7 +80,14 @@
"- ground-truth answers\n",
"- questions\n",
"\n",
"Source: https://github.com/deepset-ai/haystack-evaluation/blob/main/datasets/README.md"
"Get the dataset [here](https://github.com/deepset-ai/haystack-evaluation/blob/main/datasets/README.md)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Indexing Pipeline"
]
},
{
Expand Down Expand Up @@ -276,7 +250,7 @@
"embedding_model=\"sentence-transformers/all-MiniLM-L6-v2\"\n",
"document_store = InMemoryDocumentStore()\n",
"\n",
"files_path = \"/content/papers_for_questions\"\n",
"files_path = \"/content/papers_for_questions\" # <ENTER YOUR PATH HERE>\n",
"pipeline = Pipeline()\n",
"pipeline.add_component(\"converter\", PyPDFToDocument())\n",
"pipeline.add_component(\"cleaner\", DocumentCleaner())\n",
Expand Down Expand Up @@ -412,7 +386,7 @@
"id": "zTbmQzeXQY1F"
},
"source": [
"# 2. Human Evaluation"
"## 2. Human Evaluation"
]
},
{
Expand Down Expand Up @@ -543,7 +517,7 @@
"id": "-U-QnCBqQcd6"
},
"source": [
"# 3. Deciding on Metrics\n",
"## 3. Deciding on Metrics\n",
"\n",
"* **Semantic Answer Similarity**: SASEvaluator compares the embedding of a generated answer against a ground-truth answer based on a common embedding model.\n",
"* **ContextRelevanceEvaluator** will assess the relevancy of the retrieved context to answer the query question\n",
Expand All @@ -556,7 +530,7 @@
"id": "yLkAcM_5Qfat"
},
"source": [
"# 4. Building an Evaluation Pipeline"
"## 4. Building an Evaluation Pipeline"
]
},
{
Expand All @@ -582,7 +556,7 @@
"id": "p76stWMQQmPD"
},
"source": [
"# 5. Running Evaluation"
"## 5. Running Evaluation"
]
},
{
Expand Down Expand Up @@ -663,7 +637,7 @@
"id": "mC_mIqdMQqZG"
},
"source": [
"# 6. Analyzing Results"
"## 6. Analyzing Results"
]
},
{
Expand Down Expand Up @@ -3488,7 +3462,7 @@
"id": "gKfrFf1CebJJ"
},
"source": [
"# Evaluation Frameworks"
"## Evaluation Frameworks"
]
},
{
Expand Down

0 comments on commit 5441a5c

Please sign in to comment.