diff --git a/docs/community/.DS_Store b/docs/community/.DS_Store new file mode 100644 index 00000000..5008ddfc Binary files /dev/null and b/docs/community/.DS_Store differ diff --git a/docs/community/community.md b/docs/community/community.md new file mode 100644 index 00000000..a053e1c9 --- /dev/null +++ b/docs/community/community.md @@ -0,0 +1,97 @@ +--- +layout: default +title: Community +nav_order: 5 +has_children: true +description: community resources, getting help and sharing ideas +permalink: /community +--- + +# Community + +COMING SOON ... + + +{: .note} +> The contributions to `llmware` are governed by our [Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md). + +{: .warning} +> Have you found a security issue? Then please jump to [Security Vulnerabilities](#security-vulnerabilities). + +On this page, we provide information ``llmware`` contributions. +There are **two ways** on how you can contribute. +The first is by making **code contributions**, and the second by making contributions to the **documentation**. +Please look at our [contribution suggestions](#how-can-you-contribute) if you need inspiration, or take a look at [open issues](#open-issues). + +Contributions to `llmware` are welcome from everyone. +Our goal is to make the process simple, transparent, and straightforward. +We are happy to receive suggestions on how the process can be improved. + +## How can you contribute? + +{: .note} +> If you have never contributed before look for issues with the tag [``good first issue``](https://github.com/llmware-ai/llmware/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22). + +The most usual ways to contribute is to add new features, fix bugs, add tests, or add documentation. +You can visit the [issues](https://github.com/llmware-ai/llmware/issues) site of the project and search for tags such as +``bug``, ``enhancement``, ``documentation``, or ``test``. + + +Here is a non exhaustive list of contributions you can make. + +1. Code refactoring +2. Add new text data bases +3. Add new vector data bases +4. Fix bugs +5. Add usage examples (see for example the issues [jupyter notebook - more examples and better support](https://github.com/llmware-ai/llmware/issues/508) and [google colab examples and start up scripts](https://github.com/llmware-ai/llmware/issues/507)) +6. Add experimental features +7. Improve code quality +8. Improve documentation in the docs (what you are reading right now) +9. Improve documentation by adding or updating docstrings in modules, classes, methods, or functions (see for example [Add docstrings](https://github.com/llmware-ai/llmware/issues/219)) +10. Improve test coverage +11. Answer questions in our [Discord channel](https://discord.gg/MhZn5Nc39h), especially in the [technical support forum](https://discord.com/channels/1179245642770559067/1218498778915672194) +12. Post projects in which you use ``llmware`` in our Discord forum [made with llmware](https://discord.com/channels/1179245642770559067/1218567269471486012), ideially with a link to a public GitHub repository + +## Open Issues +If you're interested in existing issues, you can + +- Look for issues, if you are a new to the project, look for issues with the `good first issue` label. +- Provide answers for questions in our [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) +- Provide help for bug or enhancement issues. + - Ask questions, reproduce the issues, or provide solutions. + - Pull a request to fix the issue. + + + +## Security Vulnerabilities +**If you believe you've found a security vulnerability, then please _do not_ submit an issue ticket or pull request or otherwise publicly disclose the issue.** +Please follow the process at [Reporting a Vulnerability](https://github.com/llmware-ai/llmware/blob/main/Security.md) + + + +## GitHub workflow + +We follow the [``fork-and-pull``](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) Git workflow. + +1. [Fork](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo) the repository on GitHub. +2. Clone your fork to your local machine with `git clone git@github.com:/llmware.git`. +3. Create a branch with `git checkout -b my-topic-branch`. +4. Run the test suite by navigating to the tests/ folder and running ```./run-tests.py -s``` to ensure there are no failures +5. [Commit](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/committing-changes-to-a-pull-request-branch-created-from-a-fork) changes to your own branch, then push to GitHub with `git push origin my-topic-branch`. +6. Submit a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) so that we can review your changes. + +Remember to [synchronize your forked repository](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#keep-your-fork-synced) _before_ submitting proposed changes upstream. If you have an existing local repository, please update it before you start, to minimize the chance of merge conflicts. + +```shell +git remote add upstream git@github.com:llmware-ai/llmware.git +git fetch upstream +git checkout upstream/main -b my-topic-branch +``` + +## Community +Questions and discussions are welcome in any shape or form. +Please fell free to join our community on our discord channel, on which we are active daily. +You are also welcome if you just want to post an idea! + +- [Discord Channel](https://discord.gg/MhZn5Nc39h) +- [GitHub discussions](https://github.com/llmware-ai/llmware/discussions) diff --git a/docs/faq.md b/docs/community/faq.md similarity index 98% rename from docs/faq.md rename to docs/community/faq.md index e972b0cb..59a672d6 100644 --- a/docs/faq.md +++ b/docs/community/faq.md @@ -1,8 +1,10 @@ --- layout: default -title: Freqently Asked Questions -nav_order: 12 -permalink: /faq +title: FAQ +parent: community +nav_order: 1 +description: overview of the major modules and classes of LLMWare +permalink: /community/faq --- # Frequently Asked Questions (FAQ) diff --git a/docs/community/join_our_community.md b/docs/community/join_our_community.md new file mode 100644 index 00000000..5d26c847 --- /dev/null +++ b/docs/community/join_our_community.md @@ -0,0 +1,71 @@ +--- +layout: default +title: Join Our Community +parent: community +nav_order: 4 +description: overview of the major modules and classes of LLMWare +permalink: /community/join_our_community +--- +# Join the LLMWare Community +___ + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- diff --git a/docs/community/need_help.md b/docs/community/need_help.md new file mode 100644 index 00000000..afc99187 --- /dev/null +++ b/docs/community/need_help.md @@ -0,0 +1,71 @@ +--- +layout: default +title: Need Hep +parent: community +nav_order: 3 +description: overview of the major modules and classes of LLMWare +permalink: /community/need_help +--- +# Need Help +___ + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- diff --git a/docs/troubleshooting.md b/docs/community/troubleshooting.md similarity index 97% rename from docs/troubleshooting.md rename to docs/community/troubleshooting.md index 4ef5c009..6812f9dd 100644 --- a/docs/troubleshooting.md +++ b/docs/community/troubleshooting.md @@ -1,9 +1,10 @@ --- layout: default title: Troubleshooting -nav_order: 8 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /troubleshooting +parent: community +nav_order: 2 +description: overview of the major modules and classes of LLMWare +permalink: /community/troubleshooting --- # Common Troubleshooting Issues ___ diff --git a/docs/components/.DS_Store b/docs/components/.DS_Store new file mode 100644 index 00000000..5008ddfc Binary files /dev/null and b/docs/components/.DS_Store differ diff --git a/docs/components/agent_inference_server.md b/docs/components/agent_inference_server.md new file mode 100644 index 00000000..27898db9 --- /dev/null +++ b/docs/components/agent_inference_server.md @@ -0,0 +1,193 @@ +--- +layout: default +title: Agent Inference Server +parent: Components +nav_order: 12 +description: overview of the major modules and classes of LLMWare +permalink: /components/agent_inference_server +--- +# Agent Inference Server +--- + +LLMWare supports multiple deployment options, including the use of REST APIs to implement most model invocations. + +To set up an inference server for Agent processes: + +```python + +""" This example shows how to set up an inference server that can be used in conjunction with agent-based workflows. + + This script covers both the server-side deployment, as well as the steps taken on the client-side to deploy + in an Agent example. + + Note: this example will build off two other examples: + + 1. "examples/Models/launch_llmware_inference_server.py" + 2. "examples/SLIM-Agents/agent-llmfx-getting-started.py" + +""" + + +from llmware.models import ModelCatalog, LLMWareInferenceServer + +# *** SERVER SIDE SCRIPT *** + +base_model = "llmware/bling-tiny-llama-v0" +LLMWareInferenceServer(base_model, + model_catalog=ModelCatalog(), + secret_api_key="demo-test", + home_path="/home/ubuntu/", + verbose=True).start() + +# this will start Flask-based server, which will display the launched IP address and port, e.g., +# "Running on " ip_address = "http://127.0.0.1:8080" + + +# *** CLIENT SIDE AGENT PROCESS *** + + +from llmware.agents import LLMfx + + +def create_multistep_report_over_api_endpoint(): + + """ This is derived from the script in the example agent-llmfx-getting-started.py. """ + + customer_transcript = "My name is Michael Jones, and I am a long-time customer. " \ + "The Mixco product is not working currently, and it is having a negative impact " \ + "on my business, as we can not deliver our products while it is down. " \ + "This is the fourth time that I have called. My account number is 93203, and " \ + "my user name is mjones. Our company is based in Tampa, Florida." + + # create an agent using LLMfx class + agent = LLMfx() + + # copy the ip address from the Flask launch readout + ip_address = "http://127.0.0.1:8080" + + # inserting this line below into the agent process sets the 'api endpoint' execution to "ON" + # all agent function calls will be deployed over the API endpoint on the remote inference server + # to "switch back" to local execution, comment out this line + + agent.register_api_endpoint(api_endpoint=ip_address, + api_key="demo-test", + endpoint_on=True) + + # to explicitly turn the api endpoint "on" or "off" + # agent.switch_endpoint_on() + # agent.switch_endpoint_off() + + agent.load_work(customer_transcript) + + # load tools individually + agent.load_tool("sentiment") + agent.load_tool("ner") + + # load multiple tools + agent.load_tool_list(["emotions", "topics", "intent", "tags", "ratings", "answer"]) + + # start deploying tools and running various analytics + + # first conduct three 'soft skills' initial assessment using 3 different models + agent.sentiment() + agent.emotions() + agent.intent() + + # alternative way to execute a tool, passing the tool name as a string + agent.exec_function_call("ratings") + + # call multiple tools concurrently + agent.exec_multitool_function_call(["ner","topics","tags"]) + + # the 'answer' tool is a quantized question-answering model - ask an 'inline' question + # the optional 'key' assigns the output to a dictionary key for easy consolidation + agent.answer("What is a short summary?",key="summary") + + # prompting tool to ask a quick question as part of the analytics + response = agent.answer("What is the customer's account number and user name?", key="customer_info") + + # you can 'unload_tool' to release it from memory + agent.unload_tool("ner") + agent.unload_tool("topics") + + # at end of processing, show the report that was automatically aggregated by key + report = agent.show_report() + + # displays a summary of the activity in the process + activity_summary = agent.activity_summary() + + # list of the responses gathered + for i, entries in enumerate(agent.response_list): + print("update: response analysis: ", i, entries) + + output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} + + return output +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/components/agents.md b/docs/components/agents.md new file mode 100644 index 00000000..fd3c3a80 --- /dev/null +++ b/docs/components/agents.md @@ -0,0 +1,136 @@ +--- +layout: default +title: Agents +parent: Components +nav_order: 4 +description: overview of the major modules and classes of LLMWare +permalink: /components/agents +--- +# Agents +--- + +Agents with Function Calls and SLIM Models 🔥 + +llmware has been designed to enable Agent and LLM-based function calls using small language models designed for local and private +deployment and the ability to leverage open source models to conduct complex RAG and knowledge-based workflow automation. + +The key elements in llmware: + + - **SLIM models** - 18 function-calling small language models, optimized for a specific extraction, classification, generation, or +summarization activity, and generate python dictionaries and lists as output. + +- **LLMfx class** - enables a wide range of agent-based processes. + +Here is an example to get started: + +```python + +from llmware.agents import LLMfx + +text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that " + "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in " + "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased " + "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. " + "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday " + "presentation, the company warned investors that it’s 'currently between two major growth waves.'") + +# create an agent using LLMfx class +agent = LLMfx() + +# load text to process +agent.load_work(text) + +# load 'models' as 'tools' to be used in analysis process +agent.load_tool("sentiment") +agent.load_tool("extract") +agent.load_tool("topics") +agent.load_tool("boolean") + +# run function calls using different tools +agent.sentiment() +agent.topics() +agent.extract(params=["company"]) +agent.extract(params=["automotive revenue growth"]) +agent.xsum() +agent.boolean(params=["is 2024 growth expected to be strong? (explain)"]) + +# at end of processing, show the report that was automatically aggregated by key +report = agent.show_report() + +# displays a summary of the activity in the process +activity_summary = agent.activity_summary() + +# list of the responses gathered +for i, entries in enumerate(agent.response_list): + print("update: response analysis: ", i, entries) + +output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} + +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/architecture.md b/docs/components/components.md similarity index 98% rename from docs/architecture.md rename to docs/components/components.md index a62ef557..23cb5293 100644 --- a/docs/architecture.md +++ b/docs/components/components.md @@ -1,9 +1,10 @@ --- layout: default -title: Architecture -nav_order: 5 -description: overview of the major modules and classes of LLMWare -permalink: /architecture +title: Components +nav_order: 2 +has_children: true +description: llmware key architectural components, modules and classes +permalink: /components --- # LLMWare Architecture --- diff --git a/docs/components/data_stores.md b/docs/components/data_stores.md new file mode 100644 index 00000000..51bbc6e1 --- /dev/null +++ b/docs/components/data_stores.md @@ -0,0 +1,102 @@ +--- +layout: default +title: Data Stores +parent: Components +nav_order: 9 +description: overview of the major modules and classes of LLMWare +permalink: /components/data_stores +--- +# Data Stores +--- + +Simple-to-Scale Database Options - integrated data stores from laptop to parallelized cluster. + +```python + +from llmware.configs import LLMWareConfig + +# to set the collection database - mongo, sqlite, postgres +LLMWareConfig().set_active_db("mongo") + +# to set the vector database (or declare when installing) +# --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas +LLMWareConfig().set_vector_db("milvus") + +# for fast start - no installations required +LLMWareConfig().set_active_db("sqlite") +LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb + +# for single postgres deployment +LLMWareConfig().set_active_db("postgres") +LLMWareConfig().set_vector_db("postgres") + +# to install mongo, milvus, postgres - see the docker-compose scripts as well as examples + +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/components/embedding_models.md b/docs/components/embedding_models.md new file mode 100644 index 00000000..85e2ee9e --- /dev/null +++ b/docs/components/embedding_models.md @@ -0,0 +1,144 @@ +--- +layout: default +title: Embedding Models +parent: Components +nav_order: 6 +description: overview of the major modules and classes of LLMWare +permalink: /components/embedding_models +--- +# Embedding Models +--- + +llmware supports 30+ embedding models out of the box in the default ModelCatalog, with easy extensibility to add other +popular open source embedding models from HuggingFace or Sentence Transformers. + +To get a list of the currently supported embedding models: + +```python +from llmware.models import ModelCatalog +embedding_models = ModelCatalog().list_embedding_models() +for i, models in enumerate(embedding_models): + print(f"embedding models: {i} - {models}") +``` + +Supported popular models include: +- Sentence Transformers - `all-MiniLM-L6-v2`, `all-mpnet-base-v2` +- Jina AI - `jinaai/jina-embeddings-v2-base-en`, `jinaai/jina-embeddings-v2-small-en` +- Nomic - `nomic-ai/nomic-embed-text-v1` +- Industry BERT - `industry-bert-insurance`, `industry-bert-contracts`, `industry-bert-asset-management`, `industry-bert-sec`, `industry-bert-loans` +- OpenAI - `text-embedding-ada-002`, `text-embedding-3-small`, `text-embedding-3-large` + +We also support top embedding models from BAAI, thenlper, llmrails/ember, Google, and Cohere. We are constantly looking to add new innovative open source models to this list +so please let us know if you are looking for support for a specific embedding model, and usually within 1-2 days, we can test and add to the ModelCatalog. + +# Using an Embedding Model + +Embedding models in llmware can be installed directly by `ModelCatalog().load_model("model_name")`, but in most cases, +the name of the embedding model will be passed to the `install_new_embedding` handler in the Library class when creating a new +embedding. Once that is completed, the embedding model is captured in the Library metadata on the LibraryCard as part of the +embedding record for that library, and as a result, often times, does not need to be used explicitly again, e.g., + +```python + +from llmware.library import Library + +library = Library().create_new_library("my_library") + +# parses the content from the documents in the file path, text chunks and indexes in a text collection database +library.add_files(input_folder_path="/local/path/to/my_files", chunk_size=400, max_chunk_size=600, smart_chunking=1) + +# creates embeddings - and keeps synchronized records of which text chunks have been embedded to enable incremental use +library.install_new_embedding(embedding_model_name="jinaai/jina-embeddings-v2-small-en", + vector_db="milvus", + batch_size=100) +``` + +Once the embeddings are installed on the library, you can look up the embedding status to see the updated embeddings, and confirm that +the model has been correctly captured: + +```python + +from llmware.library import Library +library = Library().load_library("my_library") +embedding_record = library.get_embedding_status() +print("\nupdate: embedding record - ", embedding_record) +``` + +And then you can run semantic retrievals on the Library, using the Query class in the retrievals module, e.g.: + +```python +from llmware.library import Library +from llmware.retrieval import Query +library = Library().load_library("my_library") +# queries are constructed by creating a Query object, and passing a library as input +query_results = Query(library).semantic_query("my query", result_count=20) +for qr in query_results: + print("my query results: ", qr) +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/components/gguf.md b/docs/components/gguf.md new file mode 100644 index 00000000..e7088cd7 --- /dev/null +++ b/docs/components/gguf.md @@ -0,0 +1,186 @@ +--- +layout: default +title: GGUF +parent: Components +nav_order: 14 +description: overview of the major modules and classes of LLMWare +permalink: /components/gguf +--- +# GGUF +--- + +llmware packages its own build of the llama.cpp backend engine to enable running quantized models in GGUF format, which provides an +effective packaging to run small language models on both CPUs and GPUs, which fast loading and inference. + +The GGUF capability is implemented in the models.py module in the class `GGUFGenerativeModel` with an extensive set of interfaces and +configurations provided in the gguf_configs.py module (which for most users and use cases do not need to adjusted). + +To use a GGUF model is the same as using any other model in the ModelCatalog, e.g., + +```python +from llmware.models import ModelCatalog + +gguf_model = ModelCatalog().load_model("phi-3-gguf") +response = gguf_model.inference("What are the benefits of small specialized language models?") +print("response: ", response) +``` + +# GGUF Platform Support +Within the llmware library, we currently package 6 separate builds of the gguf llama.cpp engine for the following platforms: + +# Mac M1/M2/M3 + - with Accelerate: "libllama_mac_metal.dylib" + - without Accelerate: "libllama_mac_metal_no_acc.dylib" (note: if you have an old Mac OS installed, it may not have full Accelerate support) + - By default on Mac M1/M2/M3, it will attempt to use the Accelerate (faster) back-end, and if that fails, then it will automatically revert to the no-acc version + +# Windows + - CUDA version + - CPU version + - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version. + +# Linux + - CUDA version + - CPU version + - Will look for CUDA drivers, and if found, will try to use the CUDA build, but if that fails, then it will automatically revert to the CPU version. + + +# Troubleshooting CUDA on Windows and Linux + +Requirement: Nvidia CUDA 12.1+ +-- how to check: `nvcc --version` and `nvidia-smi` - if not found, then drivers are either not installed or not in $PATH and need to be configured +-- if you have older drivers (e.g., v11), then you will need to update them. + +# Bring your own custom llama.cpp gguf backend + +If you have a unique system requirement, or are looking to optimize for a particular BLAS library with your own build, you can bring your own as follows: +if you have a unique system requirement, you can build llama_cpp from source, and apply custom build settings - or find in the community a prebuilt llama_cpp library that matches your platform. Happy to help if you share the requirements. + +```python +from llmware.gguf_configs import GGUFConfigs +GGUFConfigs().set_config("custom_lib_path", "/path/to/your/custom/llama_cpp_backend") + +# ... and then load and run the model as usual - the GGUF model class will look at this config and load the llama.cpp found at the custom lib path. +``` + +# Streaming GGUF + +```python + +""" This example illustrates how to use the stream method for GGUF models for fast streaming of inference, +especially for real-time chat interactions. + + Please note that the stream method has been implemented for GGUF models starting in llmware-0.2.13. This will be +any model with GGUFGenerativeModel class, and generally includes models with names that end in "gguf". + + See also the chat UI example in the UI examples folder. + + We would recommend using a chat optimized model, and have included a representative list below. +""" + + +from llmware.models import ModelCatalog +from llmware.gguf_configs import GGUFConfigs + +# sets an absolute output maximum for the GGUF engine - normally set by default at 256 +GGUFConfigs().set_config("max_output_tokens", 1000) + +chat_models = ["phi-3-gguf", + "llama-2-7b-chat-gguf", + "llama-3-instruct-bartowski-gguf", + "openhermes-mistral-7b-gguf", + "zephyr-7b-gguf", + "tiny-llama-chat-gguf"] + +model_name = chat_models[0] + +# maximum output can be set optionally at any number up to the "max_output_tokens" set +model = ModelCatalog().load_model(model_name, max_output=500) + +text_out = "" + +token_count = 0 + +# prompt = "I am interested in gaining an understanding of the banking industry. What topics should I research?" +prompt = "What are the benefits of small specialized LLMs?" + +# since model.stream provides a generator, then use as follows to consume the generator + +for streamed_token in model.stream(prompt): + + text_out += streamed_token + if text_out.strip(): + print(streamed_token, end="") + + token_count += 1 + +# final output text and token count + +print("\n\n***total text out***: ", text_out) +print("\n***total tokens***: ", token_count) +``` + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/components/library.md b/docs/components/library.md new file mode 100644 index 00000000..e6dcc25a --- /dev/null +++ b/docs/components/library.md @@ -0,0 +1,150 @@ +--- +layout: default +title: Library +parent: Components +nav_order: 7 +description: overview of the major modules and classes of LLMWare +permalink: /components/library +--- +# Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. +--- + +Library is the main organizing construct for unstructured information in LLMWare. Users can create one large library with all types of different content, or +can create multiple libraries with each library comprising a specific logical collection of information on a +particular subject matter, project/case/deal, or even different accounts/users/departments. + +Each Library consists of the following components: + +1. Collection on a Database - this is the core of the Library, and is created through parsing of documents, which +are then automatically chunked and indexed in a text collection database. This is the basis for retrieval, +and the collection that will be used as the basis for tracking any number of vector embeddings that can be +attached to a library collection. + +2. File archives - found in the llmware_data path, within Accounts, there is a folder structure for each Library. +All file-based artifacts for the Library are organized in these folders, including copies of all files added in +the library (very useful for retrieval-based applications), images extracted and indexed from the source +documents, as well as derived artifacts such as nlp and knowledge graph and datasets. + +3. Library Catalog - each Library is registered in the LibraryCatalog table, with a unique library_card that has +the key attributes and statistics of the Library. + +When a Library object is passed to the Parser, the parser will automatically route all information into the +Library structure. + +The Library also exposes convenience methods to easily install embeddings on a library, including tracking of +incremental progress. + +To parse into a Library, there is the very useful convenience methods, "add_files" which will invoke the Parser, +collate and route the files within a selected folder path, check for duplicate files, execute the parsing, +text chunking and insertion into the database, and update all of the Library state automatically. + +Libraries are the main index constructs that are used in executing a Query. Pass the library object when +constructing the Query object, and then all retrievals (text, semantic and hybrid) will be executed against +the content in that Library only. + + +```python + +from llmware.library import Library + +# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html) + +# step 1 - create a library, which is the 'knowledge-base container' construct +# - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name}) +# - embeddings and queries are run against a library + +lib = Library().create_new_library("my_library") + +# step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types +# - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB + +lib.add_files("/folder/path/to/my/files") + +# to install an embedding on a library - pick an embedding model and vector_db +lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500) + +# to add a second embedding to the same library (mix-and-match models + vector db) +lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100) + +# easy to create multiple libraries for different projects and groups + +finance_lib = Library().create_new_library("finance_q4_2023") +finance_lib.add_files("/finance_folder/") + +hr_lib = Library().create_new_library("hr_policies") +hr_lib.add_files("/hr_folder/") + +# pull library card with key metadata - documents, text chunks, images, tables, embedding record +lib_card = Library().get_library_card("my_library") + +# see all libraries +all_my_libs = Library().get_all_library_cards() + +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! + + + +--- + +--- + diff --git a/docs/components/model_catalog.md b/docs/components/model_catalog.md new file mode 100644 index 00000000..a6263c1f --- /dev/null +++ b/docs/components/model_catalog.md @@ -0,0 +1,263 @@ +--- +layout: default +title: Model Catalog +parent: Components +nav_order: 2 +description: overview of the major modules and classes of LLMWare +permalink: /components/model_catalog +--- +# Model Catalog: +Access all models the same way with easy lookup, regardless of underlying implementation. + +# 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models +# 18 SLIM function-calling small language models for Agent use cases +# Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models +# Easy to extend to add custom models - see examples + +Generally, all models can be identified using either the `model_name` or `display_name`, which provides some flexibility +to expose a more "UI friendly" name or an informal short-name for a commonly-used model. + +The default model list is implemented in the model_configs.py module, which is then generally accessed in the models.py module through +the `ModelCatalog` class, which also provides the ability to add models of various types, over-write by loading a custom model catalog from json file, and +other useful interfaces into the list of models. + +```python + +from llmware.models import ModelCatalog +from llmware.prompts import Prompt + +# all models accessed through the ModelCatalog +models = ModelCatalog().list_all_models() + +# to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter +my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf") +output = my_model.inference("what is the future of AI?", add_context="Here is the article to read") + +# to integrate model into a Prompt +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") +response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information") +``` + +# ADD a Custom GGUF to the ModelCatalog + +```python +import time +import re +from llmware.models import ModelCatalog +from llmware.prompts import Prompt + +# Step 1 - register new gguf model - we will pick the popular LLama-2-13B-chat-GGUF + +ModelCatalog().register_gguf_model(model_name="TheBloke/Llama-2-13B-chat-GGUF-Q2", + gguf_model_repo="TheBloke/Llama-2-13B-chat-GGUF", + gguf_model_file_name="llama-2-13b-chat.Q2_K.gguf", + prompt_wrapper="my_version_inst") + +# Step 2- if the prompt_wrapper is a standard, e.g., Meta's , then no need to do anything else +# -- however, if the model uses a custom prompt wrapper, then we need to define that too +# -- in this case, we are going to create our "own version" of the Meta wrapper + +ModelCatalog().register_new_finetune_wrapper("my_version_inst", main_start="", llm_start="") + +# Once we have completed these two steps, we are done - and can begin to use the model like any other + +prompter = Prompt().load_model("TheBloke/Llama-2-13B-chat-GGUF-Q2") + +question_list = ["I am interested in gaining an understanding of the banking industry. What topics should I research?", + "What are some tips for creating a successful business plan?", + "What are the best books to read for a class on American literature?"] + + +for i, entry in enumerate(question_list): + + start_time = time.time() + print("\n") + print(f"query - {i + 1} - {entry}") + + response = prompter.prompt_main(entry) + + # Print results + time_taken = round(time.time() - start_time, 2) + llm_response = re.sub("[\n\n]", "\n", response['llm_response']) + print(f"llm_response - {i + 1} - {llm_response}") + print(f"time_taken - {i + 1} - {time_taken}") +``` + +# ADD an Ollama Model + +```python + +from llmware.models import ModelCatalog + +# Step 1 - register your Ollama models in llmware ModelCatalog +# -- these two lines will register: llama2 and mistral models +# -- note: assumes that you have previously cached and installed both of these models with ollama locally + +# register llama2 +ModelCatalog().register_ollama_model(model_name="llama2",model_type="chat",host="localhost",port=11434) + +# register mistral - note: if you are using ollama defaults, then OK to register with ollama model name only +ModelCatalog().register_ollama_model(model_name="mistral") + +# optional - confirm that model was registered +my_new_model_card = ModelCatalog().lookup_model_card("llama2") +print("\nupdate: confirming - new ollama model card - ", my_new_model_card) + +# Step 2 - start using the Ollama model like any other model in llmware + +print("\nupdate: calling ollama llama 2 model ...") + +model = ModelCatalog().load_model("llama2") +response = model.inference("why is the sky blue?") + +print("update: example #1 - ollama llama 2 response - ", response) + +# Tip: if you are loading 'llama2' chat model from Ollama, note that it is already included in +# the llmware model catalog under a different name, "TheBloke/Llama-2-7B-Chat-GGUF" +# the llmware model name maps to the original HuggingFace repository, and is a nod to "TheBloke" who has +# led the popularization of GGUF - and is responsible for creating most of the GGUF model versions. +# --llmware uses the "Q4_K_M" model by default, while Ollama generally prefers "Q4_0" + +print("\nupdate: calling Llama-2-7B-Chat-GGUF in llmware catalog ...") + +model = ModelCatalog().load_model("TheBloke/Llama-2-7B-Chat-GGUF") +response = model.inference("why is the sky blue?") + +print("update: example #1 - [compare] - llmware / Llama-2-7B-Chat-GGUF response - ", response) + +# Now, let's try the Ollama Mistral model with a context passage + +model2 = ModelCatalog().load_model("mistral") + +context_passage= ("NASA’s rover Perseverance has gathered data confirming the existence of ancient lake " + "sediments deposited by water that once filled a giant basin on Mars called Jerezo Crater, " + "according to a study published on Friday. The findings from ground-penetrating radar " + "observations conducted by the robotic rover substantiate previous orbital imagery and " + "other data leading scientists to theorize that portions of Mars were once covered in water " + "and may have harbored microbial life. The research, led by teams from the University of " + "California at Los Angeles (UCLA) and the University of Oslo, was published in the " + "journal Science Advances. It was based on subsurface scans taken by the car-sized, six-wheeled " + "rover over several months of 2022 as it made its way across the Martian surface from the " + "crater floor onto an adjacent expanse of braided, sedimentary-like features resembling, " + "from orbit, the river deltas found on Earth.") + +response = model2.inference("What are the top 3 points?", add_context=context_passage) + +print("\nupdate: calling ollama mistral model ...") + +print("update: example #2 - ollama mistral response - ", response) + +# Step 3 - using the ollama discovery API - optional + +discovery = model2.discover_models() +print("\nupdate: example #3 - checking ollama model manifest list: ", discovery) + +if len(discovery) > 0: + # note: assumes tht you have at least one model registered in ollama -otherwise, may throw error + for i, models in enumerate(discovery["models"]): + print("ollama models: ", i, models) +``` + +# Add a LM Studio Model + +```python +from llmware.models import ModelCatalog +from llmware.prompts import Prompt + + +# one step process: add the open chat model to the Model Registry +# key params: +# model_name = "my_open_chat_model1" +# api_base = uri_path to the proposed endpoint +# prompt_wrapper = alpaca | | chat_ml | hf_chat | human_bot +# -> Llama2-Chat +# hf_chat -> Zephyr-Mistral +# chat_ml -> OpenHermes - Mistral +# human_bot -> Dragon models +# model_type = "chat" (alternative: "completion") + +ModelCatalog().register_open_chat_model("my_open_chat_model1", + api_base="http://localhost:1234/v1", + prompt_wrapper="", + model_type="chat") + +# once registered, you can invoke like any other model in llmware + +prompter = Prompt().load_model("my_open_chat_model1") +response = prompter.prompt_main("What is the future of AI?") + + +# you can (optionally) register multiple open chat models with different api_base and model attributes + +ModelCatalog().register_open_chat_model("my_open_chat_model2", + api_base="http://localhost:5678/v1", + prompt_wrapper="hf_chat", + model_type="chat") +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/components/prompt_with_sources.md b/docs/components/prompt_with_sources.md new file mode 100644 index 00000000..07a7fb43 --- /dev/null +++ b/docs/components/prompt_with_sources.md @@ -0,0 +1,186 @@ +--- +layout: default +title: Prompt with Sources +parent: Components +nav_order: 10 +description: overview of the major modules and classes of LLMWare +permalink: /components/prompt_with_sources +--- +# Prompt with Sources +--- +Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference, and provides several high-level useful methods to +easily integrate a retrieval/query/parsing step into a prompt to be used as a source for running an inference on a model. + +This is best illustrated with a simple example: + +```python + +from llmware.prompts import Prompt + +# build a prompt and attach a model +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") + +# add_source_document method: accepts any supported document type, parses the file, and creates text chunks +# if a query is passed, then it will run a quick in-memory filtering search against the text chunks +# the text chunks are packaged into sources with all of the accompanying metadata from the file, and made +# available automatically in batches to be used in prompting - + +source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query") + +# to run inference with 'prompt with sources' -> source will be automatically added to the prompt +responses = prompter.prompt_with_source("my query") + +# depending upon the size of the source (and batching relative to the model context window, there may be more than +# a single inference run, so unpack potentially multiple responses + +for i, response in enumerate(responses): + print("response: ", i, response) +``` + +# FACT CHECKING + +Using prompt_with_source also provides integrated fact-checking methods that use the packaged source information to validate key +elements from the llm_response + +```python +from llmware.prompts import Prompt + +prompter = Prompt().load_model("bling-answer-tool", temperature=0.0, sample=False) + +# contract is parsed, text-chunked, and then filtered by "base salary' +source = prompter.add_source_document("/local/folder/path", "my_document.pdf", query="exact filter query") + +# calling the LLM with 'source' information from the contract automatically packaged into the prompt +responses = prompter.prompt_with_source("my question to the document", prompt_name="default_with_context") + +# run several fact checks + +# checks for numbers match +ev_numbers = prompter.evidence_check_numbers(responses) + +# looks for statistical overlap to identify potential sources for the llm response +ev_sources = prompter.evidence_check_sources(responses) + +# builds set of comparison stats between the llm_response and the sources +ev_stats = prompter.evidence_comparison_stats(responses) + +# identifies if a response is a "not found" response +z = prompter.classify_not_found_response(responses, parse_response=True, evidence_match=True,ask_the_model=False) + +for r, response in enumerate(responses): + print("LLM Response: ", response["llm_response"]) + print("Numbers: ", ev_numbers[r]["fact_check"]) + print("Sources: ", ev_sources[r]["source_review"]) + print("Stats: ", ev_stats[r]["comparison_stats"]) + print("Not Found Check: ", z[r]) +``` + +In addition to `add_source_document`, the Prompt class implements the following other methods to easily integrate sources into prompts: + +# Add Source - Query Results - Two Options + +```python + +from llmware.prompts import Prompt +from llmware.retrieval import Query +from llmware.library import Library + +# build a prompt +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") + +# Option A - run query and then add query_results to the prompt +my_lib = Library().load_library("my_library") +results = Query(my_lib).query("my query") + +source2 = prompter.add_source_query_results(results) + +# Option B - run a new query against a library and load directly into a prompt +source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15) + +``` + +# Add Other Sources + +```python + +from llmware.prompts import Prompt + +# build a prompt +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") + +# add wikipedia articles as a source +wiki_source = prompter.add_source_wikipedia("topic", article_count=5, query="filter among retrieved articles") + +# add a website as a source +website_source = prompter.add_source_website("my_url", query="filter among website") + +# add an entire library (should be small, e.g., just a couple of documents) +source = prompter.add_source_library("my_library") + +``` + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/components/query.md b/docs/components/query.md new file mode 100644 index 00000000..22bdc22e --- /dev/null +++ b/docs/components/query.md @@ -0,0 +1,115 @@ +--- +layout: default +title: Query +parent: Components +nav_order: 8 +description: overview of the major modules and classes of LLMWare +permalink: /components/query +--- +# Retrieval & Query +--- + +Query libraries with mix of text, semantic, hybrid, metadata, and custom filters. The retrieval.py module implements the +`Query` class, which is the primary way that search and retrieval is performed. Each `Query` object, when constructed, +requires that a Library is passed as a mandatory parameter in the constructor. The Query object will operate against that +Library, and have access to all of Library's specific attributes, metadata and methods. + +Retrievals in llmware leverage the Library abstraction as the primary unit against which a particular query or retrieval is +executed. This provides the ability to have multiple distinct knowledge-bases, potentially aligned to different use cases, and/or +users, accounts and permissions. + +# Executing Queries + +```python +from llmware.retrieval import Query +from llmware.library import Library + +# step 1 - load a previously created library +lib = Library().load_library("my_library") + +# step 2 - create a query object +q = Query(lib) + +# step 3 - run lots of different queries (many other options in the examples) + +# basic text query +results1 = q.text_query("text query", result_count=20, exact_mode=False) + +# semantic query +results2 = q.semantic_query("semantic query", result_count=10) + +# combining a text query restricted to only certain documents in the library and "exact" match to the query +results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True) + +# to apply a specific embedding (if multiple on library), pass the names when creating the query object +q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus") +results4 = q2.semantic_query("new semantic query") +``` + + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/components/rag_optimized_models.md b/docs/components/rag_optimized_models.md new file mode 100644 index 00000000..8f150f06 --- /dev/null +++ b/docs/components/rag_optimized_models.md @@ -0,0 +1,326 @@ +--- +layout: default +title: RAG Optimized Models +parent: Components +nav_order: 3 +description: overview of the major modules and classes of LLMWare +permalink: /components/rag_optimized_models +--- +# RAG Optimized Models +--- + +RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally. + +## Meet our Models + +- **SLIM model series:** small, specialized models fine-tuned for function calling and multi-step, multi-model Agent workflows. +- **DRAGON model series:** Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models. +- **BLING model series:** Small CPU-based RAG-optimized, instruct-following 1B-3B parameter models. +- **Industry BERT models:** out-of-the-box custom trained sentence transformer embedding models fine-tuned for the following industries: Insurance, Contracts, Asset Management, SEC. +- **GGUF Quantization:** we provide 'gguf' and 'tool' versions of many SLIM, DRAGON and BLING models, optimized for CPU deployment. + + + +```python +""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both +Pytorch and GGUF versions. """ + +import time +from llmware.prompts import Prompt + + +def hello_world_questions(): + + test_list = [ + + {"query": "What is the total amount of the invoice?", + "answer": "$22,500.00", + "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " + "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" + " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" + "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." + "If you have any questions concerning this invoice, contact Bia Hermes. " + "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, + + {"query": "What was the amount of the trade surplus?", + "answer": "62.4 billion yen ($416.6 million)", + "context": "Japan’s September trade balance swings into surplus, surprising expectations" + "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " + "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " + "billion yen. Data from Japan’s customs agency revealed that exports in September " + "increased 4.3% year on year, while imports slid 16.3% compared to the same period " + "last year. According to FactSet, exports to Asia fell for the ninth straight month, " + "which reflected ongoing China weakness. Exports were supported by shipments to " + "Western markets, FactSet added. — Lim Hui Jie"}, + + {"query": "When did the LISP machine market collapse?", + "answer": "1987.", + "context": "The attendees became the leaders of AI research in the 1960s." + " They and their students produced programs that the press described as 'astonishing': " + "computers were learning checkers strategies, solving word problems in algebra, " + "proving logical theorems and speaking English. By the middle of the 1960s, research in " + "the U.S. was heavily funded by the Department of Defense and laboratories had been " + "established around the world. Herbert Simon predicted, 'machines will be capable, " + "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " + "'within a generation ... the problem of creating 'artificial intelligence' will " + "substantially be solved'. They had, however, underestimated the difficulty of the problem. " + "Both the U.S. and British governments cut off exploratory research in response " + "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " + "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " + "as proving that artificial neural networks approach would never be useful for solving " + "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " + "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " + "AI research was revived by the commercial success of expert systems, a form of AI " + "program that simulated the knowledge and analytical skills of human experts. By 1985, " + "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " + "generation computer project inspired the U.S. and British governments to restore funding " + "for academic research. However, beginning with the collapse of the Lisp Machine market " + "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, + + {"query": "What is the current rate on 10-year treasuries?", + "answer": "4.58%", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"}, + + {"query": "Is the expected gross margin greater than 70%?", + "answer": "Yes, between 71.5% and 72.%", + "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" + "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " + "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " + "50 basis points. GAAP and non-GAAP operating expenses are expected to be " + "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " + "other income and expense are expected to be an income of approximately $100 " + "million, excluding gains and losses from non-affiliated investments. GAAP and " + "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." + "Highlights NVIDIA achieved progress since its previous earnings announcement " + "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " + "up 141% from the previous quarter and up 171% from a year ago. Announced that the " + "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " + "this quarter, with a second-generation version with HBM3e memory expected to ship " + "in Q2 of calendar 2024. "}, + + {"query": "What is Bank of America's rating on Target?", + "answer": "Buy", + "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " + "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " + "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " + "soared more than 22%. Hotter than expected September consumer price index, consumer " + "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " + "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " + "Cites consumer price index showing sticky retail inflation for the fourth time " + "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " + "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " + "Merchandising better. Freight and transportation better. Target to report quarter " + "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " + "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " + "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " + "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." + "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " + "Market email newsletter for free. Barclays cuts price targets on consumer products: " + "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " + "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " + "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" + "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" + "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " + "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " + "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " + "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " + "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " + "overweight (buy) rating but lowers price target to $139 per share from $150. " + "Sees “still challenging” environment into third-quarter print. The Club owns shares " + "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " + "to overweight from equal weight (buy from hold) but lowers price target to $224 per " + "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, + + {"query": "What was the rate of decline in 3rd quarter sales?", + "answer": "20% year-on-year.", + "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " + "third quarter earnings that plunged. The Finnish telecommunications giant said that " + "it will reduce its cost base and increase operation efficiency to “address the " + "challenging market environment. The substantial layoffs come after Nokia reported " + "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " + "the period plunged by 69% year-on-year to 133 million euros."}, + + {"query": "What is a list of the key points?", + "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " + "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " + "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" + "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"} + + ] + + return test_list + + +# this is the main script to be run + +def bling_meets_llmware_hello_world (model_name): + + t0 = time.time() + + # load the questions + test_list = hello_world_questions() + + print(f"\n > Loading Model: {model_name}...") + + # load the model + prompter = Prompt().load_model(model_name) + + t1 = time.time() + print(f"\n > Model {model_name} load time: {t1-t0} seconds") + + for i, entries in enumerate(test_list): + + print(f"\n{i+1}. Query: {entries['query']}") + + # run the prompt + output = prompter.prompt_main(entries["query"],context=entries["context"] + , prompt_name="default_with_context",temperature=0.30) + + # print out the results + llm_response = output["llm_response"].strip("\n") + print(f"LLM Response: {llm_response}") + print(f"Gold Answer: {entries['answer']}") + print(f"LLM Usage: {output['usage']}") + + t2 = time.time() + + print(f"\nTotal processing time: {t2-t1} seconds") + + return 0 + + +if __name__ == "__main__": + + # list of 'rag-instruct' laptop-ready small bling models on HuggingFace + + pytorch_models = ["llmware/bling-1b-0.1", # most popular + "llmware/bling-tiny-llama-v0", # fastest + "llmware/bling-1.4b-0.1", + "llmware/bling-falcon-1b-0.1", + "llmware/bling-cerebras-1.3b-0.1", + "llmware/bling-sheared-llama-1.3b-0.1", + "llmware/bling-sheared-llama-2.7b-0.1", + "llmware/bling-red-pajamas-3b-0.1", + "llmware/bling-stable-lm-3b-4e1t-v0", + "llmware/bling-phi-3" # most accurate (and newest) + ] + + # Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM + gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"] + + # try model from either pytorch or gguf model list + # the newest (and most accurate) is 'bling-phi-3-gguf' + + bling_meets_llmware_hello_world(gguf_models[0]) + + # check out the model card on Huggingface for RAG benchmark test performance results and other useful information +``` + + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/release_history.md b/docs/components/release_history.md similarity index 83% rename from docs/release_history.md rename to docs/components/release_history.md index f4fd36b3..9ffab872 100644 --- a/docs/release_history.md +++ b/docs/components/release_history.md @@ -1,9 +1,10 @@ --- layout: default title: Release History -nav_order: 7 +parent: Components +nav_order: 15 description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /release_history +permalink: /components/release_history --- Release History --- @@ -24,10 +25,16 @@ All wheels are built and tested on: **Release Notes** ---**0.2.14** released in the week of May 19, 2024 - continued clean up and updating of dependencies - changes in download from Huggingface Hub, associated with changes from huggingface_hub API changes - there are some 'future warnings' that are coming from within the HuggingFace code. If any problems, please raise issue. +--**0.3.0** released in the week of June 4, 2024 - continued pruning of required dependencies with split of python dependencies into a small minimal set of requirements (~10 in requirements.txt) that are included in the pip install, with an additional set of optional dependencies provided as 'extras', reflected in both the requirements_extras.txt file, and available over pip with the added instruction - `pip3 install 'llmware[full]'`. Notably, commonly used libraries such as transformers, torch and openai are now in the 'extras' as most llmware use cases do not require them, and this greatly simplifies the ability to install llmware. The `welcome_to_llmware.sh` and `welcome_to_llmware_windows.sh` have also been updated to install both the 'core' and 'extra' set of requirements. Other subtle, but significant, architectural changes include offering more extensibility for adding new model classes, configurable global base model methods for post_init and register, a new InferenceHistory state manager, and enhanced logging options. ---**0.2.13** released in the week of May 12, 2024 - clean up of dependencies in both requirements.txt and Setup (PyPi) - install of vector db python sdk (e.g., pymilvus, chromadb, etc) is now required as a separate step outside of the pip3 install llmware - attempt to keep dependency matrix as simple as possible and avoid potential dependency conflicts on install, especially for packages which in turn have a large number of dependencies. If you run into any issues with install dependencies, please raise an issue. +--**0.2.15** released in the week of May 20, 2024 - removed pytorch dependency as a global import, and shifted to dynamically loading of torch in the event that it is called in a specific model class. This enables running most of llmware code and examples without pytorch or transformers loaded. The main areas of torch (and transformers) dependency is in using HFGenerativeModels and HFEmbeddingModels. + - note: we have seen some new errors caused with Pytorch 2.3 - which are resolved by down-leveling to `pip3 install torch==2.1` + - note: there are a couple of new warnings from within transformers and huggingface_hub libraries - these can be safely ignored. We have seen that keeping `local_dir_use_symlinks = False` when pulling model artifacts from Huggingface is still the safer option in some environments. + +--**0.2.13** released in the week of May 12, 2024 - clean up of dependencies in both requirements.txt and Setup (PyPi) - install of vector db python sdk (e.g., pymilvus, chromadb, etc) is now required as a separate step outside of the pip3 install llmware - attempt to keep dependency matrix as simple as possible and avoid potential dependency conflicts on install, especially for packages which in turn have a large number of dependencies. If you run into any issues with install dependencies, please raise an issue. + + --**0.2.12** released in the week of May 5, 2024 - added Python 3.12 support, and deprecated the use of faiss for v3.12+. We have changed the "Fast Start" no-install option to use chromadb or lancedb rather than faiss. Refactoring of code especially with Datasets, Graph and Web Services as separate modules. --**0.2.11** released in the week of April 29, 2024 - updated GGUF libs for Phi-3 and Llama-3 support, and added new prebuilt shared libraries to support WhisperCPP. We are also deprecating support for Mac x86 going forward - will continue to support on most major components but not all new features going forward will be built specifically for Mac x86 (which Apple stopped shipping in 2022). Our intent is to keep narrowing our testing matrix to provide better support on key platforms. We have also added better safety checks for older versions of Mac OS running on M1/M2/M3 (no_acc option in GGUF and Whisper libs), as well as a custom check to find CUDA drivers on Windows (independent of Pytorch). diff --git a/docs/components/slim_models.md b/docs/components/slim_models.md new file mode 100644 index 00000000..2f752e1d --- /dev/null +++ b/docs/components/slim_models.md @@ -0,0 +1,304 @@ +--- +layout: default +title: SLIM Models +parent: Components +nav_order: 5 +description: overview of the major modules and classes of LLMWare +permalink: /components/slim_models +--- +# SLIM Models - Function Calling with Small Language Models +--- + +Generally, function-calling is a specialized capability of frontier language models, such as OpenAI GPT4. + +We have adapted this concept to small language models through SLIMs (Structured Language Instruction Models), +which are 'single function' models fine-tuned to accept three main inputs to construct a prompt: + +As of June 2024, there are 18 distinct SLIM function calling models with many more on the way, for most common +extraction, classification, and summarization tasks: + +**Models List** +If you would like more information about any of the SLIM models, please check out their model card: + +- extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool) +- summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool) +- xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool) +- ner - extract named entities - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool) +- sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool) +- topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool) +- sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool) +- boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool) +- ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool) +- emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool) +- tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool) +- tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool) +- intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool) +- category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool) +- nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool) +- sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool) + +You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization: +- bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf) +- bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool) +- dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool) +- dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool) +- dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool) + +All SLIM models have a common prompting structure + +Inputs: + -- text passage - this is the core passage or piece of text that you would like the model to assess + -- function - classify, extract, generate - this is handled by default by the model class, so usually does + not need to be explicitly declared - but is an option for SLIMs that support more than one function + -- params - depends upon the model, used to configure/guide the behavior of the function call - optional for + some SLIMs + +Outputs: + -- structured python output, generally either a dictionary or list + +Main objectives: + -- enable function calling with small, locally-running models, + -- simplify prompts by defining specific functions and fine-tuning the model to respond accordingly + without 'prompt magic' + -- standardized outputs that can be handled programmatically as part of a multi-step workflow. + + +```python + + +from llmware.models import ModelCatalog + + +def discover_slim_models(): + + """ Discover a list of SLIM tools in the Model Catalog. + + -- SLIMs are available in both traditional Pytorch and quantized GGUF packages. + -- Generally, we train/fine-tune in Pytorch and then package in 4-bit quantized GGUF for inference. + -- By default, we designate the GGUF versions with 'tool' or 'gguf' in their names. + -- GGUF versions are generally faster to load, faster for inference and use less memory in most environments.""" + + tools = ModelCatalog().list_llm_tools() + tool_map = ModelCatalog().get_llm_fx_mapping() + + print("\nList of SLIM model tools (GGUF) in the ModelCatalog\n") + + for i, tool in enumerate(tools): + model_card = ModelCatalog().lookup_model_card(tool_map[tool]) + print(f"{i} - tool: {tool} - " + f"model_name: {model_card['model_name']} - " + f"model_family: {model_card['model_family']}") + + return 0 + + +def hello_world_slim(): + + """ SLIM models can be identified in the ModelCatalog like any llmware model. Instead of using + inference method, SLIM models are used with the function_call method that prepares a special prompt + instruction, and takes optional parameters. + + This example shows a series of function calls with different SLIM models. + + Please note that the first time the models will be pulled from the llmware Huggingface repository, and will + take a couple of minutes. Future calls will be much faster once cached in memory locally. """ + + print("\nExecuting Function Call Inferences with SLIMs\n") + + # Sentiment Analysis + + passage1 = ("This is one of the best quarters we can remember for the industrial sector " + "with significant growth across the board in new order volume, as well as price " + "increases in excess of inflation. We continue to see very strong demand, especially " + "in Asia and Europe. Accordingly, we remain bullish on the tier 1 suppliers and would " + "be accumulating more stock on any dips.") + + # here are the two key lines of code + model = ModelCatalog().load_model("slim-sentiment-tool") + response = model.function_call(passage1) + + print("sentiment response: ", response['llm_response']) + + # Named Entity Recognition + + passage2 = "Michael Johnson was a famous Olympic sprinter from the U.S. in the early 2000s." + + model = ModelCatalog().load_model("slim-ner-tool") + response = model.function_call(passage2) + + print("ner response: ", response['llm_response']) + + # Extract anything with Slim-extract + + passage3 = ("Adobe shares tumbled as much as 11% in extended trading Thursday after the design software maker " + "issued strong fiscal first-quarter results but came up slightly short on quarterly revenue guidance. " + "Here’s how the company did, compared with estimates from analysts polled by LSEG, formerly known as Refinitiv: " + "Earnings per share: $4.48 adjusted vs. $4.38 expected Revenue: $5.18 billion vs. $5.14 billion expected " + "Adobe’s revenue grew 11% year over year in the quarter, which ended March 1, according to a statement. " + "Net income decreased to $620 million, or $1.36 per share, from $1.25 billion, or $2.71 per share, " + "in the same quarter a year ago. During the quarter, Adobe abandoned its $20 billion acquisition of " + "design software startup Figma after U.K. regulators found competitive concerns. The company paid " + "Figma a $1 billion termination fee.") + + model = ModelCatalog().load_model("slim-extract-tool") + response = model.function_call(passage3, function="extract", params=["revenue growth"]) + + print("extract response: ", response['llm_response']) + + # Generate questions with Slim-Q-Gen + + model = ModelCatalog().load_model("slim-q-gen-tiny-tool", temperature=0.2, sample=True) + # supported params - "question", "multiple choice", "boolean" + response = model.function_call(passage3, params=['multiple choice']) + + print("question generation response: ", response['llm_response']) + + # Generate topic + + model = ModelCatalog().load_model("slim-topics-tool") + response = model.function_call(passage3) + + print("topics response: ", response['llm_response']) + + # Generate headline summary with slim-xsum + model = ModelCatalog().load_model("slim-xsum-tool", temperature=0.0, sample=False) + response = model.function_call(passage3) + + print("xsum response: ", response['llm_response']) + + # Generate boolean with optional '(explain)` in parameter + model = ModelCatalog().load_model("slim-boolean-tool") + response = model.function_call(passage3, params=["Did Adobe revenue increase? (explain)"]) + + print("boolean response: ", response['llm_response']) + + # Generate tags + model = ModelCatalog().load_model("slim-tags-tool", temperature=0.0, sample=False) + response = model.function_call(passage3) + + print("tags response: ", response['llm_response']) + + return 0 + + +def using_logits_and_integrating_into_process(): + + """ This example shows two key elements of function calling SLIM models - + + 1. Using Logit Information to indicate confidence levels, especially for classifications. + 2. Using the structured dictionary generated for programmatic handling in a larger process. + + """ + + print("\nExample: using logits and integrating into process\n") + + text_passage = ("On balance, this was an average result, with earnings in line with expectations and " + "no big surprises to either the positive or the negative.") + + # two key lines (load_model + execute function_call) + additional logit_analysis step + sentiment_model = ModelCatalog().load_model("slim-sentiment-tool", get_logits=True) + response = sentiment_model.function_call(text_passage) + analysis = ModelCatalog().logit_analysis(response,sentiment_model.model_card, sentiment_model.hf_tokenizer_name) + + print("sentiment response: ", response['llm_response']) + + print("\nAnalyzing response") + for keys, values in analysis.items(): + print(f"{keys} - {values}") + + # two key attributes of the sentiment output dictionary + sentiment_value = response["llm_response"]["sentiment"] + confidence_level = analysis["confidence_score"] + + # use the sentiment classification as a 'if...then' decision point in a process + if "positive" in sentiment_value: + print("sentiment is positive .... will take 'positive' analysis path ...", sentiment_value) + else: + print("sentiment is negative .... will take 'negative' analysis path ...", sentiment_value) + + if "positive" in sentiment_value and confidence_level > 0.8: + print("sentiment is positive with high confidence ... ", sentiment_value, confidence_level) + + return 0 + + +if __name__ == "__main__": + + # discovering slim models in the llmware catalog + discover_slim_models() + + # running function call inferences + hello_world_slim() + + # doing interesting stuff with the output + using_logits_and_integrating_into_process() + +``` + + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/components/vector_databases.md b/docs/components/vector_databases.md new file mode 100644 index 00000000..cadb7db1 --- /dev/null +++ b/docs/components/vector_databases.md @@ -0,0 +1,173 @@ +--- +layout: default +title: Vector Databases +parent: Components +nav_order: 11 +description: overview of the major modules and classes of LLMWare +permalink: /components/vector_databases +--- +# Vector Databases +--- + +llmware supports the following vector databases: + + - Milvus and Milvus-Lite - `milvus` + - Postgres (PG Vector) - `postgres` + - Qdrant - `qdrant` + - ChromaDB - `chromadb` + - Redis - `redis` + - Neo4j - `neo4j` + - LanceDB - `lancedb` + - FAISS - `faiss` + - Mongo-Atlas - `mongo-atlas` + - Pinecone - `pinecone` + +In llmware, unstructured content is ingested and organized into a Library, and then embeddings are created against the +Library object, and usually, handled by implicitly through the Library method `.install_new_embedding`. + +All embedding models are implemented through the embeddings.py module, and the `EmbeddingHandler` class, which routes +the embedding process to the vector db specific handler and provides a common set of utility functions. +In most cases, it is not necessarily to explicitly call the vector db class. + +The design is intended to promote code re-use and to make it easy to experiment with different endpoint vector databases +without significant code changes, as well as to leverage the Library as the core organizing construct. + +# Select Vector DB +To select a vector database in llmware is generally done is one of two ways: + +1. Explicit Setting - `LLMWareConfig().set_vector_db("postgres")` + +2. Pass the name of the vector database at the time of installing the embeddings: + + `library.install_new_embedding(embedding_model_name=embedding_model, vector_db='milvus',batch_size=100)` + +# Install Vector DB + +No-install options: chromadb, lancedb, faiss, and milvus-lite + +API-based options: mongo-atlas, pinecone + +Install server options: + +Generally, we have found that Docker (and Docker-Compose) are the easiest and most consistent ways to install vector +db across different platforms. + +1. milvus - we provide a docker-compose script in the main repository root folder path, which installs mongodb as well. + +```bash +curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose_mongo_milvus.yaml +docker compose up -d +``` + +2. qdrant + +```bash +curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-qdrant.yaml +docker compose up -d +``` + +3. postgres and pgvector + +```bash +curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-pgvector.yaml +docker compose up -d +``` + +4. redis +```bash +# scripts to deploy other options +curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-redis-stack.yaml +``` + +5. neo4j + +```bash +curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose-neo4j.yaml +docker compose up -d +``` + +# Configure Vector DB + +To configure a vector database in llmware, we provide configuration objects in the `configs.py` module to adjust +authentication, port/host information, and other common configurations. To use the configuration, the pattern is +as follows through simple `get_config` and `set_config` methods: + +```python +from llmware.configs import MilvusConfig +MilvusConfig().set_config("lite", True) + +from llmware.configs import ChromaDBConfig +current_config = ChromaDBConfig().get_config("persistent_path") +ChromaDBConfig().set_config("persistent_path", "/new/local/path") +``` + +Configuration objects are provided for the following vector DB: `MilvusConfig`, `ChromaDBConfig`, `QdrantConfig`, +`Neo4jConfig`, `LanceDBConfig`, `PineConeConfig`, `MongoConfig`, `PostgresConfig`. + +For 'out-of-the-box' testing and development, for most use cases, you will not need to change these configs. + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/components/whisper_cpp.md b/docs/components/whisper_cpp.md new file mode 100644 index 00000000..8df900d0 --- /dev/null +++ b/docs/components/whisper_cpp.md @@ -0,0 +1,203 @@ +--- +layout: default +title: Whisper CPP +parent: Components +nav_order: 14 +description: overview of the major modules and classes of LLMWare +permalink: /components/whisper_cpp +--- +# Whisper CPP +--- + +llmware has an integrated WhisperCPP backend which enables fast, easy local voice-to-text processing. + +Whisper is a leading open voice voice-to-text model from OpenAI - https://github.com/openai/whisper + +WhisperCPP is the implementation of Whisper packaged as a GGML deliverable - https://github.com/ggerganov/whisper.cpp + +Starting with llmware 0.2.11, we have integrated WhisperCPPModel as a new model class, +providing options for direct inference, and coming soon, integration into the Parser for easy text chunking and +parsing into a Library with other document types. + +llmware provides prebuilt shared libraries for WhisperCPP on the following platforms: + --Mac M series + --Linux x86 (no CUDA) + --Linux x86 (with CUDA) - really fast + --Windows x86 (only on CPU) currently. + +We have added three Whisper models to the default model catalog: + +1. ggml-base.en.bin - english-only base model +2. ggml-base.bin - multi-lingual base model +3. ggml-small.en-tdrz.bin - this is a 'tiny-diarize' implementation that has been finetuned to identify the +speakers and inserts special [_SOLM_] tags to indicate a conversation turn / change of speaker. + + Main repo: https://github.com/akashmjn/tinydiarize/ + Citation: @software{mahajan2023tinydiarize, + author = {Mahajan, Akash}, month = {08}, + title = {tinydiarize: Minimal extension of Whisper for speaker segmentation with special tokens} + url = {https://github.com/akashmjn/tinydiarize}, + year = {2023} + +To use WAV files, there is one additional Python dependency required: + --pip install librosa + --Note: this has been added to the default requirements.txt and pypy build starting with 0.2.11 + +To use other popular audio/video file formats, such as MP3, MP4, M4A, etc., then the following dependencies are +required: + --pip install pydub + --ffmpeg library - which can be installed as follows: + -- Linux: `sudo apt install ffmpeg' + -- Mac: `brew install ffmpeg` + -- Windows: direct download and install from ffmpeg + + +```python + +""" This example shows how to use llmware provided sample files for testing with WhisperCPP, integrated as of + llmware 0.2.11. + + # examples - "famous_quotes" | "greatest_speeches" | "youtube_demos" | "earnings_calls" + + -- famous_quotes - approximately 20 small .wav files with clips from old movies and speeches + -- greatest_speeches - approximately 60 famous historical speeches in english + -- youtube_videos - wav files of ~3 llmware youtube videos + -- earnings_calls - wav files of ~4 public company earnings calls (gathered from public investor relations) + + These sample files are hosted in a non-restricted AWS S3 bucket, and downloaded via the Setup method + `load_sample_voice_files`. There are two options: + + -- small_only = True: only pulls the 'famous_quotes' samples + -- small_only = False: pulls all of the samples (requires ~1.9 GB in total) + + Please note that all of these samples have been pulled from open public domain sources, including the + Internet Archives, e.g., https://archive.org. These sample files are being provided solely for the purpose of + testing the code scripts below. Please do not use them for any other purpose. + + To run these examples, please make sure to `pip install librosa` + """ + +import os +from llmware.models import ModelCatalog +from llmware.gguf_configs import GGUFConfigs +from llmware.setup import Setup + +# optional / to adjust various parameters of the model +GGUFConfigs().set_config("whisper_cpp_verbose", "OFF") +GGUFConfigs().set_config("whisper_cpp_realtime_display", True) + +# note: english is default output - change to 'es' | 'fr' | 'de' | 'it' ... +GGUFConfigs().set_config("whisper_language", "en") +GGUFConfigs().set_config("whisper_remove_segment_markers", True) + + +def sample_files(example="famous_quotes", small_only=False): + + """ Execute a basic inference on Voice-to-Text model passing a file_path string """ + + voice_samples = Setup().load_voice_sample_files(small_only=small_only) + + examples = ["famous_quotes", "greatest_speeches", "youtube_demos", "earnings_calls"] + + if example not in examples: + print("choose one of the following - ", examples) + return 0 + + fp = os.path.join(voice_samples,example) + + files = os.listdir(fp) + + # these are the two key lines + whisper_base_english = "whisper-cpp-base-english" + + model = ModelCatalog().load_model(whisper_base_english) + + for f in files: + + if f.endswith(".wav"): + + prompt = os.path.join(fp,f) + + print(f"\n\nPROCESSING: prompt = {prompt}") + + response = model.inference(prompt) + + print("\nllm response: ", response["llm_response"]) + print("usage: ", response["usage"]) + + return 0 + + +if __name__ == "__main__": + + # pick among the four examples: famous_quotes | greatest_speeches | youtube_demos | earnings_calls + + sample_files(example="famous_quotes", small_only=False) +``` + + +Need help or have questions? +============================ + +Check out the [llmware videos](https://www.youtube.com/@llmware) and [GitHub repository](https://github.com/llmware-ai/llmware). + +Reach out to us on [GitHub Discussions](https://github.com/llmware-ai/llmware/discussions). + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in Oktober 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/contributing/contributing.md b/docs/contributing/contributing.md index e45497b6..394fae54 100644 --- a/docs/contributing/contributing.md +++ b/docs/contributing/contributing.md @@ -1,11 +1,12 @@ --- layout: default title: Contributing -nav_order: 10 +nav_order: 6 has_children: true description: llmware contributions. permalink: /contributing --- + # Contributing to llmware {: .note} diff --git a/docs/examples/.DS_Store b/docs/examples/.DS_Store new file mode 100644 index 00000000..5008ddfc Binary files /dev/null and b/docs/examples/.DS_Store differ diff --git a/docs/examples/agents.md b/docs/examples/agents.md new file mode 100644 index 00000000..ef39e737 --- /dev/null +++ b/docs/examples/agents.md @@ -0,0 +1,98 @@ +--- +layout: default +title: Agents +parent: examples +nav_order: 2 +description: overview of the major modules and classes of LLMWare +permalink: /examples/agents +--- +# Agents + + + 🚀 Start Building Multi-Model Agents Locally on a Laptop 🚀 +=============== + +**What is a SLIM?** + +**SLIMs** are **S**tructured **L**anguage **I**nstruction **M**odels, which are small, specialized 1-3B parameter LLMs, +finetuned to generate structured outputs (Python dictionaries and lists, JSON and SQL) that can be handled programmatically, and +stacked together in multi-step, multi-model Agent workflows - all running on a local CPU. + +**New SLIMS Just released** - check out slim-extract, slim-summarize, slim-xsum, slim-sa-ner, slim-boolean and slim-tags-3b + +**Check out the new examples below marked with ⭐** +🔥🔥🔥 Web Services & Function Calls ([code](web_services_slim_fx.py)) 🔥🔥🔥 + +**Check out the Intro videos** +[SLIM Intro Video](https://www.youtube.com/watch?v=cQfdaTcmBpY) + +There are 16 SLIM models, each delivered in two packages - a Pytorch/Huggingface FP16 model, and a +quantized "tool" designed for fast inference on a CPU, using LLMWare's embedded GGUF inference engine. In most cases, +we would recommend that you start with the "tools" version of the models. + +**Getting Started** + +We have several ready-to-run examples in this repository: + +| Example | Detail | +|-----------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------| +| 1. Getting Started with SLIM Models ([code](slims-getting-started.py) / [video](https://www.youtube.com/watch?v=aWZFrTDmMPc&t=196s)) | Install the models and run hello world tests to see the models in action. | +| 2. Getting Started with Function-Calling Agent ([code](agent-llmfx-getting-started.py) / [video](https://www.youtube.com/watch?v=cQfdaTcmBpY)) | Generate a Structured Report with LLMfx | +| 3. Multi-step Complex Analysis with Agent ([code](agent-multistep-analysis.py) / [video](https://www.youtube.com/watch?v=y4WvwHqRR60)) | Delivering Complex Research Analysis with SLIM Agents | | +| 4. Document Clustering ([code](document-clustering.py)) | Multi-faceted automated document analysis with Topics, Tags and NER | +| 5. Two-Step NER Retrieval ([code](ner-retrieval.py)) | Using NER to extract name, and then using as basis for retrieval. | | +| 6. Using Sentiment Analysis ([code](sentiment-analysis.py)) | Using sentiment analysis on earnings transcripts and a 'if...then' condition | +| 7. Text2SQL - Intro ([code](text2sql-getting-started-1.py)) | Getting Started with SLIM-SQL-TOOL and Basic Text2SQL Inference | | +| 8. Text2SQL - E2E ([code](text2sql-end-to-end-2.py)) | End-to-End Natural Langugage Query to SQL DB Query | | +| 9. Text2SQL - MultiStep ([code](text2sql-multistep-example-3.py)) | Extract a customer name using NER and use in a Text2SQL query | +| 10. ⭐ Web Services & Function Calls ([code](web_services_slim_fx.py)) | Generate 30 key financial analysis with SLIM function calls and web services | +| 11. ⭐ Yes-No Questions with Explanations ([code](using_slim_boolean_model.py)) | Analyze earnings releases with SLIM Boolean | +| 12. ⭐ Extracting Revenue Growth ([code](using_slim_extract_model.py)) | Extract revenue growth from earnings releases with SLIM Extract | +| 13. ⭐ Summary as a Function Call ([code](using_slim_summary.py)) | Simple Summarization as a Function Call with List Length Parameters | +| 14. ⭐ Handling Not Found Extracts ([code](not_found_extract_with_lookup.py)) | Multi-step Lookup strategy and handling not-found answers | +| 15. ⭐ Extract + Lookup ([code](custom_extract_and_lookup.py)) | Extract Named Entity information and use for lookups with SLIM Extract | +| 16. ⭐ Headline/Title as XSUM Function Call ([code](using_slim_xsum.py)) | eXtreme Summarization (XSUM) with SLIM XSUM | + +For information on all of the SLIM models, check out [LLMWare SLIM Model Collection](https://www.huggingface.co/llmware/). + +**Models List** +If you would like more information about any of the SLIM models, please check out their model card: + +- extract - extract custom keys - [slim-extract](https://www.huggingface.co/llmware/slim-extract) & [slim-extract-tool](https://www.huggingface.co/llmware/slim-extract-tool) +- summary - summarize function call - [slim-summary](https://www.huggingface.co/llmware/slim-summary) & [slim-summary-tool](https://www.huggingface.co/llmware/slim-summary-tool) +- xsum - title/headline function call - [slim-xsum](https://www.huggingface.co/llmware/slim-xsum) & [slim-xsum-tool](https://www.huggingface.co/llmware/slim-xsum-tool) +- ner - extract named entities - [slim-ner](https://www.huggingface.co/llmware/slim-ner) & [slim-ner-tool](https://www.huggingface.co/llmware/slim-ner-tool) +- sentiment - evaluate sentiment - [slim-sentiment](https://www.huggingface.co/slim-sentiment) & [slim-sentiment-tool](https://www.huggingface.co/llmware/slim-sentiment-tool) +- topics - generate topic - [slim-topics](https://www.huggingface.co/slim-topics) & [slim-topics-tool](https://www.huggingface.co/llmware/slim-topics-tool) +- sa-ner - combo model (sentiment + named entities) - [slim-sa-ner](https://www.huggingface.co/slim-sa-ner) & [slim-sa-ner-tool](https://www.huggingface.co/llmware/slim-sa-ner-tool) +- boolean - provides a yes/no output with explanation - [slim-boolean](https://www.huggingface.co/slim-boolean) & [slim-boolean-tool](https://www.huggingface.com/llmware/slim-boolean-tool) +- ratings - apply 1 (low) - 5 (high) rating - [slim-ratings](https://www.huggingface.co/slim-ratings) & [slim-ratings-tool](https://www.huggingface.co/llmware/slim-ratings-tool) +- emotions - assess emotions - [slim-emotions](https://www.huggingface.co/slim-emotions) & [slim-emotions-tool](https://www.huggingface.co/llmware/slim-emotions-tool) +- tags - auto-generate list of tags - [slim-tags](https://www.huggingface.co/slim-tags) & [slim-tags-tool](https://www.huggingface.co/llmware/slim-tags-tool) +- tags-3b - enhanced auto-generation tagging model - [slim-tags-3b](https://www.huggingface.com/slim-tags-3b) & [slim-tags-3b-tool](https://www.huggingface.co/llmware/slim-tags-3b-tool) +- intent - identify intent - [slim-intent](https://www.huggingface.co/slim-intent) & [slim-intent-tool](https://www.huggingface.co/llmware/slim-intent-tool) +- category - high-level category - [slim-category](https://www.huggingface.co/slim-category) & [slim-category-tool](https://wwww.huggingface.co/llmware/slim-category-tool) +- nli - assess if evidence supports conclusion - [slim-nli](https://www.huggingface.co/slim-nli) & [slim-nli-tool](https://www.huggingface.co/llmware/slim-nli-tool) +- sql - convert text into sql - [slim-sql](https://www.huggingface.co/slim-sql) & [slim-sql-tool](https://www.huggingface.co/llmware/slim-sql-tool) + +You may also want to check out these quantized 'answer' tools, which work well in conjunction with SLIMs for question-answer and summarization: +- bling-stablelm-3b-tool - 3b quantized RAG model - [bling-stablelm-3b-gguf](https://www.huggingface.co/llmware/bling-stablelm-3b-gguf) +- bling-answer-tool - 1b quantized RAG model - [bling-answer-tool](https://www.huggingface.co/llmware/bling-answer-tool) +- dragon-yi-answer-tool - 6b quantized RAG model - [dragon-yi-answer-tool](https://www.huggingface.co/llmware/dragon-yi-answer-tool) +- dragon-mistral-answer-tool - 7b quantized RAG model - [dragon-mistral-answer-tool](https://www.huggingface.co/llmware/dragon-mistral-answer-tool) +- dragon-llama-answer-tool - 7b quantized RAG model - [dragon-llama-answer-tool](https://www.huggingface.co/llmware/dragon-llama-answer-tool) + + +**Set up** +No special setup for SLIMs is required other than to install llmware >=0.2.6, e.g., `pip3 install llmware`. + +**Platforms:** +- Mac M1, Mac x86, Windows, Linux (Ubuntu 22 preferred, supported on Ubuntu 20 +) +- RAM: 16 GB minimum +- Python 3.9, 3.10, 3.11 (note: not supported on 3.12 yet) +- llmware >= 0.2.6 version + + +### **Let's get started! 🚀** + + diff --git a/docs/examples/datasets.md b/docs/examples/datasets.md new file mode 100644 index 00000000..59c883fe --- /dev/null +++ b/docs/examples/datasets.md @@ -0,0 +1,134 @@ +--- +layout: default +title: Datasets +parent: examples +nav_order: 10 +description: overview of the major modules and classes of LLMWare +permalink: /examples/datasets +--- +# Datasets - Introduction by Examples + +llmware provides powerful capabilities to transform raw unstructured information into various model-ready datasets. + +```python + +import os +import json + +from llmware.library import Library +from llmware.setup import Setup +from llmware.dataset_tools import Datasets +from llmware.retrieval import Query + +def build_and_use_dataset(library_name): + + # Setup a library and build a knowledge graph. Datasets will use the data in the knowledge graph + print (f"\n > Creating library {library_name}...") + library = Library().create_new_library(library_name) + sample_files_path = Setup().load_sample_files() + library.add_files(os.path.join(sample_files_path,"SmallLibrary")) + library.generate_knowledge_graph() + + # Create a Datasets object from library + datasets = Datasets(library) + + # Build a basic dataset useful for industry domain adaptation for fine-tuning embedding models + print (f"\n > Building basic text dataset...") + + basic_embedding_dataset = datasets.build_text_ds(min_tokens=500, max_tokens=1000) + dataset_location = os.path.join(library.dataset_path, basic_embedding_dataset["ds_id"]) + + print (f"\n > Dataset:") + print (f"(Files referenced below are found in {dataset_location})") + + print (f"\n{json.dumps(basic_embedding_dataset, indent=2)}") + sample = datasets.get_dataset_sample(datasets.current_ds_name) + + print (f"\nRandom sample from the dataset:\n{json.dumps(sample, indent=2)}") + + # Other Dataset Generation and Usage Examples: + + # Build a simple self-supervised generative dataset- extracts text and splits into 'text' & 'completion' + # Several generative "prompt_wrappers" are available - chat_gpt | alpaca | + basic_generative_completion_dataset = datasets.build_gen_ds_targeted_text_completion(prompt_wrapper="alpaca") + + # Build a generative self-supervised training sets created by pairing 'header_text' with 'text' + xsum_generative_completion_dataset = datasets.build_gen_ds_headline_text_xsum(prompt_wrapper="human_bot") + topic_prompter_dataset = datasets.build_gen_ds_headline_topic_prompter(prompt_wrapper="chat_gpt") + + # Filter a library by a key term as part of building the dataset + filtered_dataset = datasets.build_text_ds(query="agreement", filter_dict={"master_index":1}) + + # Pass a set of query results to create a dataset from those results only + query_results = Query(library=library).query("africa") + query_filtered_dataset = datasets.build_text_ds(min_tokens=250,max_tokens=600, qr=query_results) + + return 0 +``` + +For more examples, see the [datasets example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets/) in the main repo. + + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/examples/embedding.md b/docs/examples/embedding.md new file mode 100644 index 00000000..dab2cc34 --- /dev/null +++ b/docs/examples/embedding.md @@ -0,0 +1,222 @@ +--- +layout: default +title: Embedding +parent: examples +nav_order: 5 +description: overview of the major modules and classes of LLMWare +permalink: /examples/embedding +--- +# Embedding - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +```python + +""" This example is a fast start with Milvus Lite, which is a 'no-install' file-based version of Milvus, intended +for rapid prototyping. A couple of key points to note: + + -- Platform - per Milvus docs, Milvus Lite is designed for Mac and Linux (not on Windows currently) + -- PyMilvus - need to `pip install pymilvus>=2.4.2` + -- within LLMWare: set MilvusConfig().set_config("lite", True) +""" + +import os +from llmware.library import Library +from llmware.retrieval import Query +from llmware.setup import Setup +from llmware.status import Status +from llmware.models import ModelCatalog +from llmware.configs import LLMWareConfig, MilvusConfig + +from importlib import util + +if not util.find_spec("pymilvus"): + print("\nto run this example with pymilvus, you need to install pymilvus: pip3 install pymilvus>=2.4.2") + + +def setup_library(library_name): + + """ Note: this setup_library method is provided to enable a self-contained example to create a test library """ + + # Step 1 - Create library which is the main 'organizing construct' in llmware + print ("\nupdate: Creating library: {}".format(library_name)) + + library = Library().create_new_library(library_name) + + # check the embedding status 'before' installing the embedding + embedding_record = library.get_embedding_status() + print("embedding record - before embedding ", embedding_record) + + # Step 2 - Pull down the sample files from S3 through the .load_sample_files() command + # --note: if you need to refresh the sample files, set 'over_write=True' + print ("update: Downloading Sample Files") + + sample_files_path = Setup().load_sample_files(over_write=False) + + # Step 3 - point ".add_files" method to the folder of documents that was just created + # this method parses the documents, text chunks, and captures in database + + print("update: Parsing and Text Indexing Files") + + library.add_files(input_folder_path=os.path.join(sample_files_path, "Agreements"), + chunk_size=400, max_chunk_size=600, smart_chunking=1) + + return library + + +def install_vector_embeddings(library, embedding_model_name): + + """ This method is the core example of installing an embedding on a library. + -- two inputs - (1) a pre-created library object and (2) the name of an embedding model """ + + library_name = library.library_name + vector_db = LLMWareConfig().get_vector_db() + + print(f"\nupdate: Starting the Embedding: " + f"library - {library_name} - " + f"vector_db - {vector_db} - " + f"model - {embedding_model_name}") + + # *** this is the one key line of code to create the embedding *** + library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db,batch_size=100) + + # note: for using llmware as part of a larger application, you can check the real-time status by polling Status() + # --both the EmbeddingHandler and Parsers write to Status() at intervals while processing + update = Status().get_embedding_status(library_name, embedding_model) + print("update: Embeddings Complete - Status() check at end of embedding - ", update) + + # Start using the new vector embeddings with Query + sample_query = "incentive compensation" + print("\n\nupdate: Run a sample semantic/vector query: {}".format(sample_query)) + + # queries are constructed by creating a Query object, and passing a library as input + query_results = Query(library).semantic_query(sample_query, result_count=20) + + for i, entries in enumerate(query_results): + + # each query result is a dictionary with many useful keys + + text = entries["text"] + document_source = entries["file_source"] + page_num = entries["page_num"] + vector_distance = entries["distance"] + + # to see all of the dictionary keys returned, uncomment the line below + # print("update: query_results - all - ", i, entries) + + # for display purposes only, we will only show the first 125 characters of the text + if len(text) > 125: text = text[0:125] + " ... " + + print("\nupdate: query results - {} - document - {} - page num - {} - distance - {} " + .format( i, document_source, page_num, vector_distance)) + + print("update: text sample - ", text) + + # lets take a look at the library embedding status again at the end to confirm embeddings were created + embedding_record = library.get_embedding_status() + + print("\nupdate: embedding record - ", embedding_record) + + return 0 + + +if __name__ == "__main__": + + # Fast Start configuration - will use no-install embedded sqlite + # -- if you have installed Mongo or Postgres, then change the .set_active_db accordingly + + LLMWareConfig().set_active_db("sqlite") + + # set the "lite" flag in MilvusConfig to True -> to use server version, set to False (which is default) + MilvusConfig().set_config("lite", True) + LLMWareConfig().set_vector_db("milvus") + + # Step 1 - create library + library = setup_library("ex2_milvus_lite") + + # Step 2 - Select any embedding model in the LLMWare catalog + + # to see a list of the embedding models supported, uncomment the line below and print the list + embedding_models = ModelCatalog().list_embedding_models() + + # for i, models in enumerate(embedding_models): + # print("embedding models: ", i, models) + + # for this first embedding, we will use a very popular and fast sentence transformer + embedding_model = "mini-lm-sbert" + + # note: if you want to swap out "mini-lm-sbert" for Open AI 'text-embedding-ada-002', uncomment these lines: + # embedding_model = "text-embedding-ada-002" + # os.environ["USER_MANAGED_OPENAI_API_KEY"] = "" + + # run the core script + install_vector_embeddings(library, embedding_model) +``` + + +For more examples, see the [embedding examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding/) in the main repo. + + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/examples/examples.md b/docs/examples/examples.md new file mode 100644 index 00000000..eb0f6aaf --- /dev/null +++ b/docs/examples/examples.md @@ -0,0 +1,25 @@ +--- +layout: default +title: Examples +nav_order: 4 +has_children: true +description: examples, recipes and use cases +permalink: /examples +--- + +llmware offers a wide range of examples to cover the lifecycle of building RAG and Agent based applications using +small language models: + + - [Parsing examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing) - ~14 stand-alone parsing examples for all common document types, including options for parsing in memory, outputting to JSON, parsing custom configured CSV and JSON files, running OCR on embedded images found in documents, table extraction, image extraction, text chunking, zip files, and web sources. + - [Embedding examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Embedding) - ~15 stand-alone embedding examples to show how to use ~10 different vector databases and wide range of leading open source embedding models (including sentence transformers). + - [Retrieval examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval) - ~10 stand-alone examples illustrating different query and retrieval techniques - semantic queries, text queries, document filters, page filters, 'hybrid' queries, author search, using query state, and generating bibliographies. + - [Dataset examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Datasets) - ~5 stand-alone examples to show 'next steps' of how to leverage a Library to re-package content into various datasets and automated NLP analytics. + - [Fast start example #1-Parsing](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-1-create_first_library.py) - shows the basics of parsing. + - [Fast start example #2-Embedding](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-2-build_embeddings.py) - shows the basics of building embeddings. + - [CustomTable examples](https://www.github.com/llmware-ai/llmware/tree/main/Structured_Tables) - ~5 examples to start building structured tables that can be used in conjunction with LLM-based workflows. + + - [Models examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Models) - ~20 examples showing a wide range of different model inferences and use cases, including the ability to integrate Ollama models, OpenChat (e.g., LMStudio) models, using LLama-3 and Phi-3, bringing your own models into the ModelCatalog, and configuring sampling settings. + - [Prompts examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts) - ~5 examples that illustrate how to use Prompt as an integrated workflow for integrating knowledge sources, managing prompt history, and applying fact-checking. + - [SLIM-Agents examples](https://www.github.com/llmware-ai/llmware/tree/main/examples/SLIM-Agents) - ~20 examples showing how to build multi-model, multi-step Agent processes using locally-running SLIM function calling models. + - [Fast start example #3-Prompts and Models](https://www.github.com/llmware-ai/llmware/tree/main/fast_start/example-3-prompts_and_models.py) - getting started with model inference. + diff --git a/docs/examples.md b/docs/examples/getting_started.md similarity index 100% rename from docs/examples.md rename to docs/examples/getting_started.md diff --git a/docs/examples/models.md b/docs/examples/models.md new file mode 100644 index 00000000..ba6aab50 --- /dev/null +++ b/docs/examples/models.md @@ -0,0 +1,498 @@ +--- +layout: default +title: Models +parent: Learn +nav_order: 3 +description: overview of the major modules and classes of LLMWare +permalink: /examples/models +--- +# Models + +We introduce ``llmware`` through self-contained examples. + +```python + + +""" This example demonstrates prompting local BLING models with provided context - easy to select among different +BLING models between 1B - 4B, including both Pytorch versions and GGUF quantized versions, and to swap out the +hello_world questions with your own test set. + + NOTE: if you are running on a CPU with limited memory (e.g., <16 GB of RAM), we would recommend sticking to +the 1B parameter models, or using the quantized GGUF versions. You may get out-of-memory errors and/or very +slow performance with ~3B parameter Pytorch models. Even with 16 GB+ of RAM, the 3B Pytorch models should run but +will be slow (without GPU acceleration). """ + + +import time +from llmware.prompts import Prompt + + +def hello_world_questions(): + + test_list = [ + + {"query": "What is the total amount of the invoice?", + "answer": "$22,500.00", + "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " + "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" + " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" + "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." + "If you have any questions concerning this invoice, contact Bia Hermes. " + "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, + + {"query": "What was the amount of the trade surplus?", + "answer": "62.4 billion yen ($416.6 million)", + "context": "Japan’s September trade balance swings into surplus, surprising expectations" + "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " + "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " + "billion yen. Data from Japan’s customs agency revealed that exports in September " + "increased 4.3% year on year, while imports slid 16.3% compared to the same period " + "last year. According to FactSet, exports to Asia fell for the ninth straight month, " + "which reflected ongoing China weakness. Exports were supported by shipments to " + "Western markets, FactSet added. — Lim Hui Jie"}, + + {"query": "What was Microsoft's revenue in the 3rd quarter?", + "answer": "$52.9 billion", + "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — " + "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023," + " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion" + " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion " + "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and " + "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 " + "and increased 10% (up 14% in constant currency).\n"}, + + {"query": "When did the LISP machine market collapse?", + "answer": "1987.", + "context": "The attendees became the leaders of AI research in the 1960s." + " They and their students produced programs that the press described as 'astonishing': " + "computers were learning checkers strategies, solving word problems in algebra, " + "proving logical theorems and speaking English. By the middle of the 1960s, research in " + "the U.S. was heavily funded by the Department of Defense and laboratories had been " + "established around the world. Herbert Simon predicted, 'machines will be capable, " + "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " + "'within a generation ... the problem of creating 'artificial intelligence' will " + "substantially be solved'. They had, however, underestimated the difficulty of the problem. " + "Both the U.S. and British governments cut off exploratory research in response " + "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " + "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " + "as proving that artificial neural networks approach would never be useful for solving " + "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " + "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " + "AI research was revived by the commercial success of expert systems, a form of AI " + "program that simulated the knowledge and analytical skills of human experts. By 1985, " + "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " + "generation computer project inspired the U.S. and British governments to restore funding " + "for academic research. However, beginning with the collapse of the Lisp Machine market " + "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, + + {"query": "When will employment start?", + "answer": "April 16, 2012.", + "context": "THIS EXECUTIVE EMPLOYMENT AGREEMENT (this “Agreement”) is entered " + "into this 2nd day of April, 2012, by and between Aphrodite Apollo " + "(“Executive”) and TestCo Software, Inc. (the “Company” or “Employer”), " + "and shall become effective upon Executive’s commencement of employment " + "(the “Effective Date”) which is expected to commence on April 16, 2012. " + "The Company and Executive agree that unless Executive has commenced " + "employment with the Company as of April 16, 2012 (or such later date as " + "agreed by each of the Company and Executive) this Agreement shall be " + "null and void and of no further effect."}, + + {"query": "What is the current rate on 10-year treasuries?", + "answer": "4.58%", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"}, + + {"query": "What is the governing law?", + "answer": "State of Massachusetts", + "context": "19. Governing Law and Procedures. This Agreement shall be governed by and interpreted " + "under the laws of the State of Massachusetts, except with respect to Section 18(a) of this Agreement," + " which shall be governed by the laws of the State of Delaware, without giving effect to any " + "conflict of laws provisions. Employer and Executive each irrevocably and unconditionally " + "(a) agrees that any action commenced by Employer for preliminary and permanent injunctive relief " + "or other equitable relief related to this Agreement or any action commenced by Executive pursuant " + "to any provision hereof, may be brought in the United States District Court for the federal " + "district in which Executive’s principal place of employment is located, or if such court does " + "not have jurisdiction or will not accept jurisdiction, in any court of general jurisdiction " + "in the state and county in which Executive’s principal place of employment is located, " + "(b) consents to the non-exclusive jurisdiction of any such court in any such suit, action o" + "r proceeding, and (c) waives any objection which Employer or Executive may have to the " + "laying of venue of any such suit, action or proceeding in any such court. Employer and " + "Executive each also irrevocably and unconditionally consents to the service of any process, " + "pleadings, notices or other papers in a manner permitted by the notice provisions of Section 8."}, + + {"query": "What is the amount of the base salary?", + "answer": "$200,000.", + "context": "2.2. Base Salary. For all the services rendered by Executive hereunder, during the " + "Employment Period, Employer shall pay Executive a base salary at the annual rate of " + "$200,000, payable semimonthly in accordance with Employer’s normal payroll practices. " + "Executive’s base salary shall be reviewed annually by the Board (or the compensation committee " + "of the Board), pursuant to Employer’s normal compensation and performance review policies " + "for senior level executives, and may be increased but not decreased. The amount of any " + "increase for each year shall be determined accordingly. For purposes of this Agreement, " + "the term “Base Salary” shall mean the amount of Executive’s base salary established " + "from time to time pursuant to this Section 2.2. "}, + + {"query": "Is the expected gross margin greater than 70%?", + "answer": "Yes, between 71.5% and 72.%", + "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" + "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " + "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " + "50 basis points. GAAP and non-GAAP operating expenses are expected to be " + "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " + "other income and expense are expected to be an income of approximately $100 " + "million, excluding gains and losses from non-affiliated investments. GAAP and " + "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." + "Highlights NVIDIA achieved progress since its previous earnings announcement " + "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " + "up 141% from the previous quarter and up 171% from a year ago. Announced that the " + "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " + "this quarter, with a second-generation version with HBM3e memory expected to ship " + "in Q2 of calendar 2024. "}, + + {"query": "What is Bank of America's rating on Target?", + "answer": "Buy", + "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " + "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " + "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " + "soared more than 22%. Hotter than expected September consumer price index, consumer " + "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " + "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " + "Cites consumer price index showing sticky retail inflation for the fourth time " + "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " + "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " + "Merchandising better. Freight and transportation better. Target to report quarter " + "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " + "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " + "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " + "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." + "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " + "Market email newsletter for free. Barclays cuts price targets on consumer products: " + "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " + "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " + "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" + "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" + "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " + "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " + "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " + "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " + "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " + "overweight (buy) rating but lowers price target to $139 per share from $150. " + "Sees “still challenging” environment into third-quarter print. The Club owns shares " + "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " + "to overweight from equal weight (buy from hold) but lowers price target to $224 per " + "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, + + {"query": "Who is NVIDIA's partner for the driver assistance system?", + "answer": "MediaTek", + "context": "Automotive Second-quarter revenue was $253 million, down 15% from the previous " + "quarter and up 15% from a year ago. Announced that NVIDIA DRIVE Orin™ is powering " + "the new XPENG G6 Coupe SUV’s intelligent advanced driver assistance system. " + "Partnered with MediaTek, which will develop mainstream automotive systems on " + "chips for global OEMs, which integrate new NVIDIA GPU chiplet IP for AI and graphics."}, + + {"query": "What was the rate of decline in 3rd quarter sales?", + "answer": "20% year-on-year.", + "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " + "third quarter earnings that plunged. The Finnish telecommunications giant said that " + "it will reduce its cost base and increase operation efficiency to “address the " + "challenging market environment. The substantial layoffs come after Nokia reported " + "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " + "the period plunged by 69% year-on-year to 133 million euros."}, + + {"query": "What was professional visualization revenue in the quarter?", + "answer": "$379 million", + "context": "Gaming Second-quarter revenue was $2.49 billion, up 11% from the previous quarter and up " + "22% from a year ago. Began shipping the GeForce RTX™ 4060 family of GPUs, " + "bringing to gamers NVIDIA Ada Lovelace architecture and DLSS, starting at $299." + "Announced NVIDIA Avatar Cloud Engine, or ACE, for Games, a custom AI model " + "foundry service using AI-powered natural language interactions to transform games " + "by bringing intelligence to non-playable characters. Added 35 DLSS games, including " + "Diablo IV, Ratchet & Clank: Rift Apart, Baldur’s Gate 3 and F1 23, as well as Portal: " + "Prelude RTX, a path-traced game made by the community using NVIDIA’s RTX Remix creator tool." + "Professional Visualization Second-quarter revenue was $379 million, up 28% from the " + "previous quarter and down 24% from a year ago. Announced three new desktop " + "workstation RTX GPUs based on the Ada Lovelace architecture — NVIDIA RTX 5000, RTX 4500 " + "and RTX 4000 — to deliver the latest AI, graphics and real-time rendering, which are " + "shipping this quarter. Announced a major release of the NVIDIA Omniverse platform, " + "with new foundation applications and services for developers and industrial " + "enterprises to optimize and enhance their 3D pipelines with OpenUSD and " + "generative AI. Joined with Pixar, Adobe, Apple and Autodesk to form the " + "Alliance for OpenUSD to promote the standardization, development, evolution and " + "growth of Universal Scene Description technology."}, + + + {"query": "What is the executive's title?", + "answer": "Senior Vice President, Event Planning ('SVP') of the Workforce Optimization Division.", + "context": "2.1. Duties and Responsibilities and Extent of Service. During the Employment Period, " + "Executive shall serve as Senior Vice President, Event Planning (“SVP”) of the Employer’s " + "Workforce Optimization Division. In such role, Executive will report to the Board of " + "Directors of Employer (the “Board”) and shall devote substantially all of his business time " + "and attention and his best efforts and ability to the operations of Employer and its subsidiaries. " + "Executive shall be responsible for running Employer’s day-to-day operations and shall perform " + "faithfully, diligently and competently the duties and responsibilities of a SVP and such other " + "duties and responsibilities as directed by the Board and are consistent with such position. " + "The foregoing shall not be construed as preventing Executive from (a) making passive " + "investments in other businesses or enterprises consistent with Employer’s code of conduct, " + "or (b) engaging in any other business activity consistent with Employer’s code of conduct; " + "provided that Executive seeks and obtains the prior approval of the Board before engaging " + "in any other business activity. In addition, it shall not be a violation of this Agreement " + "for Executive to participate in civic or charitable activities, deliver lectures, fulfill " + "speaking engagements, teach at educational institutions, and/or manage personal investments " + "(subject to the immediately preceding sentence); provided that such activities do not " + "interfere in any substantial respect with the performance of Executive’s responsibilities " + "as an employee in accordance with this Agreement. Executive may also serve on one or more " + "corporate boards of another company (and committees thereof) upon giving advance notice " + "to the Board prior to commencing service on any other corporate board."}, + + {"query": "According to the CFO, what led to the increase in cloud revenue?", + "answer": "Focused execution by our sales teams and partners", + "context": "'The world's most advanced AI models " + "are coming together with the world's most universal user interface - natural language - " + "to create a new era of computing,' said Satya Nadella, chairman and chief " + "executive officer of Microsoft. 'Across the Microsoft Cloud, we are the platform " + "of choice to help customers get the most value out of their digital spend and innovate " + "for this next generation of AI.' 'Focused execution by our sales teams and partners " + "in this dynamic environment resulted in Microsoft Cloud revenue of $28.5 billion, " + "up 22% (up 25% in constant currency) year-over-year,' said Amy Hood, executive " + "vice president and chief financial officer of Microsoft.\n"}, + + {"query": "Which company is located in Nevada?", + "answer": "North Industries", + "context": "To send notices to Blue Moon Tech, mail to their headquarters at: " + "555 California Street, San Francisco, California 94123. To send notices to North Industries, mail to" + "their principal U.S. offices at: 19832 32nd Avenue, Las Vegas, Nevada 23593.\nTo send notices " + "to Red River Industries, send to: One Red River Road, Stamford, Connecticut 08234."}, + + {"query": "When can termination after a material breach occur?", + "answer": "If the breach is not cured within 15 days of notice of the breach.", + "context": "This Agreement shall remain in effect until terminated. Either party may terminate this " + "agreement, any Statement of Work or Services Description for convenience by giving the other " + "party 30 days written notice. Either party may terminate this Agreement or any work order or " + "services description if the other party is in material breach or default of any obligation " + "that is not cured within 15 days’ notice of such breach. The TestCo agrees to pay all fees " + "for services performed and expenses incurred prior to the termination of this Agreement. " + "Termination of this Agreement will terminate all outstanding Statement of Work or Services " + "Description entered into under this agreement."}, + + {"query": "What is a headline summary in 10 words or less?", + "answer": "Joe Biden is the 46th President of the United States.", + "context": "Joe Biden's tenure as the 46th president of the United States began with " + "his inauguration on January 20, 2021. Biden, a Democrat from Delaware who " + "previously served as vice president under Barack Obama, " + "took office following his victory in the 2020 presidential election over " + "Republican incumbent president Donald Trump. Upon his inauguration, he " + "became the oldest president in American history."}, + + {"query": "Who are the two people that won elections in Georgia?", + "answer": "Jon Ossoff and Raphael Warnock", + "context": "Though Biden was generally acknowledged as the winner, " + "General Services Administration head Emily W. Murphy " + "initially refused to begin the transition to the president-elect, " + "thereby denying funds and office space to his team. " + "On November 23, after Michigan certified its results, Murphy " + "issued the letter of ascertainment, granting the Biden transition " + "team access to federal funds and resources for an orderly transition. " + "Two days after becoming the projected winner of the 2020 election, " + "Biden announced the formation of a task force to advise him on the " + "COVID-19 pandemic during the transition, co-chaired by former " + "Surgeon General Vivek Murthy, former FDA commissioner David A. Kessler, " + "and Yale University's Marcella Nunez-Smith. On January 5, 2021, " + "the Democratic Party won control of the United States Senate, " + "effective January 20, as a result of electoral victories in " + "Georgia by Jon Ossoff in a runoff election for a six-year term " + "and Raphael Warnock in a special runoff election for a two-year term. " + "President-elect Biden had supported and campaigned for both " + "candidates prior to the runoff elections on January 5.On January 6, " + "a mob of thousands of Trump supporters violently stormed the Capitol " + "in the hope of overturning Biden's election, forcing Congress to " + "evacuate during the counting of the Electoral College votes. More " + "than 26,000 National Guard members were deployed to the capital " + "for the inauguration, with thousands remaining into the spring."}, + + {"query": "What is the list of the top financial highlights for the quarter?", + "answer": "•Revenue: $52.9 million, up 10% in constant currency;\n" + "•Operating income: $22.4 billion, up 15% in constant currency;\n" + "•Net income: $18.3 billion, up 14% in constant currency;\n" + "•Diluted earnings per share: $2.45 billion, up 14% in constant currency.", + "context": "Microsoft Cloud Strength Drives Third Quarter Results \nREDMOND, Wash. — April 25, 2023 — " + "Microsoft Corp. today announced the following results for the quarter ended March 31, 2023," + " as compared to the corresponding period of last fiscal year:\n· Revenue was $52.9 billion" + " and increased 7% (up 10% in constant currency)\n· Operating income was $22.4 billion " + "and increased 10% (up 15% in constant currency)\n· Net income was $18.3 billion and " + "increased 9% (up 14% in constant currency)\n· Diluted earnings per share was $2.45 " + "and increased 10% (up 14% in constant currency).\n"}, + + {"query": "What is a list of the key points?", + "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " + "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " + "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" + "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"} + + ] + + return test_list + + +def bling_meets_llmware_hello_world (model_name): + + """ Simple inference loop that loads a model and runs through a series of test questions. """ + + t0 = time.time() + test_list = hello_world_questions() + + print(f"\n > Loading Model: {model_name}...") + + prompter = Prompt().load_model(model_name) + + t1 = time.time() + print(f"\n > Model {model_name} load time: {t1-t0} seconds") + + for i, entries in enumerate(test_list): + print(f"\n{i+1}. Query: {entries['query']}") + + # run the prompt + output = prompter.prompt_main(entries["query"],context=entries["context"] + , prompt_name="default_with_context",temperature=0.30) + + llm_response = output["llm_response"].strip("\n") + print(f"LLM Response: {llm_response}") + print(f"Gold Answer: {entries['answer']}") + print(f"LLM Usage: {output['usage']}") + + t2 = time.time() + print(f"\nTotal processing time: {t2-t1} seconds") + + return 0 + + +if __name__ == "__main__": + + # list of 'rag-instruct' laptop-ready bling models on HuggingFace + + model_list = ["llmware/bling-1b-0.1", + "llmware/bling-tiny-llama-v0", + "llmware/bling-1.4b-0.1", + "llmware/bling-falcon-1b-0.1", + "llmware/bling-cerebras-1.3b-0.1", + "llmware/bling-sheared-llama-1.3b-0.1", + "llmware/bling-sheared-llama-2.7b-0.1", + "llmware/bling-red-pajamas-3b-0.1", + "llmware/bling-stable-lm-3b-4e1t-v0", + "llmware/bling-phi-3", + + # use GGUF models too + "bling-phi-3-gguf", # quantized bling-phi-3 + "bling-answer-tool", # quantized bling-tiny-llama + "bling-stablelm-3b-tool" # quantized bling-stablelm-3b + ] + + # try the newest bling model - 'tiny-llama' + bling_meets_llmware_hello_world(model_list[1]) +``` + +For more examples, see the [models examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Models/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/examples/notebooks.md b/docs/examples/notebooks.md new file mode 100644 index 00000000..d534ee00 --- /dev/null +++ b/docs/examples/notebooks.md @@ -0,0 +1,41 @@ +--- +layout: default +title: Notebooks +parent: examples +nav_order: 11 +description: overview of the major modules and classes of LLMWare +permalink: /examples/notebooks +--- +# Notebooks - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +# Understanding Google Colab and Jupyter Notebooks + +Welcome to our project documentation! A common point of confusion among developers new to data science and machine learning workflows is the relationship and differences between Google Colab and Jupyter Notebooks. This README aims to clarify these parts to ensure everyone is on the same page. + +## What are Jupyter Notebooks? + +Jupyter Notebooks is an open-source web application that lets you create and share documents that have live code, equations, visualizations, and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. + +## What is Google Colab? + +Google Colab (or Colaboratory) is a free Jupyter notebook environment that requires no setup and runs in the cloud. It offers a similar interface to Jupyter Notebooks and lets users write and execute Python in a web browser. Google Colab also provides free access to computing resources, including GPUs and TPUs, making it highly popular for machine learning and data analysis projects. + +## Key Similarities + +- **Interface:** Both platforms use the Jupyter Notebook interface, which supports mixing executable code, equations, visualizations, and narrative text in a single document. +- **Language Support:** Primarily, both are used for executing Python code. However, Jupyter Notebooks support other languages such as R and Julia. +- **Use Cases:** They are widely used for data analysis, machine learning, and education, allowing for easy sharing of results and methodologies. + +## Key Differences + +- **Execution Environment:** Jupyter Notebooks can be run locally on your machine or on a server, but Google Colab is hosted in the cloud. +- **Access to Resources:** Google Colab provides free access to hardware accelerators (GPUs and TPUs) which is not inherently available in Jupyter Notebooks unless specifically set up by the user on their servers. +- **Collaboration:** Google Colab offers easier collaboration features, similar to Google Docs, letting multiple users work on the same notebook simultaneously. + +## Conclusion + +While Google Colab and Jupyter Notebooks might seem different they are built on the same idea and offer similar functionalities with a few distinctions, mainly in execution environment and access to computing resources. Understanding these platforms' capabilities can significantly enhance your data science and machine learning projects. + +We hope this guide has helped clarify the similarities and differences between Google Colab and Jupyter Notebooks. Happy coding! + diff --git a/docs/examples/parsing.md b/docs/examples/parsing.md new file mode 100644 index 00000000..4cca6b65 --- /dev/null +++ b/docs/examples/parsing.md @@ -0,0 +1,69 @@ +--- +layout: default +title: Parsing +parent: examples +nav_order: 4 +description: overview of the major modules and classes of LLMWare +permalink: /examples/parsing +--- +# Parsing - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + + +🚀 Parsing Examples 🚀 +=============== + +**Parsing is the Humble Hero of Good RAG Pipelines** + +LLMWare supports parsing of a wide range of unstructured content types, and views parsing, text chunking and indexing as the first step in the pipeline, and like any pipeline, care and attention to getting "great input" is usually the key to "great output." + +In this repository, we show several key features of parsing with llmware: + + +**Parsing PDFs like a Pro** + +- Configuring text chunking and extraction parameters - [**PDF Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_parser_new_configs.py) + +- PDF Table extraction - [**PDF Table**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/pdf_table_extraction.py) + +- Fallback to OCR - [**PDF-by-OCR**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_pdf_by_ocr.py) + + +**Parsing Office Documents (Powerpoints, Word, Excel)** + +- Configuring text chunking and extraction parameters - [**Office Configuration**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/office_parser_new_configs.py) + +- Handling ZIPs and mixed file types - [**Microsoft IR Documents**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parsing_microsoft_ir_docs.py) + +- Running OCR on Images Extracted - [**OCR Embedded Doc Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/ocr_embedded_doc_images.py) + + +**Parsing without a Database** + +- Parse in Memory - [**Parse in Memory**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_in_memory.py) + +- Parse directly into a Prompt - [**Parse in Prompt**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_into_prompt.py) + +- Parse to JSON file - [**Parse to JSON**](https://www.github.com/llmware-ai/llmware/tree/main/examples/main/examples/Parsing/parse_to_json.py) + + +**Other Content Types** + +- Custom CSV - [**Custom CSV files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_csv_custom.py) + +- Custom JSON - [**Custom JSON files**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_jsonl_custom.py) + +- Images - [**OCR on Images**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_images.py) + +- Web/HTML - [**Website Extraction**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/parse_web_sources_in_memory.py) + +- Voice (WAV) - in Use_Cases - [**Parsing Great Speeches**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/parsing_great_speeches.py) + +For more examples, see the [parsing examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Parsing/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + +### **Let's get started! 🚀** + + diff --git a/docs/examples/prompts.md b/docs/examples/prompts.md new file mode 100644 index 00000000..7d7b0f5c --- /dev/null +++ b/docs/examples/prompts.md @@ -0,0 +1,263 @@ +--- +layout: default +title: Prompts +parent: examples +nav_order: 6 +description: overview of the major modules and classes of LLMWare +permalink: /examples/prompts +--- +# Prompts - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +# Basic RAG Scenario - Invoice Processing + +```python + +""" This example shows an end-to-end scenario for invoice processing that can be run locally and without a +database. The example shows how to combine the use of parsing combined with prompts_with_sources to rapidly +iterate through a batch of invoices and ask a set of questions, and then save the full output to both +(1) .jsonl for integration into an upstream application/database and (2) to a CSV for human review in excel. + + note: the sample code pulls from a public repo to load the sample invoice documents the first time - + please feel free to substitute with your own invoice documents (PDF/DOCX/PPTX/XLSX/CSV/TXT) if you prefer. + + this example does not require a database or embedding + + this example can be run locally on a laptop by setting 'run_on_cpu=True' + if 'run_on_cpu==False", then please see the example 'launch_llmware_inference_server.py' + to configure and set up a 'pop-up' GPU inference server in just a few minutes +""" + +import os +import re + +from llmware.prompts import Prompt, HumanInTheLoop +from llmware.configs import LLMWareConfig +from llmware.setup import Setup +from llmware.models import ModelCatalog + + +def invoice_processing(run_on_cpu=True): + + # Step 1 - Pull down the sample files from S3 through the .load_sample_files() command + # --note: if you need to refresh the sample files, set 'over_write=True' + print("update: Downloading Sample Files") + + sample_files_path = Setup().load_sample_files(over_write=False) + invoices_path = os.path.join(sample_files_path, "Invoices") + + # Step 2 - simple sample query list - each question will be asked to each invoice + query_list = ["What is the total amount of the invoice?", + "What is the invoice number?", + "What are the names of the two parties?"] + + # Step 3 - Load Model + + if run_on_cpu: + + # load local bling model that can run on cpu/laptop + + # note: bling-1b-0.1 is the *fastest* & *smallest*, but will make more errors than larger BLING models + # model_name = "llmware/bling-1b-0.1" + + # try the new bling-phi-3 quantized with gguf - most accurate + model_name = 'bling-phi-3-gguf' + else: + + # use GPU-based inference server to process + # *** see the launch_llmware_inference_server.py example script to setup *** + + server_uri_string = "http://11.123.456.789:8088" # insert your server_uri_string + server_secret_key = "demo-test" + ModelCatalog().setup_custom_llmware_inference_server(server_uri_string, secret_key=server_secret_key) + model_name = "llmware-inference-server" + + # attach inference server to prompt object + prompter = Prompt().load_model(model_name) + + # Step 4 - main loop thru folder of invoices + + for i, invoice in enumerate(os.listdir(invoices_path)): + + # just in case (legacy on mac os file system - not needed on linux or windows) + if invoice != ".DS_Store": + + print("\nAnalyzing invoice: ", str(i + 1), invoice) + + for question in query_list: + + # Step 4A - parses the invoices in memory and attaches as a source to the Prompt + source = prompter.add_source_document(invoices_path,invoice) + + # Step 4B - executes the prompt on the LLM (with the loaded source) + output = prompter.prompt_with_source(question,prompt_name="default_with_context") + + for i, response in enumerate(output): + print("LLM Response - ", question, " - ", re.sub("[\n]"," ", response["llm_response"])) + + prompter.clear_source_materials() + + # Save jsonl report with full transaction history to /prompt_history folder + print("\nupdate: prompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id)) + + prompter.save_state() + + # Generate CSV report for easy Human review in Excel + csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv() + + print("\nupdate: csv output for human review - ", csv_output) + + return 0 + + +if __name__ == "__main__": + + invoice_processing(run_on_cpu=True) +``` + +# Document Summarizer + +```python + +""" This Example shows a packaged 'document_summarizer' prompt using the slim-summary-tool. It shows a variety of +techniques to summarize documents generally larger than a LLM context window, and how to assemble multiple source +batches from the document, as well as using a 'query' and 'topic' to focus on specific segments of the document. """ + +import os + +from llmware.prompts import Prompt +from llmware.setup import Setup + + +def test_summarize_document(example="jd salinger"): + + # pull a sample document (or substitute a file_path and file_name of your own) + sample_files_path = Setup().load_sample_files(over_write=False) + + topic = None + query = None + fp = None + fn = None + + if example not in ["jd salinger", "employment terms", "just the comp", "un resolutions"]: + print ("not found example") + return [] + + if example == "jd salinger": + fp = os.path.join(sample_files_path, "SmallLibrary") + fn = "Jd-Salinger-Biography.docx" + topic = "jd salinger" + query = None + + if example == "employment terms": + fp = os.path.join(sample_files_path, "Agreements") + fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf" + topic = "executive compensation terms" + query = None + + if example == "just the comp": + fp = os.path.join(sample_files_path, "Agreements") + fn = "Athena EXECUTIVE EMPLOYMENT AGREEMENT.pdf" + topic = "executive compensation terms" + query = "base salary" + + if example == "un resolutions": + fp = os.path.join(sample_files_path, "SmallLibrary") + fn = "N2126108.pdf" + # fn = "N2137825.pdf" + topic = "key points" + query = None + + # optional parameters: 'query' - will select among blocks with the query term + # 'topic' - will pass a topic/issue as the parameter to the model to 'focus' the summary + # 'max_batch_cap' - caps the number of batches sent to the model + # 'text_only' - returns just the summary text aggregated + + kp = Prompt().summarize_document_fc(fp, fn, topic=topic, query=query, text_only=True, max_batch_cap=15) + + print(f"\nDocument summary completed - {len(kp)} Points") + for i, points in enumerate(kp): + print(i, points) + + return 0 + + +if __name__ == "__main__": + + print(f"\nExample: Summarize Documents\n") + + # 4 examples - ["jd salinger", "employment terms", "just the comp", "un resolutions"] + # -- "jd salinger" - summarizes key points about jd salinger from short biography document + # -- "employment terms" - summarizes the executive compensation terms across 15 page document + # -- "just the comp" - queries to find subset of document and then summarizes the key terms + # -- "un resolutions" - summarizes the un resolutions document + + summary_direct = test_summarize_document(example="employment terms") +``` + +For more examples, see the [prompt examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Prompts/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/examples/retrieval.md b/docs/examples/retrieval.md new file mode 100644 index 00000000..45fbfba7 --- /dev/null +++ b/docs/examples/retrieval.md @@ -0,0 +1,167 @@ +--- +layout: default +title: Retrieval +parent: examples +nav_order: 7 +description: overview of the major modules and classes of LLMWare +permalink: /examples/retrieval +--- +# Retrieval - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +# SEMANTIC Retrieval Example + +```python + +""" +This 'getting started' example demonstrates how to use basic semantic retrieval with the Query class + 1. Create a sample library + 2. Run a basic semantic query + 3. View the results +""" + +import os +from llmware.library import Library +from llmware.retrieval import Query +from llmware.setup import Setup +from llmware.configs import LLMWareConfig + + +def create_fin_docs_sample_library(library_name): + + print(f"update: creating library - {library_name}") + + library = Library().create_new_library(library_name) + sample_files_path = Setup().load_sample_files(over_write=False) + ingestion_folder_path = os.path.join(sample_files_path, "FinDocs") + parsing_output = library.add_files(ingestion_folder_path) + + print(f"update: building embeddings - may take a few minutes the first time") + + # note: if you have installed Milvus or another vector DB, please feel free to substitute + # note: if you have any memory constraints on laptop: + # (1) reduce embedding batch_size or ... + # (2) substitute "mini-lm-sbert" as embedding model + + library.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb",batch_size=200) + + return library + + +def basic_semantic_retrieval_example (library): + + # Create a Query instance + q = Query(library) + + # Set the keys that should be returned - optional - full set of keys will be returned by default + q.query_result_return_keys = ["distance","file_source", "page_num", "text"] + + # perform a simple query + my_query = "ESG initiatives" + query_results1 = q.semantic_query(my_query, result_count=20) + + # Iterate through query_results, which is a list of result dicts + print(f"\nQuery 1 - {my_query}") + for i, result in enumerate(query_results1): + print("results - ", i, result) + + # perform another query + my_query2 = "stock performance" + query_results2 = q.semantic_query(my_query2, result_count=10) + + print(f"\nQuery 2 - {my_query2}") + for i, result in enumerate(query_results2): + print("results - ", i, result) + + # perform another query + my_query3 = "cloud computing" + + # note: use of embedding_distance_threshold will cap results with distance < 1.0 + query_results3 = q.semantic_query(my_query3, result_count=50, embedding_distance_threshold=1.0) + + print(f"\nQuery 3 - {my_query3}") + for i, result in enumerate(query_results3): + print("result - ", i, result) + + return [query_results1, query_results2, query_results3] + + +if __name__ == "__main__": + + print(f"Example - Running a Basic Semantic Query") + + LLMWareConfig().set_active_db("sqlite") + + # step 1- will create library + embeddings with Financial Docs + lib = create_fin_docs_sample_library("lib_semantic_query_1") + + # step 2- run query against the library and embeddings + my_results = basic_semantic_retrieval_example(lib) +``` + +For more examples, see the [retrieval examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Retrieval/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/examples/structured_tables.md b/docs/examples/structured_tables.md new file mode 100644 index 00000000..ec2d2a92 --- /dev/null +++ b/docs/examples/structured_tables.md @@ -0,0 +1,225 @@ +--- +layout: default +title: Structured Tables +parent: examples +nav_order: 9 +description: overview of the major modules and classes of LLMWare +permalink: /examples/structured_tables +--- +# Structured Tables - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +```python + +""" This example shows the basic recipe for creating a CustomTable with LLMWare and a few of the basic methods + to quickly get started. + + In this example, we will build a very simple 'hello world' Files table, which we will build upon in a future + example by aggregating a more interesting and useful set of attributes from a LLMWare Library collection. + + CustomTable is designed to work with the text collection databases supported by LLMWare: + + SQL DBs --- Postgres and SQLIte + NoSQL DB --- Mongo DB + + Even though Mongo does not require a schema for inserting and retrieving information, the CustomTable method + will expect a defined schema to be provided (good best practice, in any case). """ + +from llmware.resources import CustomTable + + +def hello_world_custom_table(): + + # simple schema for a table to track Files/Documents + # note: the schema is a python dictionary, with named keys, and the value corresponding to the data type + # for sqlite and postgres, any standard sql data type should generally work + + files_schema = {"custom_doc_num": "integer", + "file_name": "text", + "comments": "text"} + + # create a CustomTable object + db_name = "sqlite" + table_name = "files_table_1000" + ct = CustomTable(db=db_name,table_name=table_name, schema=files_schema) + + # insert a few sample rows - each row is a dictionary with keys from the schema, and the *actual* values + r1 = {"custom_doc_num": 1, "file_name": "technical_manual.pdf", "comments": "very useful overview"} + ct.write_new_record(r1) + + r2 = {"custom_doc_num": 2, "file_name": "work_presentation.pptx", "comments": "need to save for future reference"} + ct.write_new_record(r2) + + r3 = {"custom_doc_num": 3, "file_name": "dataset.json", "comments": "will use in next project"} + ct.write_new_record(r3) + + # to see the entries - pull all items from the table + all_results = ct.get_all() + + print("\nTEST #1 - Retrieving All Elements") + for i, res in enumerate(all_results): + print("results: ", i, res) + + # look at the database schema + schema = ct.get_schema() + + print("\nTEST #2 - Getting the Table Schema") + print("schema: ", schema) + + schema_str = ct.sql_table_create_string() + + print("table create sql: ", schema_str) + + # perform a basic lookup with 'key' and 'value' + f = ct.lookup("custom_doc_num", 2) + + print("\nTEST #3 - Basic Lookup - 'custom_doc_num' = 2") + print("lookup: ", f) + + # if you prefer SQL, pass a SQL query directly (note: this will only work on Postgres and SQLite) + + if db_name == "sqlite": + + # note: our standard 'unpacking' of a row of sqlite includes the rowid attribute + custom_query = f"SELECT rowid, * FROM {table_name} WHERE custom_doc_num = 3;" + + elif db_name == "postgres": + custom_query = f"SELECT * FROM {table_name} WHERE custom_doc_num = 3;" + + elif db_name == "mongo": + custom_query = {"custom_doc_num": 3} + else: + print("must use either sqlite, postgres or mongo") + return -1 + + cf = ct.custom_lookup(custom_query) + + print("\nTEST #4 - Custom SQL Lookup - 'custom_doc_num' = 3") + print("custom query lookup: ", cf) + + print("\nTEST #5 - Making Updates and Deletes") + + # to delete a record + ct.delete_record("custom_doc_num", 1) + print("deleted record") + + # to update the values of a record + ct.update_record({"custom_doc_num": 2}, "file_name", "work_presentation_update_v2.pptx") + print("updated record") + + updated_all_results = ct.get_all() + + for i, res in enumerate(updated_all_results): + print("updated results: ", i, res) + + print("\nTEST #6 - Delete Table - uncomment and set confirm=True") + # done? delete the table and start over + # -- note: confirm=True must be set + # ct.delete_table(confirm=False) + + # look at all tables in the database + tables = ct.list_all_tables() + + print("\nTEST #7 - View all of the tables on the DB") + for i, t in enumerate(tables): + print("tables:" ,i, t) + + return 0 + + +if __name__ == "__main__": + + hello_world_custom_table() +``` + + +These examples illustrate the use of the CustomTable class to quickly create SQL tables that can be used in conjunction with LLM-based workflows. + +1. [**Intro to CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/create_custom_table-1.py) + + - Getting started with using CustomTables + +2. [**Loading CSV into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_into_custom_table-2a.py) + + - Loading CSV into CustomTables + +3. [**Loading CSV into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/loading_csv_w_config_options-2b.py) + + - Loading CSV into Library + +4. [**Loading JSON into CustomTables**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_custom_table-3a.py) + + - Loading JSON into CustomTable database + +5 [**Loading JSON into Library (Configured)**](https://www.github.com/llmware-ai/llmware/tree/main/examples/Stuctured_Tables/loading_json_w_config_options-3b.py) + + - Loading JSON into a library with configuration + + +For more examples, see the [structured tables example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Structured_Tables/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + + diff --git a/docs/examples/ui.md b/docs/examples/ui.md new file mode 100644 index 00000000..587a6fbd --- /dev/null +++ b/docs/examples/ui.md @@ -0,0 +1,101 @@ +--- +layout: default +title: UI +parent: examples +nav_order: 8 +description: overview of the major modules and classes of LLMWare +permalink: /examples/ui +--- +# UI - Introduction by Examples +We introduce ``llmware`` through self-contained examples. + +**UI Scenarios** + +We provide several 'UI' examples that show how to use LLMWare in a complex recipe combining different elements to accomplish a specific objective. While each example is still high-level, it is shared in the spirit of providing a high-level framework 'starting point' that can be developed in more detail for a variety of common use cases. All of these examples use small, specialized models, running locally - 'Small, but Mighty' ! + + +1. [**GGUF Streaming Chatbot**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/gguf_streaming_chatbot.py) + + - Locally deployed chatbot using leading open source chat models, including Phi-3-GGUF + - Uses Streamlit + - Core simple framework of ~20 lines using llmware and Streamlit + +2. [**Simple RAG UI with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/simple_rag_ui_with_streamlit.py) + + - Simple RAG UI + +3. [**RAG UI with Query Topic with Streamlit**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/rag_ui_with_query_topic_with_streamlit.py) + + - UI demonstrating UI with query topic in RAG scenario + +4. [**Using Streamlit Chat UI**](https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/using_streamlit_chat_ui.py) + + - Basic Streamlit Chat UI + + +For more examples, see the [UI examples]((https://www.github.com/llmware-ai/llmware/tree/main/examples/UI/) in the main repo. + +Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. + + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- + diff --git a/docs/use_cases.md b/docs/examples/use_cases.md similarity index 95% rename from docs/use_cases.md rename to docs/examples/use_cases.md index f410712b..0b4287d1 100644 --- a/docs/use_cases.md +++ b/docs/examples/use_cases.md @@ -1,9 +1,10 @@ --- layout: default -title: Use Cases -nav_order: 4 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /use_cases +title: Use Cases +parent: examples +nav_order: 1 +description: overview of the major modules and classes of LLMWare +permalink: /examples/use_cases --- 🚀 Use Cases Examples 🚀 --- @@ -67,7 +68,9 @@ We provide several 'end-to-end' examples that show how to use LLMWare in a compl - Shows a variety of advanced parsing techniques with Office document formats packaged in ZIP archives - Extracts tables and images, runs OCR against the embedded images, exports the whole library, and creates dataset - + +For more examples, see the [use cases example]((https://www.github.com/llmware-ai/llmware/tree/main/examples/Use_Cases/) in the main repo. + Check back often - we are updating these examples regularly - and many of these examples have companion videos as well. diff --git a/docs/getting_started/.DS_Store b/docs/getting_started/.DS_Store new file mode 100644 index 00000000..5008ddfc Binary files /dev/null and b/docs/getting_started/.DS_Store differ diff --git a/docs/getting_started/clone_repo.md b/docs/getting_started/clone_repo.md new file mode 100644 index 00000000..a2649d68 --- /dev/null +++ b/docs/getting_started/clone_repo.md @@ -0,0 +1,92 @@ +--- +layout: default +title: Clone Repo +parent: Getting Started +nav_order: 3 +permalink: /getting_started/clone_repo +--- + +## ✍️ Working with the llmware Github repository + +The llmware repo can be pulled locally to get access to all the examples, or to work directly with the latest version of the llmware code. + +```bash +git clone git@github.com:llmware-ai/llmware.git +``` + +We have provided a **welcome_to_llmware** automation script in the root of the repository folder. After cloning: +- On Windows command line: `.\welcome_to_llmware_windows.sh` +- On Mac / Linux command line: `sh ./welcome_to_llmware.sh` + +Alternatively, if you prefer to complete setup without the welcome automation script, then the next steps include: + +1. **install requirements.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements.txt``` + +2. **install requirements_extras.txt** - inside the /llmware path - e.g., ```pip3 install -r llmware/requirements_extras.txt``` (Depending upon your use case, you may not need all or any of these installs, but some of these will be used in the examples.) + +3. **run examples** - copy one or more of the example .py files into the root project path. (We have seen several IDEs that will attempt to run interactively from the nested /example path, and then not have access to the /llmware module - the easy fix is to just copy the example you want to run into the root path). + +4. **install vector db** - no-install vector db options include milvus lite, chromadb, faiss and lancedb - which do not require a server install, but do require that you install the python sdk library for that vector db, e.g., `pip3 install pymilvus`, or `pip3 install chromadb`. If you look in [examples/Embedding](https://github.com/llmware-ai/llmware/tree/main/examples/Embedding), you will see examples for getting started with various vector DB, and in the root of the repo, you will see easy-to-get-started docker compose scripts for installing milvus, postgres/pgvector, mongo, qdrant, neo4j, and redis. + +5. Note: we have seen recently issues with Pytorch==2.3 on some platforms - if you run into any issues, we have seen that uninstalling Pytorch and downleveling to Pytorch==2.1 usually solves the problem. + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/fast_start.md b/docs/getting_started/fast_start.md similarity index 96% rename from docs/fast_start.md rename to docs/getting_started/fast_start.md index dec072f4..905c9392 100644 --- a/docs/fast_start.md +++ b/docs/getting_started/fast_start.md @@ -1,10 +1,11 @@ --- layout: default -title: Fast Start Series -nav_order: 3 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /fast_start +title: Fast Start +parent: Getting Started +nav_order: 4 +permalink: /getting_started/fast_start --- + Fast Start: Learning RAG with llmware through 6 examples --- diff --git a/docs/index.md b/docs/getting_started/getting_started.md similarity index 97% rename from docs/index.md rename to docs/getting_started/getting_started.md index c3ae186b..b22656aa 100644 --- a/docs/index.md +++ b/docs/getting_started/getting_started.md @@ -1,10 +1,12 @@ --- layout: default -title: Home | llmware +title: Getting Started nav_order: 1 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: / +has_children: true +description: getting started with llmware +permalink: /getting_started --- + ## Welcome to
  • @@ -161,3 +163,4 @@ The company offers a Software as a Service (SaaS) Retrieval Augmented Generation
--- + diff --git a/docs/getting_started/installation.md b/docs/getting_started/installation.md new file mode 100644 index 00000000..87304d6b --- /dev/null +++ b/docs/getting_started/installation.md @@ -0,0 +1,119 @@ +--- +layout: default +title: Installation +parent: Getting Started +nav_order: 2 +permalink: /getting_started/installation +--- + +## Installation + +Set up + +`pip3 install llmware` or, if you prefer clone the github repo locally, e.g., `git clone git@github.com:llmware-ai/llmware.git +`. + +Platforms: +- Mac M1/M2/M3, Windows, Linux (Ubuntu 20 or Ubuntu 22 preferred) +- RAM: 16 GB minimum +- Python 3.9, 3.10, 3.11 (note: not supported on 3.12 - coming soon!) +- Pull the latest version of llmware == 0.2.11 (as of end of April 2024) +- Please note that we have updated the examples from the original versions, to use new features in llmware, so there may be minor differences with the videos, which are annotated in the comments in each example. + + +## Wheel Archive + +- If you prefer, we also provide a set of recent wheels in the [wheel archives](https://www.github.com/llmware-ai/llmware/tree/main/wheel_archives) in this repository, which can be downloaded individually and used as follows: + +```bash +pip3 install llmware-0.2.12-py3-none-any.wheel +```` + +- We generally keep the main branch of this repository current with all changes, but we only publish new wheels to PyPi approximately once per week + +___ + +___ +**Cloning the Repository** + +- If you prefer to clone the repository: + +```bash +git clone git@github.com:llmware-ai/llmware.git +``` + +- The llmware package is contained entirely in the /llmware folder path, so you should be able to drop this folder (with all of its contents) into a project tree, and use the llmware module essentially the same as a pip install. + +- Please ensure that you are capturing and updating the /llmware/lib folder, which includes required compiled shared libraries. If you prefer, you can keep only those libs required for your OS platform. + +- After cloning the repo, we provide a short 'welcome to llmware' automation script, which can be used to install the projects requirements (from llmware/requirements.txt), install several optional dependencies that are commonly used in examples, copy several good 'getting started' examples into the root folder, and then run a 'welcome_example.py' script to get started using our models. To use the "welcome to llmware" script: + +Windows: +```bash +.\welcome_to_llmware_windows.sh +``` + +Mac/Linux: +```bash +sh ./welcome_to_llmware.sh +``` + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/getting_started/overview.md b/docs/getting_started/overview.md new file mode 100644 index 00000000..be82cc88 --- /dev/null +++ b/docs/getting_started/overview.md @@ -0,0 +1,653 @@ +--- +layout: default +title: Overview +parent: Getting Started +nav_order: 1 +permalink: /getting_started/overview +--- + +## Welcome to +
    +
  • + llmware +
  • +
+ +## 🧰🛠️🔩Building Enterprise RAG Pipelines with Small, Specialized Models + +`llmware` provides a unified framework for building LLM-based applications (e.g, RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process. + + `llmware` has two main components: + + 1. **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and + + 2. **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction. + +By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications. + +Most of our examples can be run without a GPU server - get started right away on your laptop. + +## 🎯 Key features +Writing code with`llmware` is based on a few main concepts: + +
+Model Catalog: Access all models the same way with easy lookup, regardless of underlying implementation. + + + +```python +# 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models +# Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models +# Easy to extend to add custom models - see examples + +from llmware.models import ModelCatalog +from llmware.prompts import Prompt + +# all models accessed through the ModelCatalog +models = ModelCatalog().list_all_models() + +# to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter +my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf") +output = my_model.inference("what is the future of AI?", add_context="Here is the article to read") + +# to integrate model into a Prompt +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") +response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information") +``` + +
+ +
+Library: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. + +```python + +from llmware.library import Library + +# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html) + +# step 1 - create a library, which is the 'knowledge-base container' construct +# - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name}) +# - embeddings and queries are run against a library + +lib = Library().create_new_library("my_library") + +# step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types +# - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB + +lib.add_files("/folder/path/to/my/files") + +# to install an embedding on a library - pick an embedding model and vector_db +lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500) + +# to add a second embedding to the same library (mix-and-match models + vector db) +lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100) + +# easy to create multiple libraries for different projects and groups + +finance_lib = Library().create_new_library("finance_q4_2023") +finance_lib.add_files("/finance_folder/") + +hr_lib = Library().create_new_library("hr_policies") +hr_lib.add_files("/hr_folder/") + +# pull library card with key metadata - documents, text chunks, images, tables, embedding record +lib_card = Library().get_library_card("my_library") + +# see all libraries +all_my_libs = Library().get_all_library_cards() + +``` +
+ +
+Query: query libraries with mix of text, semantic, hybrid, metadata, and custom filters. + +```python + +from llmware.retrieval import Query +from llmware.library import Library + +# step 1 - load the previously created library +lib = Library().load_library("my_library") + +# step 2 - create a query object and pass the library +q = Query(lib) + +# step 3 - run lots of different queries (many other options in the examples) + +# basic text query +results1 = q.text_query("text query", result_count=20, exact_mode=False) + +# semantic query +results2 = q.semantic_query("semantic query", result_count=10) + +# combining a text query restricted to only certain documents in the library and "exact" match to the query +results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True) + +# to apply a specific embedding (if multiple on library), pass the names when creating the query object +q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus") +results4 = q2.semantic_query("new semantic query") +``` + +
+ +
+Prompt with Sources: the easiest way to combine knowledge retrieval with a LLM inference. + +```python + +from llmware.prompts import Prompt +from llmware.retrieval import Query +from llmware.library import Library + +# build a prompt +prompter = Prompt().load_model("llmware/bling-tiny-llama-v0") + +# add a file -> file is parsed, text chunked, filtered by query, and then packaged as model-ready context, +# including in batches, if needed, to fit the model context window + +source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query") + +# attach query results (from a Query) into a Prompt +my_lib = Library().load_library("my_library") +results = Query(my_lib).query("my query") +source2 = prompter.add_source_query_results(results) + +# run a new query against a library and load directly into a prompt +source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15) + +# to run inference with 'prompt with sources' +responses = prompter.prompt_with_source("my query") + +# to run fact-checks - post inference +fact_check = prompter.evidence_check_sources(responses) + +# to view source materials (batched 'model-ready' and attached to prompt) +source_materials = prompter.review_sources_summary() + +# to see the full prompt history +prompt_history = prompter.get_current_history() +``` + +
+ +
+RAG-Optimized Models - 1-7B parameter models designed for RAG workflow integration and running locally. + +``` +""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both +Pytorch and GGUF versions. """ + +import time +from llmware.prompts import Prompt + + +def hello_world_questions(): + + test_list = [ + + {"query": "What is the total amount of the invoice?", + "answer": "$22,500.00", + "context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street " + "Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering" + " Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n" + "Make all checks payable to Services Vendor Inc. Payment is due within 30 days." + "If you have any questions concerning this invoice, contact Bia Hermes. " + "THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"}, + + {"query": "What was the amount of the trade surplus?", + "answer": "62.4 billion yen ($416.6 million)", + "context": "Japan’s September trade balance swings into surplus, surprising expectations" + "Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, " + "beating expectations from economists polled by Reuters for a trade deficit of 42.5 " + "billion yen. Data from Japan’s customs agency revealed that exports in September " + "increased 4.3% year on year, while imports slid 16.3% compared to the same period " + "last year. According to FactSet, exports to Asia fell for the ninth straight month, " + "which reflected ongoing China weakness. Exports were supported by shipments to " + "Western markets, FactSet added. — Lim Hui Jie"}, + + {"query": "When did the LISP machine market collapse?", + "answer": "1987.", + "context": "The attendees became the leaders of AI research in the 1960s." + " They and their students produced programs that the press described as 'astonishing': " + "computers were learning checkers strategies, solving word problems in algebra, " + "proving logical theorems and speaking English. By the middle of the 1960s, research in " + "the U.S. was heavily funded by the Department of Defense and laboratories had been " + "established around the world. Herbert Simon predicted, 'machines will be capable, " + "within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, " + "'within a generation ... the problem of creating 'artificial intelligence' will " + "substantially be solved'. They had, however, underestimated the difficulty of the problem. " + "Both the U.S. and British governments cut off exploratory research in response " + "to the criticism of Sir James Lighthill and ongoing pressure from the US Congress " + "to fund more productive projects. Minsky's and Papert's book Perceptrons was understood " + "as proving that artificial neural networks approach would never be useful for solving " + "real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period " + "when obtaining funding for AI projects was difficult, followed. In the early 1980s, " + "AI research was revived by the commercial success of expert systems, a form of AI " + "program that simulated the knowledge and analytical skills of human experts. By 1985, " + "the market for AI had reached over a billion dollars. At the same time, Japan's fifth " + "generation computer project inspired the U.S. and British governments to restore funding " + "for academic research. However, beginning with the collapse of the Lisp Machine market " + "in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."}, + + {"query": "What is the current rate on 10-year treasuries?", + "answer": "4.58%", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"}, + + {"query": "Is the expected gross margin greater than 70%?", + "answer": "Yes, between 71.5% and 72.%", + "context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:" + "Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP " + "gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus " + "50 basis points. GAAP and non-GAAP operating expenses are expected to be " + "approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP " + "other income and expense are expected to be an income of approximately $100 " + "million, excluding gains and losses from non-affiliated investments. GAAP and " + "non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items." + "Highlights NVIDIA achieved progress since its previous earnings announcement " + "in these areas: Data Center Second-quarter revenue was a record $10.32 billion, " + "up 141% from the previous quarter and up 171% from a year ago. Announced that the " + "NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping " + "this quarter, with a second-generation version with HBM3e memory expected to ship " + "in Q2 of calendar 2024. "}, + + {"query": "What is Bank of America's rating on Target?", + "answer": "Buy", + "context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from " + "my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom " + "of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index " + "soared more than 22%. Hotter than expected September consumer price index, consumer " + "inflation. The Social Security Administration issues announced a 3.2% cost-of-living " + "adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. " + "Cites consumer price index showing sticky retail inflation for the fourth time " + "in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites " + "risk/reward from depressed levels. Traffic could improve. Gross margin upside. " + "Merchandising better. Freight and transportation better. Target to report quarter " + "next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), " + "the off-price juggernaut behind T.J. Maxx, Marshalls and HomeGoods. Goldman Sachs " + "tactical buy trades on Club names Wells Fargo (WFC), which reports quarter Friday, " + "Humana (HUM) and Nvidia (NVDA). BofA initiates Snowflake (SNOW) with a buy rating." + "If you like this story, sign up for Jim Cramer’s Top 10 Morning Thoughts on the " + "Market email newsletter for free. Barclays cuts price targets on consumer products: " + "UTZ Brands (UTZ) to $16 per share from $17. Kraft Heinz (KHC) to $36 per share from " + "$38. Cyclical drag. J.M. Smucker (SJM) to $129 from $160. Secular headwinds. " + "Coca-Cola (KO) to $59 from $70. Barclays cut PTs on housing-related stocks: Toll Brothers" + "(TOL) to $74 per share from $82. Keeps underweight. Lowers Trex (TREX) and Azek" + "(AZEK), too. Goldman Sachs (GS) announces sale of fintech platform and warns on " + "third quarter of 19-cent per share drag on earnings. The buyer: investors led by " + "private equity firm Sixth Street. Exiting a mistake. Rise in consumer engagement for " + "Spotify (SPOT), says Morgan Stanley. The analysts hike price target to $190 per share " + "from $185. Keeps overweight (buy) rating. JPMorgan loves elf Beauty (ELF). Keeps " + "overweight (buy) rating but lowers price target to $139 per share from $150. " + "Sees “still challenging” environment into third-quarter print. The Club owns shares " + "in high-end beauty company Estee Lauder (EL). Barclays upgrades First Solar (FSLR) " + "to overweight from equal weight (buy from hold) but lowers price target to $224 per " + "share from $230. Risk reward upgrade. Best visibility of utility scale names."}, + + {"query": "What was the rate of decline in 3rd quarter sales?", + "answer": "20% year-on-year.", + "context": "Nokia said it would cut up to 14,000 jobs as part of a cost cutting plan following " + "third quarter earnings that plunged. The Finnish telecommunications giant said that " + "it will reduce its cost base and increase operation efficiency to “address the " + "challenging market environment. The substantial layoffs come after Nokia reported " + "third-quarter net sales declined 20% year-on-year to 4.98 billion euros. Profit over " + "the period plunged by 69% year-on-year to 133 million euros."}, + + {"query": "What is a list of the key points?", + "answer": "•Stocks rallied on Friday with stronger-than-expected U.S jobs data and increase in " + "Treasury yields;\n•Dow Jones gained 195.12 points;\n•S&P 500 added 1.59%;\n•Nasdaq Composite rose " + "1.35%;\n•U.S. economy added 438,000 jobs in August, better than the 273,000 expected;\n" + "•10-year Treasury rate trading near the highest level in 14 years at 4.58%.", + "context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data " + "and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, " + "or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy " + "Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in " + "August, the Labor Department said. Economists polled by Dow Jones expected 273,000 " + "jobs. However, wages rose less than expected last month. Stocks posted a stunning " + "turnaround on Friday, after initially falling on the stronger-than-expected jobs report. " + "At its session low, the Dow had fallen as much as 198 points; it surged by more than " + "500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during " + "their lowest points in the day. Traders were unclear of the reason for the intraday " + "reversal. Some noted it could be the softer wage number in the jobs report that made " + "investors rethink their earlier bearish stance. Others noted the pullback in yields from " + "the day’s highs. Part of the rally may just be to do a market that had gotten extremely " + "oversold with the S&P 500 at one point this week down more than 9% from its high earlier " + "this year. Yields initially surged after the report, with the 10-year Treasury rate trading " + "near its highest level in 14 years. The benchmark rate later eased from those levels, but " + "was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back " + "in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s " + "helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries " + "Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially " + "some oversold conditions.'"} + + ] + + return test_list + + +# this is the main script to be run + +def bling_meets_llmware_hello_world (model_name): + + t0 = time.time() + + # load the questions + test_list = hello_world_questions() + + print(f"\n > Loading Model: {model_name}...") + + # load the model + prompter = Prompt().load_model(model_name) + + t1 = time.time() + print(f"\n > Model {model_name} load time: {t1-t0} seconds") + + for i, entries in enumerate(test_list): + + print(f"\n{i+1}. Query: {entries['query']}") + + # run the prompt + output = prompter.prompt_main(entries["query"],context=entries["context"] + , prompt_name="default_with_context",temperature=0.30) + + # print out the results + llm_response = output["llm_response"].strip("\n") + print(f"LLM Response: {llm_response}") + print(f"Gold Answer: {entries['answer']}") + print(f"LLM Usage: {output['usage']}") + + t2 = time.time() + + print(f"\nTotal processing time: {t2-t1} seconds") + + return 0 + + +if __name__ == "__main__": + + # list of 'rag-instruct' laptop-ready small bling models on HuggingFace + + pytorch_models = ["llmware/bling-1b-0.1", # most popular + "llmware/bling-tiny-llama-v0", # fastest + "llmware/bling-1.4b-0.1", + "llmware/bling-falcon-1b-0.1", + "llmware/bling-cerebras-1.3b-0.1", + "llmware/bling-sheared-llama-1.3b-0.1", + "llmware/bling-sheared-llama-2.7b-0.1", + "llmware/bling-red-pajamas-3b-0.1", + "llmware/bling-stable-lm-3b-4e1t-v0", + "llmware/bling-phi-3" # most accurate (and newest) + ] + + # Quantized GGUF versions generally load faster and run nicely on a laptop with at least 16 GB of RAM + gguf_models = ["bling-phi-3-gguf", "bling-stablelm-3b-tool", "dragon-llama-answer-tool", "dragon-yi-answer-tool", "dragon-mistral-answer-tool"] + + # try model from either pytorch or gguf model list + # the newest (and most accurate) is 'bling-phi-3-gguf' + + bling_meets_llmware_hello_world(gguf_models[0] + + # check out the model card on Huggingface for RAG benchmark test performance results and other useful information +``` + +
+ +
+Simple-to-Scale Database Options - integrated data stores from laptop to parallelized cluster. + +```python + +from llmware.configs import LLMWareConfig + +# to set the collection database - mongo, sqlite, postgres +LLMWareConfig().set_active_db("mongo") + +# to set the vector database (or declare when installing) +# --options: milvus, pg_vector (postgres), redis, qdrant, faiss, pinecone, mongo atlas +LLMWareConfig().set_vector_db("milvus") + +# for fast start - no installations required +LLMWareConfig().set_active_db("sqlite") +LLMWareConfig().set_vector_db("chromadb") # try also faiss and lancedb + +# for single postgres deployment +LLMWareConfig().set_active_db("postgres") +LLMWareConfig().set_vector_db("postgres") + +# to install mongo, milvus, postgres - see the docker-compose scripts as well as examples + +``` + +
+ +
+ + 🔥 Agents with Function Calls and SLIM Models 🔥 + +```python + +from llmware.agents import LLMfx + +text = ("Tesla stock fell 8% in premarket trading after reporting fourth-quarter revenue and profit that " + "missed analysts’ estimates. The electric vehicle company also warned that vehicle volume growth in " + "2024 'may be notably lower' than last year’s growth rate. Automotive revenue, meanwhile, increased " + "just 1% from a year earlier, partly because the EVs were selling for less than they had in the past. " + "Tesla implemented steep price cuts in the second half of the year around the world. In a Wednesday " + "presentation, the company warned investors that it’s 'currently between two major growth waves.'") + +# create an agent using LLMfx class +agent = LLMfx() + +# load text to process +agent.load_work(text) + +# load 'models' as 'tools' to be used in analysis process +agent.load_tool("sentiment") +agent.load_tool("extract") +agent.load_tool("topics") +agent.load_tool("boolean") + +# run function calls using different tools +agent.sentiment() +agent.topics() +agent.extract(params=["company"]) +agent.extract(params=["automotive revenue growth"]) +agent.xsum() +agent.boolean(params=["is 2024 growth expected to be strong? (explain)"]) + +# at end of processing, show the report that was automatically aggregated by key +report = agent.show_report() + +# displays a summary of the activity in the process +activity_summary = agent.activity_summary() + +# list of the responses gathered +for i, entries in enumerate(agent.response_list): + print("update: response analysis: ", i, entries) + +output = {"report": report, "activity_summary": activity_summary, "journal": agent.journal} + +``` + +
+
+ + 🚀 Start coding - Quick Start for RAG 🚀 + +```python +# This example illustrates a simple contract analysis +# using a RAG-optimized LLM running locally + +import os +import re +from llmware.prompts import Prompt, HumanInTheLoop +from llmware.setup import Setup +from llmware.configs import LLMWareConfig + +def contract_analysis_on_laptop (model_name): + + # In this scenario, we will: + # -- download a set of sample contract files + # -- create a Prompt and load a BLING LLM model + # -- parse each contract, extract the relevant passages, and pass questions to a local LLM + + # Main loop - Iterate thru each contract: + # + # 1. parse the document in memory (convert from PDF file into text chunks with metadata) + # 2. filter the parsed text chunks with a "topic" (e.g., "governing law") to extract relevant passages + # 3. package and assemble the text chunks into a model-ready context + # 4. ask three key questions for each contract to the LLM + # 5. print to the screen + # 6. save the results in both json and csv for furthe processing and review. + + # Load the llmware sample files + + print (f"\n > Loading the llmware sample files...") + + sample_files_path = Setup().load_sample_files() + contracts_path = os.path.join(sample_files_path,"Agreements") + + # Query list - these are the 3 main topics and questions that we would like the LLM to analyze for each contract + + query_list = {"executive employment agreement": "What are the name of the two parties?", + "base salary": "What is the executive's base salary?", + "vacation": "How many vacation days will the executive receive?"} + + # Load the selected model by name that was passed into the function + + print (f"\n > Loading model {model_name}...") + + prompter = Prompt().load_model(model_name, temperature=0.0, sample=False) + + # Main loop + + for i, contract in enumerate(os.listdir(contracts_path)): + + # excluding Mac file artifact (annoying, but fact of life in demos) + if contract != ".DS_Store": + + print("\nAnalyzing contract: ", str(i+1), contract) + + print("LLM Responses:") + + for key, value in query_list.items(): + + # step 1 + 2 + 3 above - contract is parsed, text-chunked, filtered by topic key, + # ... and then packaged into the prompt + + source = prompter.add_source_document(contracts_path, contract, query=key) + + # step 4 above - calling the LLM with 'source' information already packaged into the prompt + + responses = prompter.prompt_with_source(value, prompt_name="default_with_context") + + # step 5 above - print out to screen + + for r, response in enumerate(responses): + print(key, ":", re.sub("[\n]"," ", response["llm_response"]).strip()) + + # We're done with this contract, clear the source from the prompt + prompter.clear_source_materials() + + # step 6 above - saving the analysis to jsonl and csv + + # Save jsonl report to jsonl to /prompt_history folder + print("\nPrompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id)) + prompter.save_state() + + # Save csv report that includes the model, response, prompt, and evidence for human-in-the-loop review + csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv() + print("csv output saved at: ", csv_output) + + +if __name__ == "__main__": + + # use local cpu model - try the newest - RAG finetune of Phi-3 quantized and packaged in GGUF + model = "bling-phi-3-gguf" + + contract_analysis_on_laptop(model) + +``` +
+ + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/platforms.md b/docs/getting_started/platforms.md similarity index 96% rename from docs/platforms.md rename to docs/getting_started/platforms.md index 1c2fb91c..7a225dfa 100644 --- a/docs/platforms.md +++ b/docs/getting_started/platforms.md @@ -1,9 +1,9 @@ --- layout: default -title: Platform Support -nav_order: 2 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /platform_support +title: Platforms Supported +parent: Getting Started +nav_order: 5 +permalink: /getting_started/platforms --- ___ # Platform Support diff --git a/docs/getting_started/working_with_docker.md b/docs/getting_started/working_with_docker.md new file mode 100644 index 00000000..3e8d3e4a --- /dev/null +++ b/docs/getting_started/working_with_docker.md @@ -0,0 +1,71 @@ +--- +layout: default +title: Working with Docker +parent: Getting Started +nav_order: 6 +permalink: /getting_started/working_with_docker +--- + +## Working with Docker Scripts + +COMING SOON ... + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/.DS_Store b/docs/learn/.DS_Store new file mode 100644 index 00000000..5008ddfc Binary files /dev/null and b/docs/learn/.DS_Store differ diff --git a/docs/learn/advanced_techniques_for_rag.md b/docs/learn/advanced_techniques_for_rag.md new file mode 100644 index 00000000..bb5ec746 --- /dev/null +++ b/docs/learn/advanced_techniques_for_rag.md @@ -0,0 +1,85 @@ +--- +layout: default +title: Advanced RAG +parent: Learn +nav_order: 4 +description: overview of the major modules and classes of LLMWare +permalink: /learn/advanced_techniques_for_rag +--- +llmware Youtube Video Channel +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Advanced RAG Techniques ** +- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) +- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) +- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) +- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP) +- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s) +- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/core_rag_scenarios_running_locally.md b/docs/learn/core_rag_scenarios_running_locally.md new file mode 100644 index 00000000..46b90c36 --- /dev/null +++ b/docs/learn/core_rag_scenarios_running_locally.md @@ -0,0 +1,86 @@ +--- +layout: default +title: Core RAG Scenarios Running Locally +parent: Learn +nav_order: 2 +description: overview of the major modules and classes of LLMWare +permalink: /learn/core_rag_scenarios_running_locally +--- +Core RAG Scenarios Run Locally +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Core RAG Scenarios** + +- [Use small LLMs for RAG for Contract Analysis (feat. LLMWare)](https://www.youtube.com/watch?v=8aV5p3tErP0) +- [Invoice Processing with LLMware](https://www.youtube.com/watch?v=VHZSaBBG-Bo&t=10s) +- [Evaluate LLMs for RAG with LLMWare](https://www.youtube.com/watch?v=s0KWqYg5Buk&t=105s) +- [Fast Start to RAG with LLMWare Open Source Library](https://www.youtube.com/watch?v=0naqpH93eEU) +- [Use Retrieval Augmented Generation (RAG) without a Database](https://www.youtube.com/watch?v=tAGz6yR14lw) +- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) +- [RAG with BLING on your laptop](https://www.youtube.com/watch?v=JjgqOZ2v5oU) +- [DRAGON-7B-Models](https://www.youtube.com/watch?v=d_u7VaKu6Qk&t=37s) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/integrated_voice_transcription_with_whisper_cpp.md b/docs/learn/integrated_voice_transcription_with_whisper_cpp.md new file mode 100644 index 00000000..1a20bf9d --- /dev/null +++ b/docs/learn/integrated_voice_transcription_with_whisper_cpp.md @@ -0,0 +1,78 @@ +--- +layout: default +title: Voice Transcription with Whisper CPP +parent: Learn +nav_order: 6 +description: overview of the major modules and classes of LLMWare +permalink: /learn/integrated_voice_transcription_with_whisper_cpp +--- +Integrated Voice Transcription with Whisper CPP +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Using Whisper CPP Models** +- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s) +- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/videos.md b/docs/learn/learn.md similarity index 95% rename from docs/videos.md rename to docs/learn/learn.md index 64ff41ba..9356e951 100644 --- a/docs/videos.md +++ b/docs/learn/learn.md @@ -1,11 +1,12 @@ --- layout: default -title: Videos -nav_order: 6 -description: llmware is an integrated framework with over 50+ models for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. -permalink: /videos +title: Learn +nav_order: 3 +has_children: true +description: key learning resources +permalink: /learn --- -llmware Youtube Video Channel +Learn: Youtube Video Series --- **Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. @@ -114,3 +115,4 @@ The company offers a Software as a Service (SaaS) Retrieval Augmented Generation --- + diff --git a/docs/learn/other_topics.md b/docs/learn/other_topics.md new file mode 100644 index 00000000..cf43d65f --- /dev/null +++ b/docs/learn/other_topics.md @@ -0,0 +1,85 @@ +--- +layout: default +title: Other Topics +parent: Learn +nav_order: 7 +description: overview of the major modules and classes of LLMWare +permalink: /learn/other_topics +--- +Other Notable Videos and Topics +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Some of our most recent videos** +- [Fast Local Chatbot with Phi-3-GGUF](https://youtu.be/gzzEVK8p3VM?si=HTMWQtN9XuaqjmpK) +- [Document Summarization](https://youtu.be/Ps3W-P9A1m8?si=mHvCcHvrKzndaNul) +- [Agent Server](https://youtu.be/nsA6-ZdnkXg?si=v7iGhC_rpj8TWbbl) +- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) +- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) +- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/parsing_embedding_data_extraction.md b/docs/learn/parsing_embedding_data_extraction.md new file mode 100644 index 00000000..1fda2c5a --- /dev/null +++ b/docs/learn/parsing_embedding_data_extraction.md @@ -0,0 +1,82 @@ +--- +layout: default +title: Parsing Embedding and Data Extraction +parent: Learn +nav_order: 5 +description: overview of the major modules and classes of LLMWare +permalink: /learn/parsing_embedding_data_extraction +--- +Parsing, Embedding, and Data Extraction +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Parsing, Embedding, Data Pipelines and Extraction** +- [Advanced Parsing Techniques](https://youtu.be/dEsw8V_YBYY?si=B0GTVNhwfBYWkXyf) +- [Ingest PDFs at Scale](https://www.youtube.com/watch?v=O0adUfrrxi8&t=10s) +- [Install and Compare Multiple Embeddings with Postgres and PGVector](https://www.youtube.com/watch?v=Bncvggy6m5Q) +- [Intro to Parsing and Text Chunking](https://youtu.be/2xDefZ4oBOM?si=YZzBUjDfQ0839EVF) +- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/using_agents_functions_slim_models.md b/docs/learn/using_agents_functions_slim_models.md new file mode 100644 index 00000000..b0481006 --- /dev/null +++ b/docs/learn/using_agents_functions_slim_models.md @@ -0,0 +1,90 @@ +--- +layout: default +title: Using Agents & Function Calls with SLIM Models +parent: Learn +nav_order: 1 +description: overview of the major modules and classes of LLMWare +permalink: /learn/using_agents_functions_slim_models +--- +Using Agents, Function Calls and SLIM Models +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Using Agents, Function Calls and SLIM models** +- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) +- [Sentiment Analysis](https://youtu.be/ERCHP21oAN8?si=fp6D4Tk9J2HdDRXa) +- [SLIMS Playlist](https://youtube.com/playlist?list=PL1-dn33KwsmAHWCWK6YjZrzicQ2yR6W8T&si=TSFGqQ3ObOO5vDde) +- [Agent-based Complex Research Analysis](https://youtu.be/y4WvwHqRR60?si=jX3KCrKcYkM95boe) +- [Getting Started with SLIMs (with code)](https://youtu.be/aWZFrTDmMPc?si=lmo98_quo_2Hrq0C) +- [SLIM Models Intro](https://www.youtube.com/watch?v=cQfdaTcmBpY) +- [Text2SQL Intro](https://youtu.be/BKZ6kO2XxNo?si=tXGt63pvrp_rOlIP) +- [Pop up LLMWare Inference Server](https://www.youtube.com/watch?v=qiEmLnSRDUA&t=20s) +- [Hardest Problem in RAG - handling 'Not Found'](https://youtu.be/slDeF7bYuv0?si=j1nkdwdGr5sgvUtK) +- [Extract Information from Earnings Releases](https://youtu.be/d6HFfyDk4YE?si=VmnIiWFmgBtR4DxS) +- [Summary Function Calls](https://youtu.be/yNg_KH5cPSk?si=Yl94tp_vKA8e7eT7) +- [Boolean Yes-No Function Calls](https://youtu.be/jZQZMMqAJXs?si=lU4YVI0H0tfc9k6e) +- [Autogenerate Topics, Tags and NER](https://youtu.be/N6oOxuyDsC4?si=vo2Fd8VG5xTbH4SD) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+--- diff --git a/docs/learn/using_quantized_gguf_models.md b/docs/learn/using_quantized_gguf_models.md new file mode 100644 index 00000000..6d80f5e2 --- /dev/null +++ b/docs/learn/using_quantized_gguf_models.md @@ -0,0 +1,87 @@ +--- +layout: default +title: Using Quantized GGUF Models +parent: Learn +nav_order: 3 +description: overview of the major modules and classes of LLMWare +permalink: /learn/using_quantized_gguf_models +--- +Using Quantized GGUF Models +--- + +**Tutorial Videos** - check out our Youtube channel for high-impact 5-10 minute tutorials on the latest examples. + +Check back often as this list is always growing ... + +🎬 **Using GGUF Models** +- [Using LM Studio Models](https://www.youtube.com/watch?v=h2FDjUyvsKE) +- [Using Ollama Models](https://www.youtube.com/watch?v=qITahpVDuV0) +- [Use any GGUF Model](https://www.youtube.com/watch?v=9wXJgld7Yow) +- [Background on GGUF Quantization & DRAGON Model Example](https://www.youtube.com/watch?v=ZJyQIZNJ45E) +- [Getting Started with Whisper.CPP](https://youtu.be/YG5u5AOU9MQ?si=5xQYZCILPSiR8n4s) +- [Best Small RAG Model - Bling-Phi-3](https://youtu.be/cViMonCAeSc?si=L6jX0sRdZAmKtRcz) +- [Agent Automation with Web Services for Financial Research](https://youtu.be/l0jzsg1_Ik0?si=oBGtALHLplouY9x2) +- [Voice Transcription and Automated Analysis of Greatest Speeches Dataset](https://youtu.be/5y0ez5ZBpPE?si=PAaCIqYou8nCGxYG) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part I](https://youtu.be/7oMTGhSKuNY?si=_KSjuBnqArvWzYbx) +- [Are you prompting wrong for RAG - Stochastic Sampling-Part II- Code Experiments](https://youtu.be/iXp1tj-pPjM?si=3ZeMgipY0vJDHIMY) + + +# More information about the project - [see main repository](https://www.github.com/llmware-ai/llmware.git) + + +# About the project + +`llmware` is © 2023-{{ "now" | date: "%Y" }} by [AI Bloks](https://www.aibloks.com/home). + +## Contributing +Please first discuss any change you want to make publicly, for example on GitHub via raising an [issue](https://github.com/llmware-ai/llmware/issues) or starting a [new discussion](https://github.com/llmware-ai/llmware/discussions). +You can also write an email or start a discussion on our Discrod channel. +Read more about becoming a contributor in the [GitHub repo](https://github.com/llmware-ai/llmware/blob/main/CONTRIBUTING.md). + +## Code of conduct +We welcome everyone into the ``llmware`` community. +[View our Code of Conduct](https://github.com/llmware-ai/llmware/blob/main/CODE_OF_CONDUCT.md) in our GitHub repository. + +## ``llmware`` and [AI Bloks](https://www.aibloks.com/home) +``llmware`` is an open source project from [AI Bloks](https://www.aibloks.com/home) - the company behind ``llmware``. +The company offers a Software as a Service (SaaS) Retrieval Augmented Generation (RAG) service. +[AI Bloks](https://www.aibloks.com/home) was founded by [Namee Oberst](https://www.linkedin.com/in/nameeoberst/) and [Darren Oberst](https://www.linkedin.com/in/darren-oberst-34a4b54/) in October 2022. + +## License + +`llmware` is distributed by an [Apache-2.0 license](https://www.github.com/llmware-ai/llmware/blob/main/LICENSE). + +## Thank you to the contributors of ``llmware``! +
    +{% for contributor in site.github.contributors %} +
  • + + {{ contributor.login }} + +
  • +{% endfor %} +
+ + +--- +
    +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
  • + +
  • +
+---