From fddf77d2730432ee9d8e3f12d70411a3fd65e7f4 Mon Sep 17 00:00:00 2001 From: ruthenian8 Date: Wed, 18 Oct 2023 13:57:21 +0300 Subject: [PATCH 1/3] Create stub documents --- docs/source/user_guides.rst | 7 +++++++ docs/source/user_guides/advanced_features.rst | 7 +++++++ 2 files changed, 14 insertions(+) create mode 100644 docs/source/user_guides/advanced_features.rst diff --git a/docs/source/user_guides.rst b/docs/source/user_guides.rst index 024c84ecc..8ad7cef8d 100644 --- a/docs/source/user_guides.rst +++ b/docs/source/user_guides.rst @@ -23,6 +23,12 @@ for exploring the telemetry data collected from your conversational services. We show how to plug in the telemetry collection and configure the pre-built Superset dashboard shipped with DFF. +:doc:`Advanced features guide <./user_guides/advanced_features>` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``Advanced features guide`` demonstrates the advanced capabilities of the DFF library. +Not strictly necessary for starter projects, these features come into play when +scaling the project up or improving performance is desired. .. toctree:: :hidden: @@ -30,3 +36,4 @@ Superset dashboard shipped with DFF. user_guides/basic_conceptions user_guides/context_guide user_guides/superset_guide + user_guides/advanced_features diff --git a/docs/source/user_guides/advanced_features.rst b/docs/source/user_guides/advanced_features.rst new file mode 100644 index 000000000..2afdc8623 --- /dev/null +++ b/docs/source/user_guides/advanced_features.rst @@ -0,0 +1,7 @@ +Advanced Features +----------------- + +Features +~~~~~~~~ + +Features From 4eb2e362222cee7249c32e43c5bc598302f48259 Mon Sep 17 00:00:00 2001 From: ruthenian8 Date: Thu, 19 Oct 2023 15:02:54 +0300 Subject: [PATCH 2/3] add best practices guide --- docs/source/user_guides.rst | 4 +- docs/source/user_guides/advanced_features.rst | 163 +++++++++++++++++- 2 files changed, 160 insertions(+), 7 deletions(-) diff --git a/docs/source/user_guides.rst b/docs/source/user_guides.rst index 8ad7cef8d..515db7ace 100644 --- a/docs/source/user_guides.rst +++ b/docs/source/user_guides.rst @@ -26,8 +26,8 @@ Superset dashboard shipped with DFF. :doc:`Advanced features guide <./user_guides/advanced_features>` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -``Advanced features guide`` demonstrates the advanced capabilities of the DFF library. -Not strictly necessary for starter projects, these features come into play when +``Best practices guide`` demonstrates the best practices of development with the DFF library. +Not strictly necessary for starter projects, these practices come into play when scaling the project up or improving performance is desired. .. toctree:: diff --git a/docs/source/user_guides/advanced_features.rst b/docs/source/user_guides/advanced_features.rst index 2afdc8623..a13ce360a 100644 --- a/docs/source/user_guides/advanced_features.rst +++ b/docs/source/user_guides/advanced_features.rst @@ -1,7 +1,160 @@ -Advanced Features ------------------ +Best practices guide +----------------------- -Features -~~~~~~~~ +Setting up a Virtual Environment +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Features +A virtual environment provides a controlled and isolated setup for your bot, minimizing conflicts +and ensuring consistency across different setups. + +- If you already have a virtual environment and just need DFF as a component, install DFF using pip. If you need specific dependencies, install them using pip as well. +- If you prefer, you can clone the DFF GitHub repository and set up a virtual environment using the `make venv` command. This virtual environment will have all the necessary requirements for working with DFF. + +Script Design +~~~~~~~~~~~~~ + +The foundation of your bot's ability to engage in meaningful conversation lies in its script. +In DFF, this process is structured around dividing your script into distinct flows. +A flow represents a self-contained piece of dialogue encompassing a particular topic or function. +More in the `basic guide <./basic_conceptions.rst>`__. + +- Creating a Script: A script is a dictionary where keys correspond to different flows, + which are used to divide a dialog into sub-dialogs and process them separately. + +- Begin by brainstorming and listing down the primary functions and topics your bot needs to handle. + +- For each function or topic, create a separate flow. + Flows are dictionaries, with keys being nodes that represent the smallest unit of a dialog. + Each flow should have a clear entry and exit point. + +- Creating Nodes: A node contains the bot's response to a user's input, + a condition determining the transition to another node, + whether within the current or another flow. + +- Ensure that there's a logical progression within each flow, guiding the user from the beginning to the end of the conversation segment. + +Models +~~~~~~ + +Models are central to making your bot intelligent and responsive. +In the context of DFF, models may help in processing data and generating non-hardcoded responses. + +- Set up caching mechanisms to improve response times by reducing the need for recalculations. + You can straightforwardly cache the output of functions that leverage calls to NLU models + or use more complex solutions, like `GPTcache `_ + +- The `Dockerfile `_ in the DFF demo + illustrates caching a model using SentenceTransformer in a Docker container. + The model is constructed during image build, so that the weights that the Huggingface library + fetches from the web are downloaded in advance. At runtime, the fetched weights will be quickly read from the disk. + +- Use persistent context storages to hold the necessary information that your bot will need. + +Using Docker +~~~~~~~~~~~~ + +Docker simplifies the deployment of your bot by encapsulating it into containers. +The `docker-compose` file in the DFF repository provides a solid base for setting up your bot's environment. + +- Make sure that Docker and Docker Compose are installed on your machine. + +- Clone the GitHub-based distribution of DFF which includes a `docker-compose.yml `_ file. + +- The `docker-compose.yml `_ file + demonstrates the setup of various database services like MySQL, PostgreSQL, Redis, MongoDB, and others using Docker Compose. + The file also showcases setting up other services and defines the network and volumes for data persistence. + Customize the provided file to match your bot's requirements, such as specifying dependencies and environment variables. + +- As a rule of thumb, most of the time you will need at least two docker containers: 1) The bot itself, containerized as a web application; + 2) Container for a database image. You can add the web app image to the docker-compose file and, optionally, add both containers + to a single docker profile. + +.. code-block:: + + web: + build: + context: web/ + volumes: + - ./web/:/app:ro + ports: + - 8000:8000 + env_file: + - ./.env + depends_on: + - psql + profiles: + - 'myapp' + psql: + # ... other options + profiles: + - 'myapp' + +- This allows you to control both containers with a single docker command. + +.. code-block:: + + docker-compose --profile myapp up + + +- Use Docker Compose commands to build and run your bot. + +Directory Structure +~~~~~~~~~~~~~~~~~~~ + +A well-organized directory structure is crucial for managing your bot's code, assets, and other resources effectively. The demo provided in the DFF repository serves as a good template. + +- Organize your scripts, models, and other resources in a logical, hierarchical manner. + +- Maintain a clean and well-documented codebase to facilitate maintenance and collaboration. + +- You can create a directory for your bot project following the structure outlined + in the `demo project `_. + +Testing and Load Testing +~~~~~~~~~~~~~~~~~~~~~~~~ + +Testing ensures that your bot functions as expected under various conditions, while load testing gauges its performance under high traffic. + +- Regular bot functionality can be covered by simple end-to-end tests that include user requests and bot replies. + Tests of this kind can be automated using the Pytest framework. + The demo project includes an `example `_ of such a testing suite. + +- Optimize your bot's performance by identifying bottlenecks during I/O operations and other levels. + Utilize tools like Locust for load testing to ensure your bot scales well under high load conditions. + Additionally, profile and benchmark different context storages to choose the most efficient one for your dialog service. + +.. note:: + + More in the `profiling user guide <#>`_. + +- Profiling with Locust: DFF recommends using Locust for load testing to measure the scalability of each component in your pipeline, + especially when integrated into a web server application like Flask or FastAPI. + +- Profiling Context Storages: Benchmarking the performance of database bindings is crucial. + DFF provides tools for measuring the speed and reliability of various context storage solutions like JSON, + Pickle, PostgreSQL, MongoDB, Redis, MySQL, SQLite, and YDB. + +Make use of telemetry +~~~~~~~~~~~~~~~~~~~~~ + +Another great way to measure the efficiency of your bot is to employ the telemetry mechanisms +that come packaged with DFF's GitHub distribution. Telemetry data can then be viewed +and played with by means of the integrated Superset dashboard. + +.. note:: + + For more information on working with Telemetry data, you can consult + the `Stats Tutorial <../tutorials/tutorials.stats.1_extractor_functions.py>`_ + and the `Superset Guide <./superset_guide.rst>`__. + +Choosing a Database +~~~~~~~~~~~~~~~~~~~ + +The choice of database technology affects your bot's performance and ease of data management. + +- Evaluate the data requirements of your bot as well as the capabilities of your hardware + (server or local machine) to determine the most suitable database technology. + +- Set up and configure the database, ensuring it meets your bot’s data storage, retrieval, and processing needs. + +- DFF supports various databases like JSON, Pickle, SQLite, PostgreSQL, MySQL, MongoDB, Redis, and Yandex Database. From 3723a507593554ddbd9e2abd3728968f1c00f2b9 Mon Sep 17 00:00:00 2001 From: ruthenian8 Date: Thu, 19 Oct 2023 16:54:14 +0300 Subject: [PATCH 3/3] add links and additional code snippets --- docs/source/user_guides/advanced_features.rst | 134 ++++++++++++------ 1 file changed, 93 insertions(+), 41 deletions(-) diff --git a/docs/source/user_guides/advanced_features.rst b/docs/source/user_guides/advanced_features.rst index a13ce360a..a6238f156 100644 --- a/docs/source/user_guides/advanced_features.rst +++ b/docs/source/user_guides/advanced_features.rst @@ -1,6 +1,13 @@ Best practices guide ----------------------- +Introduction +~~~~~~~~~~~~ + +When developing a conversational service with DFF, there are certain practices you +can follow to make the development process more efficient. In this guide, +we name some of the steps you can take to make the most of the DFF framework. + Setting up a Virtual Environment ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -10,6 +17,17 @@ and ensuring consistency across different setups. - If you already have a virtual environment and just need DFF as a component, install DFF using pip. If you need specific dependencies, install them using pip as well. - If you prefer, you can clone the DFF GitHub repository and set up a virtual environment using the `make venv` command. This virtual environment will have all the necessary requirements for working with DFF. +.. warning:: + + The code below is relevant for Linux-based systems or for Windows with Cygwin installed. + +.. code-block:: bash + + git clone https://github.com/deeppavlov/dialog_flow_framework.git + cd dialog_flow_framework + make venv + source venv/bin/activate + Script Design ~~~~~~~~~~~~~ @@ -48,8 +66,50 @@ In the context of DFF, models may help in processing data and generating non-har The model is constructed during image build, so that the weights that the Huggingface library fetches from the web are downloaded in advance. At runtime, the fetched weights will be quickly read from the disk. +.. code-block:: dockerfile + + # cache mfaq model + RUN ["python3", "-c", "from sentence_transformers import SentenceTransformer; _ = SentenceTransformer('clips/mfaq')"] + - Use persistent context storages to hold the necessary information that your bot will need. + +Directory Structure +~~~~~~~~~~~~~~~~~~~ + +A well-organized directory structure is crucial for managing your bot's code, assets, and other resources effectively. The demo provided in the DFF repository serves as a good template. + +- Organize your scripts, models, and other resources in a logical, hierarchical manner. + For instance, since DFF provides three types of standard callback functions, + conditions, responses, and processing functions, + it may be beneficial to use three separate files for those, i.e. ``conditions.py``, + ``processing.py``, and ``responses.py``. + +- Maintain a clean and well-documented codebase to facilitate maintenance and collaboration. + +- You can create a directory for your bot project following the structure outlined + in the `demo project `_. + +- Below is a simplified project tree that shows a minimal example of how files can be structured. + +.. code-block:: shell + + project/ + ├── myapp + │   ├── dialog_graph + │   │   ├── __init__.py + │   │   ├── conditions.py # Condition callbacks + │   │   ├── processing.py # Processing callbacks + │   │   ├── response.py # Response callbacks + │   │   └── script.py # DFF script and pipeline are constructed here + │   ├── dockerfile + │   ├── requirements.txt + │   ├── web_app.py # the web app imports the DFF pipeline from dialog_graph + │   └── test.py # End-to-end testing happy path is defined here + ├── ...Folders for other docker-based services, if applicable + ├── venv/ + └── docker-compose.yml + Using Docker ~~~~~~~~~~~~ @@ -69,73 +129,65 @@ The `docker-compose` file in the DFF repository provides a solid base for settin 2) Container for a database image. You can add the web app image to the docker-compose file and, optionally, add both containers to a single docker profile. -.. code-block:: - - web: - build: - context: web/ - volumes: - - ./web/:/app:ro - ports: - - 8000:8000 - env_file: - - ./.env - depends_on: - - psql - profiles: - - 'myapp' - psql: - # ... other options - profiles: - - 'myapp' +.. code-block:: yaml + + web: + build: + # source folder + context: myapp/ + volumes: + # folder forwarding + - ./web/:/app:ro + ports: + # port forwarding + - 8000:8000 + env_file: + # environment variables + - ./.env_file + depends_on: + - psql + profiles: + - 'myapp' + psql: + # ... other options + profiles: + - 'myapp' - This allows you to control both containers with a single docker command. -.. code-block:: +.. code-block:: shell - docker-compose --profile myapp up + docker-compose --profile myapp up - Use Docker Compose commands to build and run your bot. -Directory Structure -~~~~~~~~~~~~~~~~~~~ - -A well-organized directory structure is crucial for managing your bot's code, assets, and other resources effectively. The demo provided in the DFF repository serves as a good template. - -- Organize your scripts, models, and other resources in a logical, hierarchical manner. - -- Maintain a clean and well-documented codebase to facilitate maintenance and collaboration. - -- You can create a directory for your bot project following the structure outlined - in the `demo project `_. - Testing and Load Testing ~~~~~~~~~~~~~~~~~~~~~~~~ Testing ensures that your bot functions as expected under various conditions, while load testing gauges its performance under high traffic. - Regular bot functionality can be covered by simple end-to-end tests that include user requests and bot replies. - Tests of this kind can be automated using the Pytest framework. + Tests of this kind can be automated using the `Pytest framework `_. The demo project includes an `example `_ of such a testing suite. - Optimize your bot's performance by identifying bottlenecks during I/O operations and other levels. - Utilize tools like Locust for load testing to ensure your bot scales well under high load conditions. + Utilize tools like `Locust `_ for load testing to ensure your bot scales well under high load conditions. Additionally, profile and benchmark different context storages to choose the most efficient one for your dialog service. .. note:: More in the `profiling user guide <#>`_. -- Profiling with Locust: DFF recommends using Locust for load testing to measure the scalability of each component in your pipeline, +- Profiling with Locust: DFF recommends using `Locust `_ for load testing to measure the scalability of each component in your pipeline, especially when integrated into a web server application like Flask or FastAPI. - Profiling Context Storages: Benchmarking the performance of database bindings is crucial. DFF provides tools for measuring the speed and reliability of various context storage solutions like JSON, Pickle, PostgreSQL, MongoDB, Redis, MySQL, SQLite, and YDB. -Make use of telemetry -~~~~~~~~~~~~~~~~~~~~~ +Making Use of Telemetry +~~~~~~~~~~~~~~~~~~~~~~~ Another great way to measure the efficiency of your bot is to employ the telemetry mechanisms that come packaged with DFF's GitHub distribution. Telemetry data can then be viewed @@ -143,9 +195,9 @@ and played with by means of the integrated Superset dashboard. .. note:: - For more information on working with Telemetry data, you can consult - the `Stats Tutorial <../tutorials/tutorials.stats.1_extractor_functions.py>`_ - and the `Superset Guide <./superset_guide.rst>`__. + For more information on how to set up the telemetry and work with the data afterwards, you can consult + the `Stats Tutorial <../tutorials/tutorials.stats.1_extractor_functions.py>`_ + and the `Superset Guide <./superset_guide.rst>`__. Choosing a Database ~~~~~~~~~~~~~~~~~~~