Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Enhance observability #68

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from

Conversation

outerlook
Copy link
Member

No description provided.

Added support of OpenTelemetry to the broker package. This will help in diagnostic tracing and querying of metrics within a distributed system which is beneficial for observing performance bottlenecks, slow network requests and other issues. It's a part of observability feature development for improving service reliability and performance monitoring.

3 new files were created under `packages/broker/src/telemetry/setup/` directory for setting up the OpenTelemetry functionality. The existing cli index file has been updated to call the setup.

New observability dependencies have been added in the package.json file and jest configuration file was updated accordingly for testing requirements.
This commit adds support for OpenTelemetry across the development network. Added METRICS_URL and TRACING_URL for all brokers to enable visibility into performance metrics and tracing of broker interactions. Furthermore, a Tempo local backend for traces is set up along with Prometheus for metrics and Grafana for visualization. Opentelemetry collector configuration was also added for data ingestion and forwarding. This will improve our ability to debug performance issues and understand network interactions in more granular detail.
Permits using broker in other packages
This new feature introduces metrics tracking to the broker service. With this implementation, we can now trace and observe system activities accurately. Several utilities were added for middleware handling (middlewares.ts and activeSpanDecorator.ts) and context management (context/index.ts and helpers.ts). Newly created metric counters include monitoring bytes read/written, number of messages read/written, HTTP queries performed, total stored messages/bytes and more.
Telemetry context and active span decorators were added to ConsensusManger, SystemRecovery, and QueryRequestHandler methods in the logStore package. This change is intended to enhance observability of these operations. Furthermore, observeMessageMetricsCollector was used in LogStorePlugin, which collects metrics for observing messages. Unnecessary comments were also removed from LogStorePlugin.
Integrated OpenTelemetry for better request tracing within the broker service. This change provides enhanced visibility into 'last', 'from', and 'range' queries in the dataQueryEndpoint, by creating spans and adding contexts to requests. This will greatly aid in monitoring and maintaining the system.
…lisher

The BroadbandSubscriber and BroadbandPublisher classes have been updated to support the application of different kinds of middlewares for processing. The use of middlewares will allow enhancements in the data processing flow, thereby enabling the code to be more maintainable, readable, and expandable in the future. The default set of middlewares has also been defined. These can be overridden or new middlewares can be added as per the application needs.
Relaxed the definition of the config parameter in the constructor from StrictConfig to Pick<StrictConfig, 'pool'> for the ReportPoller class for better flexibility. Also, the access modifier of kyvePool is updated from 'private' to 'protected' to make it accessible in inheriting classes.
Multiple changes were made in the broker package to embed OpenTelemetry metrics and traces into the testing process. This was accomplished by importing the 'metrics' and 'trace' modules from @opentelemetry/api into the test utils, and by defining functions that initialize and shut down the OpenTelemetry SDK within the test environment setup and teardown procedures. Also, new unit tests were added to ensure the correctness of metrics and traces.
Added CONFIG_TEST from ConfigTest and validateConfig from Config to the exports in "exports-esm.mjs". This resolves a previously existing conflict with the streamr re-exporting.
Adjusted the importOrder in the prettier configuration file (.prettierrc) to include a specific pattern for the 'startOpenTelemetry' module to make sure open telemetry is imported before others.
Implemented the ObserverPlugin class, responsible for managing the system messages and reports, and handling start and stop events.

Added comprehensive configuration schema, including handling auth, plugins, and pool configurations. Set up detailed metrics for both system messages and reports, this includes observables for bytes, count, lost messages, and bundled metrics such as the total number of reported queries and stored messages etc.

Created scripts for start-dev-server and start-prod-server to facilitate testing and deployment.

Additional utility function moduleFromMetaUrl added to support conversion from a given meta URL to a NodeJS.Module object.
In this commit, the observer package is added to the Dockerfile.base from dev-network. This incorporate the observer functionality to the docker image. Necessary configuration files such as observer.json and Dockerfile.observer, a shell script start-in-docker.sh is also added to aid in starting the observer in a Docker container. This change was necessary to facilitate observer functionality in development phase.
Updated the .gitignore file to ensure that '.env' files within the 'dev-network/assets' directory are no longer ignored by Git, allowing for development-specific environment variables to be tracked. In addition, a '.env.observer' file has been created within the 'dev-network/assets/observer' directory. This file contains settings related to log level and telemetry data collection, including a workaround to allow 'node-fetch' to call 'arweave.net' with a self-signed certificate.
Added Grafana and OpenTelemetry Collector to the list of ports in ports.md. Grafana will enable better data visualization for Logstore, and OpenTelemetry Collector will enhance monitoring capabilities. Also, created observer.md with the observer information for Logstore DevNetwork. Updated the connect.sh script to ensure both new services can be accessed when SSH'ing into the server.
A CONFIG_PATH environment variable was introduced to allow customization of the configuration file path. Now, the system will check for this variable and use the corresponding file if it's available, otherwise, it will fall back to the default path. This provides flexibility in configurations for different deployment scenarios.

Also, we updated the privateKey and network id within the development-1.env.json.
Introduced a README markdown file for the 'Observer' package. This documentation provides an overview, key features, quick start guide, environment variables, configuration, and deployment steps for the package. The purpose of this addition is to aid users in understanding and working with the Observer package, which is central to inspecting network activities and collecting telemetry data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant