Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(observability): otel integration add tracing #48

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cchenggit
Copy link
Collaborator

No description provided.

@cchenggit cchenggit marked this pull request as draft October 14, 2024 17:39
Copy link

codecov bot commented Oct 14, 2024

Codecov Report

Attention: Patch coverage is 74.70449% with 107 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
test/helper/helper.go 59.09% 12 Missing and 6 partials ⚠️
pkg/tracing/opentelemetry.go 79.22% 11 Missing and 5 partials ⚠️
db/dao/dao.go 62.50% 12 Missing and 3 partials ⚠️
pkg/tracing/tracing.go 73.21% 11 Missing and 4 partials ⚠️
pkg/queue/redis/redis.go 33.33% 8 Missing and 2 partials ⚠️
test/tracing/types.go 72.22% 9 Missing and 1 partial ⚠️
proxy/middlewares/recovery.go 0.00% 7 Missing ⚠️
config/database.go 75.00% 4 Missing and 2 partials ⚠️
config/config.go 42.85% 3 Missing and 1 partial ⚠️
app/app.go 76.92% 2 Missing and 1 partial ⚠️
... and 1 more
Flag Coverage Δ
integration 68.70% <72.81%> (+0.67%) ⬆️
unit 13.17% <5.53%> (-0.68%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
admin/admin.go 71.42% <ø> (-0.99%) ⬇️
admin/api/api.go 93.54% <100.00%> (+0.21%) ⬆️
admin/api/middleware.go 96.00% <ø> (ø)
config/redis.go 100.00% <ø> (ø)
config/tracing.go 100.00% <100.00%> (ø)
db/dao/attempt_dao.go 100.00% <100.00%> (ø)
db/dao/attempt_detail_dao.go 91.17% <100.00%> (+1.52%) ⬆️
db/db.go 67.24% <100.00%> (+3.90%) ⬆️
dispatcher/dispatcher.go 76.85% <100.00%> (+0.99%) ⬆️
pkg/taskqueue/redis.go 84.40% <100.00%> (+2.48%) ⬆️
... and 14 more

... and 1 file with indirect coverage changes

@vm-001 vm-001 self-requested a review October 15, 2024 00:51
@cchenggit cchenggit marked this pull request as ready for review October 22, 2024 14:28
config.yml Outdated Show resolved Hide resolved
config.yml Outdated Show resolved Hide resolved
config.yml Outdated Show resolved Hide resolved
config.yml Outdated Show resolved Hide resolved
config.yml Show resolved Hide resolved
proxy/router/router.go Outdated Show resolved Hide resolved
pkg/tracing/opentelemetry/opentelemetry.go Outdated Show resolved Hide resolved
@vm-001
Copy link
Member

vm-001 commented Oct 23, 2024

Please resolve the conflict.


type OpentelemetryConfig config.OpenTelemetryConfig

func (o *OpentelemetryConfig) Setup(serviceName string, samplingRate float64, globalAttributes map[string]string) (trace.Tracer, io.Closer, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to make OpentelemetryConfig as a function arg instead of a receiver.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interface should be better to implement different tracing setup.

CapturedResponseHeaders []string `yaml:"captured_response_headers"`
SafeQueryParams []string `yaml:"safe_query_params"`
SamplingRate float64 `yaml:"sampling_rate" default:"1"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to specific the propagation type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otel.SetTextMapPropagator(autoprop.NewTextMapPropagator()) will config propagation type from env OTEL_PROPAGATORS, default is tracecontext,baggage

config/tracing.go Outdated Show resolved Hide resolved
ServiceName string `yaml:"service_name" default:"WebhookX"`
CapturedRequestHeaders []string `yaml:"captured_request_headers"`
CapturedResponseHeaders []string `yaml:"captured_response_headers"`
SafeQueryParams []string `yaml:"safe_query_params"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the safe_query_params is defined in the config.yml? do we really need it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe_query_params defined in config.yml, should check again.

type OpenTelemetryConfig struct {
Protocol OtlpProtocol `yaml:"protocol" envconfig:"PROTOCOL" default:"http/protobuf"`
Endpoint string `yaml:"endpoint" envconfig:"ENDPOINT" default:"http://localhost:4318/v1/metrics"`
Headers map[string]string `yaml:"headers,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove it as it can be configured via Otel Envs? Or do you think this is required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's necessary, webhookx should start up with a simple default config rather than letting user gain a deeper understanding about OTEL Envs.
It can be override by OTEL Envs if users have more requirements of tracing

vm-001
vm-001 previously approved these changes Oct 25, 2024
go.mod Outdated Show resolved Hide resolved
config.yml Outdated Show resolved Hide resolved
@vm-001
Copy link
Member

vm-001 commented Oct 26, 2024

  1. Tracing instruments should be NOOP(as much as possible) when tracing.enabled is false to avoid performance impact.
  2. DAO span name should be more detailed. e.g. dao.endpoints.list("dao.{opts.Table}.{operation}") instead of dao.list
  3. Use otelsql.WithSpanOptions() to filter out unnecessary span such as sql.conn.reset_session.

@cchenggit
Copy link
Collaborator Author

cchenggit commented Oct 27, 2024

  1. Tracing instruments should be NOOP(as much as possible) when tracing.enabled is false to avoid performance impact.

tracing.enabled added, and it has been inject to database and redis config, so that they can control whether initialize tracing in the component.

  1. DAO span name should be more detailed. e.g. dao.endpoints.list("dao.{opts.Table}.{operation}") instead of dao.list

good point, updated.

  1. Use otelsql.WithSpanOptions() to filter out unnecessary span such as sql.conn.reset_session.

omited spans:

sql.connector.connect
sql.conn.ping
sql.conn.prepare
sql.conn.reset_session
sql.rows

all available spans list below:

MethodConnectorConnect Method = "sql.connector.connect"
MethodConnPing         Method = "sql.conn.ping"
MethodConnExec         Method = "sql.conn.exec"
MethodConnQuery        Method = "sql.conn.query"
MethodConnPrepare      Method = "sql.conn.prepare"
MethodConnBeginTx      Method = "sql.conn.begin_tx"
MethodConnResetSession Method = "sql.conn.reset_session"
MethodTxCommit         Method = "sql.tx.commit"
MethodTxRollback       Method = "sql.tx.rollback"
MethodStmtExec         Method = "sql.stmt.exec"
MethodStmtQuery        Method = "sql.stmt.query"
MethodRows             Method = "sql.rows"

admin/api/api.go Outdated
@@ -95,7 +97,7 @@ func (api *API) assert(err error) {
func (api *API) Handler() http.Handler {
r := mux.NewRouter()

r.Use(panicRecovery)
r.Use(middlewares.PanicRecovery)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite that both panicRecovery in proxy and admin are shares most of logic, but I don't it's the right time to use the same middleware.

@@ -58,6 +59,10 @@ func (cfg Config) Validate() error {
return err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Username string `yaml:"username" default:"webhookx"`
Password string `yaml:"password" default:""`
Database string `yaml:"database" default:"webhookx"`
tracingEnabled bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer replacing db driver at app initialize rather than having tracingEnabled everywhere.

config/redis.go Outdated
Port uint32 `yaml:"port" default:"6379"`
Password string `yaml:"password" default:""`
Database uint32 `yaml:"database" default:"0"`
tracingEnabled bool
Copy link
Member

@vm-001 vm-001 Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

dispatcher/dispatcher.go Show resolved Hide resolved
func init() {
var err error
cfg, err = config.Init()
if err != nil {
panic(err)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be deleted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDE format issue

type TracingConfig struct {
Enabled bool `yaml:"enabled" default:"false"`
Attributes Map `yaml:"attributes"`
Opentelemetry *OpenTelemetryConfig `yaml:"opentelemetry"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason to use pointer here

return nil
}
if cfg.SamplingRate > 1 || cfg.SamplingRate < 0 {
return errors.New("invalid sampling rate, must be [0,1]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errors.New("invalid sampling rate, must be [0,1]")
return errors.New("sampling_rate must be in the range [0, 1]")

type OpenTelemetryConfig struct {
Protocol OtlpProtocol `yaml:"protocol" envconfig:"PROTOCOL" default:"http/protobuf"`
Endpoint string `yaml:"endpoint" envconfig:"ENDPOINT" default:"http://localhost:4318/v1/metrics"`
Headers Map `yaml:"headers" envconfig:"HEADERS"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove Headers as it can be configured via
OTEL_EXPORTER_OTLP_TRACES_HEADERS

@@ -91,6 +93,12 @@ func NewRedisQueue(opts RedisTaskQueueOptions, logger *zap.SugaredLogger, metric
}

func (q *RedisTaskQueue) Add(ctx context.Context, tasks []*TaskMessage) error {
if tracer := tracing.TracerFromContext(ctx); tracer != nil {
tracingCtx, span := tracer.Start(context.Background(), "queue.add", trace.WithSpanKind(trace.SpanKindClient))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracingCtx, span := tracer.Start(context.Background(), "queue.add", trace.WithSpanKind(trace.SpanKindClient))
tracingCtx, span := tracer.Start(ctx, "queue.add", trace.WithSpanKind(trace.SpanKindClient))

Copy link
Member

@vm-001 vm-001 Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change queue.* to taskqueue.* as it may get confused with the proxy queue.

@vm-001
Copy link
Member

vm-001 commented Oct 27, 2024

  1. manually start tracing span in queue/redis/redis
  2. The redisotel is basically covered by the manual span, so I think redisotel is not mandatory at this point, please remove it.

@vm-001 vm-001 added this to the 0.4.0 milestone Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants