Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-1089] [Bug] Server socket closed when running DBT on GitHub Actions #701

Closed
2 tasks done
BeltranCunef opened this issue Jan 4, 2024 · 7 comments
Closed
2 tasks done
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@BeltranCunef
Copy link

BeltranCunef commented Jan 4, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When running dbt run --full-refresh on GitHub Actions I get an error message that the server has closed the connection unexpectedly. I'm using DBT to do a full refresh of all my seeds and models excluding materialized views and others which have the tag do_not_run.

Expected Behavior

Finishing the full refresh succesfully

Steps To Reproduce

A simple dbt run --full-refresh -t $TARGET --profiles-dir ./ --exclude tag:my_tag config.materialized:materialized_view where $TARGET is a GitHub variable.

Relevant log output

Completed with 1 error and 0 warnings:
01:51:21  
01:51:21    Runtime Error in model my_model (models/marts/my_folder/my_model.sql)
  BrokenPipe: server socket closed. Please check that client side networking configurations such as Proxies, firewalls, VPN, etc. are not affecting your network connection.

Environment

- OS:Ubuntu 22.04.3
- Python:3.11.7
- dbt:1.7.3

Which database adapter are you using with dbt?

redshift

Additional Context

No response

@BeltranCunef BeltranCunef added bug Something isn't working triage labels Jan 4, 2024
@github-actions github-actions bot changed the title [Bug] <Server socket closed when running DBT on GitHub Actions> [CT-3520] [Bug] <Server socket closed when running DBT on GitHub Actions> Jan 4, 2024
@BeltranCunef BeltranCunef changed the title [CT-3520] [Bug] <Server socket closed when running DBT on GitHub Actions> [Bug] Server socket closed when running DBT on GitHub Actions Jan 4, 2024
@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core Jan 4, 2024
@github-actions github-actions bot changed the title [Bug] Server socket closed when running DBT on GitHub Actions [ADAP-1089] [Bug] Server socket closed when running DBT on GitHub Actions Jan 4, 2024
@dbeatty10
Copy link
Contributor

Thanks for reaching out @BeltranCunef !

dbt-redshift needs a stable network connection. If the connection is lost, the Amazon Redshift Python driver will emit the error message you saw.

We're not planning on changing the requirement of a stable network connection for dbt-redshift, so I'm going to close this as "not planned".

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 4, 2024
@dbeatty10 dbeatty10 added wontfix This will not be worked on and removed triage labels Jan 4, 2024
@ag-serenis
Copy link

ag-serenis commented Feb 11, 2024

@dbeatty10 we are having the same issue when running dbt core v1.7.3 from GitHub Actions, but not when running from local laptops. I would have assumed the connection from GitHub to be more stable than the one of my ISP.
So, I am wondering whether that is a problem which is specific to running from Github Actions.
Also, we do not seem to have the same problem when we run an older version of dbt (v0.21.1).
As the keepalives_idle option is now deprecated in dbt-redshift, is there any way one could use to further investigate/solve the issue?

@FridayPush
Copy link

This error message is just now becoming a problem for us when testing on a local or ci. It's also comes up every now and then on slack. It's always a long running model while no other model is running. For some the deprecated keepalives_idle resolves the issue.
Today on a redshift serverless question: https://getdbt.slack.com/archives/CJARVS0RY/p1708354987376349
Mid Jan: https://getdbt.slack.com/archives/CJARVS0RY/p1704831868838559

But that seems to no longer be helping. Running a specific model alone that takes ~12min to full refresh always throws a BrokenPipe error. Upgrading to 1.7 returns a buffer size error, 1.5.8 shows an error message about my internet connection/vpn/etc.
dbt-redshift==1.4.0 completes successfully every time.

@BrettM86
Copy link

I have been having this issue on 1.7.3 with my deployments on both Dagster Cloud and through Fivetran's dbt deployment. Setting the keepalives_idle and RA3 unfortunately doesn't resolve my issues. I'm also not on Redshift Serverless.

@dbeatty10 Is it possible we can get a second look into what's causing these issues it seems to be more of a dbt issue than a "internet connection" issue considering multiple cloud deployments are failing and the revert to dbt-redshift==1.4.0 solves all problems for most people.

It's also not happening to specific models, I have some staging models that I can typically execute in < 10 seconds who are being flagged as taking 9000 seconds to execute before a Broken Pipe error or connection timeout (Set in my config).

Another time it's been brought up on slack Feb 21st: https://getdbt.slack.com/archives/CJARVS0RY/p1708535489807729

@Previatto
Copy link

Hey @BrettM86, did you downgrade just dbt-redshift to 1.4.0? From what I understand, in dbt-rs 1.5 they changed the driver from psycopg2 to redshift_connector. From my own research, it seems that redshift_connector has the parameter tcp_keepalive true as default, so I think that is why keepalive_idle was deprecated.

btw, are you, @FridayPush and @ag-serenis using the parameter --fail-fast?

(for @dbeatty10 too) I was plagued with the same broken-pipe error until I realized that it was happening when I used --fail-fast parameter and a model had an error. As soon as the model errored (I was missing an alias, wrong column name, etc) It would return the broken-pipe, no matter how long it ran.

As soon as I removed the parameter, boom, a 17 minutes run with no broken-pipe, even with some models having errors. I added it back and the broken-pipe came back with it. So either my connection magically stayed stable for 17 minutes only when I removed --fail-fast or there is something more to that.

@FridayPush
Copy link

FridayPush commented Mar 20, 2024

We are not using the --fail-fast flag. The models I was trying against weren't modified from normal runs I was specifically full refreshing individual tables that had changed. With tools and direct SQL queries I have no issues running the query, (eg in pycharm's sql tools running the compiled query to a temp table). Perhaps this is more related from the psycopg to redshift-connector job. I was reminded of the thread when another user in #db-redshift on slack complained about connection issues this morning.

I do think it's something related to like 'micro-hiccups' in networking or similar. I have 10g/symmetrical internal at my house and see no indication of networking issues with anything else; but the jobs from DBT Cloud do not trigger this often at all.

@dadadima
Copy link

dadadima commented Aug 26, 2024

Adding a comment here just in case this is useful for someone still experiencing the issue.
What solved for me was changing the Github Action runner from ubuntu-latest to macos-latest. With this change I stopped experiencing the server socker related isssues. I haven't had time to test the windows runner (especially because my action had many bash commands).

My setup:
macos-latest
python 3.11.9
dbt-core==1.8.3
dbt-redshift==1.8.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

7 participants