-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic validation after error #186
Comments
This seems to be fine when I simulate a connection drop with duckdb. Kjell, are you seeing a different behavior when the connection to your real database drops? drv <- duckdb::duckdb()
pool <- pool::dbPool(drv)
# All is fine
con <- pool::poolCheckout(pool)
con@conn_ref
#> <pointer: 0x600002eca180>
DBI::dbGetQuery(con, "SELECT 1")
#> 1
#> 1 1
# Connection drops
con <- unserialize(serialize(con, NULL))
con@conn_ref
#> <pointer: 0x0>
pool::poolReturn(con)
# Connection seems to be fine when checked out from pool again
con <- pool::poolCheckout(pool)
con@conn_ref
#> <pointer: 0x600002ec9fc0>
DBI::dbGetQuery(con, "SELECT 2")
#> 2
#> 1 2
pool::poolReturn(con) Created on 2024-08-14 with reprex v2.1.0 |
In my experience it only happens when a server drops the connection. Here is a reprex:
|
Thanks, confirmed. |
What happens if you pass a |
This is a public-facing server, it's easy to try various options. That said, both I still see a race condition: checkout, connection drop, issuing query. It seems to be easy for the pool package to just retry in the case of query failures, perhaps optionally, and perhaps only if the error text (🙀🙀🙀) matches a pattern. 0pool <- pool::dbPool(
drv = RMariaDB::MariaDB(),
validationInterval = 0,
host = "relational.fel.cvut.cz",
port = 3306,
user = "guest",
password = "ctu-relational"
)
con <- pool::poolCheckout(pool)
DBI::dbExecute(con, "SET SESSION wait_timeout=2;")
#> [1] 0
pool::poolReturn(con)
Sys.sleep(3)
con <- pool::poolCheckout(pool)
DBI::dbIsValid(con)
#> [1] TRUE
DBI::dbGetQuery(con, "SELECT 1")
#> 1
#> 1 1 Created on 2024-08-21 with reprex v2.1.0 1pool <- pool::dbPool(
drv = RMariaDB::MariaDB(),
validationInterval = 1,
host = "relational.fel.cvut.cz",
port = 3306,
user = "guest",
password = "ctu-relational"
)
con <- pool::poolCheckout(pool)
DBI::dbExecute(con, "SET SESSION wait_timeout=2;")
#> [1] 0
pool::poolReturn(con)
Sys.sleep(3)
con <- pool::poolCheckout(pool)
DBI::dbIsValid(con)
#> [1] TRUE
DBI::dbGetQuery(con, "SELECT 1")
#> 1
#> 1 1 Created on 2024-08-21 with reprex v2.1.0 defaultpool <- pool::dbPool(
drv = RMariaDB::MariaDB(),
validationInterval = 60,
host = "relational.fel.cvut.cz",
port = 3306,
user = "guest",
password = "ctu-relational"
)
con <- pool::poolCheckout(pool)
DBI::dbExecute(con, "SET SESSION wait_timeout=2;")
#> [1] 0
pool::poolReturn(con)
Sys.sleep(3)
con <- pool::poolCheckout(pool)
DBI::dbIsValid(con)
#> [1] TRUE
DBI::dbGetQuery(con, "SELECT 1")
#> Error: Server has gone away [2006] Created on 2024-08-21 with reprex v2.1.0 |
Example for race condition with pool <- pool::dbPool(
drv = RMariaDB::MariaDB(),
validationInterval = 1,
host = "relational.fel.cvut.cz",
port = 3306,
user = "guest",
password = "ctu-relational"
)
con <- pool::poolCheckout(pool)
DBI::dbExecute(con, "SET SESSION wait_timeout=2;")
#> [1] 0
pool::poolReturn(con)
Sys.sleep(1)
con <- pool::poolCheckout(pool)
Sys.sleep(2)
DBI::dbIsValid(con)
#> [1] TRUE
DBI::dbGetQuery(con, "SELECT 1")
#> Error: Server has gone away [2006] Created on 2024-08-21 with reprex v2.1.0 |
My understanding of the way we intend people to use (It has been quite a while since I’ve thought about this package so please excuse me if I’ve forgotten some important use case!) |
A classic case of "you're holding it the wrong way", it seems? 🙃 Do you think it's worth stressing this point in pool <- pool::dbPool(
drv = RMariaDB::MariaDB(),
validationInterval = 1,
host = "relational.fel.cvut.cz",
port = 3306,
user = "guest",
password = "ctu-relational"
)
DBI::dbExecute(pool, "SET SESSION wait_timeout=2;")
#> [1] 0
Sys.sleep(3)
DBI::dbIsValid(pool)
#> [1] TRUE
DBI::dbGetQuery(pool, "SELECT 1")
#> 1
#> 1 1 Created on 2024-08-21 with reprex v2.1.0 |
We totally could have a documentation problem. IIRC when Barbara and I originally wrote this package the intention was to have everyone do poolCheckout/poolReturn, it was Hadley’s idea to have the pool itself act like a connection. I don’t remember what the docs emphasized (and I’m about to board a plane so I won’t look right now). Is this all happening because there’s a real database out there with a session timeout that’s shorter than 60 seconds? |
No, the timeout could be arbitrarily long. I did assume that pool works this way, but never got the chance to try it in production. The first attempt wasn't successful, we picked shorter timeouts to be able to reprex. Happy to help review a documentation PR. |
After reviewing the docs, I think we might want to slightly reframe the use cases. In Shiny (shiny?) apps with database access, querying the data from a database will block not only the UI of the app currently carrying out the database operation, but also the UIs of all other apps managed by the same process. (We learned that the hard way.) It's rare that we need to serve multiple connections (because all database accesses are short-lived anyway, with implicit checkout + read + return in close succession), but more important to keep the connections stable in the presence of server settings we can't control. Async DB access is something that ADBC/adbi might be able to solve, but a few more things will need to fall into place before this happens. (Also, should it be async DB, or async DB + computation + ... in a separate R process handled by mirai?) Happy to discuss. |
I think you mean "session" instead of app? IIRC the relationship between R process and Shiny app is still only 1:1, i.e., you can't run more than one Shiny app simultaneously--but you can have more than one user simultaneously connected to a single app/process.
This is all true but I'm not sure what change in framing you're suggesting? If you use
Unfortunately I think the answer is "it depends"... 😄 |
Added a note to |
Thanks, the doc extension looks good. I have added my vision to #190 . These cover what I would add to make sure the reconnection part is covered too. Other than that, yes, the behavior is what I'd be looking for. Thank you for your support! Yes, session => app 🤦 It depends, yes. |
@krlmlr as we discussed last week, if the DB connection is known to block then a solution would be to use one dedicated mirai process to host the DBI / pool connection for the app (can even be ADBC) - something along the lines of This would operate independently of other async computations you might (or might not) want to run for the Shiny app using the default or another profile. Just in relation to the case you mentioned, not talking about broader implications (yet 😄). |
As Joe said, it depends. If we move just the database into one process, but have other time-consuming computations and move them to another process, a lot of time will be spent on communication. That's my primary concern and the reason I'm not pushing too hard for "only" database async at this time. I'd like to understand this space better before moving. |
I think this also depends on what we're trying to solve. (i) for keeping the Shiny app responsive, both will be async and not block, and (ii) if we're not moving around huge amounts of data, the pure communications overhead is likely to be minimal (and 1/100 of what you might have in mind if your frame of reference is pre-mirai).
I see it as an opportunity to solve both cases rather than either/or, as I agree that it probably 'depends'! If I follow my original idea, Just floating some thoughts. |
In some setups, the connection to the database server will be closed by the server or the client library, e.g., after a period of time. Such a connection still gives
TRUE
fordbIsValid()
, but running a query will fail.Can we extend
dbSendQuery()
anddbSendStatement()
to do a "smoke test" withSELECT 1
after failure, to see if the connection is still alive, and perhaps automatically attempt a reconnect?CC @kjellpk.
The text was updated successfully, but these errors were encountered: