Skip to content

Connect to your database (e.g. redshift or postgres) from your jupyter notebook to interactively run queries and see the output immediately.

License

Notifications You must be signed in to change notification settings

davidkuda/nbdbsession

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nbdbsession

This code helps you to manage connections to sql databases in jupyter notebooks.

nbdbsession stands for "notebook database session".

I use notebooks very often to connect to databases like postgres. With the notebook sessions, I often work on PoCs (proof of concept), on presentations or on debugging.

catherinedevlin has created open source software that I love: ipython-sql. (Note: There was a fork of this project: jupy-sql. The original version is not actively maintained, and does not work with sqlalchemy version 2 that was released in January 2023)

This code -- nbdbsession -- lets you connect to sql databases from your notebook, and run queries.

How To Use nbdbsession

First, install it:

pip install nbdbsession

Then, in your git repository where you start your notebook, create a .settings.toml file with your database login credentials:

You need to define your database credentials in a toml file. The toml file should look like this:

# either: .settings.toml on top level of your git repo
# or:     $HOME/.nbdbsession.creds.toml

[davidkuda]
db_driver = "postgresql"
database = "dev"
user = "davidkuda"
password = "${DB_PASSWORD}" # you can use environment variables
db_url = "localhost"
port = 5439
# the ssh command is optional:
ssh_cmd = "ssh -fL 5432:db.kuda.ai:5432"

nbdbsession will scan three places for a toml file and will choose the first hit in this order:

  • env var $SETTINGS_FILE_PATH
  • (root directory of a git project)/.settings.toml
  • $HOME/.nbdbsession.creds.toml

Finally, you can connect to your database in your notebook by running the following code in a cell:

from nbdbsession.sqlconn import connect

connect("davidkuda") # note: this is the name as defined in .settings.toml

Once you have done that, you can run sql commands by prepending %sql (one line) %%sql (multi-line) in the notebook.

Run single line sql commands directly in your notebook:

%sql SELECT * FROM table LIMIT 10;

Run multi-line sql commands directly in your notebook:

%%sql
SELECT
    *
FROM
    table
LIMIT
    10;

Managing the conn without this repo

In the usual way, you would create a connection string, something like this:

%load_ext sql 

import parse

# make a connection string for your database connection
config = {
    "user": "postgres",
    "password": parse.quote("postgres"),
    "url": "127.0.0.1",
    "port": 5432,
    "database": "postgres",
}

conn_string = f'postgresql://{c["user"]}:{c["password"]}@{c["url"]}:{c["port"]}/{c["database"]}'
# results in: 'postgresql://postgres:[email protected]:5432/postgres'

%sql $conn_string

With the code in this repo, you reduce all that to one line:

from nbdbsession.sqlconn import connect

# this will enable ipython sql and use the conn str that you choose:
connect("staging")

%sql

About

Connect to your database (e.g. redshift or postgres) from your jupyter notebook to interactively run queries and see the output immediately.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages