Up and running with a data labeling application on top of your Databricks Lakehouse in 5 minutes.
- Databricks SQL endpoint
- Locally: Docker, Makefile
Screen.Recording.2023-03-08.at.17.45.28.webm
- Create or start an existing SQL endpoint of any size in your Databricks workspace
- On the local machine, clone the repository and create
.env
file in the repository directory. Follow the.env.sample
for instructions. - Build the demo container:
docker build -t dbsql-labeling-app-example-demo -f dockerfiles/Dockerfile.demo .
- Prepare the data (please note that this script will run
TRUNCATE TABLE
command on the table calledlabels
stored in a given catalog and database. Please make sure that such table doesn't exist!)
docker run \
-it \
--env-file=.env \
dbsql-labeling-app-example-demo \
python dbsql_labeling_app_example/loader.py
- Start the UI:
docker run \
-it \
--env-file=.env \
-p 8050:8050 \
dbsql-labeling-app-example-demo \
python dbsql_labeling_app_example/app.py
- Open http://localhost:8050 and enjoy the new application 🔥
- To stop the app use
Ctrl-C
.
- Use the VSCode + DevContainers extension
- After the start of the DevContainer run this command to open the poetry shell:
poetry shell
- For UI development with hot reloading run:
DEBUG=True python dbsql_labeling_app_example/app.py
- For ETL part, check the
dbsql_labeling_app_example/loader.py
source code