Skip to content

Latest commit

 

History

History
148 lines (115 loc) · 4.38 KB

README.md

File metadata and controls

148 lines (115 loc) · 4.38 KB

Accelerating Queries with SQLite Quickstart

Follow this quickstart to configure dataset acceleration using SQLite.

Tip: Open and refer to the SQLite Data Accelerator documentation while completing this quickstart.

Tip: Follow Advanced Data Refresh Quickstart to learn more about advanced data refresh scenarios, such as programmatically updating refresh_sql and triggering data refreshes.

Step 1. Initialize the Spice app

Ensure the Spice CLI is installed. If not, follow the Spice Getting Started guide to install.

Clone the Spice samples repository and navigate to the sqlite directory:

git clone https://github.com/spiceai/quickstarts.git
cd quickstarts/sqlite/accelerator

Start the Spice runtime

spice run

Output:

2024-09-10T06:54:36.184935Z  INFO runtime::metrics_server: Spice Runtime Metrics listening on 127.0.0.1:9090
2024-09-10T06:54:36.185086Z  INFO runtime::flight: Spice Runtime Flight listening on 127.0.0.1:50051
2024-09-10T06:54:36.187305Z  INFO runtime::http: Spice Runtime HTTP listening on 127.0.0.1:8090
2024-09-10T06:54:36.193225Z  INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052
2024-09-10T06:54:36.385124Z  INFO runtime: Initialized results cache; max size: 128.00 MiB, item ttl: 1s
2024-09-10T06:54:37.020990Z  INFO runtime: Dataset taxi_trips registered (s3://spiceai-demo-datasets/taxi_trips/2024/), results cache enabled.

Step 2. Run query against the dataset using the Spice SQL REPL

In a new terminal, start the Spice SQL REPL.

spice sql

Enter a query to display the longest taxi trips:

SELECT trip_distance, total_amount FROM taxi_trips ORDER BY trip_distance DESC LIMIT 10;

Output:

+---------------+--------------+
| trip_distance | total_amount |
+---------------+--------------+
| 312722.3      | 22.15        |
| 97793.92      | 36.31        |
| 82015.45      | 21.56        |
| 72975.97      | 20.04        |
| 71752.26      | 49.57        |
| 59282.45      | 33.52        |
| 59076.43      | 23.17        |
| 58298.51      | 18.63        |
| 51619.36      | 24.2         |
| 44018.64      | 52.43        |
+---------------+--------------+

Time: 2.1508365 seconds. 10 rows.

Step3. Enable SQLite Accelerator

Use text editor to open .spicepod.yaml and uncomment acceleration section. Save.

Before:

version: v1beta1
kind: Spicepod
name: spice_app
datasets:
  - from: s3://spiceai-demo-datasets/taxi_trips/2024/
    name: taxi_trips
    description: taxi trips in s3
    params:
      file_format: parquet
#   acceleration:
#     enabled: true
#     engine: sqlite
#     mode: file

After:

version: v1beta1
kind: Spicepod
name: spice_app
datasets:
- from: s3://spiceai-demo-datasets/taxi_trips/2024/
  name: taxi_trips
  description: taxi trips in s3
  params:
    file_format: parquet
  acceleration:
    enabled: true
    engine: sqlite
    mode: file

The following output is shown in the Spice runtime terminal confirming new configuration is applied.

2024-09-10T06:59:21.908667Z  INFO runtime: Unloaded dataset taxi_trips
2024-09-10T06:59:22.524295Z  INFO runtime: Dataset taxi_trips registered (s3://spiceai-demo-datasets/taxi_trips/2024/), acceleration (sqlite:file), results cache enabled.
2024-09-10T06:59:22.525789Z  INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips
2024-09-10T06:59:39.244473Z  INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (421.71 MiB) for dataset taxi_trips in 16s 718ms.

Run query to display the longest taxi trips again:

SELECT trip_distance, total_amount FROM taxi_trips ORDER BY trip_distance DESC LIMIT 10;

Output:

+---------------+--------------+
| trip_distance | total_amount |
+---------------+--------------+
| 312722.3      | 22.15        |
| 97793.92      | 36.31        |
| 82015.45      | 21.56        |
| 72975.97      | 20.04        |
| 71752.26      | 49.57        |
| 59282.45      | 33.52        |
| 59076.43      | 23.17        |
| 58298.51      | 18.63        |
| 51619.36      | 24.2         |
| 44018.64      | 52.43        |
+---------------+--------------+

Time: 0.193560667 seconds. 10 rows.

Observe query execution time decreased from 2.1508365 to 0.193560667 seconds.