Skip to content

Commit

Permalink
content changes
Browse files Browse the repository at this point in the history
  • Loading branch information
Yibei990826 committed Nov 24, 2024
1 parent be9280b commit f246d93
Showing 1 changed file with 53 additions and 69 deletions.
122 changes: 53 additions & 69 deletions nbs/docs/getting-started/7_why_timegpt.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we compare the performance of TimeGPT against three forecasting models: the classical model (ARIMA), the machine learning model (LGBRegressor), and the deep learning model (N-HiTS), using a subset of data from the M5 Forecasting competition. We want to highlight three top-rated benefits our users love about TimeGPT:\n",
"In this notebook, we compare the performance of TimeGPT against three forecasting models: the classical model (ARIMA), the machine learning model (LightGBM), and the deep learning model (N-HiTS), using a subset of data from the M5 Forecasting competition. We want to highlight three top-rated benefits our users love about TimeGPT:\n",
"\n",
"🎯 **Accuracy**: TimeGPT consistently outperforms traditional models by capturing complex patterns with precision.\n",
"\n",
Expand Down Expand Up @@ -346,7 +346,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Model Fitting (TimeGPT, ARIMA, LGBRegressor, N-HiTS)"
"## 2. Model Fitting (TimeGPT, ARIMA, LightGBM, N-HiTS)"
]
},
{
Expand Down Expand Up @@ -393,8 +393,8 @@
"data": {
"text/plain": [
"metric\n",
"rmse 592.586609\n",
"smape 0.049402\n",
"rmse 592.609313\n",
"smape 0.049404\n",
"Name: TimeGPT, dtype: float64"
]
},
Expand All @@ -416,7 +416,7 @@
"metadata": {},
"source": [
"### 2.2 Classical Models (ARIMA):\n",
"Secondly, we applied ARIMA, a classical statistical model, to the same forecasting task. Here, ARIMA struggled to capture the data's intricate, non-linear patterns, resulting in comparatively lower accuracy."
"Next, we applied ARIMA, a traditional statistical model, to the same forecasting task. Classical models use historical trends and seasonality to make predictions by relying on linear assumptions. However, they struggled to capture the complex, non-linear patterns within the data, leading to lower accuracy compared to other approaches. Additionally, ARIMA was slower due to its iterative parameter estimation process, which becomes computationally intensive for larger datasets."
]
},
{
Expand Down Expand Up @@ -496,9 +496,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2.3 Machine Learning Models (LGBMRegressor)\n",
"### 2.3 Machine Learning Models (LightGBM)\n",
"\n",
"Thirdly, we used machine learning model (LGBRegressor) for the same task. While LGBRegressor can capture seasonality and patterns, it requires detailed feature engineering, careful tuning, and domain knowledge to optimize performance."
"Thirdly, we used a machine learning model, LightGBM, for the same forecasting task, implemented through the automated pipeline provided by our mlforecast library.\n",
"While LightGBM can capture seasonality and patterns, achieving the best performance often requires detailed feature engineering, careful hyperparameter tuning, and domain knowledge. You can try our mlforecast library to simplify this process and get started quickly!"
]
},
{
Expand All @@ -521,37 +522,27 @@
"outputs": [],
"source": [
"import optuna\n",
"from mlforecast.auto import AutoMLForecast, AutoLightGBM"
"from mlforecast.auto import AutoMLForecast, AutoLightGBM\n",
"\n",
"# Suppress Optuna's logging output\n",
"optuna.logging.set_verbosity(optuna.logging.ERROR)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"metric\n",
"rmse 687.773744\n",
"smape 0.051448\n",
"Name: AutoLightGBM, dtype: float64"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"optuna.logging.set_verbosity(optuna.logging.ERROR)\n",
"\n",
"# Initialize an automated forecasting pipeline using AutoMLForecast.\n",
"mlf = AutoMLForecast(\n",
" models=[AutoLightGBM()],\n",
" freq='D',\n",
" season_length=7,\n",
" season_length=7, \n",
" fit_config=lambda trial: {'static_features': ['unique_id']}\n",
")\n",
"\n",
"# Fit the model to the training dataset.\n",
"mlf.fit(\n",
" df=df_train.astype({'unique_id': 'category'}),\n",
" n_windows=1,\n",
Expand Down Expand Up @@ -650,7 +641,7 @@
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "8605f576022d436fa3fe0205ddb28c62",
"model_id": "0c26ef6fd57a4ea5abd154adb2f31030",
"version_major": 2,
"version_minor": 0
},
Expand All @@ -664,7 +655,7 @@
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0d5c9a7a9d434d009d663962006251c4",
"model_id": "f35365801c1448c592757ae376217f50",
"version_major": 2,
"version_minor": 0
},
Expand All @@ -678,7 +669,7 @@
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5d7a2e04d6c64bbea8c4d24e76aa1315",
"model_id": "bab6fb21d0b642a98c42dcc8310a5a42",
"version_major": 2,
"version_minor": 0
},
Expand All @@ -702,7 +693,7 @@
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "594b4e4727054e569fdee9d0cca86acd",
"model_id": "42edfccc1470410da181e96bcef803c6",
"version_major": 2,
"version_minor": 0
},
Expand All @@ -723,24 +714,17 @@
}
],
"source": [
"#| echo: true\n",
"#| eval: false\n",
"# Initialize the N-HiTS model.\n",
"models = [NHITS(h=28, \n",
" input_size=28, \n",
" max_steps=100)]\n",
"\n",
"# Fit the model using training data\n",
"nf = NeuralForecast(models=models, freq='D')\n",
"nf.fit(df=df_train)\n",
"fcst_nhits = nf.predict()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since this machine doesn’t have GPU, the result is trained using Google Colabs."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -784,14 +768,14 @@
"\n",
"| **Model** | **RMSE** | **SMAPE** |\n",
"|------------------|----------|-----------|\n",
"| ARIMA | 1167.5 | 8.30% |\n",
"| LGBRegressor | 816.7 | 8.06% |\n",
"| N-HiTS | 748.6 | 6.06% |\n",
"| **TimeGPT** | **370.9**| **3.98%** |\n",
"\n",
"| ARIMA | 724.9 | 5.50% |\n",
"| LightGBM | 687.8 | 5.14% |\n",
"| N-HiTS | 605.0 | 5.34% |\n",
"| **TimeGPT** | **592.6**| **4.94%** |\n",
" \n",
"\n",
"#### Breakdown for Each Time-series\n",
"Followed below are the metrics for each individual time series groups. Our analysis shows that TimeGPT consistently outperforms the other models, achieving the best results for all but one group."
"Followed below are the metrics for each individual time series groups. TimeGPT consistently delivers accurate forecasts across all time series groups. In many cases, it performs as well as or better than data-specific models, showing its versatility and reliability across different datasets."
]
},
{
Expand Down Expand Up @@ -824,35 +808,35 @@
],
"source": [
"# | echo: false\n",
"# colors = [\n",
"# (\"#A9B9C3\", 0.5), # Grey-bluish color 1\n",
"# (\"#7A8D9D\", 0.5), # Grey-bluish color 2\n",
"# (\"#5B6D79\", 0.5), # Grey-bluish color 3\n",
"# ('#F95D6A', 0.75) # Green color for the last\n",
"# ]\n",
"colors = [\n",
" (\"#A9B9C3\", 0.5), # Grey-bluish color 1\n",
" (\"#7A8D9D\", 0.5), # Grey-bluish color 2\n",
" (\"#5B6D79\", 0.5), # Grey-bluish color 3\n",
" ('#F95D6A', 0.75) # Green color for the last\n",
"]\n",
"\n",
"\n",
"# # Filter evaluation data by metric and set unique_id as index\n",
"# rmse_df = evaluation_df[evaluation_df['metric'] == 'rmse'].set_index('unique_id')\n",
"# smape_df = evaluation_df[evaluation_df['metric'] == 'smape'].set_index('unique_id')\n",
"# Filter evaluation data by metric and set unique_id as index\n",
"rmse_df = evaluation_df[evaluation_df['metric'] == 'rmse'].set_index('unique_id')\n",
"smape_df = evaluation_df[evaluation_df['metric'] == 'smape'].set_index('unique_id')\n",
"\n",
"# # Plot function with custom colors and opacity\n",
"# def plot_metric(ax, df, title, ylabel):\n",
"# x = np.arange(len(df))\n",
"# bar_width = 0.2\n",
"# for i, (col, (color, alpha)) in enumerate(zip(df.columns[1:], colors)):\n",
"# ax.bar(x + i * bar_width, df[col], width=bar_width, label=col, color=color, alpha=alpha)\n",
"# ax.set(title=title, ylabel=ylabel, xticks=x + bar_width * (len(df.columns[1:]) - 1) / 2, xticklabels=df.index)\n",
"# ax.tick_params(axis='x', rotation=45)\n",
"# ax.legend()\n",
"# Plot function with custom colors and opacity\n",
"def plot_metric(ax, df, title, ylabel):\n",
" x = np.arange(len(df))\n",
" bar_width = 0.2\n",
" for i, (col, (color, alpha)) in enumerate(zip(df.columns[1:], colors)):\n",
" ax.bar(x + i * bar_width, df[col], width=bar_width, label=col, color=color, alpha=alpha)\n",
" ax.set(title=title, ylabel=ylabel, xticks=x + bar_width * (len(df.columns[1:]) - 1) / 2, xticklabels=df.index)\n",
" ax.tick_params(axis='x', rotation=45)\n",
" ax.legend()\n",
"\n",
"# # Generate side-by-side plots for RMSE and SMAPE\n",
"# fig, axes = plt.subplots(1, 2, figsize=(14, 6))\n",
"# plot_metric(axes[0], rmse_df, \"RMSE Comparison Across Models by Category\", \"RMSE\")\n",
"# plot_metric(axes[1], smape_df*100, \"%SMAPE Comparison Across Models by Category\", \"SMAPE\")\n",
"# Generate side-by-side plots for RMSE and SMAPE\n",
"fig, axes = plt.subplots(1, 2, figsize=(14, 6))\n",
"plot_metric(axes[0], rmse_df, \"RMSE Comparison Across Models by Category\", \"RMSE\")\n",
"plot_metric(axes[1], smape_df*100, \"%SMAPE Comparison Across Models by Category\", \"SMAPE\")\n",
"\n",
"# plt.tight_layout()\n",
"# plt.show()"
"plt.tight_layout()\n",
"plt.show()"
]
},
{
Expand Down

0 comments on commit f246d93

Please sign in to comment.