diff --git a/Exercises/Session 3 - Optimization and Simulation in TT format (with solutions).ipynb b/Exercises/Session 3 - Optimization and Simulation in TT format (with solutions).ipynb new file mode 100644 index 0000000..e482003 --- /dev/null +++ b/Exercises/Session 3 - Optimization and Simulation in TT format (with solutions).ipynb @@ -0,0 +1,608 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "e4f44eb7", + "metadata": {}, + "source": [ + "$\\def\\tcoreleft{\\underset{\\tiny\\mid}{\\textcolor{MidnightBlue}{⦸}}}$\n", + "$\\def\\tcorecenter{\\underset{\\tiny\\mid}{\\textcolor{RedOrange}{⦿}}}$\n", + "$\\def\\tcoreright{\\underset{\\tiny\\mid}{\\textcolor{MidnightBlue}{\\oslash}}}$\n", + "

TMQS Workshop 2024 @ Zuse Institute Berlin

\n", + "

Summer School on Tensor Methods for Quantum Simulation

\n", + "

June 3 - 5, 2024

\n", + "

$\\tcoreleft - \\tcoreleft - \\tcoreleft - \\cdots - \\tcorecenter - \\cdots - \\tcoreright - \\tcoreright$

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "58a30939-f570-45cd-a736-d6f21aeb2a0c", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "id": "6c3441f3-b5a9-42c4-ba21-2bc682b0d8ac", + "metadata": {}, + "source": [ + "## **Session 3 - Optimization and Simulation in TT format**" + ] + }, + { + "cell_type": "markdown", + "id": "6f6cc702", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "858a33fe-e16e-4dbb-b369-511d7c71fe71", + "metadata": {}, + "source": [ + "## Exercise 3.1\n", + "\n", + "Now, let's look at a well-known example for supervised learning problem which involves the classification of handwritten digits. One of the frequently used datasets in this context is the so-called MNIST dataset. It actually contains 60,000 training images and 10,000 test images with their corresponding classes (i.e., 0,...,9). To reduce computation time, we will examine a reduced dataset extracted from the MNIST dataset. This consists of images with 7x7 pixels, representing only the digits 0 and 1. We have 500 training images and 100 test images at our disposal.\n", + "\n", + "**a)**$\\quad$Load the dataset and take a look at some of the training images:\n", + "\n", + "> data = np.load('MNIST_full.npz')\n", + "\n", + "$\\hspace{0.8cm}$You can access the arrays ```x_train, y_train, x_test, y_test``` by, e.g., ```data['x_train']```.\n", + "\n", + "$\\hspace{0.8cm}$The arrays ```x_train``` and ```x_test``` have shape $49 \\times 500$ and $49 \\times 100$, respectively and contain the (flattened) images.\n", + "\n", + "$\\hspace{0.8cm}$The arrays ```y_train``` and ```y_test``` have shape $2 \\times 500$ and $2 \\times 100$, respectively and contain the corresponding classes (in one-hot encoding).\n", + "\n", + "**b)**$\\quad$For the construction of the transformed data tensor $\\mathbf{\\Theta}$, we choose the two basis functions $\\sin(\\alpha x)$ and $\\cos(\\alpha x)$ with $\\alpha=0.5 \\pi$.\n", + "\n", + "$\\hspace{0.8cm}$For this purpose, we use the functions from scikit_tt.data_driven.transform, i.e.,\n", + "\n", + "> import scikit_tt.data_driven.transform as tdt\n", + ">\n", + "> basis_list = []\n", + "> \n", + "> for i in range(order):\n", + "> \n", + ">    basis_list.append([tdt.Cos(i, alpha), tdt.Sin(i, alpha)])\n", + "\n", + "$\\hspace{0.8cm}$Note that ```order``` is simply the number of pixels.\n", + "\n", + "**c)**$\\quad$In the next step, we define the initial guess $\\mathbf{\\Xi}$ for the optimization problems\n", + "\n", + "$\\hspace{1.25cm}$$\\displaystyle \\min_{\\mathbf{\\Xi} \\in \\mathbb{T}} \\lVert \\mathbf{\\Xi}^\\top \\mathbf{\\Theta} - Y_i \\rVert_F$,\n", + "\n", + "$\\hspace{0.8cm}$where $Y_i$ denotes the $i$th row of ```y_train```. We specify that $\\mathbb{T}$ here consists only of tensor trains with a TT rank of 1, i.e.,\n", + "\n", + "> cores = [np.ones([1, 2, 1, 1]) for i in range(order)]\n", + ">\n", + "> initial_guess = TT(cores).ortho()\n", + "\n", + "**d)**$\\quad$Finally, we use the ARR routine from ```scikit_tt.data_driven.regression``` to optimize the tensors for the individual learning problems:\n", + "\n", + "> import scikit_tt.data_driven.regression as reg\n", + ">\n", + "> xi = reg.arr(x_train, y_train, basis_list, initial_guess, repeats=5, progress=False)\n", + "\n", + "**e)**$\\quad$To apply our coefficient tensors to the test data, we construct the corresponding transformed data tensor:\n", + "\n", + "> Theta = tdt.basis_decomposition(x_test, basis_list).transpose(cores=49)\n", + "\n", + "$\\hspace{0.8cm}$The corresponding (approximate) label vectors can then be computed by contracting the coefficient tensors with $\\mathbf{\\Theta}$. \n", + "\n", + "$\\hspace{0.8cm}$For example, the label vector for class 0 can be computed as follows:\n", + "\n", + "> xi_0 = TT(xi[0].cores + [np.ones([1,1,1,1])])\n", + ">\n", + "> y_0 = (xi_0.transpose()@Theta).matricize()\n", + "\n", + "$\\hspace{0.8cm}$Give these lines some thought!\n", + "\n", + "**e)**$\\quad$The row indices of the largest entries of $\\begin{pmatrix} - y_0 - \\\\ - y_1 - \\end{pmatrix}$ determine the detected labels. Compute the classification rate!" + ] + }, + { + "cell_type": "markdown", + "id": "bc6abad4-3b47-447c-9e87-86b44381164b", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "id": "185135f7-a5b7-4db9-9192-d8ff9a0a532f", + "metadata": {}, + "source": [ + "$\\textcolor{red}{\\textbf{SOLUTION:}}$" + ] + }, + { + "cell_type": "markdown", + "id": "d3a9188e-d384-4fc5-8022-34050c91807b", + "metadata": {}, + "source": [ + "**a)**" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "1a02a147-592d-4c28-981e-5859967afa0c", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZgAAAGdCAYAAAAv9mXmAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAWx0lEQVR4nO3dbWxUBb7H8d/QoYNiOwpaaG+H2lUiD6XItqy24IqCze1VgnF1dS9i19190U15sjFxq7nRfWL0xW7UoM2W3ctKNliy2eVho4DdrC16sUqrjb1oEBaSjkK3F+LOlCZ3gHLuC6+TrQhypuffw8x+P8lJnMmZnN9E4tfTKW3AcRxHAAB4bJzfAwAA2YnAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAE8GxvuDZs2d19OhR5eXlKRAIjPXlAQCj4DiOBgcHVVRUpHHjLnyPMuaBOXr0qCKRyFhfFgDgoVgspuLi4gueM+aBycvLkyQt1L8pqPFjfXkAwCic0Wm9qVdT/y2/kDEPzOdfFgtqvIIBAgMAGeX/f3rlxXzEwYf8AAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEykFZgXX3xRpaWlmjBhgioqKvTGG294vQsAkOFcB2bLli1au3atnnjiCb333nu65ZZbVFtbq76+Pot9AIAM5Towv/zlL/X9739fP/jBDzRz5kw9++yzikQiam5uttgHAMhQrgJz6tQpdXd3q6amZsTzNTU12rt375e+JplMKpFIjDgAANnPVWCOHz+u4eFhTZkyZcTzU6ZMUX9//5e+JhqNKhwOp45IJJL+WgBAxkjrQ/5AIDDiseM45zz3uaamJsXj8dQRi8XSuSQAIMME3Zx89dVXKycn55y7lYGBgXPuaj4XCoUUCoXSXwgAyEiu7mByc3NVUVGhtra2Ec+3tbWpurra02EAgMzm6g5GkhobG7VixQpVVlaqqqpKLS0t6uvrU319vcU+AECGch2Y+++/XydOnNBPfvITHTt2TGVlZXr11VdVUlJisQ8AkKECjuM4Y3nBRCKhcDisRVqmYGD8WF4aADBKZ5zTatd2xeNx5efnX/BcfhYZAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABNBvwfg0hUsnOr3BE/1t1z494dnov+Y8arfEzz3xH8+5PcEzxVH9/o9wRfcwQAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJhwHZg9e/Zo6dKlKioqUiAQ0LZt2wxmAQAynevADA0Nae7cuVq/fr3FHgBAlgi6fUFtba1qa2sttgAAsojrwLiVTCaVTCZTjxOJhPUlAQCXAPMP+aPRqMLhcOqIRCLWlwQAXALMA9PU1KR4PJ46YrGY9SUBAJcA8y+RhUIhhUIh68sAAC4x/D0YAIAJ13cwJ0+e1KFDh1KPjxw5op6eHk2aNEnTpk3zdBwAIHO5DkxXV5duu+221OPGxkZJUl1dnX772996NgwAkNlcB2bRokVyHMdiCwAgi/AZDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATQb8HZIvg1Cl+T/DcK927/J7gqWdOTPd7guf+azD73tN1/3rY7wmeS0b9XuAP7mAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMuApMNBrV/PnzlZeXp4KCAt199906cOCA1TYAQAZzFZiOjg41NDSos7NTbW1tOnPmjGpqajQ0NGS1DwCQoYJuTt61a9eIxxs3blRBQYG6u7v1zW9+09NhAIDM5iowXxSPxyVJkyZNOu85yWRSyWQy9TiRSIzmkgCADJH2h/yO46ixsVELFy5UWVnZec+LRqMKh8OpIxKJpHtJAEAGSTswK1eu1Pvvv6+XX375guc1NTUpHo+njlgslu4lAQAZJK0vka1atUo7duzQnj17VFxcfMFzQ6GQQqFQWuMAAJnLVWAcx9GqVau0detWtbe3q7S01GoXACDDuQpMQ0ODNm/erO3btysvL0/9/f2SpHA4rMsuu8xkIAAgM7n6DKa5uVnxeFyLFi1SYWFh6tiyZYvVPgBAhnL9JTIAAC4GP4sMAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMuPqVyTi/Dx+/1u8Jnpve/l2/J3jqa//e4/cEz3363Zv9nuC5qQ8f8XsCPMIdDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAlXgWlublZ5ebny8/OVn5+vqqoq7dy502obACCDuQpMcXGxnn76aXV1damrq0u33367li1bpv3791vtAwBkqKCbk5cuXTri8c9//nM1Nzers7NTs2fP9nQYACCzuQrMPxoeHtbvf/97DQ0Nqaqq6rznJZNJJZPJ1ONEIpHuJQEAGcT1h/y9vb264oorFAqFVF9fr61bt2rWrFnnPT8ajSocDqeOSCQyqsEAgMzgOjA33HCDenp61NnZqR/+8Ieqq6vTBx98cN7zm5qaFI/HU0csFhvVYABAZnD9JbLc3Fxdf/31kqTKykrt27dPzz33nH71q1996fmhUEihUGh0KwEAGWfUfw/GcZwRn7EAACC5vIN5/PHHVVtbq0gkosHBQbW2tqq9vV27du2y2gcAyFCuAvO3v/1NK1as0LFjxxQOh1VeXq5du3bpjjvusNoHAMhQrgLzm9/8xmoHACDL8LPIAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJgI+j0gW9xx8/t+T/DckUdu8HsCvkLyqoDfEzx3oL/A7wmeu1b9fk/wBXcwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJkYVmGg0qkAgoLVr13o0BwCQLdIOzL59+9TS0qLy8nIv9wAAskRagTl58qSWL1+uDRs26KqrrvJ6EwAgC6QVmIaGBt15551asmTJV56bTCaVSCRGHACA7Bd0+4LW1la9++672rdv30WdH41G9eMf/9j1MABAZnN1BxOLxbRmzRr97ne/04QJEy7qNU1NTYrH46kjFoulNRQAkFlc3cF0d3drYGBAFRUVqeeGh4e1Z88erV+/XslkUjk5OSNeEwqFFAqFvFkLAMgYrgKzePFi9fb2jnju4Ycf1owZM/TYY4+dExcAwD8vV4HJy8tTWVnZiOcmTpyoyZMnn/M8AOCfG3+THwBgwvV3kX1Re3u7BzMAANmGOxgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJoJ+D8gW7xwr8XuC5259/r/9nuCprv/5mt8TPHdPYbvfEzz39r0z/J7guWG/B/iEOxgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATrgLz1FNPKRAIjDimTp1qtQ0AkMGCbl8we/Zs/fnPf049zsnJ8XQQACA7uA5MMBjkrgUA8JVcfwZz8OBBFRUVqbS0VA888IAOHz58wfOTyaQSicSIAwCQ/VwF5qabbtKmTZu0e/dubdiwQf39/aqurtaJEyfO+5poNKpwOJw6IpHIqEcDAC59rgJTW1urb33rW5ozZ46WLFmiV155RZL00ksvnfc1TU1NisfjqSMWi41uMQAgI7j+DOYfTZw4UXPmzNHBgwfPe04oFFIoFBrNZQAAGWhUfw8mmUzqww8/VGFhoVd7AABZwlVgHn30UXV0dOjIkSN6++23de+99yqRSKiurs5qHwAgQ7n6EtnHH3+s73znOzp+/LiuueYa3Xzzzers7FRJSYnVPgBAhnIVmNbWVqsdAIAsw88iAwCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJggMAAAEwQGAGAi6PeAbFHUMOj3BM913lrp9wRP5Zxy/J7gube2fez3BM85pw/7PQEe4Q4GAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADAhOvAfPLJJ3rwwQc1efJkXX755brxxhvV3d1tsQ0AkMGCbk7+9NNPtWDBAt12223auXOnCgoK9Ne//lVXXnml0TwAQKZyFZhnnnlGkUhEGzduTD137bXXer0JAJAFXH2JbMeOHaqsrNR9992ngoICzZs3Txs2bLjga5LJpBKJxIgDAJD9XAXm8OHDam5u1vTp07V7927V19dr9erV2rRp03lfE41GFQ6HU0ckEhn1aADApS/gOI5zsSfn5uaqsrJSe/fuTT23evVq7du3T2+99daXviaZTCqZTKYeJxIJRSIRLdIyBQPjRzH90hIs/he/J3juxK3Z9T8DOacu+o96xsjb9p7fEzznnD7l9wRcwBnntNq1XfF4XPn5+Rc819UdTGFhoWbNmjXiuZkzZ6qvr++8rwmFQsrPzx9xAACyn6vALFiwQAcOHBjx3EcffaSSkhJPRwEAMp+rwDzyyCPq7OzUunXrdOjQIW3evFktLS1qaGiw2gcAyFCuAjN//nxt3bpVL7/8ssrKyvTTn/5Uzz77rJYvX261DwCQoVz9PRhJuuuuu3TXXXdZbAEAZBF+FhkAwASBAQCYIDAAABMEBgBggsAAAEwQGACACQIDADBBYAAAJggMAMAEgQEAmCAwAAATBAYAYILAAABMEBgAgAkCAwAwQWAAACYIDADABIEBAJhw/SuTR8txHEnSGZ2WnLG+uqGzSb8XeG741P/6PcFTzuls+gP3mTPOab8neM7JwveUTc7os38/n/+3/EICzsWc5aGPP/5YkUhkLC8JAPBYLBZTcXHxBc8Z88CcPXtWR48eVV5engKBgNl1EomEIpGIYrGY8vPzza4zlnhPl75sez8S7ylTjNV7chxHg4ODKioq0rhxF/6UZcy/RDZu3LivrJ6X8vPzs+YP0Od4T5e+bHs/Eu8pU4zFewqHwxd1Hh/yAwBMEBgAgImsDUwoFNKTTz6pUCjk9xTP8J4ufdn2fiTeU6a4FN/TmH/IDwD455C1dzAAAH8RGACACQIDADBBYAAAJrIyMC+++KJKS0s1YcIEVVRU6I033vB70qjs2bNHS5cuVVFRkQKBgLZt2+b3pFGJRqOaP3++8vLyVFBQoLvvvlsHDhzwe9aoNDc3q7y8PPWX3KqqqrRz506/Z3kmGo0qEAho7dq1fk8ZlaeeekqBQGDEMXXqVL9njconn3yiBx98UJMnT9bll1+uG2+8Ud3d3X7PkpSFgdmyZYvWrl2rJ554Qu+9955uueUW1dbWqq+vz+9paRsaGtLcuXO1fv16v6d4oqOjQw0NDers7FRbW5vOnDmjmpoaDQ0N+T0tbcXFxXr66afV1dWlrq4u3X777Vq2bJn279/v97RR27dvn1paWlReXu73FE/Mnj1bx44dSx29vb1+T0rbp59+qgULFmj8+PHauXOnPvjgA/3iF7/QlVde6fe0zzhZ5hvf+IZTX18/4rkZM2Y4P/rRj3xa5C1JztatW/2e4amBgQFHktPR0eH3FE9dddVVzq9//Wu/Z4zK4OCgM336dKetrc259dZbnTVr1vg9aVSefPJJZ+7cuX7P8Mxjjz3mLFy40O8Z55VVdzCnTp1Sd3e3ampqRjxfU1OjvXv3+rQKXyUej0uSJk2a5PMSbwwPD6u1tVVDQ0Oqqqrye86oNDQ06M4779SSJUv8nuKZgwcPqqioSKWlpXrggQd0+PBhvyelbceOHaqsrNR9992ngoICzZs3Txs2bPB7VkpWBeb48eMaHh7WlClTRjw/ZcoU9ff3+7QKF+I4jhobG7Vw4UKVlZX5PWdUent7dcUVVygUCqm+vl5bt27VrFmz/J6VttbWVr377ruKRqN+T/HMTTfdpE2bNmn37t3asGGD+vv7VV1drRMnTvg9LS2HDx9Wc3Ozpk+frt27d6u+vl6rV6/Wpk2b/J4myYefpjwWvvhrABzHMf3VAEjfypUr9f777+vNN9/0e8qo3XDDDerp6dHf//53/eEPf1BdXZ06OjoyMjKxWExr1qzRa6+9pgkTJvg9xzO1tbWpf54zZ46qqqp03XXX6aWXXlJjY6OPy9Jz9uxZVVZWat26dZKkefPmaf/+/WpubtZDDz3k87osu4O5+uqrlZOTc87dysDAwDl3NfDfqlWrtGPHDr3++utj+iscrOTm5ur6669XZWWlotGo5s6dq+eee87vWWnp7u7WwMCAKioqFAwGFQwG1dHRoeeff17BYFDDw8N+T/TExIkTNWfOHB08eNDvKWkpLCw8539gZs6cecl8U1NWBSY3N1cVFRVqa2sb8XxbW5uqq6t9WoUvchxHK1eu1B//+Ef95S9/UWlpqd+TTDiOo2QyM3+V9uLFi9Xb26uenp7UUVlZqeXLl6unp0c5OTl+T/REMpnUhx9+qMLCQr+npGXBggXnfIv/Rx99pJKSEp8WjZR1XyJrbGzUihUrVFlZqaqqKrW0tKivr0/19fV+T0vbyZMndejQodTjI0eOqKenR5MmTdK0adN8XJaehoYGbd68Wdu3b1deXl7qjjMcDuuyyy7zeV16Hn/8cdXW1ioSiWhwcFCtra1qb2/Xrl27/J6Wlry8vHM+E5s4caImT56c0Z+VPfroo1q6dKmmTZumgYEB/exnP1MikVBdXZ3f09LyyCOPqLq6WuvWrdO3v/1tvfPOO2ppaVFLS4vf0z7j7zex2XjhhReckpISJzc31/n617+e8d/++vrrrzuSzjnq6ur8npaWL3svkpyNGzf6PS1t3/ve91J/5q655hpn8eLFzmuvveb3LE9lw7cp33///U5hYaEzfvx4p6ioyLnnnnuc/fv3+z1rVP70pz85ZWVlTigUcmbMmOG0tLT4PSmFH9cPADCRVZ/BAAAuHQQGAGCCwAAATBAYAIAJAgMAMEFgAAAmCAwAwASBAQCYIDAAABMEBgBggsAAAEwQGACAif8D7ds0XBw+UqIAAAAASUVORK5CYII=", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "\n", + "data = np.load('MNIST_reduced.npz')\n", + "x_train = data['x_train']\n", + "y_train = data['y_train']\n", + "x_test = data['x_test']\n", + "y_test = data['y_test']\n", + "\n", + "plt.imshow(x_train[:,11].reshape([7,7]))\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "76766d73-a577-4e86-8ae6-8f8cd1c0a737", + "metadata": {}, + "source": [ + "**b)**" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "a97833e1-9bb5-48ed-a1e6-e5a32a169e14", + "metadata": {}, + "outputs": [], + "source": [ + "import scikit_tt.data_driven.transform as tdt\n", + "\n", + "order = 49\n", + "alpha = 0.5 * np.pi\n", + "basis_list = []\n", + "for i in range(order):\n", + " basis_list.append([tdt.Cos(i, alpha), tdt.Sin(i, alpha)])" + ] + }, + { + "cell_type": "markdown", + "id": "b3ff0acc-80e0-456e-ab39-348964498f31", + "metadata": {}, + "source": [ + "**c)**" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "0dd84cb1-3ad7-47c0-8cc9-fde62c75d8a0", + "metadata": {}, + "outputs": [], + "source": [ + "from scikit_tt.tensor_train import TT\n", + "\n", + "cores = [np.ones([1, 2, 1, 1]) for i in range(order)]\n", + "initial_guess = TT(cores).ortho()" + ] + }, + { + "cell_type": "markdown", + "id": "f3d10764-ddfa-41d7-8e9e-b464052c7be6", + "metadata": {}, + "source": [ + "**d)**" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "b77d22be-6549-44ce-b4e5-1b9cb2dea27f", + "metadata": {}, + "outputs": [], + "source": [ + "import scikit_tt.data_driven.regression as reg\n", + "\n", + "xi = reg.arr(x_train, y_train, basis_list, initial_guess, repeats=5, progress=False)" + ] + }, + { + "cell_type": "markdown", + "id": "0a63ef0d-8c1f-4113-b69e-985fd2175fb8", + "metadata": {}, + "source": [ + "**e)**" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "ef8e566d-367c-41ac-bc47-e05eb0b7fe7c", + "metadata": {}, + "outputs": [], + "source": [ + "Theta = tdt.basis_decomposition(x_test, basis_list).transpose(cores=49)\n", + "xi_0 = TT(xi[0].cores + [np.ones([1,1,1,1])])\n", + "xi_1 = TT(xi[1].cores + [np.ones([1,1,1,1])])\n", + "y_0 = (xi_0.transpose()@Theta).matricize()\n", + "y_1 = (xi_1.transpose()@Theta).matricize()" + ] + }, + { + "cell_type": "markdown", + "id": "28ebc97a-6721-4fb3-95b2-57e1e9f8c266", + "metadata": {}, + "source": [ + "**f)**" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "235e000f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "classification rate: 99.0\n" + ] + } + ], + "source": [ + "inds = np.argmax(np.vstack([y_0, y_1]), axis=0)\n", + "sol = np.zeros([2,100])\n", + "sol[inds, np.arange(0,100)] = 1\n", + "classification_rate = 100 - np.sum(np.abs(sol - y_test)) / 2\n", + "print('classification rate:', classification_rate)" + ] + }, + { + "cell_type": "markdown", + "id": "57638421-d883-4f7f-ab0f-aae0505cffca", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "c7140595-9875-4201-ad30-e186a6722ff8", + "metadata": {}, + "source": [ + "## Exercise 3.2 \n", + "\n", + "To conclude, let's take a look at a modified version of an established model for the CO oxidation at RuO2(110), which is a popular fruit-fly system in the theoretical study of heterogeneous catalysis (i.e., a type of catalysis in which the phase of the catalyst differs from that of the reactants). In multiscale models of heterogeneous catalysis, one crucial point is the solution of a Markovian master equation describing the stochastic reaction kinetics. This usually is too high-dimensional to be solved with standard numerical techniques\n", + "and one has to rely on sampling approaches based on the kinetic Monte Carlo method. Here, we want to break the curse of dimensionality for the direct solution of the Markovian master equation by exploiting the TT Format.\n", + "\n", + "The state space of the model is $3^d$, where $d$ is the number of reaction sites in our model (coupled in a nearest-neighbor fashion). That is, each site can be in three different states: $0 = \\text{empty}$, $1 = \\text{O-covered}$, $2 = \\text{CO-covered}$.\n", + "\n", + "There are seven elementary reactions:\n", + "\n", + "- **O$_2$ adsorption:** if two neighboring sites are empty, they can adsorb a O$_2$ molecule such that both sites are O-covered afterwards\n", + "- **CO adsorption:** if one site is empty, it can adsorb a CO molecule such that the site is CO-covered afterwards\n", + "- **O$_2$ desorption:** reversed process of the O$_2$ adsorption\n", + "- **CO desorption:** reversed process of the CO adsorption\n", + "- **CO$_2$ desorption:** if two neighboring sites are covered by CO and O, respectively, a (gaseous) CO$_2$ molecule can be formed\n", + "- **O diffusion:** an adsorbed O atom can move from one site to a neighboring one\n", + "- **CO diffusion:** an adsorbed O atom can move from one site to a neighboring one\n", + "\n", + "The corresponding reaction rate constants are \n", + "\n", + "- k_ad_o2=1\n", + "- k_ad_co=0.01\n", + "- k_de_o2=0.001 \n", + "- k_de_co=1\n", + "- k_de_co2=1\n", + "- k_diff_o=0.0001\n", + "- k_diff_co=0.0001\n", + "\n", + "**a)**$\\quad$First, we build the infinitesimal generator of our model. \n", + "\n", + "$\\hspace{0.8cm}$An entry (transition rate) of this tensor is equal to the reaction rate constant of the corresponding reaction (if there exists one). \n", + "\n", + "$\\hspace{0.8cm}$For this, we use the automatic construction of SLIM decompositions for nearest-neighbor interaction systems provided by Scikit-TT. \n", + "\n", + "$\\hspace{0.8cm}$In order to do this, you have to define two lists (of lists): ```single_cell_reactions``` and ```two_cell_reactions```.\n", + "\n", + "$\\hspace{0.8cm}$The list ```single_cell_reactions```is given by \n", + "\n", + "> single_cell_reactions = [[0, 2, k_ad_co], [2, 0, k_de_co]]\n", + "\n", + "$\\hspace{0.8cm}$The first number in each list represents the state of a site before the reaction, the second number is the state after the reaction.\n", + "\n", + "$\\hspace{0.8cm}$Define the list ```two_cell_reactions```! Here each entry is a list of the form\n", + "\n", + "> [state of site 1 before, state of site 1 afterwards, state of site 2 before, state of site 2 afterwards, reaction rate constant]```\n", + "\n", + "$\\hspace{0.8cm}$Then you can automatically construct the operator by\n", + "\n", + "> import scikit_tt.slim as slim\n", + "> \n", + "> operator = slim.slim_mme_hom([3] * d, single_cell_reactions, two_cell_reactions, cyclic=True)\n", + "\n", + "$\\hspace{0.8cm}$where ```d``` denotes the order of the system, i.e., the number of sites.\n", + "\n", + "$\\hspace{0.8cm}$The last argument ensures that we have a cyclic TT operator in order to mimic an infinite surface. Therefore, the last site is reaction-wise connected to the first one.\n", + "\n", + "**b)**$\\quad$We want to simulate the corresponding master equation up to time $T=1000$ until the system relaxed to a steady state.\n", + "\n", + "$\\hspace{0.8cm}$Define the initial value:\n", + "\n", + "> initial_value = tt.unit(operator.row_dims, [1] * operator.order)\n", + "\n", + "$\\hspace{0.8cm}$To simulate the system, we use the implicit Euler method, which is available in ```scikit_tt.solvers.ode```:\n", + "\n", + "> solution = ode.implicit_euler(operator, initial_value, initial_guess, step_sizes=[1]*T, repeats=2, progress=False)\n", + "\n", + "$\\hspace{0.8cm}$Here, the TT instance ```initial_guess``` defines the initial guess for each time step. You can use:\n", + "\n", + "> initial_guess = tt.ones(operator.row_dims, [1] * operator.order, ranks=r).ortho()\n", + "\n", + "$\\hspace{0.8cm}$where ```r``` is the rank of the TT representation.\n", + "\n", + "**c)**$\\quad$Run the simulations for $d=20$ (How many states do we consider here?) and $r=1,3,5$ and compute the turn-over frequencies (TOFs) for different reactions. \n", + "\n", + "$\\hspace{0.8cm}$A TOF measure how often a reaction is executed per unit time and unit surface.\n", + "\n", + "$\\hspace{0.8cm}$Compute the TOFs for the following reactions: CO adsorption, CO desorption, CO$_2$ desorption, O$_2$ adsorption, O$_2$ desorption.\n", + "\n", + "$\\hspace{0.8cm}$You can use the follwing routines:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "14d5b917-9670-448c-a794-c30955ba08d0", + "metadata": {}, + "outputs": [], + "source": [ + "def single_cell_tof(t, reactant_state, reaction_rate):\n", + " tt_tmp = [None] * t.order\n", + " for k in range(t.order):\n", + " tt_tmp[k] = tt.ones([1] * t.order, t.row_dims)\n", + " tt_tmp[k].cores[k] = np.zeros([1, 1, t.row_dims[k], 1])\n", + " tt_tmp[k].cores[k][0, 0, reactant_state, 0] = 1\n", + " turn_over_frequency = 0\n", + " for k in range(t.order):\n", + " turn_over_frequency += (reaction_rate / t.order) * (tt_tmp[k] @ t)\n", + " return turn_over_frequency\n", + "\n", + "\n", + "def two_cell_tof(t, reactant_states, reaction_rate):\n", + " tt_left = [None] * t.order\n", + " tt_right = [None] * t.order\n", + " for k in range(t.order):\n", + " tt_left[k] = tt.ones([1] * t.order, t.row_dims)\n", + " tt_right[k] = tt.ones([1] * t.order, t.row_dims)\n", + " tt_left[k].cores[k] = np.zeros([1, 1, t.row_dims[k], 1])\n", + " tt_left[k].cores[k][0, 0, reactant_states[0], 0] = 1\n", + " tt_right[k].cores[k] = np.zeros([1, 1, t.row_dims[k], 1])\n", + " tt_right[k].cores[k][0, 0, reactant_states[0], 0] = 1\n", + " if k > 0:\n", + " tt_left[k].cores[k - 1] = np.zeros([1, 1, t.row_dims[k - 1], 1])\n", + " tt_left[k].cores[k - 1][0, 0, reactant_states[1], 0] = 1\n", + " else:\n", + " tt_left[k].cores[-1] = np.zeros([1, 1, t.row_dims[-1], 1])\n", + " tt_left[k].cores[-1][0, 0, reactant_states[1], 0] = 1\n", + " if k < t.order - 1:\n", + " tt_right[k].cores[k + 1] = np.zeros([1, 1, t.row_dims[k + 1], 1])\n", + " tt_right[k].cores[k + 1][0, 0, reactant_states[1], 0] = 1\n", + " else:\n", + " tt_right[k].cores[0] = np.zeros([1, 1, t.row_dims[0], 1])\n", + " tt_right[k].cores[0][0, 0, reactant_states[1], 0] = 1\n", + " turn_over_frequency = 0\n", + " for k in range(t.order):\n", + " turn_over_frequency += (reaction_rate / t.order) * (tt_left[k] @ t) + (reaction_rate / t.order) * (tt_right[k] @ t)\n", + " if reactant_states[0]==reactant_states[1]:\n", + " turn_over_frequency *= 0.5\n", + " return turn_over_frequency" + ] + }, + { + "cell_type": "markdown", + "id": "3d344fde-867d-4256-bf37-2a014fe48de3", + "metadata": {}, + "source": [ + "$\\hspace{0.8cm}$Compare your results which the follwing results obtained by extensive kMC simulations:\n", + "\n", + "$\\hspace{1.4cm}$$\\text{TOF}_{\\text{CO adsorption}} = 3.7335184883e-05$\n", + "\n", + "$\\hspace{1.4cm}$$\\text{TOF}_{\\text{CO desorption}} = 1.5825615091e-05$\n", + "\n", + "$\\hspace{1.4cm}$$\\text{TOF}_{\\text{CO}_2~\\text{desorption}} = 2.1509901799e-05$\n", + "\n", + "$\\hspace{1.4cm}$$\\text{TOF}_{\\text{O}_2~\\text{adsorption}} = 1.0042688156e-03$\n", + "\n", + "$\\hspace{1.4cm}$$\\text{TOF}_{\\text{O}_2~\\text{desorption}} = 9.9351569748e-04])$\n" + ] + }, + { + "cell_type": "markdown", + "id": "83e132fb-95b5-43a8-98ca-0ed7c397767d", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "id": "1cb59312-1033-4dd0-886c-7a034eae661b", + "metadata": {}, + "source": [ + "$\\textcolor{red}{\\textbf{SOLUTION:}}$" + ] + }, + { + "cell_type": "markdown", + "id": "04858a85-b2af-4dfc-a867-56477780d7dc", + "metadata": {}, + "source": [ + "**a)**" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "95c7629b-afe0-4f7a-9223-60e890fbe0e9", + "metadata": {}, + "outputs": [], + "source": [ + "import scikit_tt.slim as slim\n", + "\n", + "def co_oxidation(order: int, cyclic: bool = True):\n", + "\n", + " # reaction rate constants\n", + " k_ad_co=0.01\n", + " k_ad_o2=1\n", + " k_de_co=1\n", + " k_de_o2=0.001\n", + " k_diff_co=0.0001\n", + " k_diff_o=0.0001\n", + " k_de_co2=1,\n", + "\n", + " # define state space\n", + " state_space = [3] * order\n", + " \n", + " # define operator using automatic construction of SLIM decomposition\n", + " single_cell_reactions = [[0, 2, k_ad_co], [2, 0, k_de_co]]\n", + " two_cell_reactions = [[0, 1, 0, 1, k_ad_o2], [1, 0, 1, 0, k_de_o2], [2, 0, 1, 0, k_de_co2],\n", + " [1, 0, 2, 0, k_de_co2], [1, 0, 0, 1, k_diff_o], [0, 1, 1, 0, k_diff_o],\n", + " [0, 2, 2, 0, k_diff_co], [2, 0, 0, 2, k_diff_co]]\n", + " # define operator\n", + " operator = slim.slim_mme_hom(state_space, single_cell_reactions, two_cell_reactions, cyclic=cyclic)\n", + " \n", + " return operator" + ] + }, + { + "cell_type": "markdown", + "id": "9fa39d73-1da2-4936-bd47-1421d66d2e19", + "metadata": {}, + "source": [ + "**b)**" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "id": "c389bb4e-b192-4c81-8b61-d3a6071836e1", + "metadata": {}, + "outputs": [], + "source": [ + "import scikit_tt.tensor_train as tt\n", + "\n", + "initial_value = tt.unit(operator.row_dims, [1] * operator.order)" + ] + }, + { + "cell_type": "markdown", + "id": "f9cfdea9-86a7-44dc-931d-a04113ac4037", + "metadata": {}, + "source": [ + "**c)**" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "id": "c9c6030d-ac33-439f-921d-a416d5a1968b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "rank: 1\n", + "closeness: 0.02049389927055082\n", + "CO adsorption: 0.9999985033320705\n", + "CO desorption: 0.9999988212074029\n", + "CO2 desorption: 0.9999982654390752\n", + "O2 adsorption: 0.9999999999998999\n", + "O2 desorption: 0.0065266118325683475\n", + " \n", + "rank: 3\n", + "closeness: 2.854591698491477e-05\n", + "CO adsorption: 0.03535894872072331\n", + "CO desorption: 0.025410651666284444\n", + "CO2 desorption: 0.03523120468111328\n", + "O2 adsorption: 0.002271096173509499\n", + "O2 desorption: 0.00026421613477488315\n", + " \n", + "rank: 5\n", + "closeness: 1.993406525169611e-05\n", + "CO adsorption: 0.036540695417844236\n", + "CO desorption: 0.037659732726426226\n", + "CO2 desorption: 0.050786659054062204\n", + "O2 adsorption: 0.0021937656676205898\n", + "O2 desorption: 0.0002735168507174837\n", + " \n" + ] + } + ], + "source": [ + "import scikit_tt.solvers.ode as ode\n", + "\n", + "tofs = np.array([3.7335184883e-05, 1.5825615091e-05, 2.1509901799e-05, 1.0042688156e-03, 9.9351569748e-04])\n", + "R = [1, 3, 5]\n", + "operator = co_oxidation(order=20, cyclic=True).ortho()\n", + "\n", + "for i in range(3):\n", + " r = R[i]\n", + " initial_guess = tt.ones(operator.row_dims, [1] * operator.order, ranks=r).ortho()\n", + " solution = ode.implicit_euler(operator, initial_value, initial_guess, step_sizes=[1]*1000, repeats=2, progress=False)\n", + " print('rank:', r)\n", + " print('closeness: ', (operator@solution[-1]).norm())\n", + " print('CO adsorption:', np.abs(single_cell_tof(solution[-1], 0, 0.01) - tofs[0])/tofs[0]) # CO adsorption\n", + " print('CO desorption:', np.abs(single_cell_tof(solution[-1], 2, 1) - tofs[1])/tofs[1]) # CO desorption\n", + " print('CO2 desorption:', np.abs(two_cell_tof(solution[-1], [2,1], 1) - tofs[2])/tofs[2]) # CO2 desorption\n", + " print('O2 adsorption:', np.abs(two_cell_tof(solution[-1], [0,0], 1) - tofs[3])/tofs[3]) # O2 adsorption\n", + " print('O2 desorption:', np.abs(two_cell_tof(solution[-1], [1,1], 0.001) - tofs[4])/tofs[4]) # O2 desorption\n", + " print(' ')" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}