Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating VMs from node to node not working #1256

Open
mmaced opened this issue May 3, 2024 · 9 comments
Open

Migrating VMs from node to node not working #1256

mmaced opened this issue May 3, 2024 · 9 comments
Labels
🐛 bug Something isn't working 🤷 can't reproduce

Comments

@mmaced
Copy link

mmaced commented May 3, 2024

Hello,

I setuped my proxmox infra with terraform bpg/proxmox.

Now I wanted to migrate a VM from host01 to host02 but it's not working because it's re-creating and I lose my data...
I already tried migrate = true but still didn't work.

Is there a way to migrate whenever I want using terraform bpg/proxmox?

@mmaced mmaced added the 🐛 bug Something isn't working label May 3, 2024
@bpg
Copy link
Owner

bpg commented May 3, 2024

Hi @mmacedo2000! 👋🏼

The migration support was added in #501, so in theory this should work, however, I have not tested it for a while.

How did you try migrating the VM? By updating the node_name and migrate = true in the VM resource? Is your VM is a clone of another VM?

@bpg bpg added the ⌛ pending author's response Requested additional information from the reporter label May 4, 2024
@mmaced
Copy link
Author

mmaced commented May 6, 2024

Hi @mmacedo2000! 👋🏼

The migration support was added in #501, so in theory this should work, however, I have not tested it for a while.

How did you try migrating the VM? By updating the node_name and migrate = true in the VM resource? Is your VM is a clone of another VM?

Hello,

Yes, I am trying to migrate the VM by updating the node_name and migrate = true

The VM is created with proxmox_virtual_environment_vm and I am also using a cloud init.

This is the VM creation configuration:

resource "proxmox_virtual_environment_vm" "vm" {
  name        = var.name
  description = var.description
  tags        = ["terraform"]
  #local.proxmox_tags
  node_name   = var.host_name
  migrate     = true
  cpu {
    cores = var.vm_cores
    numa  = true
    type  = "host"
  }

  memory {
    dedicated = var.vm_memory
  }

  agent {
    enabled = true
  }

  network_device {
    bridge = "vmbr0"
  }

  disk {
    datastore_id = var.disk_datasource_name
    file_id      = proxmox_virtual_environment_file.ubuntu_cloud_image.id
    interface    = "scsi0"
    size         = var.disk_size
  }

  serial_device {}

  operating_system {
    type = "l26"
  }

  initialization {
    datastore_id      = var.disk_datasource_name
    user_data_file_id = proxmox_virtual_environment_file.cloud_init.id
    ip_config {
      ipv4 {
        address = "dhcp"
      }
    }
  }
}

@bpg bpg removed the ⌛ pending author's response Requested additional information from the reporter label May 6, 2024
@mmaced
Copy link
Author

mmaced commented May 11, 2024

Hello,

Did you have a chance to look at this problem? @bpg

Thanks.

@bpg
Copy link
Owner

bpg commented May 12, 2024

Hi @mmaced, I can't reproduce this issue in my lab.

My template:

resource "proxmox_virtual_environment_vm" "ubuntu_vm" {
  name = "test"

  node_name = "pve2"
  vm_id     = 1000

  agent {
    enabled = true
  }

  cpu {
    cores    = 4
  }

  memory {
    dedicated = 4096
    #    hugepages = "any"
  }

  boot_order = ["virtio0", "scsi0"]

  disk {
    datastore_id = "local-lvm"
    file_id      = proxmox_virtual_environment_download_file.ubuntu_cloud_image.id
    interface    = "virtio0"
  }

  initialization {
    datastore_id = "local-lvm"
    ip_config {
      ipv4 {
        address = "dhcp"
      }
    }
    user_data_file_id = proxmox_virtual_environment_file.cloud_config.id
  }

  network_device {
    bridge = "vmbr0"
  }

}

resource "proxmox_virtual_environment_download_file" "ubuntu_cloud_image" {
  content_type        = "iso"
  datastore_id        = "local"
  node_name           = "pve2"
  url                 = "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
  overwrite_unmanaged = true
}

After initial deployment, I changed

  node_name = "pve1"
  migrate   = true

in the TF template for the ubuntu_vm resource, and then ran apply again, which performed the migration:

data.local_file.ssh_public_key: Reading...
data.local_file.ssh_public_key: Read complete after 0s [id=84cdc3a1d1eaa8e821cd6dd4287f04ae996b8309]
proxmox_virtual_environment_download_file.ubuntu_cloud_image: Refreshing state... [id=local:iso/jammy-server-cloudimg-amd64.img]
proxmox_virtual_environment_file.cloud_config: Refreshing state... [id=local:snippets/cloud-config.yaml]
proxmox_virtual_environment_vm.ubuntu_vm: Refreshing state... [id=1000]

OpenTofu used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

OpenTofu will perform the following actions:

  # proxmox_virtual_environment_vm.ubuntu_vm will be updated in-place
  ~ resource "proxmox_virtual_environment_vm" "ubuntu_vm" {
        id                      = "1000"
      ~ migrate                 = false -> true
        name                    = "test"
      ~ node_name               = "pve2" -> "pve1"
        tags                    = []
        # (25 unchanged attributes hidden)

        # (6 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
proxmox_virtual_environment_vm.ubuntu_vm: Modifying... [id=1000]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 10s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 20s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 30s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 40s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 50s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 1m0s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 1m10s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 1m20s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Still modifying... [id=1000, 1m30s elapsed]
proxmox_virtual_environment_vm.ubuntu_vm: Modifications complete after 1m31s [id=1000]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
Screenshot 2024-05-12 at 4 03 34 PM Screenshot 2024-05-12 at 4 04 39 PM

Could you post your terraform / tofu output from the plan or apply command when you're trying to migrate?

@bpg
Copy link
Owner

bpg commented May 22, 2024

Hey @mmaced, could you post your TF output and ideally a debug log, so I can try to debug it?

@mmaced
Copy link
Author

mmaced commented Jun 10, 2024

Hello @bpg , sorry for the late response.

So, this is my vm configuration (initially already with migrate=true, idk if that could be the problem).

##############################################
# Resource names and tags
##############################################
locals {
  proxmox_tags  = {
    Service       = "proxmox",
    Availability  = "${var.availability}",
    App           = "${var.app}",
    CreatedBy     = "${var.username}",
    Type          = "${var.created_proccess}",
    Pipeline      = "${var.pipeline_caller}",
    PipelineID    = "${var.pipeline_id}",
    CostCenter    = "${var.cost_center}",
    ModifiedAt    = timestamp()
  }
}
##############################################
# Proxmox VM
##############################################
resource "proxmox_virtual_environment_vm" "vm" {
  name        = var.name
  description = var.description
  tags        = ["terraform"]
  #local.proxmox_tags
  node_name   = var.host_name
  migrate     = true
  protection  = true
  cpu {
    cores = var.vm_cores
    numa  = true
    type  = "host"
  }

  memory {
    dedicated = var.vm_memory
  }

  agent {
    enabled = true
  }

  network_device {
    bridge = "vmbr0"
  }

  disk {
    datastore_id = var.disk_datasource_name
    file_id      = "NAS:iso/jammy-server-cloudimg-amd64.img"
    interface    = "scsi0"
    size         = var.disk_size
  }

  serial_device {}

  operating_system {
    type = "l26"
  }

  initialization {
    datastore_id      = var.disk_datasource_name

    user_data_file_id = proxmox_virtual_environment_file.cloud_init.id
    ip_config {
      ipv4 {
        address = "dhcp"
      }
    }
  }
}

Now I will change variable var.host_name to another host name and run tf.
image

and as you can see he will destroy and recreate instead of migrate:

image

@bpg
Copy link
Owner

bpg commented Jun 10, 2024

Ah, that's migration of a promox_virual_environment_file resource, not a VM.

You'd probably need to put your cloud_init file to a shared datastore (ceph, nfs, cifs, etc) to support this scenario.

Moving file resources between local datastores on different cluster nodes is not supported by the provider.

@mmaced
Copy link
Author

mmaced commented Jun 16, 2024

But this promox_virtual_environment_file is already set to be shared as a nfs (NAS2),

I am also migrating the promox_virtual_environment_vm from node_name to another and it also tries to destroy and create instead of migrate.

resource "proxmox_virtual_environment_vm" "vm" {
name = var.name
description = var.description
tags = ["terraform"]
#local.proxmox_tags
node_name = var.host_name
migrate = true
protection = true

@bpg
Copy link
Owner

bpg commented Jun 16, 2024

But this promox_virtual_environment_file is already set to be shared as a nfs (NAS2),

Great, then it doesn't need to be moved from node to node, as the file should be available under the same datastore name on all nodes.

I am also migrating the promox_virtual_environment_vm from node_name to another and it also tries to destroy and create instead of migrate.

Could you share a terraform output of this try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working 🤷 can't reproduce
Projects
None yet
Development

No branches or pull requests

2 participants