Option for using qemu internal iscsi driver #171

fwiesel · 2023-04-25T10:30:11Z

Qemu has a user-space implementation of iscsi,
which can be used instead of the kernel-level one.

By relying only on user-space tooling, we need
less privileges and can run more easily in containerised environments.

…ed snapshot

With our VASA provider we cannot create disks with the default minimum of 1MB as no LUNs below 4MB size can be created. The need for creating such small volumes comes from the fact, that finding the actual size of stream-optimized images is left to the vCenter, which grows the volume dynamically.

When we're creating a volume from an image, the size was set to 0 to let the vCenter figure out the uncompressed size. But with our VASA-provider and using thin-provisioning we end up with empty volumes, presumably because the volume doesn't auto-grow properly. Since we know the size the volume should have in the end in advance and we're not re-using the created VM as a template for others, we can create the volume initially with the user-wanted size and don't have to resize the volume afterwards.

Until now, when creating a snapshot from an attached volume, only the additional unwanted disks got remove and the NICs/VIFs were kept. With nsx-t this is a problem, as it does not allow duplicate MAC addresses on the same logical switch. Therefore, we now remove all VIFs - or rather every device that has a "macAddress" attribute - in the clone-process.

* Adding-support-for-online-resize-cinder-volume * pep8 fixes * Added vmware_online_resize config option to disable online-resize

This filters out backends based on the shard defined in the backend's extra_capabilities and in the project's tags.

This is addressing the fix for restoring VVol volumes from swift backup. cinder-backup needs to upload the backup data to VMWare via HttpNfc API by performing a ImportVApp call. Since this operation is about to replace the existing backing, we want to keep the main logic in cinder-volume. Thus, we instruct cinder-backup how to build the spec for ImportVApp via a json-like syntax, since it seems that suds objects can't pe pickled or simply can't be sent plain over RPC.

After a restore backup operation, we correct the name and backing uuid for the newly created backing. This commit also includes moving some values into constants and updating the unit tests. [SAP] fix VM related constants [SAP] vmdk driver - adding comment for terminate_connection()

Got notified about this by `tox -e pep8` ... didn't know.

When we set a datastore in maintenace, we remove the tag connecting it to the storage-profile. We do this to prohibit cinder from using that datastore for new volumes. But since cinder also checks the tags for finding a valid datastore on attachment, it does a costly and slow vMotion of the volume to another datasore. We don't want it to vMotion the volumes automatically, but rather want to do it on our own, as doing it on attachment makes attachments really slow. Setting `vmware_profile_check_on_attach` to `false` will disable the check on attachment. It's still done on resize, though.

Snapshots are only run through the scheduler to check the capacity is still available. The host is already defined and only that host is checked. Therefore, we can savely ignore snapshots in the `ShardFilter`.

Since it would be too much effort to change our blackbox tests to use multiple projects so they can test in all shards, we implement an override in the `ShardFilter` via scheduler_hints. Example: os volume create --size 10 asdf --hint vcenter-shard=vc-a-1

This patch adds the 'custom-requirements.txt' file which is used by loci builds for cinder base container images. This patch adds the python-agentliveness package that we use to ensure the c-api/c-sch/c-bak/c-vol service is up.

This patch adds the capability reporting for thin provisioning support as well as max over subscription and reserve percentage. https://docs.openstack.org/cinder/queens/admin/blockstorage-over-subscription.html

This patch adds some missing requirements that prevents cinder services from running in our environment.

This patch adds redis to the custom-requirements.txt which is needed by osprofiler

* chunkeddriver - improving the restore operation Compute all the incremental backups prior to writing it to the file, so that a single write operation is executed on the volume_file regardless the number of incremental backups. Removes the need of seeking back into the volume_file for overwriting with incremental chunks. * Fix add_object() to avoid unnecessary iteration over the same object The previous approach of iterating on enumerate() while inserting 2 times into the list, was doing an extra useless iteration over an object that has just been inserted. We switch to while loop so that we are able to jump to the desired index after we inserted the segments into the list. * Add iterator methods in BackupRestoreHandle Since this was built to be used as an iterator, it's cleaner to use the python iterator api and get rid of the has_next() and get_next() methods. * Fix _clear_reader() to properly clear the reader if it's not needed It checks if there are no more segments of the same object after the current index till the end of the segments list, case when it also closes and removes the reader from the cache directly. * Added a docstring for the Segment.of() method * Create BackupRestoreHandleV1 for handling v1 metadata Since we're handling most of the restore process within the BackupRestoreHandle class, we're now moving the metadata versioning down to it's own class (BackupRestoreHandleV1). DRIVER_VERSION_MAPPING should now refer to class names. This kind of classes should extend BackupRestoreHandle or at least take as constructor parameters: * chunked_driver - an instance of ChunkedBackupDriver * volume_id - the volume id * volume_file - the file handle where to write the data Additionaly, such a class should implement the following methods: * add_backup(backup, metadata) - called for each backup * finish_restore() - called after the backups are iterated * Make BackupRestoreHandle an abstract class Since BackupRestoreHandle does not implement the add_backup method which lets other classes inheriting it to define their own backup and metadata handling, it makes sense to make it abstract.

This patch adds the concourse_unit_test_task file so we can run unit tests during the loci image build process.

There is a bug in setuptools that prevents python installs from working correctly and you end up with an error "error: 'egg_base' must be a directory name (got `src`)" This patch upgrades the version of pip for running unit tests, which should fix the error.

This patch adds the import_data options during volume attach for migrating a volume. When a volume is attached locally for work during migration the volume needs to be writeable in order for cinder to copy bits into the volume. This import_data section of the connection_properties, instructs os-brick to create a write handle for the http connection to the volume. This is needed for migrating a volume from one shard to another, since cinder's generic volume copy takes over.

This patch fixes an issue that tempest found during test_volume_snapshot_backup and the volume['migration_status'] was None and can't be dereferenced as a string.

This patch ensures that the connector is not None during force detach. This issue was found during a tempest run.

This patch adds the cinder host to the update volume stats notification in the log, so we can track which cinder host is updating at what time.

This patch adds some driver startup validation for the cinder.conf vmware_storage_profile setting. It makes sure the storage profile exists. This patch also adds some validation to the get_volume_stats, to make sure not to try and call vcenter when we didn't find any datastores during get_volume_stats() time. We simply log a warning and move on.

) * Improve the deletion of object readers in BackupRestoreHandle This deletes right away an object reader which is not needed anymore, without waiting for the garbage collector to take care of it. As an effect, this should lower the memory consumption. * Fix finish_restore to increment the current object index This fixes the high memory consumption caused by _clear_reader which was not able to do its job because the self._idx was not reflecting the correct value. * fix pep8 - use delayed string interpolation

Drivers may need to create their own RPC communication channel to exchange vendor-specific logic. A driver can now specify additional RPC endpoints by appending it to the `self.additional_endpoints` list. The manager will always initialize that property as an empty list and take care of exposing the endpoints after the driver has been loaded. This is a feature which is already possible in Nova.

Implement driver's `migrate_volume`. It adds a new RPC server and client for communicating with the destination host to get information needed to build the relocate spec for a particular volume: datastore, resource pool, host, vcenter info, and to perform `move_backing_to_folder` after the volume was relocated, or, to create the backing on the destination host if it doesn't already exist.

[SAP] Update requirements.txt to include memcached for tooz

VMDKs are stream optimized, having different size than the actual volume. We have to set the correct vmdk_size to VmdkWriteHandle so that it can validate the upload progress correctly.

Temptest tests found a bug in the shard filter where creating a cinder group from an existing group or group snapshot always failed with not finding any hosts. This is due to the shard filter not getting much of any information about the request. This patch adds the project_id of the group being created to ensure that the shard filter can work properly.

[SAP] set the correct vmdk_size when restoring backups

[SAP] Fix the shard filter for group creation

This patch allows a snapshot creation request go through the scheduler to pick a pool for the snapshot to live on. The backend picked by the scheduler is added to the snapshot metadata. This metadata field '__cinder_internal_backend' is filtered out of requests fetching the snapshot information, so end users will never see it. To enable this ability a new config option is added to the scheduler and is defaulted to be False or disabled. sap_allow_independent_snapshots This allows creating a snapshot from a volume to happen on a completely different pool than the source volume. This patch also alters create volume from snapshot, to allow a volume to be created from a different pool than the source volume that the snapshot was created from. All of this is not how upstream works or allows. Fix snapshot view filtering This patch fixes an issue iterating over the metadata from the snapshot views API. The code is supposed to filter out and remove the __cinder_internal keys in the metadata. This patch looks for the keys to remove first and then deletes them instead of iterating over the metadata array and trying to delete them while iterating over the metadata, which causes a python exception.

Allow snapshots to be independent

This patch adds a new custom sap config option SAP_allow_independent_clone This patch allows a clone of a volume to be scheduled on a different pool than the source volume.

fixes hanging thread due to eventlet/eventlet#432 which may get fixed for oslo.log in 5.0.1 with openstack/oslo.log@94b9dc3 (at the time of writing master in antelope cycle is constraint to 5.0.0)

This patch looks for a custom attribute 'cinder_state' on a datastore at get_volume_stats() time. If the custom attribute value is 'drain', then the datastore is marked as 'draining' and cinder shouldn't use it for provisioning new requests. The datastore will be marked as down in the pool stats for that datastore.

This patch updates the 141 DB version upgrade to first drop the existing quota_usages table constraint on quota_usages_project_id_key, which happens to be UNIQUE (project_id, resource, deleted). 141 by default adds a unique constraint on the same key as ALTER TABLE quota_usages ADD CONSTRAINT quota_usages_project_id_key UNIQUE (project_id, resource, race_preventer)

[SAP] drop existing SAP constraint

This patch changes how cinder calls the oslo.vmware rw_handle to set the size of the contents of the vmdk file bytes to using a method on the object. This requires PR: sapcc/oslo.vmware#35

[SAP] Fix setting the restore handle size

This patch fixes some formats of strings and calls to LOG. They were working in train, but started failing in Wallaby. The old strings were LOG.debug("foo bar {}", something) They are fixed to LOG.debug("foo bar %s", something)

[SAP] Fix logging formats

Changes: - eliminate whitespace in passenv values - account for stricter allowlist checking - removed skipsdist=True, which in tox 4 appears to prevent cinder from being installed in the testenvs - made 4.0.0 the tox minversion, which means tox will have to update itself until it's available in distros, which in turn means that the default install_command needs to be used in the base testenv, which means a new base pyXX testenv is added to include our install_command that includes upper constraints - added install_command to most testenvs since they don't inherit the correct one from the base testenv any more Not strictly necessary for this patch, but I did them anyway: - moved the api-ref and releasenotes testenvs to be closer to the docs testenv - added reno as a dep for the 'venv' testenv, which our contributor docs say should be used to generate a new release note from the reno template, and which has apparently been broken for a while This patch makes tox 4 the default so that we can hopefully catch problems locally before they block the gate. SAP - The change to setup.py to add the py_modules line comes from setuptools >= 61 being automatically installed in the process and it's only overwritten with the version from constraints Change-Id: I75e36fa100925bd486c9d4fdf8a33dd58347ce81

[SAP]Get ready for tox 4

This patch adds a simple check against the pool in the shard filter to only filter if the pool is reporting as a vendor_name of 'VMware', which is true only for vmdk.py and fcd.py drivers. If we deploy a netapp cinder driver, this will pass the shard filter as there is no reason to shard.

There is a failure for looking at the cinder_pool_state for a datastore when the state is None (not set). This patch ensures that the custom attribute is set to something before trying to call .lower() on the string.

[SAP] Fix vmdk stats reporting

[SAP] Update shard filter to filter only on vmware

fwiesel · 2023-04-25T10:31:07Z

Draft, as there are no unit tests, and I haven't looked into the reverse operations (disk to image, and backup).

This patch adds the new action_track for cinder. The purpose of this module is to add a track() method which can be used to log business logic actions being taken on resources (volumes, snaps, etc) so we can then use this action logs to discover our most common failures and help debug/fix them. For now the only supported output is sending the actions into the standard cinder logfile. But we could add another adapter in the trace() method for sending the action logs to a rabbitmq queue, and/or a db table directly. This patch also adds calls throughout cinder code to track specific actions against volumes, such as create, delete, attach, migrate. Updated the test_attachments_manager test to use a magicmock instead of a sentinel as sentinels don't cover unknown attrubutes like magickmock does.

[SAP] Add action_track to format specific logs for easier debugging

Qemu has a user-space implementation of iscsi, which can be used instead of the kernel-level one. By relying only on user-space tooling, we need less privileges and can run more easily in containerised environments.

imitevm and others added 30 commits October 3, 2022 16:34

[SAP] implement size expansion when creating volume from template-bas…

ac86edf

…ed snapshot

[SAP] netapp/dataontap: ignore certificate

b8c064f

[SAP] Fix pep8 warnings

947cd24

[SAP] Online resize of cinder volumes (#24)

4a21cb6

* Adding-support-for-online-resize-cinder-volume * pep8 fixes * Added vmware_online_resize config option to disable online-resize

[SAP] scheduler: Add ShardFilter

d97d831

This filters out backends based on the shard defined in the backend's extra_capabilities and in the project's tags.

[SAP] fix pep8

6da64e5

[SAP] run tox -e genopts for "scheduler: Add ShardFilter"

d7f97b6

Got notified about this by `tox -e pep8` ... didn't know.

[SAP] scheduler: ShardFilter lets snapshots pass

79663b4

Snapshots are only run through the scheduler to check the capacity is still available. The host is already defined and only that host is checked. Therefore, we can savely ignore snapshots in the `ShardFilter`.

[SAP] Add loci consumed custom requirements

3d74cab

This patch adds the 'custom-requirements.txt' file which is used by loci builds for cinder base container images. This patch adds the python-agentliveness package that we use to ensure the c-api/c-sch/c-bak/c-vol service is up.

[SAP] add reporting of thin provisioning

f1c2b19

This patch adds the capability reporting for thin provisioning support as well as max over subscription and reserve percentage. https://docs.openstack.org/cinder/queens/admin/blockstorage-over-subscription.html

[SAP] added some missing custom requirements

e4562bb

This patch adds some missing requirements that prevents cinder services from running in our environment.

[SAP] need redis for osprofiler

9ca50d6

This patch adds redis to the custom-requirements.txt which is needed by osprofiler

[SAP] add concourse_unit_test_task

9069c9b

This patch adds the concourse_unit_test_task file so we can run unit tests during the loci image build process.

[SAP] Fix when migration status is None

0d387ca

This patch fixes an issue that tempest found during test_volume_snapshot_backup and the volume['migration_status'] was None and can't be dereferenced as a string.

[SAP] fix force detach when connector is None

0341cf9

This patch ensures that the connector is not None during force detach. This issue was found during a tempest run.

[SAP] add cinder host to notify log line

01e99ff

This patch adds the cinder host to the update volume stats notification in the log, so we can track which cinder host is updating at what time.

hemna and others added 22 commits November 2, 2022 12:48

Merge pull request #152 from sapcc/memcached

846ffd6

[SAP] Update requirements.txt to include memcached for tooz

[SAP] set the correct vmdk_size when restoring backups

e7c761b

VMDKs are stream optimized, having different size than the actual volume. We have to set the correct vmdk_size to VmdkWriteHandle so that it can validate the upload progress correctly.

Merge pull request #154 from sapcc/correct_vmdk_size

539df99

[SAP] set the correct vmdk_size when restoring backups

Merge pull request #155 from sapcc/shard_filter

31fb035

[SAP] Fix the shard filter for group creation

Merge pull request #158 from sapcc/wallaby_independent_snapshots

da60e86

Allow snapshots to be independent

Allow independent cloning of volumes

1957259

This patch adds a new custom sap config option SAP_allow_independent_clone This patch allows a clone of a volume to be scheduled on a different pool than the source volume.

SAPCC: don't log in method called by eventlet.tpool.execute()

28a34d2

fixes hanging thread due to eventlet/eventlet#432 which may get fixed for oslo.log in 5.0.1 with openstack/oslo.log@94b9dc3 (at the time of writing master in antelope cycle is constraint to 5.0.0)

Merge pull request #162 from sapcc/fix_constraint

92dfac2

[SAP] drop existing SAP constraint

Fix setting the restore handle size

21d021b

This patch changes how cinder calls the oslo.vmware rw_handle to set the size of the contents of the vmdk file bytes to using a method on the object. This requires PR: sapcc/oslo.vmware#35

Merge pull request #164 from sapcc/restore_size_fix

8e8665e

[SAP] Fix setting the restore handle size

[SAP] Fix logging formats

f738943

This patch fixes some formats of strings and calls to LOG. They were working in train, but started failing in Wallaby. The old strings were LOG.debug("foo bar {}", something) They are fixed to LOG.debug("foo bar %s", something)

Merge pull request #167 from sapcc/sap_fix_logging

ceabe4b

[SAP] Fix logging formats

Merge pull request #168 from sapcc/sap_fix_tox

d235272

[SAP]Get ready for tox 4

[SAP] Fix vmdk stats reporting

a3d083c

There is a failure for looking at the cinder_pool_state for a datastore when the state is None (not set). This patch ensures that the custom attribute is set to something before trying to call .lower() on the string.

Merge pull request #170 from sapcc/fix_stats

b6e6452

[SAP] Fix vmdk stats reporting

Merge pull request #166 from sapcc/vmware_sharding_only

9aab279

[SAP] Update shard filter to filter only on vmware

fwiesel requested a review from hemna April 25, 2023 10:30

hemna and others added 3 commits April 28, 2023 09:10

Merge pull request #163 from sapcc/walt-trace

4cca7e4

[SAP] Add action_track to format specific logs for easier debugging

Option for using qemu internal iscsi driver

f4feba9

Qemu has a user-space implementation of iscsi, which can be used instead of the kernel-level one. By relying only on user-space tooling, we need less privileges and can run more easily in containerised environments.

fwiesel force-pushed the qemu_image_xfer branch from a3ba775 to f4feba9 Compare May 2, 2023 14:09

hemna force-pushed the stable/wallaby-m3 branch from 9fdd4d5 to 767aaad Compare September 13, 2023 20:27

hemna force-pushed the stable/wallaby-m3 branch from 2b59588 to 17e2b86 Compare August 27, 2024 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for using qemu internal iscsi driver #171

Option for using qemu internal iscsi driver #171

fwiesel commented Apr 25, 2023

fwiesel commented Apr 25, 2023

Option for using qemu internal iscsi driver #171

Are you sure you want to change the base?

Option for using qemu internal iscsi driver #171

Conversation

fwiesel commented Apr 25, 2023

fwiesel commented Apr 25, 2023