Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAP: Add affinity UUID validation #182

Open
wants to merge 151 commits into
base: stable/wallaby-m3
Choose a base branch
from

Conversation

hemna
Copy link

@hemna hemna commented Sep 27, 2023

This patch adds some basic UUID volume validation for the cinder create volume API request when passing in scheduler hints for
volume affinity or anti-affinity. The API now ensures that the
UUIDs are valid cinder volumes. The UUID must be a valid cinder
volume UUID for the create call to work.

imitevm and others added 30 commits September 13, 2023 15:55
With our VASA provider we cannot create disks with the default minimum
of 1MB as no LUNs below 4MB size can be created. The need for creating
such small volumes comes from the fact, that finding the actual size of
stream-optimized images is left to the vCenter, which grows the volume
dynamically.
When we're creating a volume from an image, the size was set to 0 to let
the vCenter figure out the uncompressed size. But with our VASA-provider
and using thin-provisioning we end up with empty volumes, presumably
because the volume doesn't auto-grow properly.

Since we know the size the volume should have in the end in advance and
we're not re-using the created VM as a template for others, we can
create the volume initially with the user-wanted size and don't have to
resize the volume afterwards.
Until now, when creating a snapshot from an attached volume, only the
additional unwanted disks got remove and the NICs/VIFs were kept. With
nsx-t this is a problem, as it does not allow duplicate MAC addresses
on the same logical switch. Therefore, we now remove all VIFs - or
rather every device that has a "macAddress" attribute - in the
clone-process.
* Adding-support-for-online-resize-cinder-volume
* pep8 fixes
* Added vmware_online_resize config option to disable online-resize
This filters out backends based on the shard defined in the backend's
extra_capabilities and in the project's tags.
This is addressing the fix for restoring VVol volumes from swift
backup. cinder-backup needs to upload the backup data to VMWare
via HttpNfc API by performing a ImportVApp call. Since this
operation is about to replace the existing backing, we want to
keep the main logic in cinder-volume. Thus, we instruct
cinder-backup how to build the spec for ImportVApp via a json-like
syntax, since it seems that suds objects can't pe pickled or
simply can't be sent plain over RPC.
After a restore backup operation, we correct the name and backing uuid
for the newly created backing.
This commit also includes moving some values into constants and updating
the unit tests.

[SAP] fix VM related constants

[SAP] vmdk driver - adding comment for terminate_connection()
Got notified about this by `tox -e pep8` ... didn't know.
When we set a datastore in maintenace, we remove the tag connecting it
to the storage-profile. We do this to prohibit cinder from using that
datastore for new volumes. But since cinder also checks the tags for
finding a valid datastore on attachment, it does a costly and slow
vMotion of the volume to another datasore. We don't want it to vMotion
the volumes automatically, but rather want to do it on our own, as doing
it on attachment makes attachments really slow.

Setting `vmware_profile_check_on_attach` to `false` will disable the
check on attachment. It's still done on resize, though.
Snapshots are only run through the scheduler to check the capacity is
still available. The host is already defined and only that host is
checked. Therefore, we can savely ignore snapshots in the `ShardFilter`.
Since it would be too much effort to change our blackbox tests to use
multiple projects so they can test in all shards, we implement an
override in the `ShardFilter` via scheduler_hints.
Example:
  os volume create --size 10 asdf --hint vcenter-shard=vc-a-1
This patch adds the 'custom-requirements.txt' file which is
used by loci builds for cinder base container images.

This patch adds the python-agentliveness package that we use
to ensure the c-api/c-sch/c-bak/c-vol service is up.
This patch adds the capability reporting for
thin provisioning support as well as max over subscription
and reserve percentage.

https://docs.openstack.org/cinder/queens/admin/blockstorage-over-subscription.html
This patch adds some missing requirements that prevents
cinder services from running in our environment.
This patch adds redis to the custom-requirements.txt which is
needed by osprofiler
* chunkeddriver - improving the restore operation

Compute all the incremental backups prior to writing it to the
file, so that a single write operation is executed on the
volume_file regardless the number of incremental backups.
Removes the need of seeking back into the volume_file for
overwriting with incremental chunks.

* Fix add_object() to avoid unnecessary iteration over the same object

The previous approach of iterating on enumerate() while inserting
2 times into the list, was doing an extra useless iteration over
an object that has just been inserted.
We switch to while loop so that we are able to jump to the desired
index after we inserted the segments into the list.

* Add iterator methods in BackupRestoreHandle

Since this was built to be used as an iterator, it's cleaner
to use the python iterator api and get rid of the has_next()
and get_next() methods.

* Fix _clear_reader() to properly clear the reader if it's not needed

It checks if there are no more segments of the same object after the
current index till the end of the segments list, case when it also
closes and removes the reader from the cache directly.

* Added a docstring for the Segment.of() method

* Create BackupRestoreHandleV1 for handling v1 metadata

Since we're handling most of the restore process within the
BackupRestoreHandle class, we're now moving the metadata
versioning down to it's own class (BackupRestoreHandleV1).

DRIVER_VERSION_MAPPING should now refer to class names.
This kind of classes should extend BackupRestoreHandle
or at least take as constructor parameters:
* chunked_driver - an instance of ChunkedBackupDriver
* volume_id - the volume id
* volume_file - the file handle where to write the data

Additionaly, such a class should implement the following methods:
* add_backup(backup, metadata) - called for each backup
* finish_restore() - called after the backups are iterated

* Make BackupRestoreHandle an abstract class

Since BackupRestoreHandle does not implement the add_backup method
which lets other classes inheriting it to define their own backup
and metadata handling, it makes sense to make it abstract.
This patch adds the concourse_unit_test_task file so we can run
unit tests during the loci image build process.
There is a bug in setuptools that prevents python installs
from working correctly and you end up with an error
"error: 'egg_base' must be a directory name (got `src`)"

This patch upgrades the version of pip for running unit tests,
which should fix the error.
This patch adds the import_data options during volume attach
for migrating a volume.   When a volume is attached locally
for work during migration the volume needs to be writeable
in order for cinder to copy bits into the volume.   This import_data
section of the connection_properties, instructs os-brick to create
a write handle for the http connection to the volume.

This is needed for migrating a volume from one shard to another,
since cinder's generic volume copy takes over.
This patch fixes an issue that tempest found during
test_volume_snapshot_backup and the volume['migration_status'] was None
and can't be dereferenced as a string.
This patch ensures that the connector is not None during
force detach.  This issue was found during a tempest run.
This patch adds the cinder host to the update volume stats
notification in the log, so we can track which cinder host
is updating at what time.
This patch adds some driver startup validation for the cinder.conf
vmware_storage_profile setting.   It makes sure the storage profile
exists.

This patch also adds some validation to the get_volume_stats, to make
sure not to try and call vcenter when we didn't find any datastores
during get_volume_stats() time.   We simply log a warning and move on.
)

* Improve the deletion of object readers in BackupRestoreHandle

This deletes right away an object reader which is not needed
anymore, without waiting for the garbage collector to take care
of it. As an effect, this should lower the memory consumption.

* Fix finish_restore to increment the current object index

This fixes the high memory consumption caused by _clear_reader
which was not able to do its job because the self._idx
was not reflecting the correct value.

* fix pep8 - use delayed string interpolation
Drivers may need to create their own RPC communication channel to
exchange vendor-specific logic.
A driver can now specify additional RPC endpoints by appending it
to the `self.additional_endpoints` list. The manager will always
initialize that property as an empty list and take care of
exposing the endpoints after the driver has been loaded.
This is a feature which is already possible in Nova.
Implement driver's `migrate_volume`.
It adds a new RPC server and client for communicating with the
destination host to get information needed to build the
relocate spec for a particular volume: datastore, resource pool,
host, vcenter info, and to perform `move_backing_to_folder` after
the volume was relocated, or, to create the backing on the
destination host if it doesn't already exist.
hemna and others added 28 commits September 13, 2023 15:59
There is a failure for looking at the cinder_pool_state for a datastore
when the state is None (not set).  This patch ensures that the custom
attribute is set to something before trying to call .lower() on the
string.
This patch adds a simple check against the pool in the shard filter
to only filter if the pool is reporting as a vendor_name of 'VMware',
which is true only for vmdk.py and fcd.py drivers.

If we deploy a netapp cinder driver, this will pass the shard filter
as there is no reason to shard.
This patch adds the new action_track for cinder.  The purpose of this
module is to add a track() method which can be used to log business
logic actions being taken on resources (volumes, snaps, etc) so we can
then use this action logs to discover our most common failures and help
debug/fix them.

For now the only supported output is sending the actions into the
standard cinder logfile.  But we could add another adapter in the
trace() method for sending the action logs to a rabbitmq queue, and/or
a db table directly.

This patch also adds calls throughout cinder code to track specific
actions against volumes, such as create, delete, attach, migrate.

Updated the test_attachments_manager test to use a magicmock instead of
a sentinel as sentinels don't cover unknown attrubutes like magickmock
does.
This patch adds the ability in the volume manger to check if a backend
has clones as snapshots and account for the space consumed on the pool.

All of our snapshots are clones and the volume manager doesn't account
for the space consumed on the pool/datastore.  Only the scheduler
accounts for that space consumed on the pool currently. With this patch
the volume manager will check all snapshots in the DB and properly
account for the space used on the pool. This is to help prevent
overcommit on a pool.
This patch removes the volume_has_snapshots_filter in the
conditional update when the backend that owns the volume has
clones as snapshots.

This effectively makes all vmware driver based volumes migrateable
within a shards if the volume has snapshots.
When creating a volume from a snapshot, the hidden
__cinder_internal_backend metadata on the snapshot will
be used for the host for the volume creation.  We do this now
instead of falling back to the snapshot's source volume host can
be different than it was when the snapshot was changed, because
the source volume can be migrated now.

This ensures that the new volume from snapshot can copy the bits
from the snapshot (clone).
…also use it for temp directory.

This change will help ops to find fcd objects on the datastore, as there is no longer any searchable object in vcenter....
This patch catches the VIMException when an fcd volume isn't found
during a delete_volume.  This is considered a success.
The envlist was missing from the tox.ini, which prevented running
'tox' without an -e option correctly.  The upper-constraints were
not being applied to the tox environment runtime, which allowed
SQLAlchemy 2.0.x to be installed and then the tests would fail
instantly.
Volumes that are created by a K8S cluster will get scheduled
in the same shard (vCenter).
This identifies the K8S cluster by looking at the metadata [1]
set by the CSI driver when creating the volumes.

[1] cinder.csi.openstack.org/cluster
This patch adds the new cinder/utils.py calculate_capacity_factors
to provide a detailed and more consistent,accurate view of the
various factors in determining virtual free space for a particular
backend.  It takes into consideration total capacity, free space,
thin/thick provisioning in the volume type, thin/thick support in the
backend, as well as reserved percentage and max_over_subscription_ratio.

Since the vmware driver is configured to allow lazy creation of volumes,
the free space reported by the pool/datastore isn't a reliable
source of how much is free considering what has been requested from
cinder to allocate.

This patch calculates what should be free based upon the
total available capacity ( total - reserved ) and what cinder
has tracked as allocated against that backend.
If that calculated free is less than the reported free, then
the calculated free is what is reported as virtual_free_space.

There is a known issue in cinder with keeping track of the allocated
space in 2 places:
1) at startup cinder only considers volumes that are in-use and
   available.  All other volumes in other states aren't used to
   calculate the allocated space.   This has to be fixed.
   This is fixed here: #117

2) The allocated space isn't adjusted during volume migrations.
This patch reworks the CapacityFilter which uses the new
utils.calculate_capacity_factors to determine if the
virtual free space available.

The main issue is that the vmware driver does lazy creates.
This causes an issue of over reporting of free space by
the backend, because the space hasn't been consumed yet.
So the amount of free space is not accurate with respect to
how much has been allocated by cinder.  This updated calculate
capacity factors as well as the SAPCapacityFilter accounts for
the virtual free space by using the cinder allocated capacity
tracking.  If the free space is reported less than what cinder
thinks should be available, then the reported free space is used.

This relies on an acurate reporting of the allocated capacity
by the driver.  We know there is an issue with allocated capacity
not being reported correctly for migrated volumes, as well as
accounting for existing volumes at startup.  The startup issue
should be solved with this PR:
#117

Will have to do a folllow up to account for updating the allocated
capacity for migrated volumes.
We already ignore the same in the "normal" shard filter code, but also
need to do it in the k8s shard part of it - otherwise, VMs being in the
wrong shard will not get their volumes attached.

Change-Id: Idf5c8e25916d148bf5cecf5feb1ad2f35e83d7cf
…-connector

[SAP] k8s shard filter ignores migrate-by-connector
Extend the SQL query to return all matching hosts, ordered desc.
For new volume creations we allow it in the dominant shard only,
however for existing volumes we allow the operation in all the
shards of the K8S cluster. This is a backward-compatibility with
existing K8S cluster spanning multiple shards in the same AZ.

Additionally, this commit fixes the availability_zone value to be
taken from filter_properties, which is always seeded by
populate_filter_properties()

For better debugging, adding a debug log when a backend is
filtered out by the K8S logic.

Change-Id: I42d526e8b1335e9025543a424f9203858a44663e
The message we pass into `action_track.track()` was wrongly a tuple
instead of a string, because we converted it from a LOG message and
forgot to convert the formatting arguments to actual formatting.

Change-Id: Ib861b59961a92ceabaa2874226ec367702ddac0c
ShardFilter: allow existing K8S clusters with more than 1 shard
This patch adds some basic UUID volume validation for the cinder
create volume API request when passing in scheduler hints for
volume affinity or anti-affinity.   The API now ensures that the
UUIDs are valid cinder volumes.  The UUID must be a valid cinder
volume UUID for the create call to work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.