Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

use error injection to verify operation in case of media issues #799

Open
okartau opened this issue Nov 2, 2020 · 1 comment
Open

use error injection to verify operation in case of media issues #799

okartau opened this issue Nov 2, 2020 · 1 comment
Labels
future needs to be fixed in some future release

Comments

@okartau
Copy link
Contributor

okartau commented Nov 2, 2020

We have not had systematic approach to verify/improve the pmem-csi behavior on top of media errors. As we deal with file system formatting and mounting, the possible media-level errors are typically serious. Without an human operator interpreting error messages, the pmem-csi plugin can't do much to fix and continue, so it's likely "fail and stop".
But at least we should try to make sure that the result is not too ugly, like crashing without helpful message, looping forever, etc.

So far, error handling has been mostly based on scenarios that have been detected through testing and use.
To improve the coverage, we could use artificially generated errors. There is ndctl-inject-error but it seems HW-specific. We could also investigate can we just corrupt media in emulated use case and see what happens in next run.
Also worth investigating, is coverage in emulated cases as good (i.e. bad) as HW-based corruption can be.
I am not convinced should we add such testing (probably quite slow) into CI cycle, which is already lasting long. But having some tools to run some tests out-of-CI (also, on HW) would be helpful.

@pohly
Copy link
Contributor

pohly commented Dec 18, 2020

The relevant fault injection for PMEM-CSI is when "mount" or "mkfs" fail. If file access fails at application runtime, then there isn't much that PMEM-CSI can do about it. We don't even get to know about it.

@pohly pohly added the future needs to be fixed in some future release label Dec 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
future needs to be fixed in some future release
Projects
None yet
Development

No branches or pull requests

2 participants