You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During CreateSnapshot, CSI will return OK after calling CreateSnapshot (IaaS).
Once CreateSnapshot (CSI) returned OK, the CO now consider that the Snapshot is now "cut" in CSI specification (meaning the Snapshot's content cannot be altered by future writes).
Once the "cut" done, CO may "thaw" application which may continue writing on Volume.
The data contained in a snapshot is considered cut when the snapshot is in the completed state.
This behavior could lead CO to prematurely resume writes on Volume and alter Snapshot content.
What you expected to happen?
As described in CSI spec:
CreateSnapshot is a synchronous call and it MUST block until the snapshot is cut
In the current Outscale API version, CreateSnapshot (CSI) should block until Snapshot (IaaS) state reached "completed".
How to reproduce it (as minimally and precisely as possible)?
Create a loop that appends current date to date.txt in a volume every seconds
Trigger CreateSnapshot (CSI)
Read creation_time of the Snapshot
Restore Snapshot to a new Volume and read date file
Compair dates between 3. and 4 => Dates written to restored Volume should be after creation_time
Anything else we need to know?:
Note that ready_to_use still switch to true once a Snapshot (IaaS) move to "complete" state as Outscale have no post-processing effort (unlike EC2).
🔥IMPLEMENTATION RISK🔥
Waiting for state to reach "complete" could easily timeout CSI calls which is ok as CO will call CreateSnapshot again and again.
If each pending call is not stopped once timeout is reached, each call may continue performing ReadSnapshots (IaaS) in an infinite loop and cause those issues:
Runners are occupied to run the same ReadSnapshots (IaaS) over and over, leading to useless API usage.
All runners may be saturated by the same task and controller cannot respond anymore leading to denial of service
Fix implementation should consider exit with an error instead of ReadSnapshot (IaaS) forever (could be a fixed allocated time, could be after first read, ...)
Environment
Driver version: <= 1.2.4
The text was updated successfully, but these errors were encountered:
Yes, this theoretical bug could affect your implementation if it is based on Outscale's CSI snapshots.
We need to investigate more around this issue. cc @outscale-hmi
/kind bug
What happened?
During CreateSnapshot, CSI will return OK after calling CreateSnapshot (IaaS).
Once CreateSnapshot (CSI) returned OK, the CO now consider that the Snapshot is now "cut" in CSI specification (meaning the Snapshot's content cannot be altered by future writes).
Once the "cut" done, CO may "thaw" application which may continue writing on Volume.
However unlike EC2 behavior where "the point-in-time snapshot is created immediately", Outscale's Snapshot will be cut once the state "completed" is reached on IaaS:
This behavior could lead CO to prematurely resume writes on Volume and alter Snapshot content.
What you expected to happen?
As described in CSI spec:
In the current Outscale API version, CreateSnapshot (CSI) should block until Snapshot (IaaS) state reached "completed".
How to reproduce it (as minimally and precisely as possible)?
creation_time
of the Snapshotcreation_time
Anything else we need to know?:
Note that
ready_to_use
still switch totrue
once a Snapshot (IaaS) move to "complete" state as Outscale have no post-processing effort (unlike EC2).🔥IMPLEMENTATION RISK🔥
Waiting for state to reach "complete" could easily timeout CSI calls which is ok as CO will call CreateSnapshot again and again.
If each pending call is not stopped once timeout is reached, each call may continue performing ReadSnapshots (IaaS) in an infinite loop and cause those issues:
Fix implementation should consider exit with an error instead of ReadSnapshot (IaaS) forever (could be a fixed allocated time, could be after first read, ...)
Environment
The text was updated successfully, but these errors were encountered: