Skip to content

Commit 9429bc6

Browse files
committed
fix(doc): Improve accurancy of snapshot documentation
Fix various minor errors: - Drop some specifics on the cgroups v1 disclaimer, because all supported host kernel versions are "5.4+" - Do not claim that creating a snapshot has no effect on the running VM, because that's not true. - Cut down on some repeated and confusing information / examples near the end. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
1 parent d1f702d commit 9429bc6

File tree

1 file changed

+18
-33
lines changed

1 file changed

+18
-33
lines changed

docs/snapshotting/snapshot-support.md

Lines changed: 18 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ the feature can be combined with guest_memfd support in Firecracker.
122122

123123
### Limitations
124124

125-
- High snapshot latency on 5.4+ host kernels due to cgroups V1. We strongly
125+
- High snapshot restoration latency when cgroups V1 are in use. We strongly
126126
recommend to deploy snapshots on cgroups V2 enabled hosts for the implied
127127
kernel versions -
128128
[related issue](https://github.com/firecracker-microvm/firecracker/issues/2129).
@@ -145,10 +145,11 @@ the feature can be combined with guest_memfd support in Firecracker.
145145
resumed from snapshot load memory on-demand from the snapshot and
146146
copy-on-write to anonymous memory.
147147
- Resuming from a snapshot is optimized for speed, while taking a snapshot
148-
involves some extra CPU cycles for synchronously writing dirty memory pages to
149-
the memory snapshot file. Taking a snapshot of a fresh microVM, on which dirty
150-
pages tracking is not enabled, results in the full contents of guest memory
151-
being written to the snapshot.
148+
involves some extra CPU cycles for synchronously writing memory pages to the
149+
memory snapshot file. Taking a full snapshot of a microVM, on which dirty page
150+
tracking is not enabled, results in the full contents of guest memory being
151+
written to the snapshot, and particularly, in all guest memory being faulted
152+
in.
152153
- The _memory file_ and _microVM state file_ are generated by Firecracker on
153154
snapshot creation. The disk contents are _not_ explicitly flushed to their
154155
backing files.
@@ -354,10 +355,12 @@ Enabling this support enables KVM dirty page tracking, so it comes at a cost
354355
(which consists of CPU cycles spent by KVM accounting for dirtied pages); it
355356
should only be used when needed.
356357

357-
Creating a snapshot will **not** influence state, will **not** stop or end the
358-
microVM, it can be used as before, so the microVM can be resumed if you still
359-
want to use it. At this point, in case you plan to continue using the current
360-
microVM, you should make sure to also copy the disk backing files.
358+
Creating a snapshot has some minor effects on the currently running microVM:
359+
360+
- The vsock device is [reset](#vsock-device-reset), causing the driver to
361+
terminate connection on resumption.
362+
- On x86_64, a notification for KVM-clock is injected to notify the guest about
363+
being paused.
361364

362365
### Resuming the microVM
363366

@@ -382,8 +385,8 @@ ignored (microVM remains in the running state). **Effects**:
382385
### Loading snapshots
383386

384387
If you want to load a snapshot, you can do that only **before** the microVM is
385-
configured (the only resources that can be configured prior are the Logger and
386-
the Metrics systems) by sending the following API command:
388+
configured (the only resources that can be configured prior are the logger and
389+
the metrics systems) by sending the following API command:
387390

388391
```bash
389392
curl --unix-socket /tmp/firecracker.socket -i \
@@ -470,28 +473,10 @@ to the new Firecracker process as they were to the original one.
470473
- _on failure_: A specific error is reported and then the current Firecracker
471474
process is ended (as it might be in an invalid state).
472475

473-
*Notes*: Please, keep in mind that only by setting to true
474-
`enable_diff_snapshots`, when loading a snapshot, or `track_dirty_pages`, when
475-
configuring the machine on a fresh microVM, you can then create a `diff`
476-
snapshot. Also, `track_dirty_pages` is not saved when creating a snapshot, so
477-
you need to explicitly set `enable_diff_snapshots` when sending
478-
`LoadSnapshot`command if you want to be able to do diff snapshots from a loaded
479-
microVM. Another thing that you should be aware of is the following: if a fresh
480-
microVM can create diff snapshots, then if you create a **full** snapshot, the
481-
memory file contains the whole guest memory, while if you create a **diff** one,
482-
that file is sparse and only contains the guest dirtied pages. With these in
483-
mind, some possible snapshotting scenarios are the following:
484-
485-
- `Boot from a fresh microVM` -> `Pause` -> `Create snapshot` -> `Resume` ->
486-
`Pause` -> `Create snapshot` -> ... ;
487-
- `Boot from a fresh microVM` -> `Pause` -> `Create snapshot` -> `Resume` ->
488-
`Pause` -> `Resume` -> ... -> `Pause` -> `Create snapshot` -> ... ;
489-
- `Load snapshot` -> `Resume` -> `Pause` -> `Create snapshot` -> `Resume` ->
490-
`Pause` -> `Create snapshot` -> ... ;
491-
- `Load snapshot` -> `Resume` -> `Pause` -> `Create snapshot` -> `Resume` ->
492-
`Pause` -> `Resume` -> ... -> `Pause` -> `Create snapshot` -> ... ; where
493-
`Create snapshot` can refer to either a full or a diff snapshot for all the
494-
aforementioned flows.
476+
*Notes*: The `track_dirty_pages` configuration is not saved when creating a
477+
snapshot, so you need to explicitly set `track_dirty_pages` again when sending
478+
the `LoadSnapshot` command if you want to be able to do dirty page tracking
479+
based diff snapshots from a loaded microVM.
495480

496481
It is also worth knowing, a microVM that is restored from snapshot will be
497482
resumed with the guest OS wall-clock continuing from the moment of the snapshot

0 commit comments

Comments
 (0)