Closed
Description
since the training usually takes few hours, and for different datasets, objects or styles, you can adapt with different training parameters, is there a way to output the .bin -let's say- every 1000 steps or so?
that would be very useful to check if the training is going in a good direction and maybe you can economize your gpu/usage if you are happy with the results.
it's very frustrating when you are 1000steps to complete a 10000steps training and runtime disconnects ending with hours of hours of training gone without a checkpoint.
also the ability to somehow save and resume the training would be a dream.
Metadata
Metadata
Assignees
Labels
No labels