Description
Excited by the opportunities that Stable Diffusion has made for use with consumer hardware, I've started working with the code. I was looking to get an idea of what I'd need to do for something like #277 and studying the StableDiffusionPipeline class.
I'm concerned by the use of isinstance(self.scheduler, LMSDiscreteScheduler)
. If you can't implement a new Scheduler without meddling with the internal implementation of the Pipeline, that points to some breakdown in the API design. We don't want growing if isintance
trees that grow with each New Scheduler PR.
It also introduces an ordering dependency between components that weren't otherwise entangled:
It wasn't obvious to me at first glance, but scheduler.set_timesteps
modifies scheduler.sigmas
. So now when working with that code we have to remember that we have to do that scheduler initialization before we finish preparing the initial latents, whereas those were independent operations before.
Another worrying comment is this:
If "not all schedulers share the same signature," it's much harder to make and use Schedulers.
Suggestions
LMSDiscreteScheduler's step
function seems to want the step number instead of the time. Options:
- Change the signature of
Scheduler.step
to always include the index as well as the time. (It's readily available and cheap to have on the stack.) - Since LMS is linear (right there in the name!), could it take the usual
timestep
value and easily convert it back to its index value?
For DDIMScheduler's extra eta
parameter:
The **kwargs
on StableDiffusionPipeline.__call__
seems to be unused, and eta
seems to be passed straight through to scheduler.step
with no other interaction. We could take eta
out of the formal parameter list and rename kwargs
to extra_step_kwargs
.
I haven't yet looked at other uses of Scheduler outside of this one pipeline, but those are my initial thoughts.