Negative prompts are an extension of the Classifier Free Guidance Module. Recall this is part of the pred_noise method of StableDiffusion
StableDiffusion.pred_noise?
Signature: StableDiffusion.pred_noise(self, prompt_embedding, l, t, guidance_scale)
Docstring: <no docstring>
File: ~/Desktop/SlowAI/nbs/slowai/overview.py
Type: function
Let’s define a helper method to load StableDiffusion, as in the “Overview” notebook
get_stable_diffusion
def get_stable_diffusion(
cls:type = StableDiffusion
):
get_simple_pipe
sd = get_stable_diffusion()
sd(
prompt= "a photo of a giraffe in Paris" ,
guidance_scale= 7.5 ,
as_pil= True ,
)
0%| | 0/30 [00:00<?, ?it/s]100%|██████████| 30/30 [00:04<00:00, 7.38it/s]
prompt_embedding is a tensor four-rank tensor of batch_size x seq_len x channels, where the batch size is 2 because its the concatenated unconditional prompt and the conditional prompt.
sd.embed_prompt("a photo of a giraffe in paris" ).shape
We want to add the negative prompt and run this through the denoising unet at the same time. This should make the batch size into 3.
StableDiffusionWithNegativePromptA
def StableDiffusionWithNegativePromptA(
tokenizer:CLIPTokenizer, text_encoder:CLIPTextModel, scheduler:Any, unet:UNet2DConditionModel, vae:AutoencoderKL
)-> None :
sd = get_stable_diffusion(StableDiffusionWithNegativePromptA)
embedding = sd.embed_prompt("a photo of a giraffe in paris" , "blurry" )
embedding.shape
Now, we need to pretty much totally rewrite the denoising method to incorporate this negative guidance.
StableDiffusionWithNegativePromptB
def StableDiffusionWithNegativePromptB(
tokenizer:CLIPTokenizer, text_encoder:CLIPTextModel, scheduler:Any, unet:UNet2DConditionModel, vae:AutoencoderKL
)-> None :
sd = get_stable_diffusion(StableDiffusionWithNegativePromptB)
embedding = sd.embed_prompt("a photo of a giraffe in paris" , "blurry" )
l = sd.init_latents()
epsilon = sd.pred_noise(embedding, l, t= 0 , guidance_scale_pos= 7.5 , guidance_scale_neg= 2 )
epsilon.shape
torch.Size([1, 4, 64, 64])
Finally, we incorporate the negative prompt into the class API.
StableDiffusionWithNegativePromptC
def StableDiffusionWithNegativePromptC(
tokenizer:CLIPTokenizer, text_encoder:CLIPTextModel, scheduler:Any, unet:UNet2DConditionModel, vae:AutoencoderKL
)-> None :
sd = get_stable_diffusion(StableDiffusionWithNegativePromptC)
sd(
prompt= "a photo of a labrador dog" ,
negative_prompt= "park, greenery, plants, flowers" ,
guidance_scale= 7.5 ,
neg_guidance_scale= 5 ,
as_pil= True ,
)
0%| | 0/30 [00:00<?, ?it/s]100%|██████████| 30/30 [00:06<00:00, 4.98it/s]
sd = get_stable_diffusion(StableDiffusionWithNegativePromptC)
sd(
prompt= "a photo of a labrador dog in a park" ,
negative_prompt= "greenery, plants, flowers" ,
guidance_scale= 7.5 ,
neg_guidance_scale= 5 ,
as_pil= True ,
)
0%| | 0/30 [00:00<?, ?it/s]100%|██████████| 30/30 [00:06<00:00, 4.97it/s]