► Keras 3 API 文档 / KerasCV / 模型 / 任务 / StableDiffusion图像生成模型

StableDiffusion图像生成模型

`StableDiffusion` 类

keras_cv.models.StableDiffusion(img_height=512, img_width=512, jit_compile=True)

Keras 实现的 Stable Diffusion。

请注意，StableDiffusion API 以及 StableDiffusion 子组件（例如 ImageEncoder、DiffusionModel）的 API 在此阶段应被视为不稳定。我们不保证未来对这些 API 的更改向后兼容。

Stable Diffusion 是一种强大的图像生成模型，可用于根据简短的文本描述（称为“提示”）生成图像，以及其他用途。

参数

img_height：int，要生成的图像的高度，以像素为单位。请注意，仅支持 128 的倍数；提供的数值将四舍五入到最接近的有效值。默认为 512。
img_width：int，要生成的图像的宽度，以像素为单位。请注意，仅支持 128 的倍数；提供的数值将四舍五入到最接近的有效值。默认为 512。
jit_compile：bool，是否将底层模型编译到 XLA。这可能在某些系统上导致显著的加速。默认为 False。

示例

from keras_cv.src.models import StableDiffusion
from PIL import Image

model = StableDiffusion(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

参考

[源代码]

`StableDiffusionV2` 类

keras_cv.models.StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)

Keras 实现的 Stable Diffusion v2。

请注意，StableDiffusion API 以及 StableDiffusionV2 子组件（例如 ImageEncoder、DiffusionModelV2）的 API 在此阶段应被视为不稳定。我们不保证未来对这些 API 的更改向后兼容。

Stable Diffusion 是一种强大的图像生成模型，可用于根据简短的文本描述（称为“提示”）生成图像，以及其他用途。

参数

img_height：int，要生成的图像的高度，以像素为单位。请注意，仅支持 128 的倍数；提供的数值将四舍五入到最接近的有效值。默认为 512。
img_width：int，要生成的图像的宽度，以像素为单位。请注意，仅支持 128 的倍数；提供的数值将四舍五入到最接近的有效值。默认为 512。
jit_compile：bool，是否将底层模型编译到 XLA。这可能在某些系统上导致显著的加速。默认为 False。

示例

from keras_cv.src.models import StableDiffusionV2
from PIL import Image

model = StableDiffusionV2(img_height=512, img_width=512, jit_compile=True)
img = model.text_to_image(
    prompt="A beautiful horse running through a field",
    batch_size=1,  # How many images to generate at once
    num_steps=25,  # Number of iterations (controls image quality)
    seed=123,  # Set this to always get the same image from the same prompt
)
Image.fromarray(img[0]).save("horse.png")
print("saved at horse.png")

参考

[源代码]

`Decoder` 类

keras_cv.models.stable_diffusion.Decoder(
    img_height, img_width, name=None, download_weights=True
)

Sequential 将线性堆叠的层组合成一个 Model。

示例

model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))

# Note that you can also omit the initial `Input`.
# In that case the model doesn't have any weights until the first call
# to a training/evaluation method (since it isn't yet built):
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
# model.weights not created yet

# Whereas if you specify an `Input`, the model gets built
# continuously as you are adding layers:
model = keras.Sequential()
model.add(keras.Input(shape=(16,)))
model.add(keras.layers.Dense(8))
len(model.weights)  # Returns "2"

# When using the delayed-build pattern (no input shape specified), you can
# choose to manually build your model by calling
# `build(batch_input_shape)`:
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(4))
model.build((None, 16))
len(model.weights)  # Returns "4"

# Note that when using the delayed-build pattern (no input shape specified),
# the model gets built the first time you call `fit`, `eval`, or `predict`,
# or the first time you call the model on some input data.
model = keras.Sequential()
model.add(keras.layers.Dense(8))
model.add(keras.layers.Dense(1))
model.compile(optimizer='sgd', loss='mse')
# This builds the model for the first time:
model.fit(x, y, batch_size=32, epochs=10)

[源代码]

`DiffusionModel` 类

keras_cv.models.stable_diffusion.DiffusionModel(
    img_height, img_width, max_text_length, name=None, download_weights=True
)

一个将层组合成具有训练/推理功能的对象的模型。

有三种方法可以实例化一个 Model

使用“函数式 API”

从 Input 开始，通过链接层调用来指定模型的前向传递，最后，根据输入和输出创建模型。

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：仅支持输入张量的字典、列表和元组。不支持嵌套输入（例如，列表的列表或字典的字典）。

还可以使用中间张量创建一个新的函数式 API 模型。这使您能够快速提取模型的子组件。

示例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

请注意，backbone 和 activations 模型不是使用 keras.Input 对象创建的，而是使用源自 keras.Input 对象的张量创建的。在底层，这些模型中的层和权重将共享，以便用户可以训练 full_model，并使用 backbone 或 activations 进行特征提取。模型的输入和输出也可以是张量的嵌套结构，并且创建的模型是标准的函数式 API 模型，支持所有现有的 API。

通过子类化 `Model` 类

在这种情况下，您应该在 __init__() 中定义您的层，并且您应该在 call() 中实现模型的前向传递。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果子类化 Model，则可以在 call() 中可选地使用 training 参数（布尔值），可以使用它来指定训练和推理中的不同行为。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

创建模型后，可以使用 model.compile() 配置模型的损失和指标，使用 model.fit() 训练模型，或使用 model.predict() 使用模型进行预测。

使用 `Sequential` 类

此外，keras.Sequential 是模型的一种特殊情况，其中模型纯粹是单输入、单输出层的堆叠。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[源代码]

`ImageEncoder` 类

keras_cv.models.stable_diffusion.ImageEncoder(download_weights=True)

ImageEncoder 是 StableDiffusion 的 VAE 编码器。

[源代码]

`NoiseScheduler` 类

keras_cv.models.stable_diffusion.NoiseScheduler(
    train_timesteps=1000,
    beta_start=0.0001,
    beta_end=0.02,
    beta_schedule="linear",
    variance_type="fixed_small",
    clip_sample=True,
)

参数

train_timesteps: number of diffusion steps used to train the model.
beta_start: the starting `beta` value of inference.
beta_end: the final `beta` value.
beta_schedule: the beta schedule, a mapping from a beta range to a
    sequence of betas for stepping the model. Choose from `linear` or
    `quadratic`.
variance_type: options to clip the variance used when adding noise to
    the de-noised sample. Choose from `fixed_small`, `fixed_small_log`,
    `fixed_large`, `fixed_large_log`, `learned` or `learned_range`.
clip_sample: option to clip predicted sample between -1 and 1 for
    numerical stability.

[源代码]

`SimpleTokenizer` 类

keras_cv.models.stable_diffusion.SimpleTokenizer(bpe_path=None)

[源代码]

`TextEncoder` 类

keras_cv.models.stable_diffusion.TextEncoder(
    max_length, vocab_size=49408, name=None, download_weights=True
)

一个将层组合成具有训练/推理功能的对象的模型。

有三种方法可以实例化一个 Model

使用“函数式 API”

从 Input 开始，通过链接层调用来指定模型的前向传递，最后，根据输入和输出创建模型。

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：仅支持输入张量的字典、列表和元组。不支持嵌套输入（例如，列表的列表或字典的字典）。

还可以使用中间张量创建一个新的函数式 API 模型。这使您能够快速提取模型的子组件。

示例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通过子类化 `Model` 类

在这种情况下，您应该在 __init__() 中定义您的层，并且您应该在 call() 中实现模型的前向传递。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果子类化 Model，则可以在 call() 中可选地使用 training 参数（布尔值），可以使用它来指定训练和推理中的不同行为。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

创建模型后，可以使用 model.compile() 配置模型的损失和指标，使用 model.fit() 训练模型，或使用 model.predict() 使用模型进行预测。

使用 `Sequential` 类

此外，keras.Sequential 是模型的一种特殊情况，其中模型纯粹是单输入、单输出层的堆叠。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

[源代码]

`TextEncoderV2` 类

keras_cv.models.stable_diffusion.TextEncoderV2(
    max_length, vocab_size=49408, name=None, download_weights=True
)

一个将层组合成具有训练/推理功能的对象的模型。

有三种方法可以实例化一个 Model

使用“函数式 API”

从 Input 开始，通过链接层调用来指定模型的前向传递，最后，根据输入和输出创建模型。

inputs = keras.Input(shape=(37,))
x = keras.layers.Dense(32, activation="relu")(inputs)
outputs = keras.layers.Dense(5, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

注意：仅支持输入张量的字典、列表和元组。不支持嵌套输入（例如，列表的列表或字典的字典）。

还可以使用中间张量创建一个新的函数式 API 模型。这使您能够快速提取模型的子组件。

示例

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=128, height=128)(inputs)
conv = keras.layers.Conv2D(filters=32, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

通过子类化 `Model` 类

在这种情况下，您应该在 __init__() 中定义您的层，并且您应该在 call() 中实现模型的前向传递。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

model = MyModel()

如果子类化 Model，则可以在 call() 中可选地使用 training 参数（布尔值），可以使用它来指定训练和推理中的不同行为。

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = keras.layers.Dense(32, activation="relu")
        self.dense2 = keras.layers.Dense(5, activation="softmax")
        self.dropout = keras.layers.Dropout(0.5)

    def call(self, inputs, training=False):
        x = self.dense1(inputs)
        x = self.dropout(x, training=training)
        return self.dense2(x)

model = MyModel()

创建模型后，可以使用 model.compile() 配置模型的损失和指标，使用 model.fit() 训练模型，或使用 model.predict() 使用模型进行预测。

使用 `Sequential` 类

此外，keras.Sequential 是模型的一种特殊情况，其中模型纯粹是单输入、单输出层的堆叠。

model = keras.Sequential([
    keras.Input(shape=(None, None, 3)),
    keras.layers.Conv2D(filters=32, kernel_size=3),
])

StableDiffusion图像生成模型

StableDiffusion 类

StableDiffusionV2 类

Decoder 类

DiffusionModel 类

◆ 使用“函数式 API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

ImageEncoder 类

NoiseScheduler 类

参数

SimpleTokenizer 类

TextEncoder 类

◆ 使用“函数式 API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

TextEncoderV2 类

◆ 使用“函数式 API”

◆ 通过子类化 Model 类

◆ 使用 Sequential 类

StableDiffusion图像生成模型

StableDiffusion 类

StableDiffusionV2 类

Decoder 类

DiffusionModel 类

使用“函数式 API”

通过子类化 Model 类

使用 Sequential 类

ImageEncoder 类

NoiseScheduler 类

参数

SimpleTokenizer 类

TextEncoder 类

使用“函数式 API”

通过子类化 Model 类

使用 Sequential 类

TextEncoderV2 类

使用“函数式 API”

通过子类化 Model 类

使用 Sequential 类

`StableDiffusion` 类

`StableDiffusionV2` 类

`Decoder` 类

`DiffusionModel` 类

通过子类化 `Model` 类

使用 `Sequential` 类

`ImageEncoder` 类

`NoiseScheduler` 类

`SimpleTokenizer` 类

`TextEncoder` 类

通过子类化 `Model` 类

使用 `Sequential` 类

`TextEncoderV2` 类

通过子类化 `Model` 类

使用 `Sequential` 类