Keras 3 API 文档 / Keras 应用

Keras 应用

Keras 应用是深度学习模型,并附带预训练权重。这些模型可用于预测、特征提取和微调。

实例化模型时会自动下载权重。它们存储在 ~/.keras/models/ 中。

实例化后,模型将根据 Keras 配置文件 ~/.keras/keras.json 中设置的图像数据格式构建。例如,如果您已设置 image_data_format=channels_last,则从此存储库加载的任何模型都将根据数据格式约定“高-宽-深”构建。

可用模型

模型 大小 (MB) Top-1 准确率 Top-5 准确率 参数 深度 推理步骤时间 (毫秒)(CPU) 推理步骤时间 (毫秒)(GPU)
Xception 88 79.0% 94.5% 22.9M 81 109.4 8.1
VGG16 528 71.3% 90.1% 138.4M 16 69.5 4.2
VGG19 549 71.3% 90.0% 143.7M 19 84.8 4.4
ResNet50 98 74.9% 92.1% 25.6M 107 58.2 4.6
ResNet50V2 98 76.0% 93.0% 25.6M 103 45.6 4.4
ResNet101 171 76.4% 92.8% 44.7M 209 89.6 5.2
ResNet101V2 171 77.2% 93.8% 44.7M 205 72.7 5.4
ResNet152 232 76.6% 93.1% 60.4M 311 127.4 6.5
ResNet152V2 232 78.0% 94.2% 60.4M 307 107.5 6.6
InceptionV3 92 77.9% 93.7% 23.9M 189 42.2 6.9
InceptionResNetV2 215 80.3% 95.3% 55.9M 449 130.2 10.0
MobileNet 16 70.4% 89.5% 4.3M 55 22.6 3.4
MobileNetV2 14 71.3% 90.1% 3.5M 105 25.9 3.8
DenseNet121 33 75.0% 92.3% 8.1M 242 77.1 5.4
DenseNet169 57 76.2% 93.2% 14.3M 338 96.4 6.3
DenseNet201 80 77.3% 93.6% 20.2M 402 127.2 6.7
NASNetMobile 23 74.4% 91.9% 5.3M 389 27.0 6.7
NASNetLarge 343 82.5% 96.0% 88.9M 533 344.5 20.0
EfficientNetB0 29 77.1% 93.3% 5.3M 132 46.0 4.9
EfficientNetB1 31 79.1% 94.4% 7.9M 186 60.2 5.6
EfficientNetB2 36 80.1% 94.9% 9.2M 186 80.8 6.5
EfficientNetB3 48 81.6% 95.7% 12.3M 210 140.0 8.8
EfficientNetB4 75 82.9% 96.4% 19.5M 258 308.3 15.1
EfficientNetB5 118 83.6% 96.7% 30.6M 312 579.2 25.3
EfficientNetB6 166 84.0% 96.8% 43.3M 360 958.1 40.4
EfficientNetB7 256 84.3% 97.0% 66.7M 438 1578.9 61.6
EfficientNetV2B0 29 78.7% 94.3% 7.2M - - -
EfficientNetV2B1 34 79.8% 95.0% 8.2M - - -
EfficientNetV2B2 42 80.5% 95.1% 10.2M - - -
EfficientNetV2B3 59 82.0% 95.8% 14.5M - - -
EfficientNetV2S 88 83.9% 96.7% 21.6M - - -
EfficientNetV2M 220 85.3% 97.4% 54.4M - - -
EfficientNetV2L 479 85.7% 97.5% 119.0M - - -
ConvNeXtTiny 109.42 81.3% - 28.6M - - -
ConvNeXtSmall 192.29 82.3% - 50.2M - - -
ConvNeXtBase 338.58 85.3% - 88.5M - - -
ConvNeXtLarge 755.07 86.3% - 197.7M - - -
ConvNeXtXLarge 1310 86.7% - 350.1M - - -

Top-1 和 Top-5 准确率指的是模型在 ImageNet 验证数据集上的性能。

深度指的是网络的拓扑深度。这包括激活层、批量归一化层等。

每个推理步骤的时间是 30 个批次和 10 次重复的平均值。

  • CPU:AMD EPYC 处理器(带 IBPB)(92 核)
  • RAM:1.7T
  • GPU:Tesla A100
  • 批次大小:32

深度计算具有参数的层的数量。


图像分类模型的使用示例

使用 ResNet50 对 ImageNet 类进行分类

import keras
from keras.applications.resnet50 import ResNet50
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

使用 VGG16 提取特征

import keras
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)

使用 VGG19 从任意中间层提取特征

from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input
from keras.models import Model
import numpy as np

base_model = VGG19(weights='imagenet')
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)

img_path = 'elephant.jpg'
img = keras.utils.load_img(img_path, target_size=(224, 224))
x = keras.utils.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

block4_pool_features = model.predict(x)

在新的类集上微调 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D

# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
    layer.trainable = False

# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# train the model on the new data for a few epochs
model.fit(...)

# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.

# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
   print(i, layer.name)

# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
   layer.trainable = False
for layer in model.layers[249:]:
   layer.trainable = True

# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')

# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit(...)

在自定义输入张量上构建 InceptionV3

from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input

# this could also be the output a different Keras model or layer
input_tensor = Input(shape=(224, 224, 3))

model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=True)