DFineObjectDetector 类keras_hub.models.DFineObjectDetector(
backbone,
num_classes,
bounding_box_format="yxyx",
preprocessor=None,
matcher_class_cost=2.0,
matcher_bbox_cost=5.0,
matcher_ciou_cost=2.0,
use_focal_loss=True,
matcher_alpha=0.25,
matcher_gamma=2.0,
weight_loss_vfl=1.0,
weight_loss_bbox=5.0,
weight_loss_ciou=2.0,
weight_loss_fgl=0.15,
weight_loss_ddf=1.5,
ddf_temperature=5.0,
prediction_decoder=None,
activation=None,
**kwargs
)
D-FINE 对象检测模型。
此类封装了 DFineBackbone,并添加了最终的预测和损失计算逻辑,以实现端到端的对象检测。它负责:1. 定义连接 DFineBackbone 到输入层的函数式模型。2. 实现 compute_loss 方法,该方法使用匈牙利匹配器将预测分配给地面真实目标,并计算多个损失分量(分类、边界框等)的加权和。3. 在推理过程中,将主干网络的原始输出后处理成最终的、解码后的预测(边界框、标签、置信度分数)。
参数
keras_hub.models.Backbone 实例,具体来说是 DFineBackbone,作为对象检测器的特征提取器。"yxyx"。必须是支持的格式(例如,"yxyx"、"xyxy")。DFineObjectDetectorPreprocessor 的一个实例,用于输入数据预处理。2.0。5.0。2.0。True。0.25。2.0。1.0。5.0。2.0。0.15。1.5。5.0。keras.layers.Layer 实例,用于解码原始预测。如果未提供,则使用 NonMaxSuppression 层。None。示例
创建不带标签的 DFineObjectDetector
import numpy as np
from keras_hub.src.models.d_fine.d_fine_object_detector import (
DFineObjectDetector
)
from keras_hub.src.models.d_fine.d_fine_backbone import DFineBackbone
from keras_hub.src.models.hgnetv2.hgnetv2_backbone import HGNetV2Backbone
# Initialize the backbone without labels.
hgnetv2_backbone = HGNetV2Backbone(
stem_channels=[3, 16, 16],
stackwise_stage_filters=[
[16, 16, 64, 1, 3, 3],
[64, 32, 256, 1, 3, 3],
[256, 64, 512, 2, 3, 5],
[512, 128, 1024, 1, 3, 5],
],
apply_downsample=[False, True, True, True],
use_lightweight_conv_block=[False, False, True, True],
depths=[1, 1, 2, 1],
hidden_sizes=[64, 256, 512, 1024],
embedding_size=16,
use_learnable_affine_block=True,
hidden_act="relu",
image_shape=(256, 256, 3),
out_features=["stage3", "stage4"],
)
# Initialize the backbone without labels.
backbone = DFineBackbone(
backbone=hgnetv2_backbone,
decoder_in_channels=[128, 128],
encoder_hidden_dim=128,
num_denoising=100,
num_labels=80,
hidden_dim=128,
learn_initial_query=False,
num_queries=300,
anchor_image_size=(256, 256),
feat_strides=[16, 32],
num_feature_levels=2,
encoder_in_channels=[512, 1024],
encode_proj_layers=[1],
num_attention_heads=8,
encoder_ffn_dim=512,
num_encoder_layers=1,
hidden_expansion=0.34,
depth_multiplier=0.5,
eval_idx=-1,
num_decoder_layers=3,
decoder_attention_heads=8,
decoder_ffn_dim=512,
decoder_n_points=[6, 6],
lqe_hidden_dim=64,
num_lqe_layers=2,
out_features=["stage3", "stage4"],
image_shape=(256, 256, 3),
)
# Create the detector.
detector = DFineObjectDetector(
backbone=backbone,
num_classes=80,
bounding_box_format="yxyx",
)
创建带有主干网络标签的 DFineObjectDetector
import numpy as np
from keras_hub.src.models.d_fine.d_fine_object_detector import (
DFineObjectDetector
)
from keras_hub.src.models.d_fine.d_fine_backbone import DFineBackbone
from keras_hub.src.models.hgnetv2.hgnetv2_backbone import HGNetV2Backbone
# Define labels for the backbone.
labels = [
{
"boxes": np.array([[0.5, 0.5, 0.2, 0.2], [0.4, 0.4, 0.1, 0.1]]),
"labels": np.array([1, 10])
},
{"boxes": np.array([[0.6, 0.6, 0.3, 0.3]]), "labels": np.array([20])},
]
hgnetv2_backbone = HGNetV2Backbone(
stem_channels=[3, 16, 16],
stackwise_stage_filters=[
[16, 16, 64, 1, 3, 3],
[64, 32, 256, 1, 3, 3],
[256, 64, 512, 2, 3, 5],
[512, 128, 1024, 1, 3, 5],
],
apply_downsample=[False, True, True, True],
use_lightweight_conv_block=[False, False, True, True],
depths=[1, 1, 2, 1],
hidden_sizes=[64, 256, 512, 1024],
embedding_size=16,
use_learnable_affine_block=True,
hidden_act="relu",
image_shape=(256, 256, 3),
out_features=["stage3", "stage4"],
)
# Backbone is initialized with labels.
backbone = DFineBackbone(
backbone=hgnetv2_backbone,
decoder_in_channels=[128, 128],
encoder_hidden_dim=128,
num_denoising=100,
num_labels=80,
hidden_dim=128,
learn_initial_query=False,
num_queries=300,
anchor_image_size=(256, 256),
feat_strides=[16, 32],
num_feature_levels=2,
encoder_in_channels=[512, 1024],
encode_proj_layers=[1],
num_attention_heads=8,
encoder_ffn_dim=512,
num_encoder_layers=1,
hidden_expansion=0.34,
depth_multiplier=0.5,
eval_idx=-1,
num_decoder_layers=3,
decoder_attention_heads=8,
decoder_ffn_dim=512,
decoder_n_points=[6, 6],
lqe_hidden_dim=64,
num_lqe_layers=2,
out_features=["stage3", "stage4"],
image_shape=(256, 256, 3),
labels=labels,
box_noise_scale=1.0,
label_noise_ratio=0.5,
)
# Create the detector.
detector = DFineObjectDetector(
backbone=backbone,
num_classes=80,
bounding_box_format="yxyx",
)
使用检测器进行训练
import numpy as np
from keras_hub.src.models.d_fine.d_fine_object_detector import (
DFineObjectDetector
)
from keras_hub.src.models.d_fine.d_fine_backbone import DFineBackbone
from keras_hub.src.models.hgnetv2.hgnetv2_backbone import HGNetV2Backbone
# Initialize backbone and detector.
hgnetv2_backbone = HGNetV2Backbone(
stem_channels=[3, 16, 16],
stackwise_stage_filters=[
[16, 16, 64, 1, 3, 3],
[64, 32, 256, 1, 3, 3],
[256, 64, 512, 2, 3, 5],
[512, 128, 1024, 1, 3, 5],
],
apply_downsample=[False, True, True, True],
use_lightweight_conv_block=[False, False, True, True],
depths=[1, 1, 2, 1],
hidden_sizes=[64, 256, 512, 1024],
embedding_size=16,
use_learnable_affine_block=True,
hidden_act="relu",
image_shape=(256, 256, 3),
out_features=["stage3", "stage4"],
)
backbone = DFineBackbone(
backbone=hgnetv2_backbone,
decoder_in_channels=[128, 128],
encoder_hidden_dim=128,
num_denoising=100,
num_labels=80,
hidden_dim=128,
learn_initial_query=False,
num_queries=300,
anchor_image_size=(256, 256),
feat_strides=[16, 32],
num_feature_levels=2,
encoder_in_channels=[512, 1024],
encode_proj_layers=[1],
num_attention_heads=8,
encoder_ffn_dim=512,
num_encoder_layers=1,
hidden_expansion=0.34,
depth_multiplier=0.5,
eval_idx=-1,
num_decoder_layers=3,
decoder_attention_heads=8,
decoder_ffn_dim=512,
decoder_n_points=[6, 6],
lqe_hidden_dim=64,
num_lqe_layers=2,
out_features=["stage3", "stage4"],
image_shape=(256, 256, 3),
)
detector = DFineObjectDetector(
backbone=backbone,
num_classes=80,
bounding_box_format="yxyx",
)
# Sample training data.
images = np.random.uniform(
low=0, high=255, size=(2, 256, 256, 3)
).astype("float32")
bounding_boxes = {
"boxes": [
np.array([[10.0, 20.0, 20.0, 30.0], [20.0, 30.0, 30.0, 40.0]]),
np.array([[15.0, 25.0, 25.0, 35.0]]),
],
"labels": [
np.array([0, 2]), np.array([1])
],
}
# Compile the model.
detector.compile(
optimizer="adam",
loss=detector.compute_loss,
)
# Train the model.
detector.fit(x=images, y=bounding_boxes, epochs=1, batch_size=1)
进行预测
import numpy as np
from keras_hub.src.models.d_fine.d_fine_object_detector import (
DFineObjectDetector
)
from keras_hub.src.models.d_fine.d_fine_backbone import DFineBackbone
from keras_hub.src.models.hgnetv2.hgnetv2_backbone import HGNetV2Backbone
# Initialize backbone and detector.
hgnetv2_backbone = HGNetV2Backbone(
stem_channels=[3, 16, 16],
stackwise_stage_filters=[
[16, 16, 64, 1, 3, 3],
[64, 32, 256, 1, 3, 3],
[256, 64, 512, 2, 3, 5],
[512, 128, 1024, 1, 3, 5],
],
apply_downsample=[False, True, True, True],
use_lightweight_conv_block=[False, False, True, True],
depths=[1, 1, 2, 1],
hidden_sizes=[64, 256, 512, 1024],
embedding_size=16,
use_learnable_affine_block=True,
hidden_act="relu",
image_shape=(256, 256, 3),
out_features=["stage3", "stage4"],
)
backbone = DFineBackbone(
backbone=hgnetv2_backbone,
decoder_in_channels=[128, 128],
encoder_hidden_dim=128,
num_denoising=100,
num_labels=80,
hidden_dim=128,
learn_initial_query=False,
num_queries=300,
anchor_image_size=(256, 256),
feat_strides=[16, 32],
num_feature_levels=2,
encoder_in_channels=[512, 1024],
encode_proj_layers=[1],
num_attention_heads=8,
encoder_ffn_dim=512,
num_encoder_layers=1,
hidden_expansion=0.34,
depth_multiplier=0.5,
eval_idx=-1,
num_decoder_layers=3,
decoder_attention_heads=8,
decoder_ffn_dim=512,
decoder_n_points=[6, 6],
lqe_hidden_dim=64,
num_lqe_layers=2,
out_features=["stage3", "stage4"],
image_shape=(256, 256, 3),
)
detector = DFineObjectDetector(
backbone=backbone,
num_classes=80,
bounding_box_format="yxyx",
)
# Sample test image.
test_image = np.random.uniform(
low=0, high=255, size=(1, 256, 256, 3)
).astype("float32")
# Make predictions.
predictions = detector.predict(test_image)
# Access predictions.
boxes = predictions["boxes"] # Shape: (1, 100, 4)
labels = predictions["labels"] # Shape: (1, 100)
confidence = predictions["confidence"] # Shape: (1, 100)
num_detections = predictions["num_detections"] # Shape: (1,)
from_preset 方法DFineObjectDetector.from_preset(preset, load_weights=True, **kwargs)
从模型预设实例化一个 keras_hub.models.Task。
预设是一个包含配置、权重和其他文件资产的目录,用于保存和加载预训练模型。preset 可以作为以下之一传递:
'bert_base_en''kaggle://user/bert/keras/bert_base_en''hf://user/bert_base_en''./bert_base_en'对于任何 Task 子类,您都可以运行 cls.presets.keys() 来列出该类上所有可用的内置预设。
此构造函数可以通过两种方式调用。一种方式是从特定任务的基类(如 keras_hub.models.CausalLM.from_preset())调用,另一种方式是从模型类(如 keras_hub.models.BertTextClassifier.from_preset())调用。如果从基类调用,返回对象的子类将从预设目录中的配置推断出来。
参数
True,已保存的权重将被加载到模型架构中。如果为 False,所有权重将被随机初始化。示例
# Load a Gemma generative task.
causal_lm = keras_hub.models.CausalLM.from_preset(
"gemma_2b_en",
)
# Load a Bert classification task.
model = keras_hub.models.TextClassifier.from_preset(
"bert_base_en",
num_classes=2,
)
| 预设 | 参数 | 描述 |
|---|---|---|
| dfine_nano_coco | 3.79M | D-FINE Nano 模型,该系列中最小的变体,在 COCO 数据集上进行了预训练。非常适合计算资源有限的应用。 |
| dfine_small_coco | 10.33M | D-FINE Small 模型在 COCO 数据集上进行了预训练。在性能和计算效率之间取得了平衡。 |
| dfine_small_obj2coco | 10.33M | D-FINE Small 模型首先在 Objects365 上预训练,然后又在 COCO 上进行了微调,结合了广泛的特征学习和基准特定适应。 |
| dfine_small_obj365 | 10.62M | D-FINE Small 模型在大规模 Objects365 数据集上进行了预训练,增强了其识别各种对象的能力。 |
| dfine_medium_coco | 19.62M | D-FINE Medium 模型在 COCO 数据集上进行了预训练。是通用目标检测的扎实基线,性能强大。 |
| dfine_medium_obj2coco | 19.62M | D-FINE Medium 模型采用两阶段训练过程:在 Objects365 上预训练,然后进行 COCO 微调。 |
| dfine_medium_obj365 | 19.99M | D-FINE Medium 模型在 Objects365 数据集上进行了预训练。受益于更大、更多样化的预训练语料库。 |
| dfine_large_coco | 31.34M | D-FINE Large 模型在 COCO 数据集上进行了预训练。提供高精度,适用于更具挑战性的任务。 |
| dfine_large_obj2coco_e25 | 31.34M | D-FINE Large 模型在 Objects365 上预训练,然后在 COCO 上微调 25 个 epoch。一个高性能模型,具有专门的调优。 |
| dfine_large_obj365 | 31.86M | D-FINE Large 模型在 Objects365 数据集上进行了预训练,以提高泛化能力和在各种对象类别上的性能。 |
| dfine_xlarge_coco | 62.83M | D-FINE X-Large 模型,COCO 预训练系列中最大的变体,旨在在精度是首要任务的情况下实现最先进的性能。 |
| dfine_xlarge_obj2coco | 62.83M | D-FINE X-Large 模型,在 Objects365 上预训练,在 COCO 上微调,代表了该系列中在 COCO 类型任务上最强大的模型。 |
| dfine_xlarge_obj365 | 63.35M | D-FINE X-Large 模型在 Objects365 数据集上进行了预训练,通过利用预训练过程中的大量对象类别来提供最大性能。 |
backbone 属性keras_hub.models.DFineObjectDetector.backbone
一个具有核心架构的 keras_hub.models.Backbone 模型。
preprocessor 属性keras_hub.models.DFineObjectDetector.preprocessor
用于预处理输入的 keras_hub.models.Preprocessor 层。