导言

此演示示例专门用于在雷达图像上分割船舶的任务。这些图像信息量很大，可以让你观察物体，而不管一天中的时间和天气状况如何，但是手动解释数据需要大量的努力和知识。使用神经网络方法可以显着加快处理速度，提高检测精度。

选择具有ResNet-18主干的U-Net架构是因为它能够同时考虑本地细节和全局环境，这在处理海面和小物体时尤其重要。

导入库和配置路径

using Pkg
# Pkg.add("DataLoaders")

using FileIO, Images
using ImageTransformations: imresize, Linear
using ProgressMeter, PyCall
using Printf, Dates
using CUDA
using Flux, Metalhead
using BSON: @save, @load
using Statistics, Printf

设置训练和测试数据的路径：图像和蒙版

train_dir_imgs  = raw"data/train/imgs"
train_dir_masks = raw"data/train/masks"
test_dir_imgs   = raw"data/test/imgs"
test_dir_masks  = raw"data/test/masks";

学习的超参数

输入（256,256）的大小，批次的大小和学习率确定

res = (256, 256)          
batch_size = 2
learning_rate = 1e-3

0.001

数据集

ImageDataset类是用于存储图像和蒙版路径的结构。
在getindex方法中，数据被加载，缩放到所需的大小，并转换。

function my_collate(samples)
    xs = first.(samples)
    ys = last.(samples)
    return (cat(xs...; dims=4),  
            cat(ys...; dims=4))   
end

my_collate (generic function with 1 method)

struct ImageDataset
    imgs::Vector{String}
    masks::Vector{String}
end

ImageDataset(folder_imgs::String, folder_masks::String) =
    ImageDataset(readdir(folder_imgs; join=true), readdir(folder_masks; join=true))

function Base.getindex(ds::ImageDataset, i::Int)
    img = Gray.(load(ds.imgs[i]))                 
    img = imresize(img, res; method=Linear())   
    x = Float32.(img)                            
    H, W = size(x)
    x = reshape(x, H, W, 1)                      

    msk = Gray.(load(ds.masks[i]))               
    msk = imresize(msk, res) 
    m = Float32.(msk .> 0)                       


    y = cat(1f0 .- m, m; dims=3)

    return x, y
end

Base.length(ds::ImageDataset) = length(ds.imgs)

创建train_loader和test_loader，它们使用批处理文件将数据馈送到模型。

train_data = ImageDataset(train_dir_imgs, train_dir_masks)
test_data  = ImageDataset(test_dir_imgs,  test_dir_masks)

train_loader = Flux.DataLoader(train_data; batchsize=batch_size, collate=my_collate, parallel=false)
test_loader  = Flux.DataLoader(test_data;  batchsize=batch_size, collate=my_collate, parallel=false)

50-element DataLoader(::ImageDataset, batchsize=2, collate=my_collate)
  with first element:
  (256×256×1×2 Array{Float32, 4}, 256×256×2×2 Array{Float32, 4},)

数据可视化

在下面的单元格中，我们可视化数据以评估我们正在使用的内容。

img, mask = train_data[2]

to_rgb(x) = Gray.(dropdims(x; dims=3))

rgb = to_rgb(img)

mask_vis = Gray.(mask[:, :, 2])

hcat(rgb, mask_vis)

定义模型

UNet正在使用ResNet18后门构建，正在配置优化器（带有权重衰减的Adam），并且正在检查GPU。

model = UNet(res, 1, 2, Metalhead.backbone(Metalhead.ResNet(18; inchannels=1)))

device = CUDA.functional() ? gpu : cpu
model  = device(model)
θ      = Flux.params(model)

opt = Flux.Optimiser(WeightDecay(1e-6), Adam(learning_rate))

Flux.Optimise.Optimiser(Any[WeightDecay(1.0e-6), Adam(0.001, (0.9, 0.999), 1.0e-8, IdDict{Any, Any}())])

定义培训和验证功能

train_step：进行前向传递，计数logitcrossentropy，计算梯度并更新权重。

valid_step：根据验证数据评估损失。

function train_step(model, θ, x, y, opt)
    loss_cpu = 0f0
    ∇ = Flux.gradient(θ) do
        ŷ = model(x)                              
        l = Flux.logitcrossentropy(ŷ, y; dims=3)
        loss_cpu = cpu(l)
        l
    end
    Flux.Optimise.update!(opt, θ, ∇)
    return loss_cpu
end

function valid_step(model, x, y)
    ŷ = model(x)
    l = Flux.logitcrossentropy(ŷ, y; dims=3)
    return float(l)
end

valid_step (generic function with 1 method)

模型训练

这里描述了模型的主要训练周期。最初，我们设置我们需要的epochs的数量。

mkdir("model");

epochs = 50
for epoch in 1:epochs
    println("epoch: ", epoch)
    trainmode!(model)

    train_loss = 0f0
    for (x, y) in train_loader
        train_loss += train_step(model, θ, device(x), device(y), opt)
    end
    train_loss /= length(train_loader)
    @info "Epoch $epoch | Train Loss $train_loss"

    testmode!(model)
    validation_loss = 0f0
    for (x, y) in test_loader
        validation_loss += valid_step(model, device(x), device(y))
    end
    validation_loss /= length(test_loader)
    @info "Epoch $epoch | Validation Loss $validation_loss"

    # 保存检查点
    fn = joinpath("model", @sprintf("model_epoch_%03d.bson", epoch))
    @save fn model
    @info "  ，模型保存在$fn中"
end

保存最终模型

经过训练后，模型被传输到CPU并单独存储在model1中。bson文件。

model = cpu(model)
best_path = joinpath("model", "model1.bson")
@info "将模型保存在$best_path中"
@save best_path model

 [36m[1m[[22m[39m[36m[1minfo:[22m[39m]在model/model1中保存模型。布森

推论;推论

推理的辅助函数

ship_probs：从网络输出中提取"ship"类的概率。

predict_mask：返回基于输入图像的二进制掩码或概率图。

save_prediction：将预测的掩码保存为图片。

sci_thresholes：通过显示不同值的统计信息，帮助您选择用于二值化的阈值。

model_on_gpu(m) = any(x -> x isa CuArray, Flux.params(m))
to_dev(x, m)    = model_on_gpu(m) ? gpu(x) : x

function ship_probs(ŷ)
    sz = size(ŷ)
    if sz[3] == 2                
        p = softmax(ŷ; dims=3)
        return Array(@view p[:, :, 2, 1])
    else sz[1] == 2          
        p = softmax(ŷ; dims=1)
        return Array(@view p[2, :, :, 1]) |> x -> permutedims(x, (2,1))
    end
end


function predict_mask(model, img_path; thr=0.35, return_probs=false)
    img = Gray.(load(img_path))
    img = imresize(img, res; method=Linear())
    x = Float32.(img)
    H, W = size(x)
    x = reshape(x, H, W, 1, 1)

    ŷ = model(to_dev(x, model))

    p_ship = ship_probs(ŷ)

    @info "ship prob stats" min=minimum(p_ship) max=maximum(p_ship) mean=mean(p_ship)

    mask = Float32.(p_ship .>= thr)

    return return_probs ? (mask, p_ship) : mask
end

function save_prediction(model, in_path, out_path; thr=0.35)
    m = predict_mask(model, in_path; thr=thr)
    save(out_path, Gray.(m))
    @info "saved" path=out_path positives=sum(m .> 0)
end

function scan_thresholds(model, img_path; ts=0.10:0.05:0.50)
    _, p = predict_mask(model, img_path; return_probs=true)
    for t in ts
        m = p .>= t
        @printf "thr=%.2f  positives=%6d  max=%.3f  mean=%.3f\n" t count(m) maximum(p) mean(p)
    end
end

scan_thresholds (generic function with 1 method)

为所选择的SAR图像计算概率图，选择阈值，以及最终的pred_mask。png掩码保存。

img_path = raw"data/train/imgs/P0003_1200_2000_4200_5000.png"

scan_thresholds(model, img_path)


save_prediction(model, img_path, "pred_mask.png"; thr=0.50)

结论

本文采用主干网训练UNet神经网络作为SAR图像分割的ResNet。任务不是最简单的，因为图像中的船只可能位于城市环境中。为了进一步提高模型的质量，有必要采取更微妙的方法来训练