Video analysis and conversion.
In the digital age, video processing is becoming a key task in various fields, from computer vision and machine learning to creative content and scientific research. We present a comprehensive solution in the Julia programming language, offering a set of efficient algorithms for analyzing and processing video streams. This example demonstrates the practical implementation of a video processing system that includes detection of object boundaries (energy analysis), animation styling (cartoon effect), the use of sepia and other visual filters. Special attention is paid to optimizing memory usage, ensuring the correct data format, and supporting modern codecs for saving results in MP4 format.
Let's start by connecting libraries:
-
VideoIO is the main library for reading/writing video files. Opens the video, extracts the frames, controls the parameters (FPS, resolution), saves the result to MP4.
-
Images - basic image operations: loading, saving, color space conversion, basic filters.
-
ImageFiltering - advanced image filtering: application of convolution kernels (Gaussian, Sobel), border detection, blurring, sharpening.
-
FileIO is a universal interface for working with files of various formats.
-
ColorTypes - definition of color models (RGB, Gray) and work with color components (red, green, blue).
-
FixedPointNumbers - work with a fixed point (N0f8), necessary for the correct representation of pixels.
-
ProgressMeter - indication of processing progress, display of execution time.
-
FFMPEG (optional) - Low-level access to video codecs via FFmpeg for fine-tuning encoding parameters.
All libraries work together: VideoIO downloads videos, Images/ImageFiltering processes frames, ColorTypes manages color, ProgressMeter shows progress, FileIO saves the result.
Pkg.add("FFMPEG")
Pkg.add("ProgressMeter")
Pkg.add("ColorTypes")
Pkg.add("FixedPointNumbers")
using VideoIO, Images, ImageFiltering, FileIO, ColorTypes, FixedPointNumbers
using ProgressMeter
import ColorTypes: red, green, blue
Next, we will declare auxiliary functions:
-
prepare_frameConverts frames from the PermutedDimsArray format into regular matrices for compatibility with image processing functions. -
to_video_formatEnsures correct conversion of color formats (RGB and Grayscale) to the N0f8 type required for compatibility with VideoIO when recording videos.
function prepare_frame(frame)
if typeof(frame) <: PermutedDimsArray
return collect(frame)
else
return frame
end
end
function to_video_format(frame::AbstractArray{<:AbstractRGB})
return RGB{N0f8}.(frame)
end
function to_video_format(frame::AbstractArray{<:Gray})
return Gray{N0f8}.(frame)
end
Further announced process_video_with_effect — the main processor function, which sequentially reads each frame of the video, applies the specified effect to it, converts it to a compatible format and saves the result to a new video file with progress indication.
This function opens the source video, determines its parameters (size, FPS), then applies the specified effect for each frame through the passed handler function, converts the result to a VideoIO-compatible format and sequentially writes the processed frames to the output file, displaying the operation progress in real time.
function process_video_with_effect(
input_path::String,
output_path::String;
effect_function::Function,
max_frames::Int=0,
effect_params...
)
video = VideoIO.openvideo(input_path)
frame_count = counttotalframes(video)
max_frames > 0 && (frame_count = min(frame_count, max_frames))
first_frame = read(video)
seek(video, 1)
h, w = size(first_frame)
fps = VideoIO.framerate(video)
println("Обработка: ", input_path)
println("Размер: ", w, "x", h, ", FPS: ", fps, ", Кадров: ", frame_count)
first_processed = effect_function(prepare_frame(first_frame); effect_params...)
first_processed_video = to_video_format(first_processed)
open_video_out(output_path, first_processed_video, framerate=fps) do writer
p = Progress(frame_count, 1, "Применение эффекта...")
for i in 1:frame_count
frame = read(video)
processed_frame = effect_function(prepare_frame(frame); effect_params...)
processed_frame_video = to_video_format(processed_frame)
write(writer, processed_frame_video)
next!(p)
end
end
close(video)
println("\nВидео сохранено: ", output_path)
end
Function simple_cartoon_effect implements an algorithm for cartoon image stylization through a combination of blurring and border highlighting. First, the original frame is converted to grayscale for later boundary analysis. Then the Sobel operator is applied on both X and Y axes to calculate the brightness gradients, after which the energy of the boundaries is calculated as the sum of the squares of the gradients. The resulting border map is binarized with a given threshold, creating a mask of black and white contours. In parallel, the original color image is processed with Gaussian blur to simplify the color palette and create a soft background. The final result is formed by superimposing black contours from a binarized mask onto a blurred color image, which creates a characteristic cartoon effect with simplified color areas and clear black borders between them.
function simple_cartoon_effect(frame::AbstractArray{<:AbstractRGB}; edge_threshold=0.1)
frame = prepare_frame(frame)
gray = Gray.(frame)
edges = imfilter(gray, Kernel.sobel()[1]).^2 + imfilter(gray, Kernel.sobel()[2]).^2
edges = edges .> edge_threshold
blurred = imfilter(frame, Kernel.gaussian(3))
result = copy(blurred)
result[edges] .= RGB{N0f8}(0,0,0)
return result
end
Function energy_effect calculates the energy map of the boundaries in the image using the Sobel operator. First, the original color frame is converted into a brightness matrix using the standard weighted sum of color components formula. Then convolutions with Sobel kernels along the X and Y axes are applied to calculate horizontal and vertical brightness gradients. For each pixel, the energy of the boundaries is calculated as the Euclidean norm of the gradients, which corresponds to the intensity of the brightness change at a given point. The resulting energy map is normalized to the maximum value to bring it to the range from 0 to 1 and returned as a grayscale image, where the brightness of each pixel corresponds to the strength of the boundary at this position, which visualizes the structural features and contours of the original image.
function energy_effect(frame::AbstractArray{<:AbstractRGB}; sobel_x=nothing, sobel_y=nothing)
frame = prepare_frame(frame)
h, w = size(frame)
gray_buffer = Matrix{Float32}(undef, h, w)
energy_buffer = Matrix{Float32}(undef, h, w)
if sobel_x === nothing
sobel_x, sobel_y = Kernel.sobel()
end
for i in eachindex(frame)
pixel = frame[i]
gray_buffer[i] = 0.299f0 * red(pixel) + 0.587f0 * green(pixel) + 0.114f0 * blue(pixel)
end
gradient_x = imfilter(gray_buffer, sobel_x)
gradient_y = imfilter(gray_buffer, sobel_y)
for i in eachindex(energy_buffer)
energy_buffer[i] = sqrt(gradient_x[i]^2 + gradient_y[i]^2)
end
energy_normalized = energy_buffer ./ maximum(energy_buffer)
return Gray.(energy_normalized)
end
Function sepia_effect It applies a sepia filter to the image, which imitates an old photograph with a characteristic brownish tint. The algorithm works by converting each pixel through a linear combination of color channels with certain weights, where the red channel is enhanced by green and blue, creating a warm tone. The values of each color component are limited to one at the top to prevent overflow, and the result is stored in a fixed-point format, ensuring compatibility with the video processing system. The end result is an image with a vintage sepia effect, where all the details of the original are preserved, but the color scheme shifts towards brown and yellowish tones characteristic of historical photographic processes.
function sepia_effect(frame::AbstractArray{<:AbstractRGB})
frame = prepare_frame(frame)
result = similar(frame)
for i in eachindex(frame)
pixel = frame[i]
r_val = red(pixel)
g_val = green(pixel)
b_val = blue(pixel)
r = 0.393 * r_val + 0.769 * g_val + 0.189 * b_val
g = 0.349 * r_val + 0.686 * g_val + 0.168 * b_val
b = 0.272 * r_val + 0.534 * g_val + 0.131 * b_val
result[i] = RGB{N0f8}(min(r,1.0), min(g,1.0), min(b,1.0))
end
return result
end
Function blur_effect It applies a blur effect to an image using a Gaussian filter, which evenly reduces sharpness and creates a soft smoothing of details. The size of the filter core is regulated by the kernel_size parameter, which determines the degree of blurring - the larger the core, the stronger the smoothing effect. The Gaussian kernel calculates the weighted average of pixels in the neighborhood, where the weights decrease along the Gaussian curve from the center to the edges, which provides a natural smooth blur without artifacts. The result is converted to a fixed-point format to ensure compatibility with the video processing system, while maintaining smooth color transitions and a soft texture of the blurred image.
function blur_effect(frame::AbstractArray{<:AbstractRGB}; kernel_size=5)
frame = prepare_frame(frame)
blurred = imfilter(frame, Kernel.gaussian(kernel_size))
return RGB{N0f8}.(blurred)
end
Function sharpen_effect Applies a sharpening effect to an image using a 3x3 Laplacian-like convolution core. The core has a central coefficient of 9 and surrounding values of -1, which enhances the high-frequency components of the image and emphasizes the contours of objects. After applying the convolution, the pixel values are limited to a range from 0 to 1 to prevent exceeding the acceptable boundaries of the color space, and the result is converted to a fixed-point format to ensure compatibility with the video processing system. This process effectively enhances the detail and texture of the image, making the edges clearer and more pronounced without significant distortion of the color balance.
function sharpen_effect(frame::AbstractArray{<:AbstractRGB})
frame = prepare_frame(frame)
kernel = [-1 -1 -1; -1 9 -1; -1 -1 -1] / 1.0
sharpened = imfilter(frame, kernel)
sharpened = clamp.(sharpened, 0.0, 1.0)
return RGB{N0f8}.(sharpened)
end
Function safe_cartoon_effect creates a stable version of the cartoon effect with guaranteed data type compatibility. Unlike the basic implementation, it first converts the input image to RGB{N0f8} format, which ensures that all subsequent operations work correctly. The algorithm starts by converting the image to grayscale to analyze gradients, then applies the Sobel operator on both axes to calculate the boundary map. After binarization of the borders with the specified threshold, a contour mask is created, which is superimposed on the blurred version of the original image. The key difference from the simple version is that all operations are performed with data already converted to the correct format, which eliminates type incompatibility errors during subsequent video processing. The result is an image with a characteristic cartoon style - soft color areas and clear black contours, ready for direct recording into a video file without additional transformations.
function safe_cartoon_effect(frame::AbstractArray{<:AbstractRGB}; edge_threshold=0.1)
frame = prepare_frame(frame)
frame_rgb = RGB{N0f8}.(frame)
gray = Gray.(frame_rgb)
edges = imfilter(gray, Kernel.sobel()[1]).^2 + imfilter(gray, Kernel.sobel()[2]).^2
edges = edges .> edge_threshold
blurred = imfilter(frame_rgb, Kernel.gaussian(3))
result = copy(blurred)
result[edges] .= RGB{N0f8}(0,0,0)
return result
end
Now let's look at the results of all the algorithms described above. The analyzed code is a comprehensive Julia video processing system that implements a variety of visual effects through a modular and efficient approach. The main algorithm of operation is based around a central processor function that sequentially reads each frame of the video, applies the selected effect to it through specialized filter functions, ensures correct formatting of the data and saves the result to an output video file with progress indication.
The key advantages of this approach are its modularity and type safety. Each effect is implemented as an independent function, which makes it easy to combine filters, add new transformations, and test them in isolation. The system actively uses Julia's strict typing, which is reflected in the format conversion functions that ensure data compatibility with the VideoIO library. This eliminates common pixel format errors, especially when working with color spaces and fixed precision.
Memory optimization deserves special attention - the effects functions use preallocated buffers and avoid unnecessary data copies, which is critically important when processing high-resolution video. The variety of implemented effects demonstrates the flexibility of the architecture.
process_video_with_effect("input.mp4", "energy_analysis.mp4",
effect_function=energy_effect, max_frames=50)
process_video_with_effect("input.mp4", "cartoon_safe.mp4",
effect_function=safe_cartoon_effect, edge_threshold=0.15, max_frames=50)
process_video_with_effect("input.mp4", "sepia.mp4",
effect_function=sepia_effect, max_frames=50)
process_video_with_effect("input.mp4", "blur.mp4",
effect_function=blur_effect, kernel_size=7, max_frames=50)
include("player.jl")
media_player(@__DIR__, mode="video")
Conclusion
The practical value of the solution is emphasized by its readiness to work with real video files, support for standard formats and encoding parameters, as well as interactive tracking of processing progress.
The presented code serves as a reliable foundation for building more complex computer vision and multimedia processing systems, combining the academic rigor of algorithms with practical application.