Audio modem, the revival of data transmission technology through sound

In the era of high-speed Internet, we have almost forgotten the distinctive sound of dial-up modems, which were once the only way to connect to the network. But this technology is not a thing of the past — it has found a new lease of life in modern applications for transferring data between devices via sound. This article presents the implementation of the FSK modem. In this example, we will create a system that allows you to transmit text data through audio signals using frequency modulation (FSK), the same principle that was used in older modems.

This example is relevant for developers interested in signal processing, researchers in the field of audio communications, and anyone who wants to understand how old modems work. This example demonstrates how a simple but effective data transfer protocol can be implemented using only the audio capabilities of devices, a technology that finds applications in offline data transmission, IoT devices, and redundant communication channels.

The example below implements:

FSK modulation with 96 frequencies in the range 1-5.5 kHz
Splitting data into 4-bit nibbles (nibble)
**Simple data integrity check (instead of Reed-Solomon)
Sync signals at the beginning and end of transmission
Demodulation using FFT for frequency analysis
Saving to a WAV file

Now let's move on to the implementation, first we'll call the auxiliary libraries.

# Connecting libraries
neededLibs = ["WAV", "FFTW", "LinearAlgebra"]
for lib in neededLibs
    try
        eval(Meta.parse("using $lib"))
    catch ex
        Pkg.add(lib)
        eval(Meta.parse("using $lib"))
    end
end

Now let's define the data structure for the FSK modulator.:

sample_rate - audio signal quality (the higher, the more accurate the transmission)
bit_duration - how long each character sounds (affects the transfer rate)
frequencies - a set of frequencies for encoding 16 possible values (0-15)
reed_solomon - enable/disable error correction

struct FSKModulator
    sample_rate::Int
    bit_duration::Float64
    frequencies::Vector{Float64}
    reed_solomon::Bool
end

Next, we define a wrapper constructor for the FSKModulator structure.

function FSKModulator(;sample_rate=44100, bit_duration=0.1, reed_solomon=true)
    frequencies = range(1000, stop=5500, length=96)
    FSKModulator(sample_rate, bit_duration, frequencies, reed_solomon)
end

FSKModulator

Function encode_data converts a string into a sequence of 4-bit blocks (nibbles) with the addition of an integrity check. In this example, simplified error checking is implemented instead of a full-fledged Reed-Solomon, the algorithm calculates the XOR of all nibbles.

How it works using the example of the character "A" (ASCII 65):

Byte: 01000001 (65 in binary)
The upper 4 bits: 0100 = 4
The lower 4 bits: 0001 = 1
The result: [4, 1]

function encode_data(modulator::FSKModulator, data::String)
    bytes = Vector{UInt8}(data)
    nibbles = UInt8[]
    for byte in bytes
        push!(nibbles, byte >> 4)   # The upper 4 bits
        push!(nibbles, byte & 0x0F) # The lower 4 bits
    end
    if modulator.reed_solomon
        # There should be a Reed-Solomon implementation here, but for the demo we use a simple XOR
        checksum = reduce(⊻, nibbles)
        push!(nibbles, checksum)
    end
    nibbles
end

encode_data (generic function with 1 method)

Function generate_tone generates a pure sinusoidal tone of a given frequency and duration. Each frequency from the array frequencies The modulator represents a specific 4-bit character, our function creates a "sound embodiment" of each character for transmission.

function generate_tone(modulator::FSKModulator, frequency::Float64, duration::Float64)
    t = range(0, stop=duration, length=Int(round(modulator.sample_rate * duration)))
    0.5 .* sin.(2π * frequency .* t)
end

generate_tone (generic function with 1 method)

Next, we combine all the functions described above into a single algorithm. Based on the results of this block of code, we will get a continuous audio signal, where different frequencies represent different data, similar to the work of old modems.

function modulate(modulator::FSKModulator, data::String)
    nibbles = encode_data(modulator, data)
    signal = Float64[]
    samples_per_bit = Int(round(modulator.sample_rate * modulator.bit_duration))
    sync_tone = generate_tone(modulator, 2000.0, modulator.bit_duration * 2)
    append!(signal, sync_tone)
    
    for nibble in nibbles
        freq_index = min(nibble + 1, length(modulator.frequencies))
        frequency = modulator.frequencies[freq_index]
        tone = generate_tone(modulator, frequency, modulator.bit_duration)
        append!(signal, tone)
    end
    append!(signal, sync_tone)
    signal
end

modulate (generic function with 1 method)

Function find_peak_frequency determines the dominant frequency in an audio segment and compares it with a 4-bit value. Let's take a step-by-step look at how demodulation works.:

Convert the time signal into a frequency spectrum
Find the frequency with the maximum amplitude
We are looking for the nearest frequency from a known set of modulator
Return the original 4-bit value (0-15)

function find_peak_frequency(signal_chunk::Vector{Float64}, sample_rate::Int, frequencies::Vector{Float64})
    n = length(signal_chunk)
    fft_result = fft(signal_chunk)
    fft_magnitude = abs.(fft_result[1:div(n,2)])
    freq_axis = range(0, stop=sample_rate/2, length=div(n,2))
    peak_idx = argmax(fft_magnitude)
    peak_freq = freq_axis[peak_idx]
    closest_idx = argmin(abs.(frequencies .- peak_freq))
    UInt8(closest_idx - 1)
end

find_peak_frequency (generic function with 1 method)

The next function uses the previous one to demodulate the FSK signal - converting the sound back into data, its operation can be divided into the following stages:

Sync: Skips the initial sync signal
Segmentation: Splits the signal into segments corresponding to each symbol
Frequency analysis: Determines the dominant frequency for each segment
Decoding: Maps frequencies to the original 4-bit values
Integrity check: Checks the checksum
Recovery: Collects bytes from pairs of 4-bit blocks

function demodulate(modulator::FSKModulator, signal::Vector{Float64})
    samples_per_bit = Int(round(modulator.sample_rate * modulator.bit_duration))
    nibbles = UInt8[]
    start_idx = 2 * samples_per_bit + 1
    
    for i in start_idx:samples_per_bit:(length(signal) - samples_per_bit)
        chunk_end = min(i + samples_per_bit - 1, length(signal))
        chunk = signal[i:chunk_end]
        if length(chunk) >= samples_per_bit ÷ 2
            nibble = find_peak_frequency(chunk, modulator.sample_rate, modulator.frequencies)
            push!(nibbles, nibble)
        end
    end
    if modulator.reed_solomon && length(nibbles) > 1
        received_checksum = pop!(nibbles)
        calculated_checksum = reduce(⊻, nibbles)
        if received_checksum != calculated_checksum
            @warn "Checksum mismatch! Data may be corrupted."
        end
    end
    bytes = UInt8[]
    for i in 1:2:length(nibbles)
        if i + 1 <= length(nibbles)
            byte = (nibbles[i] << 4) | nibbles[i+1]
            push!(bytes, byte)
        end
    end
    String(bytes)
end

demodulate (generic function with 1 method)

Finally, the last function we implemented allows you to save the modulated signal in WAV format.

function save_to_wav(filename::String, signal::Vector{Float64}, sample_rate=44100)
    max_val = maximum(abs.(signal))
    if max_val > 0
        signal_normalized = signal ./ max_val
    else
        signal_normalized = signal
    end
    wavwrite(signal_normalized, filename, Fs=sample_rate)
end

save_to_wav (generic function with 2 methods)

Now let's implement a small test to verify our algorithm and listen to the received audio.

This test demonstrates the full cycle of the FSK modem: a modulator is created with a sampling frequency of 44.1 kHz and a symbol duration of 50 ms, after which the string "Hello, FSK modulation! 123" is converted into an audio signal by frequency manipulation, saved to a WAV file, then it is demodulated back into text, and finally a comparison is performed. source and received messages to verify the correctness of data transmission.

modulator = FSKModulator(sample_rate=44100, bit_duration=0.05)
message = "Hello, FSK modulation! 123"

println("Message modulation: \"$message\"")
signal = modulate(modulator, message)
save_to_wav("fsk_transmission.wav", signal, modulator.sample_rate)
println("The signal is saved in a WAV file")

include("$(@__DIR__)/player.jl")
media_player("fsk_transmission.wav")

println("\ Demodulation...")
received_message = demodulate(modulator, signal)
println("Received message: \"$received_message\"")

if message == received_message
    println("✓ The transfer is successful!")
else
    println(" Transmission error!")
    println("Expected: \"$message\"")
    println("Received:  \"$received_message\"")
end

Модуляция сообщения: "Hello, FSK modulation! 123"
Сигнал сохранён в WAV-файл

Демодуляция...
Полученное сообщение: "Hello, FSK modulation! 123"
✓ Передача успешна!

Conclusion

In conclusion, we can say that the presented implementation of the FSK modem clearly demonstrates the revival of classical data transmission technology through sound in a modern context.

The experiment has successfully confirmed the efficiency of the approach: from encoding a string into a sequence of frequency tones and saving them to a WAV file to accurately reconstructing the original message through FFT analysis, which highlights the practical value of Engee for digital signal processing tasks and opens up prospects for developing audio communication solutions in conditions of limited communication channels.