Engee documentation
Notebook

Audio modem, the revival of data transmission technology through sound

In the era of high-speed Internet, we have almost forgotten the distinctive sound of dial-up modems, which were once the only way to connect to the network. But this technology is not a thing of the past — it has found a new lease of life in modern applications for transferring data between devices via sound. This article presents the implementation of the FSK modem. In this example, we will create a system that allows you to transmit text data through audio signals using frequency modulation (FSK), the same principle that was used in older modems.

This example is relevant for developers interested in signal processing, researchers in the field of audio communications, and anyone who wants to understand how old modems work. This example demonstrates how a simple but effective data transfer protocol can be implemented using only the audio capabilities of devices, a technology that finds applications in offline data transmission, IoT devices, and redundant communication channels.

The example below implements:

  1. FSK modulation with 96 frequencies in the range 1-5.5 kHz

  2. Splitting data into 4-bit nibbles (nibble)

  3. **Simple data integrity check (instead of Reed-Solomon)

  4. Sync signals at the beginning and end of transmission

  5. Demodulation using FFT for frequency analysis

  6. Saving to a WAV file

Now let's move on to the implementation, first we'll call the auxiliary libraries.

In [ ]:
# Подключение библиотек
neededLibs = ["WAV", "FFTW", "LinearAlgebra"]
for lib in neededLibs
    try
        eval(Meta.parse("using $lib"))
    catch ex
        Pkg.add(lib)
        eval(Meta.parse("using $lib"))
    end
end

Now let's define the data structure for the FSK modulator.:

  • sample_rate - audio signal quality (the higher, the more accurate the transmission)

  • bit_duration - how long each character sounds (affects the transfer rate)

  • frequencies - a set of frequencies for encoding 16 possible values (0-15)

  • reed_solomon - enable/disable error correction

In [ ]:
struct FSKModulator
    sample_rate::Int
    bit_duration::Float64
    frequencies::Vector{Float64}
    reed_solomon::Bool
end

Next, we define a wrapper constructor for the FSKModulator structure.

In [ ]:
function FSKModulator(;sample_rate=44100, bit_duration=0.1, reed_solomon=true)
    frequencies = range(1000, stop=5500, length=96)
    FSKModulator(sample_rate, bit_duration, frequencies, reed_solomon)
end
Out[0]:
FSKModulator

Function encode_data converts a string into a sequence of 4-bit blocks (nibbles) with the addition of an integrity check. In this example, simplified error checking is implemented instead of a full-fledged Reed-Solomon, the algorithm calculates the XOR of all nibbles.

How it works using the example of the character "A" (ASCII 65):

  1. Byte: 01000001 (65 in binary)

  2. The upper 4 bits: 0100 = 4

  3. The lower 4 bits: 0001 = 1

  4. The result: [4, 1]

In [ ]:
function encode_data(modulator::FSKModulator, data::String)
    bytes = Vector{UInt8}(data)
    nibbles = UInt8[]
    for byte in bytes
        push!(nibbles, byte >> 4)   # Старшие 4 бита
        push!(nibbles, byte & 0x0F) # Младшие 4 бита
    end
    if modulator.reed_solomon
        # Здесь должна быть реализация Reed-Solomon, но для демо используем простой XOR
        checksum = reduce(, nibbles)
        push!(nibbles, checksum)
    end
    nibbles
end
Out[0]:
encode_data (generic function with 1 method)

Function generate_tone generates a pure sinusoidal tone of a given frequency and duration. Each frequency from the array frequencies The modulator represents a specific 4-bit character, our function creates a "sound embodiment" of each character for transmission.

In [ ]:
function generate_tone(modulator::FSKModulator, frequency::Float64, duration::Float64)
    t = range(0, stop=duration, length=Int(round(modulator.sample_rate * duration)))
    0.5 .* sin.(2π * frequency .* t)
end
Out[0]:
generate_tone (generic function with 1 method)

Next, we combine all the functions described above into a single algorithm. Based on the results of this block of code, we will get a continuous audio signal, where different frequencies represent different data, similar to the work of old modems.

In [ ]:
function modulate(modulator::FSKModulator, data::String)
    nibbles = encode_data(modulator, data)
    signal = Float64[]
    samples_per_bit = Int(round(modulator.sample_rate * modulator.bit_duration))
    sync_tone = generate_tone(modulator, 2000.0, modulator.bit_duration * 2)
    append!(signal, sync_tone)
    
    for nibble in nibbles
        freq_index = min(nibble + 1, length(modulator.frequencies))
        frequency = modulator.frequencies[freq_index]
        tone = generate_tone(modulator, frequency, modulator.bit_duration)
        append!(signal, tone)
    end
    append!(signal, sync_tone)
    signal
end
Out[0]:
modulate (generic function with 1 method)

Function find_peak_frequency determines the dominant frequency in an audio segment and compares it with a 4-bit value. Let's take a step-by-step look at how demodulation works.:

  1. Convert the time signal into a frequency spectrum

  2. Find the frequency with the maximum amplitude

  3. We are looking for the nearest frequency from a known set of modulator

  4. Return the original 4-bit value (0-15)

In [ ]:
function find_peak_frequency(signal_chunk::Vector{Float64}, sample_rate::Int, frequencies::Vector{Float64})
    n = length(signal_chunk)
    fft_result = fft(signal_chunk)
    fft_magnitude = abs.(fft_result[1:div(n,2)])
    freq_axis = range(0, stop=sample_rate/2, length=div(n,2))
    peak_idx = argmax(fft_magnitude)
    peak_freq = freq_axis[peak_idx]
    closest_idx = argmin(abs.(frequencies .- peak_freq))
    UInt8(closest_idx - 1)
end
Out[0]:
find_peak_frequency (generic function with 1 method)

The next function uses the previous one to demodulate the FSK signal - converting the sound back into data, its work can be divided into the following stages:

  1. Sync: Skips the initial sync signal

  2. Segmentation: Splits the signal into segments corresponding to each symbol

  3. Frequency analysis: Determines the dominant frequency for each segment

  4. Decoding: Maps frequencies to the original 4-bit values

  5. Integrity check: Checks the checksum

  6. Recovery: Collects bytes from pairs of 4-bit blocks

In [ ]:
function demodulate(modulator::FSKModulator, signal::Vector{Float64})
    samples_per_bit = Int(round(modulator.sample_rate * modulator.bit_duration))
    nibbles = UInt8[]
    start_idx = 2 * samples_per_bit + 1
    
    for i in start_idx:samples_per_bit:(length(signal) - samples_per_bit)
        chunk_end = min(i + samples_per_bit - 1, length(signal))
        chunk = signal[i:chunk_end]
        if length(chunk) >= samples_per_bit ÷ 2
            nibble = find_peak_frequency(chunk, modulator.sample_rate, modulator.frequencies)
            push!(nibbles, nibble)
        end
    end
    if modulator.reed_solomon && length(nibbles) > 1
        received_checksum = pop!(nibbles)
        calculated_checksum = reduce(, nibbles)
        if received_checksum != calculated_checksum
            @warn "Checksum mismatch! Data may be corrupted."
        end
    end
    bytes = UInt8[]
    for i in 1:2:length(nibbles)
        if i + 1 <= length(nibbles)
            byte = (nibbles[i] << 4) | nibbles[i+1]
            push!(bytes, byte)
        end
    end
    String(bytes)
end
Out[0]:
demodulate (generic function with 1 method)

Finally, the last function we implemented allows you to save the modulated signal in WAV format.

In [ ]:
function save_to_wav(filename::String, signal::Vector{Float64}, sample_rate=44100)
    max_val = maximum(abs.(signal))
    if max_val > 0
        signal_normalized = signal ./ max_val
    else
        signal_normalized = signal
    end
    wavwrite(signal_normalized, filename, Fs=sample_rate)
end
Out[0]:
save_to_wav (generic function with 2 methods)

Now let's implement a small test to verify our algorithm and listen to the received audio.

This test demonstrates the full cycle of the FSK modem: a modulator is created with a sampling frequency of 44.1 kHz and a symbol duration of 50 ms, after which the string "Hello, FSK modulation! 123" is converted into an audio signal by frequency manipulation, saved to a WAV file, then it is demodulated back into text, and finally a comparison is performed. source and received messages to verify the correctness of data transmission.

In [ ]:
modulator = FSKModulator(sample_rate=44100, bit_duration=0.05)
message = "Hello, FSK modulation! 123"

println("Модуляция сообщения: \"$message\"")
signal = modulate(modulator, message)
save_to_wav("fsk_transmission.wav", signal, modulator.sample_rate)
println("Сигнал сохранён в WAV-файл")

include("$(@__DIR__)/player.jl")
media_player("fsk_transmission.wav")

println("\nДемодуляция...")
received_message = demodulate(modulator, signal)
println("Полученное сообщение: \"$received_message\"")

if message == received_message
    println("✓ Передача успешна!")
else
    println("✗ Ошибка передачи!")
    println("Ожидалось: \"$message\"")
    println("Получено:  \"$received_message\"")
end
Модуляция сообщения: "Hello, FSK modulation! 123"
Сигнал сохранён в WAV-файл
fsk_transmission.wav (1 of 1)
Демодуляция...
Полученное сообщение: "Hello, FSK modulation! 123"
✓ Передача успешна!

Conclusion

In conclusion, we can say that the presented implementation of the FSK modem clearly demonstrates the revival of classical data transmission technology through sound in a modern context.

The experiment has successfully confirmed the efficiency of the approach: from encoding a string into a sequence of frequency tones and saving them to a WAV file to accurately reconstructing the original message through FFT analysis, which highlights the practical value of Engee for digital signal processing tasks and opens up prospects for developing audio communication solutions in conditions of limited communication channels.