WAV.jl — read/write WAV audio files

# WAV — Module

The WAV package is a pure Julia library for reading and writing the WAV audio file format.

It provides wavread, wavwrite and wavappend functions to read, write, and append to WAV files. The function wavplay provides simple audio playback.

These functions behave similar to the former MATLAB functions of the same name.

This module also provides wavread and wavwrite as load and save methods for format"WAV" to the FileIO package.

To read and write CUE and INFO chunks, there are experimental functions wav_cue_read, wav_cue_write, wav_info_read, wav_info_write.

Example

using WAV
fs = 8e3
t = 0.0:1/fs:prevfloat(1.0)
f = 1e3
y = sin.(2pi * f * t) * 0.1
wavwrite(y, "example.wav", Fs=fs)

y, fs = wavread("example.wav")
y = sin.(2pi * 2f * t) * 0.1
wavappend(y, "example.wav")

y, fs = wavread("example.wav")
wavplay(y, fs)

Main functions

# WAV.wavread — Function

wavread(io::IO; subrange=:, format="double")
wavread(filename::AbstractString; subrange=:, format="double")

Reads the samples from a WAV file. The samples are converted to floating point values in the range −1.0 to 1.0 by default.

The available options, and the default values, are:

format selects the form of data returned:
- format="double" (default) returns double-precision floating point (Float64) values in the range −1.0 to 1.0.
- format="native" returns the values as encoded in the file.
- format="size" returns a 2-tuple (n, m) containing the number of samples 'n' and the number of channels 'm' in the file (like size(y)), rather than the regular 4-tuple (y, Fs, nbits, opt) with the actual samples in y.
subrange controls which samples are returned. The default (:) returns all samples in the file. Passing an integer N (or equivalently the range 1:N) returns the first N samples of each channel. Passing a unitrange I:J returns length(I:J) consecutive samples from each channel, starting with the I-th sample.

The function returns a 4-tuple with elements

y: A 2-dimensional array containing the waveform samples, where the row index represents the time axis and the colum index the channel number
Fs: The sampling frequency in hertz
nbits: the number of bits used to encode each sample
opt: A vector of WAVChunk elements representing optional chunksfound in the WAV file.

The elements in the opt vector depend on the contents of the WAV file. A WAVChunk is defined as

struct WAVChunk
    id::Symbol
    data::Vector{UInt8}
end

where id is the four-character chunk ID. All valid WAV files will contain a fmt chunk, with id==Symbol("fmt ") (note the trailing space).

In order to obtain the contents of the format chunk, call WAV.getformat(opt).

The following methods are also defined to make this function compatible with MATLAB’s former wavread function:

wavread(filename::AbstractString, fmt::AbstractString) = wavread(filename, format=fmt)
wavread(filename::AbstractString, n) = wavread(filename, subrange=n)
wavread(filename::AbstractString, n, fmt) = wavread(filename, subrange=n, format=fmt)

Example

y, fs, nbits, opt = wavread("example.wav")

See also: wavwrite

# WAV.wavwrite — Function

wavwrite(y::AbstractArray, io::IO;
         Fs=8000, nbits=0, compression=0, chunks::Vector{WAVChunk}=WAVChunk[])
wavwrite(y::AbstractArray, filename::String;
         Fs=8000, nbits=0, compression=0, chunks::Vector{WAVChunk}=WAVChunk[])

Writes sample matrix y in RIFF/WAVE format to a file. Each column of the data represents a different channel. Stereo files contain two columns (left and right).

The second argument accepts either an IO object or a filename (String).

The function choses by default a sample rate of 8 kHz and an output data type and bits-per-sample number based on eltype(y). These defaults can be changed using optional keyword arguments:

Fs: sampling frequency in hertz
nbits: specify the number of bits to be used to encode each sample; the default (0) is an automatic choice based on the values of compression and eltype(samples)
compression controls the type of encoding used in the file, and can be one of
- WAVE_FORMAT_PCM: UInt8, Int16, etc. integer encoding
- WAVE_FORMAT_IEEE_FLOAT: Float32 or Float64 encoding
- WAVE_FORMAT_ALAW: ITU-T G.711 A-law encoding (8-bit log-scale, used in European telephone networks)
- WAVE_FORMAT_MULAW: ITU-T G.711 µ-law encoding (8-bit log-scale, used in American telephone networks)
The default (0) is to pick an encoding automatically based on typeof(samples) (see below).
chunks (default = WAVChunk[]): a vector of WAVChunk objects to be written to the file (in addition to the format chunk). See below for some utilities for creating CUE and INFO chunks.

Unless otherwise specified via nbits and compression, the type of the input array y determines the data format used in the generated file. The function attempts to picks among the encodings commonly supported by other WAV-reading audio software the one that best preserves the provided input type.

For Integer input arrays, and compression=WAVE_FORMAT_PCM or compression=0, the permitted sample-value ranges are:

nbits eltype(y) supported range output type

`nbits`	`eltype(y)`	supported range	output type
0	UInt8,Int16	full range	UInt8,Int16
0	Int32	--2^23 ≤ y < 2^23	Int32
8	<: Integer	0 ≤ y ≤ 255	UInt8
16	<: Integer	--32768 ≤ y ≤ +32767	Int16
24	<: Integer	--2^23 ≤ y < 2^23	Int32
32	<: Integer	--2^31 ≤ y < 2^31	Int32

UInt8,Int16

full range

UInt8,Int16

Int32

--2^23 ≤ y < 2^23

Int32

<: Integer

0 ≤ y ≤ 255

UInt8

<: Integer

--32768 ≤ y ≤ +32767

Int16

<: Integer

--2^23 ≤ y < 2^23

Int32

<: Integer

--2^31 ≤ y < 2^31

Int32

If y is a floating-point array, and compression=WAVE_FORMAT_IEEE_FLOAT or compression=0, the full range of Float32 values are supported:

nbits eltype(y) supported range output type

`nbits`	`eltype(y)`	supported range	output type
0	Float32,Float64	--Inf32 ≤ y ≤ +Inf32	Float32
32	Float32,Float64	--Inf32 ≤ y ≤ +Inf32	Float32
64	Float32,Float64	--Inf64 ≤ y ≤ +Inf64	Float64

Float32,Float64

--Inf32 ≤ y ≤ +Inf32

Float32

Float32,Float64

--Inf32 ≤ y ≤ +Inf32

Float32

Float32,Float64

--Inf64 ≤ y ≤ +Inf64

Float64

If y is a floating-point array, and compression=WAVE_FORMAT_PCM, the input data ranges are:

nbits eltype(y) supported range output type

`nbits`	`eltype(y)`	supported range	output type
8	Float32, Float64	--1.0 ≤ y ≤ +1.0	UInt8
16	Float32, Float64	--1.0 ≤ y ≤ +1.0	Int16
24	Float32, Float64	--1.0 ≤ y ≤ +1.0	Int32
32	Float32, Float64	--1.0 ≤ y ≤ +1.0	Int32

Float32, Float64

--1.0 ≤ y ≤ +1.0

UInt8

Float32, Float64

--1.0 ≤ y ≤ +1.0

Int16

Float32, Float64

--1.0 ≤ y ≤ +1.0

Int32

Float32, Float64

--1.0 ≤ y ≤ +1.0

Int32

The output format column shows the element type returned by wavread(..., format=-"native").

The following methods are also defined to make this function compatible with MATLAB’s former wavwrite function:

wavwrite(y::Array, f::Real, filename::String) = wavwrite(y, filename, Fs=f)
wavwrite(y::Array, f::Real, N::Real, filename::String) = wavwrite(y, filename, Fs=f, nbits=N)
wavwrite(y::Array{T}, io::IO) where {T<:Integer} = wavwrite(y, io, nbits=sizeof(T)*8)
wavwrite(y::Array{T}, filename::String) where {T<:Integer} = wavwrite(y, filename, nbits=sizeof(T)*8)
wavwrite(y::Array{Int32}, io::IO) = wavwrite(y, io, nbits=24)
wavwrite(y::Array{Int32}, filename::String) = wavwrite(y, filename, nbits=24)
wavwrite(y::Array{T}, io::IO) where {T<:FloatingPoint} = wavwrite(y, io, nbits=sizeof(T)*8, compression=WAVE_FORMAT_IEEE_FLOAT)
wavwrite(y::Array{T}, filename::String) where {T<:FloatingPoint} = wavwrite(y, filename, nbits=sizeof(T)*8, compression=WAVE_FORMAT_IEEE_FLOAT)

Processing chunks

# WAV.WAVChunk — Type

WAVChunk(id, data) represents a RIFF chunk. Symbol id is the four-character chunk ID.

# WAV.wav_cue_read — Function

wav_cue_read(chunks::Vector{WAVChunk})

Takes a Vector{WAVChunk} (as returned by wavread) and returns a Vector{WAVMarker}, where a WAVMarker is defined as:

mutable struct WAVMarker
    label::String
    start_time::UInt32
    duration::UInt32
end

Field values start_time and duration are in samples.

Example

using WAV
x, fs, bits, in_chunks = wavread("in.wav")
markers = wav_cue_read(in_chunks)

# WAV.wav_cue_write — Function

wav_cue_write(markers::Dict{UInt32, WAVMarker})

Turns WAVMarkers into a Vector{WAVChunk} (as accepted by wavwrite). The key for the dictionary is the ID of the marker to be written to file.

Example:

out_chunks = wav_cue_write(markers)
wavwrite(x, "out.wav", Fs=fs, nbits=16, compression=WAVE_FORMAT_PCM, chunks=out_chunks)

# WAV.wav_info_read — Function

wav_info_read(chunks::Vector{WAVChunk})::Dict{Symbol, String}

Given a list of chunks as returned by wavread, return a Dict{Symbol, String} where the keys are symbols representing four-character RIFF INFO tag IDs as specified in https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info

# WAV.wav_info_write — Function

wav_info_write(tags::Dict{Symbol, String})::Vector{WAVChunk}

Converts a dictionary of INFO tags into a list of WAV chunks appropriate for passing to wavwrite.

tags is a dictionary where the keys are symbols representing four-character RIFF INFO tag IDs as specified in https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info The values of the dictionary correspond to the tag data.