Engee documentation
Notebook

Reading and writing bin and xml files

In this example we will compare reading bin and xml files in Engee and MATLAB.

Let's start with binary files (bin). They contain a raw data structure, making them difficult to interpret without special tools. Such files consist of a stream of bytes that can represent any data, be it text, images, audio or video. This data is stored in binary form, which requires special processing to extract the required information.

The main features of working with binary files are as follows.

  1. Data format: in binary files, data is represented by a sequence of bytes, where each byte stores one unit of information.
  2. Data interpretation: to understand the contents of a binary file requires knowledge of its internal structure. Without knowledge of the data structure, a programme cannot interpret the file correctly.
  3. Data is accessed by reading and writing individual bytes or blocks of bytes. The programme must know how the interpreter will process this data.

Let's create a binary file for reading.

In [ ]:
Pkg.add(["EzXML"])
In [ ]:
open("file.bin", "w") do io
    write(io, Int32(12))   # Запишем целое число типа Int32
    write(io, Float64(3.14))  # Запишем вещественное число типа Float64
    write(io, "Test\n")  # Запишем строку
end
Out[0]:
5

Reading a binary file in Engee:

In [ ]:
open("file.bin", "r") do io
    num_int = read(io, Int32)  # Считаем целое число типа Int32
    num_float = read(io, Float64)  # Считаем вещественное число типа Float64
    str = String(read(io))  # Считаем оставшуюся часть файла как строку

    println("Целое число: ", num_int)
    println("Вещественное число: ", num_float)
    print("Прочитанная строка: ", str)
end
Целое число: 12
Вещественное число: 3.14
Прочитанная строка: Test

Reading a binary file in MATLAB:

In [ ]:
using MATLAB
mat"""
fileID = fopen('file.bin', 'rb');
num_int = fread(fileID, 1, 'int32');
num_float = fread(fileID, 1, 'double');
str = fscanf(fileID, '%c');
fclose(fileID);
"""
Out[0]:
0.0
In [ ]:
mat"""fprintf('Прочитанные целые числа: ');
disp(num_int);"""
>> >> >> Прочитанные целые числа: >>     12

In [ ]:
mat"""fprintf('Прочитанные вещественные числа: ');
disp(num_float);"""
>> >> >> Прочитанные вещественные числа: >>     3.1400

In [ ]:
mat"""disp(['Прочитанная строка: ', str]);"""
>> >> >> Прочитанная строка: Test

Although the tests show that the code turned out to be similar, the differences between the languages are apparent in the fact that the Engee syntax is simpler and more intuitive. In addition, data types in Engee are easier to understand because the concept of "double" is not present.

Now let's move on to XML files (xml). They are designed to store structured data in the form of a tree. Each element of an XML file has its own tag, which defines the type of data contained in this element. This makes working with such files easier, as programs can analyse the structure of the file and retrieve the required data.

The main features of working with XML files are as follows.

  1. Data structure: XML file is a tree in which each element (tag) contains information about data type and structure.
  2. Hierarchical structure: XML file organises data in a tree structure where each element has a parent and descendants.
  3. Markup: An XML file includes markup that defines how the data is organised. This makes the data easier to process and the programme knows how to handle each element.

The easiest option to read such files is to use a string representation. To analyse such files in detail, Engee has the EzXML library. And MATLAB uses Java API to work with XML, as there are no built-in tools for this.

In [ ]:
Pkg.add("EzXML")
using EzXML
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`
In [ ]:
using EzXML
doc = parsexml("""
<primates>
    <genus name="Homo">
        <species name="sapiens">Human</species>
    </genus>
    <genus name="Pan">
        <species name="paniscus">Bonobo</species>
        <species name="troglodytes">Chimpanzee</species>
    </genus>
</primates>
""")
# Сохраняем документ в файл
write("file.xml", doc)
Out[0]:
290

Reading an XML file in Engee:

In [ ]:
open("file.xml", "r") do io
    str = String(read(io))
    println(str)
end
<?xml version="1.0" encoding="UTF-8"?>
<primates>
    <genus name="Homo">
        <species name="sapiens">Human</species>
    </genus>
    <genus name="Pan">
        <species name="paniscus">Bonobo</species>
        <species name="troglodytes">Chimpanzee</species>
    </genus>
</primates>

Reading an XML file in MATLAB:

In [ ]:
mat"""
fileID = fopen('file.xml', 'rb');
disp(fscanf(fileID, '%c'));
"""
>> >> >> >> <?xml version="1.0" encoding="UTF-8"?>
<primates>
    <genus name="Homo">
        <species name="sapiens">Human</species>
    </genus>
    <genus name="Pan">
        <species name="paniscus">Bonobo</species>
        <species name="troglodytes">Chimpanzee</species>
    </genus>
</primates>

Conclusion

In this example, we have examined how binary files and XML documents are written and read in both programming languages. As we can see from this parsing, the syntactic and functional advantages in this direction are on the Engee side.