Engee documentation
Notebook

Working with binary strings

This example discusses working with binary strings in the Julia programming language, including creating, manipulating, and basic string operations.

Introduction

What are binary strings and what are they used for?

Binary strings are sequences of bytes that can contain both text data and arbitrary binary data. Unlike regular text strings, binary strings can include non-printable characters, null bytes, and other special values. They are widely used for working with files, network protocols, data serialization, and other tasks where precise byte-level control over data contents is required.

The main part

Creating strings

In [ ]:
a = "123\x00 abc "
b = "456" * '\x09'
c = "789"
println(a)
println(b)
println(c)
123 abc 
456	
789

In this block, we create three lines:

  • Line a contains the text "123" followed by a zero byte \x00, then a space, the letters "abc" and another space
  • Line b it is created by concatenating the string "456" and the tab character. \x09
  • Line c contains a simple string "789"

String comparison

In [ ]:
println("(a == b) is $(a == b)")
(a == b) is false

Here we compare the strings a and b for equality. Operator == checks whether the strings contain the same characters in the same order. Since the lines a and b they contain different data, the result will be false.

Copying lines

In [ ]:
A = a
B = b
C = c
println(A)
println(B)
println(C)
123 abc 
456	
789

In this block, we create copies of the lines a, b and c by assigning them to new variables A, B and C. In Julia, strings are immutable, so assignment creates a reference to the same string, rather than a new copy in memory.

Checking the string for emptiness

In [ ]:
if length(a) == 0
    println("string a is empty")
else
    println("string a is not empty")
end
string a is not empty

Checking whether the string is a empty using the function length(), which returns the number of characters in a string. If the length is zero, the string is empty, otherwise it is not empty.

Adding a character to a string

In [ ]:
a = a * '\x64'
println(a)
123 abc d

Adding a symbol \x64 (which corresponds to the letter 'd' in ASCII) to the end of the string a using the concatenation operator *. Note that in Julia, characters can be Unicode characters up to 32 bits long.

Extracting a substring

In [ ]:
e = a[1:6]
println(e)
123 a

Extracting a substring from a string a, starting from the 1st character and ending with the 6th character (inclusive). In Julia, indexing starts at 1, not 0, as in some other programming languages.

Repeating lines

In [ ]:
b4 = b ^ 4
println(b4)
456	456	456	456	

Repeat the line b four times using the exponentiation operator ^. This is a convenient way to create repetitive character sequences.

Replacing substrings

In [ ]:
r = replace(b4, "456" => "xyz")
println(r)
xyz	xyz	xyz	xyz	

Replace all occurrences of the substring "456" with "xyz" in the string b4 using the function replace(). Operator => creates a key-value pair to replace.

Combining strings

In [ ]:
d = a * b * c
println(d)
123 abc d456	789

Combining the lines a, b and c in one line d using the concatenation operator *. All three strings will be joined in one sequence of characters.

Conclusion

In this example, we looked at the basic operations of working with binary strings in Julia: creating strings with various characters (including non-printable ones), comparing strings, copying, checking for emptiness, adding characters, extracting substrings, repeating strings, replacing substrings, and combining strings. We've learned how to work with different types of characters, including null bytes, tab characters, and other special characters. These skills are useful for working with files, network protocols, and other tasks that require precise control over binary data.

The example was developed using materials from Rosetta Code