Engee documentation
Notebook

Creating arrays of strings

Arrays of strings store text fragments and provide a set of functions for working with them. Arrays of strings can be indexed, reshaped, and combined in the same way as arrays of any other type. In this article, we will look at some functions for working with string arrays.


Each element of the string array contains a sequence of characters 1 by n.

In [ ]:
str = "Hello, world"
Out[0]:
"Hello, world"

Let's create a row matrix using the [] operator.

In [ ]:
satellites = ["Ganymede" "Europa" "Callisto";"Amalthea" "Rings of Jupiter" "Leda"]
Out[0]:
2×3 Matrix{String}:
 "Ganymede"  "Europa"            "Callisto"
 "Amalthea"  "Rings of Jupiter"  "Leda"

Arrays of strings support indexing. We use indexing to access the first row of the str matrix.

In [ ]:
satellites[1,:]
Out[0]:
3-element Vector{String}:
 "Ganymede"
 "Europa"
 "Callisto"

Let's turn to the second element in the second line of str.

In [ ]:
satellites[2,2]
Out[0]:
"Rings of Jupiter"

You can determine the size of a given matrix using the function size().

In [ ]:
size(satellites)
Out[0]:
(2, 3)

The number of array elements using the function length().

In [ ]:
length(satellites)
Out[0]:
6

You can also specify the number of characters in each element of the string array. If you put a period before the parentheses. This will mean that we access an element of the array and apply a function to it. length().

In [ ]:
length.(satellites)
Out[0]:
2×3 Matrix{Int64}:
 8   6  8
 8  16  4

You can convert a set of numeric values to a string using the function string(). For example, we get the date and time and convert the value to a string.

In [ ]:
using Dates

d = now()
string(d)
Out[0]:
"2024-05-20T12:28:05.120"

Creating empty lines

Arrays of strings can contain both empty and missing values. An empty string contains zero characters. When displaying an empty string, the result is a pair of double quotes with nothing inside (""). The missing string is equivalent to the NaN string for numeric arrays. It indicates where values are missing in the string array. When a missing line is displayed, the result will be missing.

You can create an empty row using the function String().

In [ ]:
str = String("")
Out[0]:
""

You can create a matrix of empty rows, for example, using the function fill().

In [ ]:
str = fill("",(2,3))
Out[0]:
2×3 Matrix{String}:
 ""  ""  ""
 ""  ""  ""

To create a missing string, assign the keyword missing to the variable.

In [ ]:
str = missing
Out[0]:
missing

You can create an array of lines with both empty and missing lines.

In [ ]:
str = ["" "Ram" missing]
Out[0]:
1×3 Matrix{Union{Missing, String}}:
 ""  "Ram"  missing

Use the function ismissing() to determine which elements are strings with missing values. Note that an empty line is not a missing line.

In [ ]:
ismissing.(str)
Out[0]:
1×3 BitMatrix:
 0  0  1

We will find the space characters using the function occursin() in a line and replace them with a dash with the function replace().

In [ ]:
TF = occursin.(" ", satellites)
Out[0]:
2×3 BitMatrix:
 0  0  0
 0  1  0
In [ ]:
satellites = replace.(satellites, " " => "-")
display(satellites)
2×3 Matrix{String}:
 "Ganymede"  "Europa"            "Callisto"
 "Amalthea"  "Rings-of-Jupiter"  "Leda"

Splitting, combining, and sorting an array of strings

Combine strings into an array of strings in the same way as you would combine arrays of any other type.

In [ ]:
str1 = ["a","b","c"];
str2 = ["d","e","f"];
str3 = ["g","h","i"];
str = [str1 str2 str3]
Out[0]:
3×3 Matrix{String}:
 "a"  "d"  "g"
 "b"  "e"  "h"
 "c"  "f"  "i"

Function permutedims() allows you to transpose matrices with string elements.

In [ ]:
str = permutedims(str)
Out[0]:
3×3 Matrix{String}:
 "a"  "b"  "c"
 "d"  "e"  "f"
 "g"  "h"  "i"

To add text to lines, use the operator operator *. The operator adds text to the lines, but does not change the size of the array of lines.

In [ ]:
Name = ["Mary", "John", "Elizabeth", "Paul", "Ann"]
Name = [name * " Smith" for name in Name]
Out[0]:
5-element Vector{String}:
 "Mary Smith"
 "John Smith"
 "Elizabeth Smith"
 "Paul Smith"
 "Ann Smith"

For example, let's combine an array of first and last names.

In [ ]:
Name = ["Mary", "John", "Elizabeth", "Paul", "Ann"];
Lastname = ["Jones", "Adams", "Young", "Burns", "Spencer"];
full_names = Name .* " " .* Lastname
Out[0]:
5-element Vector{String}:
 "Mary Jones"
 "John Adams"
 "Elizabeth Young"
 "Paul Burns"
 "Ann Spencer"

But there is also a separation function split(). It can be used to separate the string elements of an array.

In [ ]:
full_names = split.(full_names)
Out[0]:
5-element Vector{Vector{SubString{String}}}:
 ["Mary", "Jones"]
 ["John", "Adams"]
 ["Elizabeth", "Young"]
 ["Paul", "Burns"]
 ["Ann", "Spencer"]

The function allows you to sort string elements. sort().

In [ ]:
sort(Name)
Out[0]:
5-element Vector{String}:
 "Ann"
 "Elizabeth"
 "John"
 "Mary"
 "Paul"

In addition to the examples of working with string arrays, there are a number of other functions. You can learn more about the functionality in the [Arrays] section (https://engee.com/helpcenter/stable/julia/base/arrays.html ).