Engee documentation

Strings

The type AbstractString' is a supertype for all string implementations in Julia. Strings are encodings of sequences of character codes https://unicode.org /[Unicode] represented by the `AbstractChar type. Julia makes several assumptions about strings:

  • Strings are encoded as fixed-size "code units".

    • Code units can be extracted using `codeunit(s, i)'.

    • The first unit of the code has the index `1'.

    • The last code unit has the index ncodeunits(s).

    • Any index i such that 1 ≤ i ≤ ncodeunits(s) is within the bounds.

  • Indexing of strings is performed in terms of these code units.

    • Characters are extracted using s[i] with a valid string index of i.

    • Each character `AbstractChar' in a string is encoded using one or more code units.

    • Only the index of the first unit of the AbstractChar code is valid.

    • Character encoding AbstractChar it does not depend on what precedes or follows it.

    • String encodings are https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing], i.e. isvalid(s, i) has a computational complexity of O(1).

Some string functions that extract code units, characters, or substrings from strings produce errors if they are passed out-of-bounds or invalid string indexes. These include codeunit(s, i) and s[i]. Functions that perform arithmetic operations with string indexes use a simplified indexing approach and output the nearest valid string index when it is within bounds. If it goes beyond the boundaries, they behave as if there are an infinite number of characters on each side of the string. Usually, the length of the code unit of these imaginary fill-in characters is 1, but string types can choose different sizes of "imaginary" characters if it makes sense for their implementations (for example, substrings can transfer index arithmetic to the base string, which they allow you to get an idea of). Non-strict indexing functions include those designed for index arithmetic: thisind, `nextind', `prevent'. This model allows index arithmetic to work with out-of-bounds indexes as intermediate values as long as they are not used to extract a character, which often helps to avoid the need for encoding to bypass edge cases.

See also the description codeunit, ncodeunits, thisind, nextind, prevind.

The type AbstractChar' is a supertype for all symbol implementations in Julia. The character represents the Unicode code position. Using the function `codepoint it is possible to get an integer value of a code position. The opposite is also true: the character can be identified by the code position. For example, based on these numeric values, characters are compared using the operators < and =='. In the new type `T <: AbstractChar, at least the method codepoint(::T) and the constructor T(::UInt32) must be defined.

The subtype AbstractChar can represent only a subset of Unicode characters. In this case, an attempt to convert an unsupported UInt32 value will result in an error. And vice versa, the built-in type Char is a _ set_ of Unicode characters (which is necessary for lossless encoding of streams with invalid bytes). In this case, converting a value that does not match in Unicode to UInt32 will result in an error. Using the function 'isvalid` it is possible to check which code positions are representable by this type of AbstractChar.

Different encodings can be used inside the implementation of the AbstractChar type. When converting using the codepoint(char)' function the internal encoding does not matter, as the Unicode code position of the character is always returned. When calling `print(io, c) for any character c::AbstractChar, the encoding is determined by the argument io (UTF-8 for all built-in types IO). If necessary, the conversion to Char is performed.

In contrast, when calling write(io, c), the encoding may depend on the value of typeof(c), while calling read(io, typeof(c))`must receive data in the same encoding as `write'. The new `AbstractChar types require their own implementation of write and `read'.

Char(c::Union{Number,AbstractChar})

Char is a 32-bit type. AbstractChar, which represents characters in Julia by default. The Char type is used for character literals such as , as well as for elements of the type String.

To represent arbitrary byte streams in lossless String objects, a Char object may have a value that cannot be converted to a Unicode code position: an error occurs when converting such a Char value to UInt32. Using the function isvalid(c::Char) it is possible to check whether c represents a valid Unicode character.

codepoint(c::AbstractChar) -> Integer

Returns the Unicode code position (an unsigned integer) corresponding to the character c (or throws an exception if c is an invalid character). For the Char' type, this value is of the `UInt32 type, however, for the AbstractChar types representing only a subset of Unicode characters, an integer value of a different size can be returned (for example, UInt8).

length(s::AbstractString) -> Int
length(s::AbstractString, i::Integer, j::Integer) -> Int

Returns the number of characters in the string s from index i to index j.

It is calculated as the number of code position indexes from i to j, which are valid character indexes. When passing a single string argument, the total number of characters in the string is calculated. When passing the arguments i and j, the number of valid indexes from i to j inclusive in the string s is calculated. In addition to values within the allowed range, the argument i can take the out-of-bounds value ncodeunits(s) + 1, and the argument j can take the out-of-bounds value 0.

The time complexity of this operation is generally linear. In other words, the execution time is proportional to the number of bytes or characters in a string, since the value is calculated dynamically. The situation is different with the method of determining the length of an array, the execution time of which is constant.

See also the description isvalid, ncodeunits, lastindex, thisind, nextind, prevind.

Examples

julia> length("jμΛIα")
5
sizeof(str::AbstractString)

The size of the string str in bytes. Is equal to the number of code units in str multiplied by the size (in bytes) of one code unit in str.

Examples

julia> sizeof("")
0

julia> sizeof("∀")
3
*(s::Union{AbstractString, AbstractChar}, t::Union{AbstractString, AbstractChar}...) -> AbstractString

Performs string and/or character concatenation by returning an object String or AnnotatedString (depending on the situation). This is equivalent to calling a function. string or annotatedstring with the same arguments. When concatenating strings of built-in types, a value of type String is always returned, but for other types of strings the result may have a different type.

Examples

julia> "Hello " * "world"
"Hello world"

julia> 'j' * "ulia"
"julia"
^(s::Union{AbstractString,AbstractChar}, n::Integer) -> AbstractString

Repeats a string or character n times. It can also be written as `repeat(s, n)'.

See also the description repeat.

Examples

julia> "Test "^3
"Test Test Test "
string(n::Integer; base::Integer = 10, pad::Integer = 1)

Converts an integer n to a string based on the specified base. You can specify the number of digits to which the filling should be performed.

See also the description digits, bitstring and count_zeros.

Examples

julia> string(5, base = 13, pad = 4)
"0005"

julia> string(-13, base = 5, pad = 4)
"-0023"

string(xs...)

Creates a string of any values using the function print.

The string function should usually not be defined directly. Instead, define the print(io::IO, x::MyType) method. If the function string(x) for some type should be very efficient, it makes sense to add a method to string and define print(io::IO, x::MyType) = print(io, string(x)) so that the functions are consistent.

See also the description String, repr, sprint, show.

Examples

julia> string("a", 1, true)
"a1true"
repeat(s::AbstractString, r::Integer)

Repeats the string r times. This can be written as s^r.

See also the description of the method ^.

Examples

julia> repeat("ha", 3)
"hahaha"
repeat(c::AbstractChar, r::Integer) -> String

Repeats the character r once. This can also be done by calling c^r.

Examples

julia> repeat('A', 3)
"AAA"
repr(x; context=nothing)

Creates a string from any value using the function show. You should not add methods to the repr; instead, define the show method.

The optional named argument context can be assigned the pair :key=>value, a tuple of pairs :key=>value or an object IO or 'IOContext`, the attributes of which are used for the I/O stream passed to `show'.

Note that the result of calling repr(x) is usually similar to how the value x is entered in Julia. Instead, you can make a call repr(MIME("text/plain"), x) to get a formatted version of x that is easier to read. It is in this form that the value of x is displayed in the REPL.

Compatibility: Julia 1.7

To pass a tuple in the named context argument, a Julia version of at least 1.7 is required.

Examples

julia> repr(1)
"1"

julia> repr(zeros(3))
"[0.0, 0.0, 0.0]"

julia> repr(big(1/3))
"0.333333333333333314829616256247390992939472198486328125"

julia> repr(big(1/3), context=:compact => true)
"0.333333"
String(s::AbstractString)

Creates a new string (String) from an existing `AbstractString'.

SubString(s::AbstractString, i::Integer, j::Integer=lastindex(s))
SubString(s::AbstractString, r::UnitRange{<:Integer})

It acts in the same way as getindex, but returns a representation of the parent string s in the range i:j or r respectively, rather than creating a copy.

The macro @views converts any slices of strings s[i:j] into substrings SubString(s, i, j) in the code block.

Examples

julia> SubString("abc", 1, 2)
"ab"

julia> SubString("abc", 1:2)
"ab"

julia> SubString("abc", 2)
"bc"
LazyString <: AbstractString

Deferred representation of string interpolation. This is useful if the string is to be constructed in a context where performing interpolation and string construction is unnecessary or undesirable (for example, in function error paths).

This type is designed so that its creation at runtime is inexpensive and at the same time as much work as possible falls either on the macro or on subsequent output operations.

Examples

julia> n = 5; str = LazyString("n is ", n)
"n is 5"

See also the description @lazy_str.

Compatibility: Julia 1.8

LazyString requires a Julia version of at least 1.8.

Advanced Help

Security features for parallel programs

By itself, the deferred line does not create problems with parallelism, even if it is output in several Julia tasks. However, if a concurrency error may occur for the print methods in the recorded value when called without synchronization, the output of the deferred line may lead to a problem. Moreover, the 'print` methods for written values can be called multiple times, although only one result will be returned.

Compatibility: Julia 1.9

In versions of Julia not lower than 1.9, the use of `LazyString' is safe in the above sense.

lazy"str"

Creates LazyString, using regular string interpolation syntax. Note that the interpolations are determined during the construction of the LazyString, but the output is postponed until the first access to the string.

For information about the security properties for parallel programs, see the documentation on LazyString.

Examples

julia> n = 5; str = lazy"n is $n"
"n is 5"

julia> typeof(str)
LazyString
Compatibility: Julia 1.8

Lazy"str" requires a Julia version of at least 1.8.

transcode(T, src)

Converts string data from one Unicode encoding to another. 'src` is an object of type String or Vector{UIntXX}, containing UTF-XX code units, where XX is the number 8, 16, or 32. The 'T` argument indicates the encoding of the returned value: String if the string String (in UTF-8 encoding) is to be returned, or UIntXX if the vector Vector' is to be returned.{UIntXX} of UTF-encoded data-XX'. (You can also use an integer type alias 'Cwchar_t for converting wchar_t* strings used by external C libraries.)

The `transcode' function is executed successfully if the input data can be adequately represented in the target encoding. For conversions from one UTF-XX encoding to another, it is always performed successfully, even for invalid Unicode data.

Currently, conversion is supported only to and from UTF-8 encoding.

Examples

julia> str = "αβγ"
"αβγ"

julia> transcode(UInt16, str)
3-element Vector{UInt16}:
 0x03b1
 0x03b2
 0x03b3

julia> transcode(String, transcode(UInt16, str))
"αβγ"
unsafe_string(p::Ptr{UInt8}, [length::Integer])

Copies a string from a C-style string address (with the NUL character at the end) in UTF-8 encoding. (The pointer can then be safely released.) If the length argument is specified (the length of the data in bytes), the string does not have to end with a NUL character.

This function is marked as "unsafe" because it will fail if p is not a valid memory address for data of the requested length.

ncodeunits(s::AbstractString) -> Int

Returns the number of code units in a string. When accessing this string, valid indexes must match the condition 1 ≤ i ≤ ncodeunits(s). Not all such indexes are valid — the index may not point to the beginning of a character, but return the value of the code unit when calling codeunit(s,i).

Examples

julia> ncodeunits("The Julia Language")
18

julia> ncodeunits("∫eˣ")
6

julia> ncodeunits('∫'), ncodeunits('e'), ncodeunits('ˣ')
(3, 1, 2)

See also the description codeunit, checkbounds, sizeof, length, lastindex.

codeunit(s::AbstractString) -> Type{<:Union{UInt8, UInt16, UInt32}}

Returns the type of code unit for the specified string object. For ASCII, Latin-1, or UTF-8 encoded strings, this will be the type UInt8, for UCS-2 and UTF-16, the type UInt16, and for UTF-32, the type UInt32. The possible types of code units are not limited to these three options, but almost all widely used string encodings use one of them. Calling codeunit(s) is equivalent to typeof(codeunit(s,1)) when s is a non-empty string.

See also the description ncodeunits.


codeunit(s::AbstractString, i::Integer) -> Union{UInt8, UInt16, UInt32}

Returns the value of the code unit in the string s at the index i. Please note that

codeunit(s, i) :: codeunit(s)

That is, the value returned by codeunit(s, i) has the type returned by codeunit(s).

Examples

julia> a = codeunit("Hello", 2)
0x65

julia> typeof(a)
UInt8

See also the description ncodeunits' and `checkbounds.

codeunits(s::AbstractString)

Returns a vector-like object containing the code units of the string. By default, it returns the shell CodeUnits, but if necessary, the function codeunits can be defined for new string types.

Examples

julia> codeunits("Juλia")
6-element Base.CodeUnits{UInt8, String}:
 0x4a
 0x75
 0xce
 0xbb
 0x69
 0x61
ascii(s::AbstractString)

Converts a string to the String type and checks if it contains only ASCII data. Otherwise, it causes an error ArgumentError indicating the position of the first byte not in ASCII encoding.

See also the description of the predicate 'isascii`, which allows filtering and replacing non-ASCII characters.

Examples

julia> ascii("abcdeγfgh")
ERROR: ArgumentError: invalid ASCII at index 6 in "abcdeγfgh"
Stacktrace:
[...]

julia> ascii("abcdefgh")
"abcdefgh"
Regex(pattern[, flags]) <: AbstractPattern

A type representing a regular expression. 'Regex` objects can be used to match strings using the function match.

Regex objects can be created using a string macro. @r_str. The Regex(pattern[, flags]) constructor is usually used if the string pattern requires interpolation. For more information about the flags, see the string macro documentation.

To escape the interpolated variables, use \Q and \E (for example, Regex("\\Q$x\\E")).

@r_str -> Regex

Creates a regular expression, such as r"^[a-z]*$", without interpolation or escaping (except for the quotation mark character ", which must still be escaped). The regular expression also accepts one or more flags, which are specified after the closing quotation mark and change its behavior.:

  • i includes case-insensitive matching.

  • m — the characters ^ and $ are considered as corresponding to the beginning and end of individual lines, not the entire text.

  • 's` allows you to match the modifier '.` with newline characters.

  • x enables "comment mode": spaces between regular expression characters are ignored, except when they are escaped by the character \, and the character # is considered as the beginning of the comment (which is ignored until the end of the line).

  • a enables ASCII mode (disables 'UTF` and UCP modes). By default, the matching of the sequences \B, \b, \D, \d, \S, \s, \W and \w is performed based on the properties of Unicode characters. If this flag is set, these sequences are mapped to ASCII characters only. This also includes the sequence \u, which outputs the specified character value directly as a single byte, without trying to encode it in UTF-8. It is important to note that this flag allows matching with invalid UTF-8 strings, with both sides of the matching treated as simple bytes (as if they were ISO/IEC 8859-1 or Latin-1 bytes) rather than character encodings. In this case, this flag is often combined with `s'. You can further refine this flag by starting the template with (UCP) or (UTF).

If interpolation is required, see the description of the type Regex.

Examples

julia> match(r"a+.*b+.*?d$"ism, "Goodbye,\nOh, angry,\nBad world\n")
RegexMatch("angry,\nBad world")

The first three flags are activated for this regular expression.

SubstitutionString(substr) <: AbstractString

Saves the specified string substr' as an object of type `Substitution String for use in regular expression substitutions. It is usually created using a macro @s_str.

Examples

julia> SubstitutionString("Hello \\g<name>, it's \\1")
s"Hello \g<name>, it's \1"

julia> subst = s"Hello \g<name>, it's \1"
s"Hello \g<name>, it's \1"

julia> typeof(subst)
SubstitutionString{String}
@s_str -> SubstitutionString

Creates a replacement string used to substitute regular expressions. In the string, the sequence of the form \N means the nth group of the record in the regular expression, and \g<groupname> means the named group of the record with the name groupname.

Examples

julia> msg = "#Hello# from Julia";

julia> replace(msg, r"#(.+)# from (?<from>\w+)" => s"FROM: \g<from>; MESSAGE: \1")
"FROM: Julia; MESSAGE: Hello"
@raw_str -> String

Creates a raw string without interpolation or escaping. The only exception is that quotation marks still have to be escaped. Backslashes escape both quotation marks and other backslashes, but only when a sequence of backslashes precedes the quotation mark character. Thus, 2n backslashes followed by a quotation mark encode n backslashes and the end of the literal, and 2n+1 backslashes followed by a quotation mark encode n backslashes with a quotation mark after them.

Examples

julia> println(raw"\ $x")
\ $x

julia> println(raw"\"")
"

julia> println(raw"\\\"")
\"

julia> println(raw"\\x \\\"")
\\x \"
@b_str

Creates an immutable byte ('UInt8`) vector using string syntax.

Examples

julia> v = b"12\x01\x02"
4-element Base.CodeUnits{UInt8, String}:
 0x31
 0x32
 0x01
 0x02

julia> v[2]
0x32
@html_str -> Docs.HTML

Creates an 'HTML` object based on a literal string.

Examples

julia> html"Julia"
HTML{String}("Julia")
@text_str -> Docs.Text

Creates a Text object based on a literal string.

Examples

julia> text"Julia"
Julia
isvalid(value) -> Bool

Returns the value true if the specified value is supported by the appropriate type, which can currently be AbstractChar, String or SubString'.{String}.

Examples

julia> isvalid(Char(0xd800))
false

julia> isvalid(SubString(String(UInt8[0xfe,0x80,0x80,0x80,0x80,0x80]),1,2))
false

julia> isvalid(Char(0xd799))
true
isvalid(T, value) -> Bool

Returns the value true if the specified value is supported by the corresponding type. Currently, the type can be AbstractChar or String'. The values for `AbstractChar can be of type AbstractChar' or `UInt32. Values for String can be of this type, like SubString'.{String}, Vector{UInt8} or a continuous subarray of these types.

Examples

julia> isvalid(Char, 0xd800)
false

julia> isvalid(String, SubString("thisisvalid",1,5))
true

julia> isvalid(Char, 0xd799)
true
Compatibility: Julia 1.6

Support for values in the form of a subarray appeared in Julia 1.6.

isvalid(s::AbstractString, i::Integer) -> Bool

A predicate indicating whether the specified index is the beginning of character encoding in s. If isvalid(s, i) is set to true, s[i] will return a character whose encoding begins at this index. If false, s[i] will cause an invalid index error or an out-of-bounds error, depending on whether the index i is within acceptable bounds. For the function isvalid(s, i) to have an O(1) complexity level, the encoding of the string s must be https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing]. This is the basic assumption for supporting universal strings in Julia.

See also the description getindex, iterate, thisind, nextind, prevind, length.

Examples

julia> str = "αβγdef";

julia> isvalid(str, 1)
true

julia> str[1]
'α': Unicode U+03B1 (category Ll: Letter, lowercase)

julia> isvalid(str, 2)
false

julia> str[2]
ERROR: StringIndexError: invalid index [2], valid nearby indices [1]=>'α', [3]=>'β'
Stacktrace:
[...]
match(r::Regex, s::AbstractString[, idx::Integer[, addopts]])

Searches for the first match of the regular expression r in s and returns the object RegexMatch containing the found match, or nothing if no matches are found. A matching substring can be obtained by accessing m.match, and recorded sequences can be obtained by accessing m.captures. The optional idx argument defines the index from which the search should start.

Examples

julia> rx = r"a(.)a"
r"a(.)a"

julia> m = match(rx, "cabac")
RegexMatch("aba", 1="b")

julia> m.captures
1-element Vector{Union{Nothing, SubString{String}}}:
 "b"

julia> m.match
"aba"

julia> match(rx, "cabac", 3) === nothing
true
eachmatch(r::Regex, s::AbstractString; overlap::Bool=false)

It searches for all matches of the regular expression r in s and returns an iterator based on matches. If the overlap argument is true, the intersection of the indexes of matching sequences in the source string is allowed, otherwise they must have separate character ranges.

Examples

julia> rx = r"a.a"
r"a.a"

julia> m = eachmatch(rx, "a1a2a3a")
Base.RegexMatchIterator{String}(r"a.a", "a1a2a3a", false)

julia> collect(m)
2-element Vector{RegexMatch}:
 RegexMatch("a1a")
 RegexMatch("a3a")

julia> collect(eachmatch(rx, "a1a2a3a", overlap = true))
3-element Vector{RegexMatch}:
 RegexMatch("a1a")
 RegexMatch("a2a")
 RegexMatch("a3a")
RegexMatch <: AbstractMatch

A type representing a single match with Regex found in the string. Usually created by a function match.

The substring of the entire matched string is stored in the match field. The captures field stores substrings for each record group with numeric indexes. To index by record group name, the entire mapped object should be indexed instead, as shown in the examples. The position from which the mapping starts is stored in the 'offset` field. The `offsets' field stores the positions of the beginning of each record group. A value of 0 means a group that has not been recorded.

This type can be used as a group iterator of the Regex record, which outputs the substrings recorded in each group. This allows you to decompose the mapping records into their components. If the group was not recorded, the value nothing is given instead of the substring.

Methods that accept the RegexMatch object are defined for iterate, length, eltype, keys, haskey and 'getindex`, where the keys are the names or numbers of the record groups. For more information, see the description keys.

Examples

julia> m = match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30 in the morning")
RegexMatch("11:30", hour="11", minute="30", 3=nothing)

julia> m.match
"11:30"

julia> m.captures
3-element Vector{Union{Nothing, SubString{String}}}:
 "11"
 "30"
 nothing


julia> m["minute"]
"30"

julia> hr, min, ampm = m; # деструктурируем группы записи путем итерации

julia> hr
"11"
keys(m::RegexMatch) -> Vector

Returns a vector of keys for all entry groups of the basic regular expression. The key is enabled even if there are no matches with the record group. In other words, idx will be in the return value, even if `m[idx] == nothing'.

Unnamed record groups will have integer keys corresponding to their indexes. Named record groups will have string keys.

Compatibility: Julia 1.7

This method was added in Julia 1.7.

Examples

julia> keys(match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30"))
3-element Vector{Any}:
  "hour"
  "minute"
 3
isless(a::AbstractString, b::AbstractString) -> Bool

Checks whether the string a precedes the string b in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).

Examples

julia> isless("a", "b")
true

julia> isless("β", "α")
false

julia> isless("a", "a")
false
==(a::AbstractString, b::AbstractString) -> Bool

Checks the character-by-character equality of strings (more strictly speaking, the equality of Unicode code positions). If any of the strings is of type AnnotatedString, the string properties must also match.

Examples

julia> "abc" == "abc"
true

julia> "abc" == "αβγ"
false
cmp(a::AbstractString, b::AbstractString) -> Int

Compares two strings. Returns 0 if both strings have the same length and all their characters match at each position. Returns -1 if a is a prefix of b or the characters in a are preceded by the characters of b in alphabetical order. Returns 1 if b is a prefix of a or the characters in b precede the characters of a in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).

Examples

julia> cmp("abc", "abc")
0

julia> cmp("ab", "abc")
-1

julia> cmp("abc", "ab")
1

julia> cmp("ab", "ac")
-1

julia> cmp("ac", "ab")
1

julia> cmp("α", "a")
1

julia> cmp("b", "β")
-1
lpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String

Returns the string representation of s and fills the resulting string on the left with the characters p up to the length of n characters (in textwidth). If the length of s is already equal to n characters, an equal string is returned. By default, the line is filled with spaces.

Examples

julia> lpad("March", 10)
"     March"
Compatibility: Julia 1.7

In Julia 1.7, this function started using the value textwidth instead of simply counting characters (code positions).

rpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String

Returns the string representation of s and fills the resulting string on the right with the characters p up to the length of n characters (in textwidth). If the length of s is already equal to n characters, an equal string is returned. By default, the line is filled with spaces.

Examples

julia> rpad("March", 20)
"March               "
Compatibility: Julia 1.7

In Julia 1.7, this function started using the value textwidth instead of simply counting characters (code positions).

findfirst(pattern::AbstractString, string::AbstractString)
findfirst(pattern::AbstractPattern, string::String)

Finds the first occurrence of pattern' in `string'. Equivalent to `findnext(pattern, string, firstindex(s)).

Examples

julia> findfirst("z", "Hello to the world") # возвращает nothing, но не выводится в REPL

julia> findfirst("Julia", "JuliaLang")
1:5
findnext(pattern::AbstractString, string::AbstractString, start::Integer)
findnext(pattern::AbstractPattern, string::String, start::Integer)

Finds the next occurrence of pattern in string starting from the start position. The pattern can be either a string or a regular expression. In the latter case, the string argument must be of type `String'.

The return value is the range of indexes in which a matching sequence is found, such that s[findnext(x, s, i)] == x:

findnext("substring", string, i) == start:stop', so `string[start:stop] == "substring" and i <= start, or `nothing' if there are no matches.

Examples

julia> findnext("z", "Hello to the world", 1) === nothing
true

julia> findnext("o", "Hello to the world", 6)
8:8

julia> findnext("Lang", "JuliaLang", 2)
6:9
findnext(ch::AbstractChar, string::AbstractString, start::Integer)

Finds the next occurrence of the character ch in the string starting from the position `start'.

Compatibility: Julia 1.3

This method requires a Julia version of 1.3 or higher.

Examples

julia> findnext('z', "Hello to the world", 1) === nothing
true

julia> findnext('o', "Hello to the world", 6)
8
findlast(pattern::AbstractString, string::AbstractString)

Finds the last occurrence of pattern' in `string'. Equivalent to `findprev(pattern, string, lastindex(string)).

Examples

julia> findlast("o", "Hello to the world")
15:15

julia> findfirst("Julia", "JuliaLang")
1:5
findlast(ch::AbstractChar, string::AbstractString)

Finds the last occurrence of the character ch in `string'.

Compatibility: Julia 1.3

This method requires a Julia version of 1.3 or higher.

Examples

julia> findlast('p', "happy")
4

julia> findlast('z', "happy") === nothing
true
findprev(pattern::AbstractString, string::AbstractString, start::Integer)

Finds the previous occurrence of pattern in string starting from the start position.

The returned value is the range of indexes in which a matching sequence is found, such that s[findprev(x, s, i)] == x:

findprev("substring", string, i) == start:stop, so string[start:stop] == "substring" and stop <= i, or nothing if there are no matches.

Examples

julia> findprev("z", "Hello to the world", 18) === nothing
true

julia> findprev("o", "Hello to the world", 18)
15:15

julia> findprev("Julia", "JuliaLang", 6)
1:5
occursin(needle::Union{AbstractString,AbstractPattern,AbstractChar}, haystack::AbstractString)

Determines whether the first argument is a substring of the second one. If needle is a regular expression, it checks whether haystack contains a match.

Examples

julia> occursin("Julia", "JuliaLang is pretty cool!")
true

julia> occursin('a', "JuliaLang is pretty cool!")
true

julia> occursin(r"a.a", "aba")
true

julia> occursin(r"a.a", "abba")
false

See also the description contains.


occursin(haystack)

Creates a function that checks whether its argument is included in the haystack, that is, a function equivalent to needle -> occursin(needle, haystack).

The returned function is of type Base.Fix2{typeof(occursin)}.

Compatibility: Julia 1.6

This method requires a Julia version at least 1.6.

Examples

julia> search_f = occursin("JuliaLang is a programming language");

julia> search_f("JuliaLang")
true

julia> search_f("Python")
false
reverse(s::AbstractString) -> AbstractString

Turns the line backwards. More strictly speaking, this function reverses the order of the code positions in the string. Its main purpose is to process strings in reverse order, especially when searching for regular expression matches backwards. See also the function description reverseind, which converts indexes of s to indexes of reverse(s) and vice versa, and a description of the graphemes function from the Unicode module, which works with user-visible characters (graphemes) rather than code positions. Also, see the function description. Iterators.reverse, which allows iterating in reverse order without creating a copy. Custom string types should have their own implementation of the reverse function, which should usually return a string of the same type and in the same encoding. If a string is returned in a different encoding, the reverse function must also be redefined for this string type so that the condition s[reverse(s,i)] == reverse(s)[i] is met.

Examples

julia> reverse("JuliaLang")
"gnaLailuJ"

The examples below may give different results on different systems. The expected result is indicated in the comments.

Combining characters can produce unexpected results.:

julia> reverse("ax̂e") # во входных данных циркумфлекс находится над x, а в выходных — над e
"êxa"

julia> using Unicode

julia> join(reverse(collect(graphemes("ax̂e")))) # меняет порядок следования графем на обратный; циркумфлекс находится над x как во входных, так и в выходных данных
"ex̂a"
replace([io::IO], s::AbstractString, pat=>r, [pat2=>r2, ...]; [count::Integer])

Searches for the specified pattern pat in s, replacing each occurrence with r. If the count argument is specified, no more than count occurrences are replaced. 'pat` can be a single character, a vector, or multiple characters, a string, or a regular expression. If 'r` is a function, each occurrence is replaced by r(s), where s is a matched substring (when pat is of type AbstractPattern or AbstractString) or a symbol (when pat is of type AbstractChar or is a collection of AbstractChar). If pat is a regular expression, and r — Substitution string, references to record groups in r are replaced with matching text. To remove instances of pat from string', assign `r the empty string String ("").

The returned value is a new string after replacement. If the argument io::IO is specified, the converted string is written to io instead (and io is returned). (For example, this can be used together with 'IOBuffer' to reuse the pre-allocated buffer array in place.)

You can specify multiple templates: they will be applied from left to right at the same time, so that only one template will be applied to any of the characters and the templates will be applied to the input text, not to substitutions.

Compatibility: Julia 1.7

Version 1.7 is required to use multiple templates.

Compatibility: Julia 1.10

The io::IO argument requires version 1.10.

Examples

julia> replace("Python is a programming language.", "Python" => "Julia")
"Julia is a programming language."

julia> replace("The quick foxes run quickly.", "quick" => "slow", count=1)
"The slow foxes run quickly."

julia> replace("The quick foxes run quickly.", "quick" => "", count=1)
"The  foxes run quickly."

julia> replace("The quick foxes run quickly.", r"fox(es)?" => s"bus\1")
"The quick buses run quickly."

julia> replace("abcabc", "a" => "b", "b" => "c", r".+" => "a")
"bca"
eachsplit(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
eachsplit(str::AbstractString; limit::Integer=0, keepempty::Bool=false)

The string str is separated by occurrences of the delimiters dlm and the iterator is returned by substrings. The dlm argument can have any formats that are allowed by the first argument of the method. `findnext' (that is, a string, regular expression, or function), or contain a single character or a collection of characters.

If the dlm argument is not specified, the default value is isspace.

Optional named arguments:

  • limit: the maximum size of the result; limit=0 means unlimited size (default value);

  • keepempty: Whether empty fields should be saved in the result. The default value is false' if the `dlm argument is specified, or true if the dlm argument is not specified.

See also the description split.

Compatibility: Julia 1.8

The eachsplit function requires a Julia version of at least 1.8.

Examples

julia> a = "Ma.rch"
"Ma.rch"

julia> b = eachsplit(a, ".")
Base.SplitIterator{String, String}("Ma.rch", ".", 0, true)

julia> collect(b)
2-element Vector{SubString{String}}:
 "Ma"
 "rch"
eachrsplit(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
eachrsplit(str::AbstractString; limit::Integer=0, keepempty::Bool=false)

Returns an iterator over substrings SubString in str, which are obtained as a result of separation by delimiters dlm and are output in reverse order (from right to left). The dlm argument can have any formats that are allowed by the first argument of the method. findprev (that is, a string, a single character, or a function), or contain a collection of characters.

If the dlm argument is not specified, the default value is isspace, and keepempty is set to false by default.

Optional named arguments:

  • If limit > 0, the iterator splits the string a maximum of limit - 1' times and returns the rest in its entirety. With `limit < 1 (by default), the number of splits is unlimited.

  • keepempty: whether to return empty fields during iteration. The default value is false if the dlm argument is not specified, or true if the dlm argument is specified.

Note that unlike the functions split, rsplit and 'eachsplit`, this function iterates through the input substrings from right to left.

See also the description eachsplit and rsplit.

Compatibility: Julia 1.11

This feature requires a version of Julia not lower than 1.11.

Examples

julia> a = "Ma.r.ch";

julia> collect(eachrsplit(a, ".")) == ["ch", "r", "Ma"]
true

julia> collect(eachrsplit(a, "."; limit=2)) == ["ch", "Ma.r"]
true
split(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
split(str::AbstractString; limit::Integer=0, keepempty::Bool=false)

Divides the string str into an array of substrings by occurrences of the delimiters dlm'. The `dlm argument can have any formats that are allowed by the first argument of the method. `findnext' (that is, a string, regular expression, or function), or contain a single character or a collection of characters.

If the dlm argument is not specified, the default value is isspace.

Optional named arguments:

  • limit: the maximum size of the result; limit=0 means unlimited size (default value);

  • keepempty: Whether empty fields should be saved in the result. The default value is false' if the `dlm argument is specified, or true if the dlm argument is not specified.

See also the description rsplit and eachsplit.

Examples

julia> a = "Ma.rch"
"Ma.rch"

julia> split(a, ".")
2-element Vector{SubString{String}}:
 "Ma"
 "rch"
rsplit(s::AbstractString; limit::Integer=0, keepempty::Bool=false)
rsplit(s::AbstractString, chars; limit::Integer=0, keepempty::Bool=true)

It acts in the same way as the function split, but starting from the end of the line.

Examples

julia> a = "M.a.r.c.h"
"M.a.r.c.h"

julia> rsplit(a, ".")
5-element Vector{SubString{String}}:
 "M"
 "a"
 "r"
 "c"
 "h"

julia> rsplit(a, "."; limit=1)
1-element Vector{SubString{String}}:
 "M.a.r.c.h"

julia> rsplit(a, "."; limit=2)
2-element Vector{SubString{String}}:
 "M.a.r.c"
 "h"
strip([pred=isspace,] str::AbstractString) -> SubString
strip(str::AbstractString, chars) -> SubString

Removes from str the beginning and ending characters that are specified in the chars argument or for which the pred function returns the value `true'.

By default, the leading and ending spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description lstrip and rstrip.

Compatibility: Julia 1.2

A method that accepts a predicative function requires a Julia version of at least 1.2.

Examples

julia> strip("{3, 5}\n", ['{', '}', '\n'])
"3, 5"
lstrip([pred=isspace,] str::AbstractString) -> SubString
lstrip(str::AbstractString, chars) -> SubString

Deletes the initial characters from str that are specified in the chars argument or for which the pred function returns the value `true'.

By default, initial spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description strip and rstrip.

Examples

julia> a = lpad("March", 20)
"               March"

julia> lstrip(a)
"March"
rstrip([pred=isspace,] str::AbstractString) -> SubString
rstrip(str::AbstractString, chars) -> SubString

Deletes the end characters from str that are specified in the chars argument or for which the pred function returns the value `true'.

By default, trailing spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description strip and lstrip.

Examples

julia> a = rpad("March", 20)
"March               "

julia> rstrip(a)
"March"
startswith(s::AbstractString, prefix::Union{AbstractString,Base.Chars})

Returns the value true if the string s begins with the value of the prefix argument, which can be a string, a character, or a tuple, vector, or set of characters. If the prefix is a tuple, vector, or set of characters, it checks whether the first character of the string s is included in this set.

See also the description endswith and contains.

Examples

julia> startswith("JuliaLang", "Julia")
true

startswith(io::IO, prefix::Union{AbstractString,Base.Chars})

Checks whether the object IO begins with a prefix, which can be a string, a character, or a tuple, vector, or set of characters. See also the description peek.


startswith(prefix)

Creates a function that checks whether its argument starts with prefix', that is, a function equivalent to `+y → startswith(y, prefix)+.

The returned function is of type Base.Fix2{typeof(startswith)} and can be used to implement specialized methods.

Compatibility: Julia 1.5

To use the startswith(prefix) function with one argument, a Julia version at least 1.5 is required.

Examples

julia> startswith("Julia")("JuliaLang")
true

julia> startswith("Julia")("Ends with Julia")
false

startswith(s::AbstractString, prefix::Regex)

Returns the value true if the string s begins with the regular expression template `prefix'.

'startswith` does not compile the binding into a regular expression, but passes it as match_option to PCRE. If compilation time is amortized, occursin(r"^...", s) runs faster than startswith(s, r"...").

See also the description occursin and endswith.

Compatibility: Julia 1.2

This method requires a Julia version of at least 1.2.

Examples

julia> startswith("JuliaLang", r"Julia|Romeo")
true
endswith(s::AbstractString, suffix::Union{AbstractString,Base.Chars})

Returns the value true if the string s ends with the value of the suffix argument, which can be a string, a character, or a tuple, vector, or set of characters. If the suffix is a tuple, vector, or set of characters, it checks whether the last character of the string s is included in this set.

See also the description startswith and contains.

Examples

julia> endswith("Sunday", "day")
true

endswith(suffix)

Creates a function that checks whether its argument ends with a suffix, that is, a function equivalent to y -> endswith(y, suffix).

The returned function is of type Base.Fix2{typeof(endswith)} and can be used to implement specialized methods.

Compatibility: Julia 1.5

To use the endswith(suffix) function with one argument, a Julia version at least 1.5 is required.

Examples

julia> endswith("Julia")("Ends with Julia")
true

julia> endswith("Julia")("JuliaLang")
false

endswith(s::AbstractString, suffix::Regex)

Returns the value true if the string s ends with the regular expression pattern `suffix'.

'endswith` does not compile the binding into a regular expression, but passes it as match_option to PCRE. If compilation time is amortized, occursin(r"...$", s) runs faster than endswith(s, r"...").

See also the description occursin and startswith.

Compatibility: Julia 1.2

This method requires a Julia version of at least 1.2.

Examples

julia> endswith("JuliaLang", r"Lang|Roberts")
true
contains(haystack::AbstractString, needle)

Returns true if haystack' contains `needle'. Similar to the `occursin(needle, haystack) call, but provided for consistency with startswith(haystack, needle) and `endswith(haystack, needle)'.

See also the description occursin, in and issubset.

Examples

julia> contains("JuliaLang is pretty cool!", "Julia")
true

julia> contains("JuliaLang is pretty cool!", 'a')
true

julia> contains("aba", r"a.a")
true

julia> contains("abba", r"a.a")
false
Compatibility: Julia 1.5

The `contains' function requires a Julia version of 1.5 or higher.


contains(needle)

Creates a function that checks whether its argument contains a needle, that is, a function equivalent to haystack -> contains(haystack, needle).

The returned function is of type Base.Fix2{typeof(contains)} and can be used to implement specialized methods.

first(s::AbstractString, n::Integer)

Returns a string consisting of the first n characters of the string s.

Examples

julia> first("∀ϵ≠0: ϵ²>0", 0)
""

julia> first("∀ϵ≠0: ϵ²>0", 1)
"∀"

julia> first("∀ϵ≠0: ϵ²>0", 3)
"∀ϵ≠"
last(s::AbstractString, n::Integer)

Returns a string consisting of the last n characters of the string `s'.

Examples

julia> last("∀ϵ≠0: ϵ²>0", 0)
""

julia> last("∀ϵ≠0: ϵ²>0", 1)
"0"

julia> last("∀ϵ≠0: ϵ²>0", 3)
"²>0"
uppercase(c::AbstractChar)

Converts c to uppercase.

See also the description lowercase and titlecase.

Examples

julia> uppercase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> uppercase('ê')
'Ê': Unicode U+00CA (category Lu: Letter, uppercase)

uppercase(s::AbstractString)

Returns the string s with all characters converted to uppercase.

See also the description lowercase, titlecase' and `uppercasefirst.

Examples

julia> uppercase("Julia")
"JULIA"
lowercase(c::AbstractChar)

Converts c to lowercase.

See also the description uppercase and titlecase.

Examples

julia> lowercase('A')
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

julia> lowercase('Ö')
'ö': Unicode U+00F6 (category Ll: Letter, lowercase)

lowercase(s::AbstractString)

Returns the string s with all characters converted to lowercase.

See also the description uppercase, titlecase' and `lowercasefirst.

Examples

julia> lowercase("STRINGS AND THINGS")
"strings and things"
titlecase(c::AbstractChar)

Converts c to the header case. It may differ from the uppercase for digraphs. See the example below.

See also the description uppercase and lowercase.

Examples

julia> titlecase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> titlecase('dž')
'Dž': Unicode U+01C5 (category Lt: Letter, titlecase)

julia> uppercase('dž')
'DŽ': Unicode U+01C4 (category Lu: Letter, uppercase)

titlecase(s::AbstractString; [wordsep::Function], strict::Bool=true) -> String

Capitalizes the first character of each word in the string s. If the strict argument is true, all other characters are converted to lowercase; otherwise, they remain unchanged. By default, all non-letter characters that begin a new grapheme are considered word separators. In the named argument wordsep, you can pass a predicate to define the characters that should be considered word separators. See also the function description uppercasefirst, which allows you to uppercase only the first character in the string s.

See also the description uppercase, lowercase and uppercasefirst.

Examples

julia> titlecase("the JULIA programming language")
"The Julia Programming Language"

julia> titlecase("ISS - international space station", strict=false)
"ISS - International Space Station"

julia> titlecase("a-a b-b", wordsep = c->c==' ')
"A-a B-b"
uppercasefirst(s::AbstractString) -> String

Returns the string s with the conversion of the first character to uppercase (more strictly speaking, to the uppercase letter of Unicode). See also the function description `titlecase', which allows you to uppercase the first letters of each word in the string `s'.

See also the description lowercasefirst, uppercase, lowercase and titlecase.

Examples

julia> uppercasefirst("python")
"Python"
lowercasefirst(s::AbstractString)

Returns the string s with the conversion of the first character to lowercase.

See also the description uppercasefirst, uppercase, lowercase and titlecase.

Examples

julia> lowercasefirst("Julia")
"julia"
join([io::IO,] iterator [, delim [, last]])

Combines the iterator object into a single line by inserting a separator (if specified) between the elements. If the last argument is specified, its value will be used instead of the delim between the last two elements. Each element of the iterator' object is converted to a string using `print(io::IOBuffer, x)'. If the `io argument is specified, the result is written to the io stream, rather than being returned as a string.

Examples

julia> join(["apples", "bananas", "pineapples"], ", ", " and ")
"apples, bananas and pineapples"

julia> join([1,2,3,4,5])
"12345"
chop(s::AbstractString; head::Integer = 0, tail::Integer = 1)

Deletes the first head and last tail characters from the string s. Calling chop(s) removes the last character from the string s'. If more characters than `length(s) are requested to be deleted, an empty string is returned.

See also the description chomp, startswith and first.

Examples

julia> a = "March"
"March"

julia> chop(a)
"Marc"

julia> chop(a, head = 1, tail = 2)
"ar"

julia> chop(a, head = 5, tail = 5)
""
chopprefix(s::AbstractString, prefix::Union{AbstractString,Regex}) -> SubString

Removes the prefix prefix from the string s. If the string s does not start with prefix, a string equal to s is returned.

See also the description chopsuffix.

Compatibility: Julia 1.8

This feature was first implemented in Julia 1.8.

Examples

julia> chopprefix("Hamburger", "Ham")
"burger"

julia> chopprefix("Hamburger", "hotdog")
"Hamburger"
chopsuffix(s::AbstractString, suffix::Union{AbstractString,Regex}) -> SubString

Removes the suffix suffix from the string s. If the string s does not end with suffix, a string equal to s is returned.

See also the description chopprefix.

Compatibility: Julia 1.8

This feature was first implemented in Julia 1.8.

Examples

julia> chopsuffix("Hamburger", "er")
"Hamburg"

julia> chopsuffix("Hamburger", "hotdog")
"Hamburger"
chomp(s::AbstractString) -> SubString

Deletes one end character of the beginning of the line from the string.

See also the description chop.

Examples

julia> chomp("Hello\n")
"Hello"
thisind(s::AbstractString, i::Integer) -> Int

If the index i is located within the boundaries of the string s, returns the index of the beginning of the character to which the encoding unit i belongs. In other words, if i is the beginning of a character, it returns i; if i is not the beginning of a character, it goes to the beginning of the character and returns its index. If the value of i is 0 or ncodeunits(s)+1, i is returned. In all other cases, the BoundsError error occurs.

Examples

julia> thisind("α", 0)
0

julia> thisind("α", 1)
1

julia> thisind("α", 2)
1

julia> thisind("α", 3)
3

julia> thisind("α", 4)
ERROR: BoundsError: attempt to access 2-codeunit String at index [4]
[...]

julia> thisind("α", -1)
ERROR: BoundsError: attempt to access 2-codeunit String at index [-1]
[...]
nextind(str::AbstractString, i::Integer, n::Integer=1) -> Int
  • If n == 1

If the index i is located within the boundaries of the string s, returns the index of the beginning of the character, the encoding of which begins after the index i. In other words, if i is the beginning of a character, it returns the beginning of the next character; if i is not the beginning of a character, it moves forward to the beginning of the character and returns its index. If the argument i is 0, the value 1 is returned, If the index i is within the boundaries of the string, but is greater than or equal to lastindex(str), the value ncodeunits(str)+1 is returned. Otherwise, a BoundsError error occurs.

  • If n > 1

    It is equivalent to applying nextind n times for n==1'. The only difference is that if the value of `n is so large that when applying nextind the value of ncodeunits(str)+1 is reached, then in each remaining iteration the returned value is increased by 1'. This means that in this case, `nextind can return a value greater than `ncodeunits(str)+1'.

  • If n == 0

    Returns i only if the value of i is a valid index in the string s or is equal to 0. Otherwise, a StringIndexError or BoundsError error occurs.

Examples

julia> nextind("α", 0)
1

julia> nextind("α", 1)
3

julia> nextind("α", 3)
ERROR: BoundsError: attempt to access 2-codeunit String at index [3]
[...]

julia> nextind("α", 0, 2)
3

julia> nextind("α", 1, 2)
4
prevind(str::AbstractString, i::Integer, n::Integer=1) -> Int
  • If n == 1

If the index i is located within the boundaries of the string s, returns the index of the beginning of the character, the encoding of which begins before the index i. In other words, if i is the beginning of a character, it returns the beginning of the previous character; if i is not the beginning of a character, it goes back to the beginning of the character and returns its index. If the argument i is 1, the value 0 is returned, If the argument i is ncodeunits(str)+1, the value lastindex(str) is returned, otherwise the error BoundsError occurs.

  • If n > 1

    Is equivalent to using prevent n times for n==1'. The only difference is that if the value of `n is so large that when using prevent, the value of 0 is reached, then in each remaining iteration, the return value is reduced by 1. This means that in this case, `prevent' can return a negative value.

  • If n == 0

    Returns i only if the value of i is a valid index in the string str or is equal to ncodeunits(str)+1. Otherwise, a StringIndexError or BoundsError error occurs.

Examples

julia> prevind("α", 3)
1

julia> prevind("α", 1)
0

julia> prevind("α", 0)
ERROR: BoundsError: attempt to access 2-codeunit String at index [0]
[...]

julia> prevind("α", 2, 2)
0

julia> prevind("α", 2, 3)
-1
textwidth(c)

Returns the number of columns required to display the symbol.

Examples

julia> textwidth('α')
1

julia> textwidth('⛵')
2

textwidth(s::AbstractString)

Returns the number of columns required to output a row.

Examples

julia> textwidth("March")
5
isascii(c::Union{AbstractChar,AbstractString}) -> Bool

Checks whether a single character or all elements of a string belong to the ASCII encoding.

Examples

julia> isascii('a')
true

julia> isascii('α')
false

julia> isascii("abc")
true

julia> isascii("αβγ")
false

For example, isascii' can be used as a predicative function for `filter or replace to remove or replace non-ASCII characters:

julia> filter(isascii, "abcdeγfgh") # удаления символов, не относящихся к ASCII;
"abcdefgh"

julia> replace("abcdeγfgh", !isascii=>' ') # замены символов, не относящихся к ASCII, на пробелы.
"abcde fgh"

isascii(cu::AbstractVector{CU}) where {CU <: Integer} -> Bool

Checks whether all values in the vector are ASCII encoded (from 0x00 to 0x7f). This function is intended to be used by other string implementations that require fast ASCII validation.

iscntrl(c::AbstractChar) -> Bool

Checks whether the character is a control character. Control characters are non-printable characters of the Latin-1 Unicode subset.

Examples

julia> iscntrl('\x01')
true

julia> iscntrl('a')
false
isdigit(c::AbstractChar) -> Bool

Checks whether the character is a digit (0—​9).

See also the description isletter.

Examples

julia> isdigit('❤')
false

julia> isdigit('9')
true

julia> isdigit('α')
false
isletter(c::AbstractChar) -> Bool

Checks whether the character is a letter. A character is considered a letter if it belongs to the general Unicode category "Letter", that is, its category code begins with "L".

See also the description isdigit.

Examples

julia> isletter('❤')
false

julia> isletter('α')
true

julia> isletter('9')
false
islowercase(c::AbstractChar) -> Bool

Checks whether the character is a lowercase letter (according to the Lowercase property from the Unicode standard).

See also the description isuppercase.

Examples

julia> islowercase('α')
true

julia> islowercase('Γ')
false

julia> islowercase('❤')
false
isnumeric(c::AbstractChar) -> Bool

Checks whether the character is a digit. A character is considered a digit if it belongs to the general Unicode category "Number", that is, its category code begins with "N".

Keep in mind that this broad category includes symbols such as ¾ and ௰. To check whether a character is a digit from 0 to 9, use the function isdigit.

Examples

julia> isnumeric('௰')
true

julia> isnumeric('9')
true

julia> isnumeric('α')
false

julia> isnumeric('❤')
false
isprint(c::AbstractChar) -> Bool

Checks whether a character is printable, including spaces, but not a control character.

Examples

julia> isprint('\x01')
false

julia> isprint('A')
true
ispunct(c::AbstractChar) -> Bool

Checks whether a character belongs to the general Unicode category of "Punctuation", that is, its category code begins with "P".

Examples

julia> ispunct('α')
false

julia> ispunct('/')
true

julia> ispunct(';')
true
isspace(c::AbstractChar) -> Bool

Checks whether the character is a space. These include the ASCII characters \t, \n, \v, \f, \r and " ", the Latin-1 character U+0085 and the Unicode characters Zs.

Examples

julia> isspace('\n')
true

julia> isspace('\r')
true

julia> isspace(' ')
true

julia> isspace('\x20')
true
isuppercase(c::AbstractChar) -> Bool

Checks whether the character is an uppercase letter (according to the Uppercase property from the Unicode standard).

See also the description islowercase.

Examples

julia> isuppercase('γ')
false

julia> isuppercase('Γ')
true

julia> isuppercase('❤')
false
isxdigit(c::AbstractChar) -> Bool

Checks whether the character is a valid hexadecimal digit. Note that the character x is not included here (as in the standard prefix `0x').

Examples

julia> isxdigit('a')
true

julia> isxdigit('x')
false
escape_string(str::AbstractString[, esc]; keep = ())::AbstractString
escape_string(io, str::AbstractString[, esc]; keep = ())::Nothing

General escaping of standard C and Unicode escape sequences. The first form returns an escaped string; the second outputs the result in `io'.

Backslashes (\) are escaped with a double backslash ("\\"). Non-printable characters are escaped using standard C escape codes, the sequence "\0" for NUL (if there is no ambiguity), the Unicode code position (prefix "\u"), or the hexadecimal value (prefix "\x").

The optional esc argument defines additional characters that must also be escaped with a backslash (when using the first form, the character " is also escaped by default).

The keep argument passes a collection of characters that should remain unchanged. Please note that esc has priority.

See also the function description unescape_string, which performs the opposite action.

Compatibility: Julia 1.7

The keep argument was first implemented in Julia 1.7.

Examples

julia> escape_string("aaa\nbbb")
"aaa\\nbbb"

julia> escape_string("aaa\nbbb"; keep = '\n')
"aaa\nbbb"

julia> escape_string("\xfe\xff") # недопустимо в utf-8
"\\xfe\\xff"

julia> escape_string(string('\u2135','\0')) # нет неоднозначности
"ℵ\\0"

julia> escape_string(string('\u2135','\0','0')) # \0 будет неоднозначно
"ℵ\\x000"
escape_raw_string(s::AbstractString, delim='"') -> AbstractString
escape_raw_string(io, s::AbstractString, delim='"')

Escapes a string using the method used for analyzing raw string literals. For each double quote character (") in the input string, s (or delim if specified) this function counts the number of n preceding backslashes (\), and then increases the number of backslashes from n to 2n+1 (even for n = 0). It also doubles the sequence of backslashes at the end of the string.

This escaping convention is used in raw strings and other non-standard string literals. (This escaping convention is also adopted in the Microsoft C/C compiler runtime.++ when analyzing the contents of the command line for the argv[] array.)

See also the description escape_string.

unescape_string(str::AbstractString, keep = ())::AbstractString
unescape_string(io, s::AbstractString, keep = ())::Nothing

General decoding of standard C and Unicode escape sequences. The first form returns an escaped string; the second outputs the result in io'. The `keep argument passes a collection of characters that (along with backslashes) must remain unchanged.

The following escape sequences are recognized:

  • escaped backslash (\\);

  • escaped double quotes (\");

  • standard C escape sequences (\a, \b, \t, \n, \v, \f, \r, \e);

  • Unicode BMP code positions (\u with 1—​4 hexadecimal digits at the end);

  • all Unicode code positions (\U with 1—​8 hexadecimal digits at the end, maximum value = 0010ffff);

  • hexadecimal bytes (\x with 1—​2 hexadecimal digits at the end);

  • octal bytes (\ with 1—​3 octal digits at the end).

See also the description escape_string.

Examples

julia> unescape_string("aaa\\nbbb") # escape-последовательность C
"aaa\nbbb"

julia> unescape_string("\\u03c0") # Юникод
"π"

julia> unescape_string("\\101") # восьмеричная форма
"A"

julia> unescape_string("aaa \\g \\n", ['g']) # используется аргумент `keep`
"aaa \\g \n"

AnnotatedString objects

The API for AnnotatedString objects is considered experimental and may be modified in different versions of Julia.

AnnotatedString{S <: AbstractString} <: AbstractString

A row with metadata in the form of annotated areas.

To be more precise, it’s a simple wrapper around any other string. AbstractString, which allows you to annotate areas of an encapsulated string using bulleted values.

                           C
                    ┌──────┸─────────┐
  "this is an example annotated string"
  └──┰────────┼─────┘         │
     A        └─────┰─────────┘
                    B

The diagram above shows the string AnnotatedString with three annotated areas (designated A, B and C). Each annotation contains a label (Symbol) and a value (Any). These three pieces of information are stored as @NamedTuple{region::UnitRange{Int64}, label::Symbol, value}.

The labels don’t have to be unique: the same area can have multiple annotations with the same label.

In general, the following properties should be preserved in the code written for AnnotatedString:

  • symbols that the annotation applies to;

  • the order in which annotations are applied to each character.

In specific cases of using AnnotatedString, additional semantics may be introduced.

A consequence of these rules is that adjacent annotations with identical labels and values are equivalent to a single annotation covering the combined range.

See also the description AnnotatedChar, annotatedstring, annotations and annotate!.

Constructors

AnnotatedString(s::S<:AbstractString) -> AnnotatedString{S}
AnnotatedString(s::S<:AbstractString, annotations::Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}})

The AnnotatedString string can also be created using the function annotatedstring, which acts much the same as string, but retains all annotations present in the arguments.

Examples

julia> AnnotatedString("this is an example annotated string",
                    [(1:18, :A => 1), (12:28, :B => 2), (18:35, :C => 3)])
"this is an example annotated string"
AnnotatedChar{S <: AbstractChar} <: AbstractChar

A Char object with annotations.

To be more precise, it’s a simple wrapper around any other character. AbstractChar, which contains a list of arbitrary bulleted annotations (@NamedTuple{label::Symbol, value}) along with the encapsulated symbol.

See also the description AnnotatedString, annotatedstring', `annotations and annotate!.

Constructors

AnnotatedChar(s::S) -> AnnotatedChar{S}
AnnotatedChar(s::S, annotations::Vector{@NamedTuple{label::Symbol, value}})

Examples

julia> AnnotatedChar('j', :label => 1)
'j': ASCII/Unicode U+006A (category Ll: Letter, lowercase)
annotatedstring(values...)

Creates an AnnotatedString string from any number of values using their output representation (print).

It works similarly string, but retains all available annotations (as values AnnotatedString or AnnotatedChar).

See also the description AnnotatedString and AnnotatedChar.

Examples

julia> annotatedstring("now a AnnotatedString")
"now a AnnotatedString"

julia> annotatedstring(AnnotatedString("annotated", [(1:9, :label => 1)]), ", and unannotated")
"annotated, and unannotated"
annotations(str::Union{AnnotatedString, SubString{AnnotatedString}},
            [position::Union{Integer, UnitRange}]) ->
    Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}}

Retrieves all annotations that relate to str'. If the `position argument is specified, only annotations that overlap with position are returned.

Annotations are provided together with the areas to which they apply, in the form of a vector of tuples "area-annotation".

According to the semantics described in AnnotatedString, the order of the returned annotations corresponds to the order in which they were applied.

See also the description annotate!.


annotations(chr::AnnotatedChar) -> Vector{@NamedTuple{label::Symbol, value}}

Gets all annotations chr as a vector of pairs of annotations.

annotate!(str::AnnotatedString, [range::UnitRange{Int}], label::Symbol, value)
annotate!(str::SubString{AnnotatedString}, [range::UnitRange{Int}], label::Symbol, value)

Annotates the range in the string str' (or the entire string) with a bulleted value (`label=> value). To remove existing label annotations, use the nothing value.

The order in which annotations are applied to str' has semantic meaning, as described in `AnnotatedString.


annotate!(char::AnnotatedChar, label::Symbol, value::Any)

Annotates the character char with the pair label => value.