Strings

# Core.AbstractString — Type

The type AbstractString' is a supertype for all string implementations in Julia. Strings are encodings of sequences of character codes https://unicode.org /[Unicode] represented by the `AbstractChar type. Julia makes several assumptions about strings:

Strings are encoded as fixed-size "code units".
- Code units can be extracted using `codeunit(s, i)'.
- The first unit of the code has the index `1'.
- The last code unit has the index ncodeunits(s).
- Any index i such that 1 ≤ i ≤ ncodeunits(s) is within the bounds.
Indexing of strings is performed in terms of these code units.
- Characters are extracted using s[i] with a valid string index of i.
- Each character `AbstractChar' in a string is encoded using one or more code units.
- Only the index of the first unit of the AbstractChar code is valid.
- Character encoding AbstractChar it does not depend on what precedes or follows it.
- String encodings are https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing], i.e. isvalid(s, i) has a computational complexity of O(1).

Some string functions that extract code units, characters, or substrings from strings produce errors if they are passed out-of-bounds or invalid string indexes. These include codeunit(s, i) and s[i]. Functions that perform arithmetic operations with string indexes use a simplified indexing approach and output the nearest valid string index when it is within bounds. If it goes beyond the boundaries, they behave as if there are an infinite number of characters on each side of the string. Usually, the length of the code unit of these imaginary fill-in characters is 1, but string types can choose different sizes of "imaginary" characters if it makes sense for their implementations (for example, substrings can transfer index arithmetic to the base string, which they allow you to get an idea of). Non-strict indexing functions include those designed for index arithmetic: thisind, `nextind', `prevent'. This model allows index arithmetic to work with out-of-bounds indexes as intermediate values as long as they are not used to extract a character, which often helps to avoid the need for encoding to bypass edge cases.

See also the description codeunit, ncodeunits, thisind, nextind, prevind.

# Core.AbstractChar — Type

The type AbstractChar' is a supertype for all symbol implementations in Julia. The character represents the Unicode code position. Using the function `codepoint it is possible to get an integer value of a code position. The opposite is also true: the character can be identified by the code position. For example, based on these numeric values, characters are compared using the operators < and =='. In the new type `T <: AbstractChar, at least the method codepoint(::T) and the constructor T(::UInt32) must be defined.

The subtype AbstractChar can represent only a subset of Unicode characters. In this case, an attempt to convert an unsupported UInt32 value will result in an error. And vice versa, the built-in type Char is a _ set_ of Unicode characters (which is necessary for lossless encoding of streams with invalid bytes). In this case, converting a value that does not match in Unicode to UInt32 will result in an error. Using the function 'isvalid` it is possible to check which code positions are representable by this type of AbstractChar.

Different encodings can be used inside the implementation of the AbstractChar type. When converting using the codepoint(char)' function the internal encoding does not matter, as the Unicode code position of the character is always returned. When calling `print(io, c) for any character c::AbstractChar, the encoding is determined by the argument io (UTF-8 for all built-in types IO). If necessary, the conversion to Char is performed.

In contrast, when calling write(io, c), the encoding may depend on the value of typeof(c), while calling read(io, typeof(c))`must receive data in the same encoding as `write'. The new `AbstractChar types require their own implementation of write and `read'.

# Core.Char — Type

Char(c::Union{Number,AbstractChar})

Char is a 32-bit type. AbstractChar, which represents characters in Julia by default. The Char type is used for character literals such as , as well as for elements of the type String.

To represent arbitrary byte streams in lossless String objects, a Char object may have a value that cannot be converted to a Unicode code position: an error occurs when converting such a Char value to UInt32. Using the function isvalid(c::Char) it is possible to check whether c represents a valid Unicode character.

# Base.codepoint — Function

codepoint(c::AbstractChar) -> Integer

Returns the Unicode code position (an unsigned integer) corresponding to the character c (or throws an exception if c is an invalid character). For the Char' type, this value is of the `UInt32 type, however, for the AbstractChar types representing only a subset of Unicode characters, an integer value of a different size can be returned (for example, UInt8).

# Base.length — Method

length(s::AbstractString) -> Int
length(s::AbstractString, i::Integer, j::Integer) -> Int

Returns the number of characters in the string s from index i to index j.

It is calculated as the number of code position indexes from i to j, which are valid character indexes. When passing a single string argument, the total number of characters in the string is calculated. When passing the arguments i and j, the number of valid indexes from i to j inclusive in the string s is calculated. In addition to values within the allowed range, the argument i can take the out-of-bounds value ncodeunits(s) + 1, and the argument j can take the out-of-bounds value 0.

The time complexity of this operation is generally linear. In other words, the execution time is proportional to the number of bytes or characters in a string, since the value is calculated dynamically. The situation is different with the method of determining the length of an array, the execution time of which is constant.

See also the description isvalid, ncodeunits, lastindex, thisind, nextind, prevind.

Examples

julia> length("jμΛIα")
5

# Base.sizeof — Method

sizeof(str::AbstractString)

The size of the string str in bytes. Is equal to the number of code units in str multiplied by the size (in bytes) of one code unit in str.

Examples

julia> sizeof("")
0

julia> sizeof("∀")
3

# Base.:* — Method

*(s::Union{AbstractString, AbstractChar}, t::Union{AbstractString, AbstractChar}...) -> AbstractString

Performs string and/or character concatenation by returning an object String or AnnotatedString (depending on the situation). This is equivalent to calling a function. string or annotatedstring with the same arguments. When concatenating strings of built-in types, a value of type String is always returned, but for other types of strings the result may have a different type.

Examples

julia> "Hello " * "world"
"Hello world"

julia> 'j' * "ulia"
"julia"

# Base.:^ — Method

^(s::Union{AbstractString,AbstractChar}, n::Integer) -> AbstractString

Repeats a string or character n times. It can also be written as `repeat(s, n)'.

See also the description ncodeunits.

codeunit(s::AbstractString, i::Integer) -> Union{UInt8, UInt16, UInt32}

Returns the value of the code unit in the string s at the index i. Please note that

codeunit(s, i) :: codeunit(s)

That is, the value returned by codeunit(s, i) has the type returned by codeunit(s).

Examples

julia> a = codeunit("Hello", 2)
0x65

julia> typeof(a)
UInt8

See also the description ncodeunits' and `checkbounds.

# Base.codeunits — Function

codeunits(s::AbstractString)

Returns a vector-like object containing the code units of the string. By default, it returns the shell CodeUnits, but if necessary, the function codeunits can be defined for new string types.

Examples

julia> codeunits("Juλia")
6-element Base.CodeUnits{UInt8, String}:
 0x4a
 0x75
 0xce
 0xbb
 0x69
 0x61

# Base.ascii — Function

ascii(s::AbstractString)

Converts a string to the String type and checks if it contains only ASCII data. Otherwise, it causes an error ArgumentError indicating the position of the first byte not in ASCII encoding.

See also the description of the predicate 'isascii`, which allows filtering and replacing non-ASCII characters.

Examples

julia> ascii("abcdeγfgh")
ERROR: ArgumentError: invalid ASCII at index 6 in "abcdeγfgh"
Stacktrace:
[...]

julia> ascii("abcdefgh")
"abcdefgh"

# Base.Regex — Type

Regex(pattern[, flags]) <: AbstractPattern

A type representing a regular expression. 'Regex` objects can be used to match strings using the function match.

Regex objects can be created using a string macro. @r_str. The Regex(pattern[, flags]) constructor is usually used if the string pattern requires interpolation. For more information about the flags, see the string macro documentation.

To escape the interpolated variables, use \Q and \E (for example, Regex("\\Q$x\\E")).

# Base.@r_str — Macro

@r_str -> Regex

Creates a regular expression, such as r"^[a-z]*$", without interpolation or escaping (except for the quotation mark character ", which must still be escaped). The regular expression also accepts one or more flags, which are specified after the closing quotation mark and change its behavior.:

i includes case-insensitive matching.
m — the characters ^ and $ are considered as corresponding to the beginning and end of individual lines, not the entire text.
's` allows you to match the modifier '.` with newline characters.
x enables "comment mode": spaces between regular expression characters are ignored, except when they are escaped by the character \, and the character # is considered as the beginning of the comment (which is ignored until the end of the line).
a enables ASCII mode (disables 'UTF` and UCP modes). By default, the matching of the sequences \B, \b, \D, \d, \S, \s, \W and \w is performed based on the properties of Unicode characters. If this flag is set, these sequences are mapped to ASCII characters only. This also includes the sequence \u, which outputs the specified character value directly as a single byte, without trying to encode it in UTF-8. It is important to note that this flag allows matching with invalid UTF-8 strings, with both sides of the matching treated as simple bytes (as if they were ISO/IEC 8859-1 or Latin-1 bytes) rather than character encodings. In this case, this flag is often combined with `s'. You can further refine this flag by starting the template with (UCP) or (UTF).

If interpolation is required, see the description of the type Regex.

Examples

julia> match(r"a+.*b+.*?d$"ism, "Goodbye,\nOh, angry,\nBad world\n")
RegexMatch("angry,\nBad world")

The first three flags are activated for this regular expression.

# Base.SubstitutionString — Type

SubstitutionString(substr) <: AbstractString

Saves the specified string substr' as an object of type `Substitution String for use in regular expression substitutions. It is usually created using a macro @s_str.

Examples

julia> SubstitutionString("Hello \\g<name>, it's \\1")
s"Hello \g<name>, it's \1"

julia> subst = s"Hello \g<name>, it's \1"
s"Hello \g<name>, it's \1"

julia> typeof(subst)
SubstitutionString{String}

# Base.@s_str — Macro

@s_str -> SubstitutionString

Creates a replacement string used to substitute regular expressions. In the string, the sequence of the form \N means the nth group of the record in the regular expression, and \g<groupname> means the named group of the record with the name groupname.

Examples

julia> msg = "#Hello# from Julia";

julia> replace(msg, r"#(.+)# from (?<from>\w+)" => s"FROM: \g<from>; MESSAGE: \1")
"FROM: Julia; MESSAGE: Hello"

# Base.@raw_str — Macro

@raw_str -> String

Creates a raw string without interpolation or escaping. The only exception is that quotation marks still have to be escaped. Backslashes escape both quotation marks and other backslashes, but only when a sequence of backslashes precedes the quotation mark character. Thus, 2n backslashes followed by a quotation mark encode n backslashes and the end of the literal, and 2n+1 backslashes followed by a quotation mark encode n backslashes with a quotation mark after them.

Examples

julia> println(raw"\ $x")
\ $x

julia> println(raw"\"")
"

julia> println(raw"\\\"")
\"

julia> println(raw"\\x \\\"")
\\x \"

# Base.@b_str — Macro

@b_str

Creates an immutable byte ('UInt8`) vector using string syntax.

Examples

julia> v = b"12\x01\x02"
4-element Base.CodeUnits{UInt8, String}:
 0x31
 0x32
 0x01
 0x02

julia> v[2]
0x32

# Base.Docs.@html_str — Macro

@html_str -> Docs.HTML

Creates an 'HTML` object based on a literal string.

Examples

julia> html"Julia"
HTML{String}("Julia")

# Base.Docs.@text_str — Macro

@text_str -> Docs.Text

Creates a Text object based on a literal string.

Examples

julia> text"Julia"
Julia

# Base.isvalid — Method

isvalid(value) -> Bool

Returns the value true if the specified value is supported by the appropriate type, which can currently be AbstractChar, String or SubString'.{String}.

Examples

julia> isvalid(Char(0xd800))
false

julia> isvalid(SubString(String(UInt8[0xfe,0x80,0x80,0x80,0x80,0x80]),1,2))
false

julia> isvalid(Char(0xd799))
true

# Base.isvalid — Method

isvalid(T, value) -> Bool

Returns the value true if the specified value is supported by the corresponding type. Currently, the type can be AbstractChar or String'. The values for `AbstractChar can be of type AbstractChar' or `UInt32. Values for String can be of this type, like SubString'.{String}, Vector{UInt8} or a continuous subarray of these types.

Examples

julia> isvalid(Char, 0xd800)
false

julia> isvalid(String, SubString("thisisvalid",1,5))
true

julia> isvalid(Char, 0xd799)
true

Compatibility: Julia 1.6

Support for values in the form of a subarray appeared in Julia 1.6.

# Base.isvalid — Method

isvalid(s::AbstractString, i::Integer) -> Bool

A predicate indicating whether the specified index is the beginning of character encoding in s. If isvalid(s, i) is set to true, s[i] will return a character whose encoding begins at this index. If false, s[i] will cause an invalid index error or an out-of-bounds error, depending on whether the index i is within acceptable bounds. For the function isvalid(s, i) to have an O(1) complexity level, the encoding of the string s must be https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing]. This is the basic assumption for supporting universal strings in Julia.

See also the description getindex, iterate, thisind, nextind, prevind, length.

Examples

julia> str = "αβγdef";

julia> isvalid(str, 1)
true

julia> str[1]
'α': Unicode U+03B1 (category Ll: Letter, lowercase)

julia> isvalid(str, 2)
false

julia> str[2]
ERROR: StringIndexError: invalid index [2], valid nearby indices [1]=>'α', [3]=>'β'
Stacktrace:
[...]

# Base.match — Function

match(r::Regex, s::AbstractString[, idx::Integer[, addopts]])

Searches for the first match of the regular expression r in s and returns the object RegexMatch containing the found match, or nothing if no matches are found. A matching substring can be obtained by accessing m.match, and recorded sequences can be obtained by accessing m.captures. The optional idx argument defines the index from which the search should start.

Examples

julia> rx = r"a(.)a"
r"a(.)a"

julia> m = match(rx, "cabac")
RegexMatch("aba", 1="b")

julia> m.captures
1-element Vector{Union{Nothing, SubString{String}}}:
 "b"

julia> m.match
"aba"

julia> match(rx, "cabac", 3) === nothing
true

# Base.eachmatch — Function

eachmatch(r::Regex, s::AbstractString; overlap::Bool=false)

It searches for all matches of the regular expression r in s and returns an iterator based on matches. If the overlap argument is true, the intersection of the indexes of matching sequences in the source string is allowed, otherwise they must have separate character ranges.

Examples

julia> rx = r"a.a"
r"a.a"

julia> m = eachmatch(rx, "a1a2a3a")
Base.RegexMatchIterator{String}(r"a.a", "a1a2a3a", false)

julia> collect(m)
2-element Vector{RegexMatch}:
 RegexMatch("a1a")
 RegexMatch("a3a")

julia> collect(eachmatch(rx, "a1a2a3a", overlap = true))
3-element Vector{RegexMatch}:
 RegexMatch("a1a")
 RegexMatch("a2a")
 RegexMatch("a3a")

# Base.RegexMatch — Type

RegexMatch <: AbstractMatch

A type representing a single match with Regex found in the string. Usually created by a function match.

The substring of the entire matched string is stored in the match field. The captures field stores substrings for each record group with numeric indexes. To index by record group name, the entire mapped object should be indexed instead, as shown in the examples. The position from which the mapping starts is stored in the 'offset` field. The `offsets' field stores the positions of the beginning of each record group. A value of 0 means a group that has not been recorded.

This type can be used as a group iterator of the Regex record, which outputs the substrings recorded in each group. This allows you to decompose the mapping records into their components. If the group was not recorded, the value nothing is given instead of the substring.

Methods that accept the RegexMatch object are defined for iterate, length, eltype, keys, haskey and 'getindex`, where the keys are the names or numbers of the record groups. For more information, see the description keys.

Examples

julia> m = match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30 in the morning")
RegexMatch("11:30", hour="11", minute="30", 3=nothing)

julia> m.match
"11:30"

julia> m.captures
3-element Vector{Union{Nothing, SubString{String}}}:
 "11"
 "30"
 nothing


julia> m["minute"]
"30"

julia> hr, min, ampm = m; # деструктурируем группы записи путем итерации

julia> hr
"11"

# Base.keys — Method

keys(m::RegexMatch) -> Vector

Returns a vector of keys for all entry groups of the basic regular expression. The key is enabled even if there are no matches with the record group. In other words, idx will be in the return value, even if `m[idx] == nothing'.

Unnamed record groups will have integer keys corresponding to their indexes. Named record groups will have string keys.

Compatibility: Julia 1.7

This method was added in Julia 1.7.

Examples

julia> keys(match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30"))
3-element Vector{Any}:
  "hour"
  "minute"
 3

# Base.isless — Method

isless(a::AbstractString, b::AbstractString) -> Bool

Checks whether the string a precedes the string b in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).

Examples

julia> isless("a", "b")
true

julia> isless("β", "α")
false

julia> isless("a", "a")
false

# Base.:== — Method

==(a::AbstractString, b::AbstractString) -> Bool

Checks the character-by-character equality of strings (more strictly speaking, the equality of Unicode code positions). If any of the strings is of type AnnotatedString, the string properties must also match.

Examples

julia> "abc" == "abc"
true

julia> "abc" == "αβγ"
false

# Base.cmp — Method

cmp(a::AbstractString, b::AbstractString) -> Int

Compares two strings. Returns 0 if both strings have the same length and all their characters match at each position. Returns -1 if a is a prefix of b or the characters in a are preceded by the characters of b in alphabetical order. Returns 1 if b is a prefix of a or the characters in b precede the characters of a in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).

Examples

julia> cmp("abc", "abc")
0

julia> cmp("ab", "abc")
-1

julia> cmp("abc", "ab")
1

julia> cmp("ab", "ac")
-1

julia> cmp("ac", "ab")
1

julia> cmp("α", "a")
1

julia> cmp("b", "β")
-1

# Base.lpad — Function

lpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String

Returns the string representation of s and fills the resulting string on the left with the characters p up to the length of n characters (in textwidth). If the length of s is already equal to n characters, an equal string is returned. By default, the line is filled with spaces.

Examples

julia> lpad("March", 10)
"     March"

Compatibility: Julia 1.7

In Julia 1.7, this function started using the value textwidth instead of simply counting characters (code positions).

# Base.rpad — Function

rpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String

Returns the string representation of s and fills the resulting string on the right with the characters p up to the length of n characters (in textwidth). If the length of s is already equal to n characters, an equal string is returned. By default, the line is filled with spaces.

Examples

julia> rpad("March", 20)
"March               "

Compatibility: Julia 1.7

In Julia 1.7, this function started using the value textwidth instead of simply counting characters (code positions).

# Base.findfirst — Method

findfirst(pattern::AbstractString, string::AbstractString)
findfirst(pattern::AbstractPattern, string::String)

Finds the first occurrence of pattern' in `string'. Equivalent to `findnext(pattern, string, firstindex(s)).

Examples

julia> findfirst("z", "Hello to the world") # возвращает nothing, но не выводится в REPL

julia> findfirst("Julia", "JuliaLang")
1:5

# Base.findnext — Method

findnext(pattern::AbstractString, string::AbstractString, start::Integer)
findnext(pattern::AbstractPattern, string::String, start::Integer)

Finds the next occurrence of pattern in string starting from the start position. The pattern can be either a string or a regular expression. In the latter case, the string argument must be of type `String'.

The return value is the range of indexes in which a matching sequence is found, such that s[findnext(x, s, i)] == x:

findnext("substring", string, i) == start:stop', so `string[start:stop] == "substring" and i <= start, or `nothing' if there are no matches.

Examples

julia> findnext("z", "Hello to the world", 1) === nothing
true

julia> findnext("o", "Hello to the world", 6)
8:8

julia> findnext("Lang", "JuliaLang", 2)
6:9

# Base.findnext — Method

findnext(ch::AbstractChar, string::AbstractString, start::Integer)

Finds the next occurrence of the character ch in the string starting from the position `start'.

Compatibility: Julia 1.3

This method requires a Julia version of 1.3 or higher.

Examples

julia> findnext('z', "Hello to the world", 1) === nothing
true

julia> findnext('o', "Hello to the world", 6)
8

# Base.findlast — Method

findlast(pattern::AbstractString, string::AbstractString)

Finds the last occurrence of pattern' in `string'. Equivalent to `findprev(pattern, string, lastindex(string)).

Examples

julia> findlast("o", "Hello to the world")
15:15

julia> findfirst("Julia", "JuliaLang")
1:5

# Base.findlast — Method

findlast(ch::AbstractChar, string::AbstractString)

Finds the last occurrence of the character ch in `string'.

Compatibility: Julia 1.3

This method requires a Julia version of 1.3 or higher.

Examples

julia> findlast('p', "happy")
4

julia> findlast('z', "happy") === nothing
true

# Base.findprev — Method

findprev(pattern::AbstractString, string::AbstractString, start::Integer)

Finds the previous occurrence of pattern in string starting from the start position.

The returned value is the range of indexes in which a matching sequence is found, such that s[findprev(x, s, i)] == x:

findprev("substring", string, i) == start:stop, so string[start:stop] == "substring" and stop <= i, or nothing if there are no matches.

Examples

julia> findprev("z", "Hello to the world", 18) === nothing
true

julia> findprev("o", "Hello to the world", 18)
15:15

julia> findprev("Julia", "JuliaLang", 6)
1:5

# Base.occursin — Function

occursin(needle::Union{AbstractString,AbstractPattern,AbstractChar}, haystack::AbstractString)

Determines whether the first argument is a substring of the second one. If needle is a regular expression, it checks whether haystack contains a match.

Examples

julia> occursin("Julia", "JuliaLang is pretty cool!")
true

julia> occursin('a', "JuliaLang is pretty cool!")
true

julia> occursin(r"a.a", "aba")
true

julia> occursin(r"a.a", "abba")
false

See also the description split.

Compatibility: Julia 1.8

The eachsplit function requires a Julia version of at least 1.8.

Examples

julia> a = "Ma.rch"
"Ma.rch"

julia> b = eachsplit(a, ".")
Base.SplitIterator{String, String}("Ma.rch", ".", 0, true)

julia> collect(b)
2-element Vector{SubString{String}}:
 "Ma"
 "rch"

# Base.eachrsplit — Function

eachrsplit(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
eachrsplit(str::AbstractString; limit::Integer=0, keepempty::Bool=false)

Returns an iterator over substrings SubString in str, which are obtained as a result of separation by delimiters dlm and are output in reverse order (from right to left). The dlm argument can have any formats that are allowed by the first argument of the method. findprev (that is, a string, a single character, or a function), or contain a collection of characters.

If the dlm argument is not specified, the default value is isspace, and keepempty is set to false by default.

Optional named arguments:

If limit > 0, the iterator splits the string a maximum of limit - 1' times and returns the rest in its entirety. With `limit < 1 (by default), the number of splits is unlimited.
keepempty: whether to return empty fields during iteration. The default value is false if the dlm argument is not specified, or true if the dlm argument is specified.

Note that unlike the functions split, rsplit and 'eachsplit`, this function iterates through the input substrings from right to left.

See also the description eachsplit and rsplit.

Compatibility: Julia 1.11

This feature requires a version of Julia not lower than 1.11.

Examples

julia> a = "Ma.r.ch";

julia> collect(eachrsplit(a, ".")) == ["ch", "r", "Ma"]
true

julia> collect(eachrsplit(a, "."; limit=2)) == ["ch", "Ma.r"]
true

# Base.split — Function

split(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
split(str::AbstractString; limit::Integer=0, keepempty::Bool=false)

Divides the string str into an array of substrings by occurrences of the delimiters dlm'. The `dlm argument can have any formats that are allowed by the first argument of the method. `findnext' (that is, a string, regular expression, or function), or contain a single character or a collection of characters.

If the dlm argument is not specified, the default value is isspace.

Optional named arguments:

limit: the maximum size of the result; limit=0 means unlimited size (default value);
keepempty: Whether empty fields should be saved in the result. The default value is false' if the `dlm argument is specified, or true if the dlm argument is not specified.

See also the description rsplit and eachsplit.

Examples

julia> a = "Ma.rch"
"Ma.rch"

julia> split(a, ".")
2-element Vector{SubString{String}}:
 "Ma"
 "rch"

# Base.rsplit — Function

rsplit(s::AbstractString; limit::Integer=0, keepempty::Bool=false)
rsplit(s::AbstractString, chars; limit::Integer=0, keepempty::Bool=true)

It acts in the same way as the function split, but starting from the end of the line.

Examples

julia> a = "M.a.r.c.h"
"M.a.r.c.h"

julia> rsplit(a, ".")
5-element Vector{SubString{String}}:
 "M"
 "a"
 "r"
 "c"
 "h"

julia> rsplit(a, "."; limit=1)
1-element Vector{SubString{String}}:
 "M.a.r.c.h"

julia> rsplit(a, "."; limit=2)
2-element Vector{SubString{String}}:
 "M.a.r.c"
 "h"

# Base.strip — Function

strip([pred=isspace,] str::AbstractString) -> SubString
strip(str::AbstractString, chars) -> SubString

Removes from str the beginning and ending characters that are specified in the chars argument or for which the pred function returns the value `true'.

By default, the leading and ending spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description lstrip and rstrip.

Compatibility: Julia 1.2

A method that accepts a predicative function requires a Julia version of at least 1.2.

Examples

julia> strip("{3, 5}\n", ['{', '}', '\n'])
"3, 5"

# Base.lstrip — Function

lstrip([pred=isspace,] str::AbstractString) -> SubString
lstrip(str::AbstractString, chars) -> SubString

Deletes the initial characters from str that are specified in the chars argument or for which the pred function returns the value `true'.

By default, initial spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description strip and rstrip.

Examples

julia> a = lpad("March", 20)
"               March"

julia> lstrip(a)
"March"

# Base.rstrip — Function

rstrip([pred=isspace,] str::AbstractString) -> SubString
rstrip(str::AbstractString, chars) -> SubString

Deletes the end characters from str that are specified in the chars argument or for which the pred function returns the value `true'.

By default, trailing spaces and separators are removed; for more information, see the function description. isspace.

The optional chars argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.

See also the description strip and lstrip.

Examples

julia> a = rpad("March", 20)
"March               "

julia> rstrip(a)
"March"

# Base.startswith — Function

startswith(s::AbstractString, prefix::Union{AbstractString,Base.Chars})

Returns the value true if the string s begins with the value of the prefix argument, which can be a string, a character, or a tuple, vector, or set of characters. If the prefix is a tuple, vector, or set of characters, it checks whether the first character of the string s is included in this set.

See also the description endswith and contains.

Examples

julia> startswith("JuliaLang", "Julia")
true

startswith(io::IO, prefix::Union{AbstractString,Base.Chars})

Checks whether the object IO begins with a prefix, which can be a string, a character, or a tuple, vector, or set of characters. See also the description peek.

startswith(prefix)

Creates a function that checks whether its argument starts with prefix', that is, a function equivalent to `+y → startswith(y, prefix)+.

The returned function is of type Base.Fix2{typeof(startswith)} and can be used to implement specialized methods.

Compatibility: Julia 1.5

To use the startswith(prefix) function with one argument, a Julia version at least 1.5 is required.

Examples

julia> startswith("Julia")("JuliaLang")
true

julia> startswith("Julia")("Ends with Julia")
false

startswith(s::AbstractString, prefix::Regex)

Returns the value true if the string s begins with the regular expression template `prefix'.

'startswith` does not compile the binding into a regular expression, but passes it as match_option to PCRE. If compilation time is amortized, occursin(r"^...", s) runs faster than startswith(s, r"...").

See also the description occursin and endswith.

Compatibility: Julia 1.2

This method requires a Julia version of at least 1.2.

Examples

julia> startswith("JuliaLang", r"Julia|Romeo")
true

# Base.endswith — Function

endswith(s::AbstractString, suffix::Union{AbstractString,Base.Chars})

Returns the value true if the string s ends with the value of the suffix argument, which can be a string, a character, or a tuple, vector, or set of characters. If the suffix is a tuple, vector, or set of characters, it checks whether the last character of the string s is included in this set.

See also the description startswith and contains.

Examples

julia> endswith("Sunday", "day")
true

endswith(suffix)

Creates a function that checks whether its argument ends with a suffix, that is, a function equivalent to y -> endswith(y, suffix).

The returned function is of type Base.Fix2{typeof(endswith)} and can be used to implement specialized methods.

Compatibility: Julia 1.5

To use the endswith(suffix) function with one argument, a Julia version at least 1.5 is required.

Examples

julia> endswith("Julia")("Ends with Julia")
true

julia> endswith("Julia")("JuliaLang")
false

endswith(s::AbstractString, suffix::Regex)

Returns the value true if the string s ends with the regular expression pattern `suffix'.

'endswith` does not compile the binding into a regular expression, but passes it as match_option to PCRE. If compilation time is amortized, occursin(r"...$", s) runs faster than endswith(s, r"...").

See also the description occursin and startswith.

Compatibility: Julia 1.2

This method requires a Julia version of at least 1.2.

Examples

julia> endswith("JuliaLang", r"Lang|Roberts")
true

# Base.contains — Function

contains(haystack::AbstractString, needle)

Returns true if haystack' contains `needle'. Similar to the `occursin(needle, haystack) call, but provided for consistency with startswith(haystack, needle) and `endswith(haystack, needle)'.

See also the description occursin, in and issubset.

Examples

julia> contains("JuliaLang is pretty cool!", "Julia")
true

julia> contains("JuliaLang is pretty cool!", 'a')
true

julia> contains("aba", r"a.a")
true

julia> contains("abba", r"a.a")
false

Compatibility: Julia 1.5

The `contains' function requires a Julia version of 1.5 or higher.

contains(needle)

Creates a function that checks whether its argument contains a needle, that is, a function equivalent to haystack -> contains(haystack, needle).

The returned function is of type Base.Fix2{typeof(contains)} and can be used to implement specialized methods.

# Base.first — Method

first(s::AbstractString, n::Integer)

Returns a string consisting of the first n characters of the string s.

Examples

julia> first("∀ϵ≠0: ϵ²>0", 0)
""

julia> first("∀ϵ≠0: ϵ²>0", 1)
"∀"

julia> first("∀ϵ≠0: ϵ²>0", 3)
"∀ϵ≠"

# Base.last — Method

last(s::AbstractString, n::Integer)

Returns a string consisting of the last n characters of the string `s'.

Examples

julia> last("∀ϵ≠0: ϵ²>0", 0)
""

julia> last("∀ϵ≠0: ϵ²>0", 1)
"0"

julia> last("∀ϵ≠0: ϵ²>0", 3)
"²>0"

# Base.Unicode.uppercase — Function

uppercase(c::AbstractChar)

Converts c to uppercase.

See also the description lowercase and titlecase.

Examples

julia> uppercase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> uppercase('ê')
'Ê': Unicode U+00CA (category Lu: Letter, uppercase)

uppercase(s::AbstractString)

Returns the string s with all characters converted to uppercase.

See also the description lowercase, titlecase' and `uppercasefirst.

Examples

julia> uppercase("Julia")
"JULIA"

# Base.Unicode.lowercase — Function

lowercase(c::AbstractChar)

Converts c to lowercase.

See also the description uppercase and titlecase.

Examples

julia> lowercase('A')
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

julia> lowercase('Ö')
'ö': Unicode U+00F6 (category Ll: Letter, lowercase)

lowercase(s::AbstractString)

Returns the string s with all characters converted to lowercase.

See also the description uppercase, titlecase' and `lowercasefirst.

Examples

julia> lowercase("STRINGS AND THINGS")
"strings and things"

# Base.Unicode.titlecase — Function

titlecase(c::AbstractChar)

Converts c to the header case. It may differ from the uppercase for digraphs. See the example below.

See also the description uppercase and lowercase.

Examples

julia> titlecase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> titlecase('ǆ')
'ǅ': Unicode U+01C5 (category Lt: Letter, titlecase)

julia> uppercase('ǆ')
'Ǆ': Unicode U+01C4 (category Lu: Letter, uppercase)

titlecase(s::AbstractString; [wordsep::Function], strict::Bool=true) -> String

Capitalizes the first character of each word in the string s. If the strict argument is true, all other characters are converted to lowercase; otherwise, they remain unchanged. By default, all non-letter characters that begin a new grapheme are considered word separators. In the named argument wordsep, you can pass a predicate to define the characters that should be considered word separators. See also the function description uppercasefirst, which allows you to uppercase only the first character in the string s.

See also the description uppercase, lowercase and uppercasefirst.

Examples

julia> titlecase("the JULIA programming language")
"The Julia Programming Language"

julia> titlecase("ISS - international space station", strict=false)
"ISS - International Space Station"

julia> titlecase("a-a b-b", wordsep = c->c==' ')
"A-a B-b"

# Base.Unicode.uppercasefirst — Function

uppercasefirst(s::AbstractString) -> String

Returns the string s with the conversion of the first character to uppercase (more strictly speaking, to the uppercase letter of Unicode). See also the function description `titlecase', which allows you to uppercase the first letters of each word in the string `s'.

See also the description lowercasefirst, uppercase, lowercase and titlecase.

Examples

julia> uppercasefirst("python")
"Python"

# Base.Unicode.lowercasefirst — Function

lowercasefirst(s::AbstractString)

Returns the string s with the conversion of the first character to lowercase.

See also the description uppercasefirst, uppercase, lowercase and titlecase.

Examples

julia> lowercasefirst("Julia")
"julia"

# Base.join — Function

join([io::IO,] iterator [, delim [, last]])

Combines the iterator object into a single line by inserting a separator (if specified) between the elements. If the last argument is specified, its value will be used instead of the delim between the last two elements. Each element of the iterator' object is converted to a string using `print(io::IOBuffer, x)'. If the `io argument is specified, the result is written to the io stream, rather than being returned as a string.

Examples

julia> join(["apples", "bananas", "pineapples"], ", ", " and ")
"apples, bananas and pineapples"

julia> join([1,2,3,4,5])
"12345"

# Base.chop — Function

chop(s::AbstractString; head::Integer = 0, tail::Integer = 1)

Deletes the first head and last tail characters from the string s. Calling chop(s) removes the last character from the string s'. If more characters than `length(s) are requested to be deleted, an empty string is returned.

See also the description chomp, startswith and first.

Examples

julia> a = "March"
"March"

julia> chop(a)
"Marc"

julia> chop(a, head = 1, tail = 2)
"ar"

julia> chop(a, head = 5, tail = 5)
""

# Base.chopprefix — Function

chopprefix(s::AbstractString, prefix::Union{AbstractString,Regex}) -> SubString

Removes the prefix prefix from the string s. If the string s does not start with prefix, a string equal to s is returned.

`AnnotatedString` objects

The API for AnnotatedString objects is considered experimental and may be modified in different versions of Julia.

# Base.AnnotatedString — Type

AnnotatedString{S <: AbstractString} <: AbstractString

A row with metadata in the form of annotated areas.

To be more precise, it’s a simple wrapper around any other string. AbstractString, which allows you to annotate areas of an encapsulated string using bulleted values.

                           C
                    ┌──────┸─────────┐
  "this is an example annotated string"
  └──┰────────┼─────┘         │
     A        └─────┰─────────┘
                    B

The diagram above shows the string AnnotatedString with three annotated areas (designated A, B and C). Each annotation contains a label (Symbol) and a value (Any). These three pieces of information are stored as @NamedTuple{region::UnitRange{Int64}, label::Symbol, value}.

The labels don’t have to be unique: the same area can have multiple annotations with the same label.

In general, the following properties should be preserved in the code written for AnnotatedString:

symbols that the annotation applies to;
the order in which annotations are applied to each character.

In specific cases of using AnnotatedString, additional semantics may be introduced.

A consequence of these rules is that adjacent annotations with identical labels and values are equivalent to a single annotation covering the combined range.

See also the description AnnotatedChar, annotatedstring, annotations and annotate!.

Constructors

AnnotatedString(s::S<:AbstractString) -> AnnotatedString{S}
AnnotatedString(s::S<:AbstractString, annotations::Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}})

The AnnotatedString string can also be created using the function annotatedstring, which acts much the same as string, but retains all annotations present in the arguments.

Examples

julia> AnnotatedString("this is an example annotated string",
                    [(1:18, :A => 1), (12:28, :B => 2), (18:35, :C => 3)])
"this is an example annotated string"

# Base.AnnotatedChar — Type

AnnotatedChar{S <: AbstractChar} <: AbstractChar

A Char object with annotations.

To be more precise, it’s a simple wrapper around any other character. AbstractChar, which contains a list of arbitrary bulleted annotations (@NamedTuple{label::Symbol, value}) along with the encapsulated symbol.

See also the description AnnotatedString, annotatedstring', `annotations and annotate!.

Constructors

AnnotatedChar(s::S) -> AnnotatedChar{S}
AnnotatedChar(s::S, annotations::Vector{@NamedTuple{label::Symbol, value}})

Examples

julia> AnnotatedChar('j', :label => 1)
'j': ASCII/Unicode U+006A (category Ll: Letter, lowercase)

# Base.annotatedstring — Function

annotatedstring(values...)

Creates an AnnotatedString string from any number of values using their output representation (print).

It works similarly string, but retains all available annotations (as values AnnotatedString or AnnotatedChar).

See also the description AnnotatedString and AnnotatedChar.

Examples

julia> annotatedstring("now a AnnotatedString")
"now a AnnotatedString"

julia> annotatedstring(AnnotatedString("annotated", [(1:9, :label => 1)]), ", and unannotated")
"annotated, and unannotated"

# Base.annotations — Function

annotations(str::Union{AnnotatedString, SubString{AnnotatedString}},
            [position::Union{Integer, UnitRange}]) ->
    Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}}

Retrieves all annotations that relate to str'. If the `position argument is specified, only annotations that overlap with position are returned.

Annotations are provided together with the areas to which they apply, in the form of a vector of tuples "area-annotation".

According to the semantics described in AnnotatedString, the order of the returned annotations corresponds to the order in which they were applied.