Strings
#
Core.AbstractString
— Type
The type AbstractString' is a supertype for all string implementations in Julia. Strings are encodings of sequences of character codes https://unicode.org /[Unicode] represented by the `AbstractChar
type. Julia makes several assumptions about strings:
-
Strings are encoded as fixed-size "code units".
-
Code units can be extracted using `codeunit(s, i)'.
-
The first unit of the code has the index `1'.
-
The last code unit has the index
ncodeunits(s)
. -
Any index
i
such that1 ≤ i ≤ ncodeunits(s)
is within the bounds.
-
-
Indexing of strings is performed in terms of these code units.
-
Characters are extracted using
s[i]
with a valid string index ofi
. -
Each character `AbstractChar' in a string is encoded using one or more code units.
-
Only the index of the first unit of the
AbstractChar
code is valid. -
Character encoding
AbstractChar
it does not depend on what precedes or follows it. -
String encodings are https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing], i.e.
isvalid(s, i)
has a computational complexity of O(1).
-
Some string functions that extract code units, characters, or substrings from strings produce errors if they are passed out-of-bounds or invalid string indexes. These include codeunit(s, i)
and s[i]
. Functions that perform arithmetic operations with string indexes use a simplified indexing approach and output the nearest valid string index when it is within bounds. If it goes beyond the boundaries, they behave as if there are an infinite number of characters on each side of the string. Usually, the length of the code unit of these imaginary fill-in characters is 1
, but string types can choose different sizes of "imaginary" characters if it makes sense for their implementations (for example, substrings can transfer index arithmetic to the base string, which they allow you to get an idea of). Non-strict indexing functions include those designed for index arithmetic: thisind
, `nextind', `prevent'. This model allows index arithmetic to work with out-of-bounds indexes as intermediate values as long as they are not used to extract a character, which often helps to avoid the need for encoding to bypass edge cases.
See also the description codeunit
, ncodeunits
, thisind
, nextind
, prevind
.
#
Core.AbstractChar
— Type
The type AbstractChar' is a supertype for all symbol implementations in Julia. The character represents the Unicode code position. Using the function `codepoint
it is possible to get an integer value of a code position. The opposite is also true: the character can be identified by the code position. For example, based on these numeric values, characters are compared using the operators <
and =='. In the new type `T <: AbstractChar
, at least the method codepoint(::T)
and the constructor T(::UInt32)
must be defined.
The subtype AbstractChar
can represent only a subset of Unicode characters. In this case, an attempt to convert an unsupported UInt32
value will result in an error. And vice versa, the built-in type Char
is a _ set_ of Unicode characters (which is necessary for lossless encoding of streams with invalid bytes). In this case, converting a value that does not match in Unicode to UInt32
will result in an error. Using the function 'isvalid` it is possible to check which code positions are representable by this type of AbstractChar
.
Different encodings can be used inside the implementation of the AbstractChar
type. When converting using the codepoint(char)' function the internal encoding does not matter, as the Unicode code position of the character is always returned. When calling `print(io, c)
for any character c::AbstractChar
, the encoding is determined by the argument io
(UTF-8 for all built-in types IO
). If necessary, the conversion to Char
is performed.
In contrast, when calling write(io, c)
, the encoding may depend on the value of typeof(c)
, while calling read(io, typeof(c))`must receive data in the same encoding as `write'. The new `AbstractChar
types require their own implementation of write
and `read'.
#
Core.Char
— Type
Char(c::Union{Number,AbstractChar})
Char
is a 32-bit type. AbstractChar
, which represents characters in Julia by default. The Char
type is used for character literals such as , as well as for elements of the type String
.
To represent arbitrary byte streams in lossless String objects, a Char object may have a value that cannot be converted to a Unicode code position: an error occurs when converting such a Char value to UInt32. Using the function isvalid(c::Char)
it is possible to check whether c
represents a valid Unicode character.
#
Base.codepoint
— Function
codepoint(c::AbstractChar) -> Integer
Returns the Unicode code position (an unsigned integer) corresponding to the character c
(or throws an exception if c
is an invalid character). For the Char' type, this value is of the `UInt32
type, however, for the AbstractChar
types representing only a subset of Unicode characters, an integer value of a different size can be returned (for example, UInt8
).
#
Base.length
— Method
length(s::AbstractString) -> Int
length(s::AbstractString, i::Integer, j::Integer) -> Int
Returns the number of characters in the string s
from index i
to index j
.
It is calculated as the number of code position indexes from i
to j
, which are valid character indexes. When passing a single string argument, the total number of characters in the string is calculated. When passing the arguments i
and j
, the number of valid indexes from i
to j
inclusive in the string s
is calculated. In addition to values within the allowed range, the argument i
can take the out-of-bounds value ncodeunits(s) + 1
, and the argument j
can take the out-of-bounds value 0
.
The time complexity of this operation is generally linear. In other words, the execution time is proportional to the number of bytes or characters in a string, since the value is calculated dynamically. The situation is different with the method of determining the length of an array, the execution time of which is constant. |
Examples
julia> length("jμΛIα")
5
#
Base.sizeof
— Method
sizeof(str::AbstractString)
The size of the string str
in bytes. Is equal to the number of code units in str
multiplied by the size (in bytes) of one code unit in str
.
Examples
julia> sizeof("")
0
julia> sizeof("∀")
3
#
Base.:
* — Method
*(s::Union{AbstractString, AbstractChar}, t::Union{AbstractString, AbstractChar}...) -> AbstractString
Performs string and/or character concatenation by returning an object String
or AnnotatedString
(depending on the situation). This is equivalent to calling a function. string
or annotatedstring
with the same arguments. When concatenating strings of built-in types, a value of type String
is always returned, but for other types of strings the result may have a different type.
Examples
julia> "Hello " * "world"
"Hello world"
julia> 'j' * "ulia"
"julia"
#
Base.string
— Function
string(n::Integer; base::Integer = 10, pad::Integer = 1)
Converts an integer n
to a string based on the specified base. You can specify the number of digits to which the filling should be performed.
See also the description digits
, bitstring
and count_zeros
.
Examples
julia> string(5, base = 13, pad = 4)
"0005"
julia> string(-13, base = 5, pad = 4)
"-0023"
string(xs...)
Creates a string of any values using the function print
.
The string
function should usually not be defined directly. Instead, define the print(io::IO, x::MyType)
method. If the function string(x)
for some type should be very efficient, it makes sense to add a method to string
and define print(io::IO, x::MyType) = print(io, string(x))
so that the functions are consistent.
Examples
julia> string("a", 1, true)
"a1true"
#
Base.repr
— Method
repr(x; context=nothing)
Creates a string from any value using the function show
. You should not add methods to the repr
; instead, define the show
method.
The optional named argument context
can be assigned the pair :key=>value
, a tuple of pairs :key=>value
or an object IO
or 'IOContext`, the attributes of which are used for the I/O stream passed to `show'.
Note that the result of calling repr(x)
is usually similar to how the value x
is entered in Julia. Instead, you can make a call repr(MIME("text/plain"), x)
to get a formatted version of x
that is easier to read. It is in this form that the value of x
is displayed in the REPL.
Compatibility: Julia 1.7
To pass a tuple in the named |
Examples
julia> repr(1)
"1"
julia> repr(zeros(3))
"[0.0, 0.0, 0.0]"
julia> repr(big(1/3))
"0.333333333333333314829616256247390992939472198486328125"
julia> repr(big(1/3), context=:compact => true)
"0.333333"
#
Core.String
— Method
String(s::AbstractString)
Creates a new string (String
) from an existing `AbstractString'.
#
Base.SubString
— Type
SubString(s::AbstractString, i::Integer, j::Integer=lastindex(s))
SubString(s::AbstractString, r::UnitRange{<:Integer})
It acts in the same way as getindex
, but returns a representation of the parent string s
in the range i:j
or r
respectively, rather than creating a copy.
The macro @views
converts any slices of strings s[i:j]
into substrings SubString(s, i, j)
in the code block.
Examples
julia> SubString("abc", 1, 2)
"ab"
julia> SubString("abc", 1:2)
"ab"
julia> SubString("abc", 2)
"bc"
#
Base.LazyString
— Type
LazyString <: AbstractString
Deferred representation of string interpolation. This is useful if the string is to be constructed in a context where performing interpolation and string construction is unnecessary or undesirable (for example, in function error paths).
This type is designed so that its creation at runtime is inexpensive and at the same time as much work as possible falls either on the macro or on subsequent output operations.
Examples
julia> n = 5; str = LazyString("n is ", n)
"n is 5"
See also the description @lazy_str
.
Compatibility: Julia 1.8
LazyString requires a Julia version of at least 1.8. |
Advanced Help
Security features for parallel programs
By itself, the deferred line does not create problems with parallelism, even if it is output in several Julia tasks. However, if a concurrency error may occur for the print
methods in the recorded value when called without synchronization, the output of the deferred line may lead to a problem. Moreover, the 'print` methods for written values can be called multiple times, although only one result will be returned.
Compatibility: Julia 1.9
In versions of Julia not lower than 1.9, the use of `LazyString' is safe in the above sense. |
#
Base.@lazy_str
— Macro
lazy"str"
Creates LazyString
, using regular string interpolation syntax. Note that the interpolations are determined during the construction of the LazyString, but the output is postponed until the first access to the string.
For information about the security properties for parallel programs, see the documentation on LazyString
.
Examples
julia> n = 5; str = lazy"n is $n" "n is 5" julia> typeof(str) LazyString
Compatibility: Julia 1.8
Lazy"str" requires a Julia version of at least 1.8. |
#
Base.transcode
— Function
transcode(T, src)
Converts string data from one Unicode encoding to another. 'src` is an object of type String
or Vector{UIntXX}
, containing UTF-XX code units, where XX
is the number 8, 16, or 32. The 'T` argument indicates the encoding of the returned value: String
if the string String
(in UTF-8 encoding) is to be returned, or UIntXX
if the vector Vector' is to be returned.{UIntXX}
of UTF-encoded data-XX'. (You can also use an integer type alias 'Cwchar_t
for converting wchar_t*
strings used by external C libraries.)
The `transcode' function is executed successfully if the input data can be adequately represented in the target encoding. For conversions from one UTF-XX encoding to another, it is always performed successfully, even for invalid Unicode data.
Currently, conversion is supported only to and from UTF-8 encoding.
Examples
julia> str = "αβγ"
"αβγ"
julia> transcode(UInt16, str)
3-element Vector{UInt16}:
0x03b1
0x03b2
0x03b3
julia> transcode(String, transcode(UInt16, str))
"αβγ"
#
Base.unsafe_string
— Function
unsafe_string(p::Ptr{UInt8}, [length::Integer])
Copies a string from a C-style string address (with the NUL character at the end) in UTF-8 encoding. (The pointer can then be safely released.) If the length
argument is specified (the length of the data in bytes), the string does not have to end with a NUL character.
This function is marked as "unsafe" because it will fail if p
is not a valid memory address for data of the requested length.
#
Base.ncodeunits
— Method
ncodeunits(s::AbstractString) -> Int
Returns the number of code units in a string. When accessing this string, valid indexes must match the condition 1 ≤ i ≤ ncodeunits(s)
. Not all such indexes are valid — the index may not point to the beginning of a character, but return the value of the code unit when calling codeunit(s,i)
.
Examples
julia> ncodeunits("The Julia Language")
18
julia> ncodeunits("∫eˣ")
6
julia> ncodeunits('∫'), ncodeunits('e'), ncodeunits('ˣ')
(3, 1, 2)
See also the description codeunit
, checkbounds
, sizeof
, length
, lastindex
.
#
Base.codeunit
— Function
codeunit(s::AbstractString) -> Type{<:Union{UInt8, UInt16, UInt32}}
Returns the type of code unit for the specified string object. For ASCII, Latin-1, or UTF-8 encoded strings, this will be the type UInt8
, for UCS-2 and UTF-16, the type UInt16
, and for UTF-32, the type UInt32
. The possible types of code units are not limited to these three options, but almost all widely used string encodings use one of them. Calling codeunit(s)
is equivalent to typeof(codeunit(s,1))
when s
is a non-empty string.
See also the description ncodeunits
.
codeunit(s::AbstractString, i::Integer) -> Union{UInt8, UInt16, UInt32}
Returns the value of the code unit in the string s
at the index i
. Please note that
codeunit(s, i) :: codeunit(s)
That is, the value returned by codeunit(s, i)
has the type returned by codeunit(s)
.
Examples
julia> a = codeunit("Hello", 2)
0x65
julia> typeof(a)
UInt8
See also the description ncodeunits'
and `checkbounds.
#
Base.codeunits
— Function
codeunits(s::AbstractString)
Returns a vector-like object containing the code units of the string. By default, it returns the shell CodeUnits
, but if necessary, the function codeunits
can be defined for new string types.
Examples
julia> codeunits("Juλia")
6-element Base.CodeUnits{UInt8, String}:
0x4a
0x75
0xce
0xbb
0x69
0x61
#
Base.ascii
— Function
ascii(s::AbstractString)
Converts a string to the String
type and checks if it contains only ASCII data. Otherwise, it causes an error ArgumentError
indicating the position of the first byte not in ASCII encoding.
See also the description of the predicate 'isascii`, which allows filtering and replacing non-ASCII characters.
Examples
julia> ascii("abcdeγfgh")
ERROR: ArgumentError: invalid ASCII at index 6 in "abcdeγfgh"
Stacktrace:
[...]
julia> ascii("abcdefgh")
"abcdefgh"
#
Base.Regex
— Type
Regex(pattern[, flags]) <: AbstractPattern
A type representing a regular expression. 'Regex` objects can be used to match strings using the function match
.
Regex
objects can be created using a string macro. @r_str
. The Regex(pattern[, flags])
constructor is usually used if the string pattern
requires interpolation. For more information about the flags, see the string macro documentation.
To escape the interpolated variables, use |
#
Base.@r_str
— Macro
@r_str -> Regex
Creates a regular expression, such as r"^[a-z]*$"
, without interpolation or escaping (except for the quotation mark character "
, which must still be escaped). The regular expression also accepts one or more flags, which are specified after the closing quotation mark and change its behavior.:
-
i
includes case-insensitive matching. -
m
— the characters^
and$
are considered as corresponding to the beginning and end of individual lines, not the entire text. -
's` allows you to match the modifier '.` with newline characters.
-
x
enables "comment mode": spaces between regular expression characters are ignored, except when they are escaped by the character\
, and the character#
is considered as the beginning of the comment (which is ignored until the end of the line). -
a
enables ASCII mode (disables 'UTF` andUCP
modes). By default, the matching of the sequences\B
,\b
,\D
,\d
,\S
,\s
,\W
and\w
is performed based on the properties of Unicode characters. If this flag is set, these sequences are mapped to ASCII characters only. This also includes the sequence\u
, which outputs the specified character value directly as a single byte, without trying to encode it in UTF-8. It is important to note that this flag allows matching with invalid UTF-8 strings, with both sides of the matching treated as simple bytes (as if they were ISO/IEC 8859-1 or Latin-1 bytes) rather than character encodings. In this case, this flag is often combined with `s'. You can further refine this flag by starting the template with (UCP) or (UTF).
If interpolation is required, see the description of the type Regex
.
Examples
julia> match(r"a+.*b+.*?d$"ism, "Goodbye,\nOh, angry,\nBad world\n")
RegexMatch("angry,\nBad world")
The first three flags are activated for this regular expression.
#
Base.SubstitutionString
— Type
SubstitutionString(substr) <: AbstractString
Saves the specified string substr' as an object of type `Substitution String
for use in regular expression substitutions. It is usually created using a macro @s_str
.
Examples
julia> SubstitutionString("Hello \\g<name>, it's \\1")
s"Hello \g<name>, it's \1"
julia> subst = s"Hello \g<name>, it's \1"
s"Hello \g<name>, it's \1"
julia> typeof(subst)
SubstitutionString{String}
#
Base.@s_str
— Macro
@s_str -> SubstitutionString
Creates a replacement string used to substitute regular expressions. In the string, the sequence of the form \N
means the nth group of the record in the regular expression, and \g<groupname>
means the named group of the record with the name groupname
.
Examples
julia> msg = "#Hello# from Julia";
julia> replace(msg, r"#(.+)# from (?<from>\w+)" => s"FROM: \g<from>; MESSAGE: \1")
"FROM: Julia; MESSAGE: Hello"
#
Base.@raw_str
— Macro
@raw_str -> String
Creates a raw string without interpolation or escaping. The only exception is that quotation marks still have to be escaped. Backslashes escape both quotation marks and other backslashes, but only when a sequence of backslashes precedes the quotation mark character. Thus, 2n backslashes followed by a quotation mark encode n backslashes and the end of the literal, and 2n+1 backslashes followed by a quotation mark encode n backslashes with a quotation mark after them.
Examples
julia> println(raw"\ $x")
\ $x
julia> println(raw"\"")
"
julia> println(raw"\\\"")
\"
julia> println(raw"\\x \\\"")
\\x \"
#
Base.@b_str
— Macro
@b_str
Creates an immutable byte ('UInt8`) vector using string syntax.
Examples
julia> v = b"12\x01\x02"
4-element Base.CodeUnits{UInt8, String}:
0x31
0x32
0x01
0x02
julia> v[2]
0x32
#
Base.Docs.@html_str
— Macro
@html_str -> Docs.HTML
Creates an 'HTML` object based on a literal string.
Examples
julia> html"Julia"
HTML{String}("Julia")
#
Base.Docs.@text_str
— Macro
@text_str -> Docs.Text
Creates a Text
object based on a literal string.
Examples
julia> text"Julia"
Julia
#
Base.isvalid
— Method
isvalid(value) -> Bool
Returns the value true
if the specified value is supported by the appropriate type, which can currently be AbstractChar
, String
or SubString'.{String}
.
Examples
julia> isvalid(Char(0xd800))
false
julia> isvalid(SubString(String(UInt8[0xfe,0x80,0x80,0x80,0x80,0x80]),1,2))
false
julia> isvalid(Char(0xd799))
true
#
Base.isvalid
— Method
isvalid(T, value) -> Bool
Returns the value true
if the specified value is supported by the corresponding type. Currently, the type can be AbstractChar
or String'. The values for `AbstractChar
can be of type AbstractChar' or `UInt32
. Values for String
can be of this type, like SubString'.{String}
, Vector{UInt8}
or a continuous subarray of these types.
Examples
julia> isvalid(Char, 0xd800)
false
julia> isvalid(String, SubString("thisisvalid",1,5))
true
julia> isvalid(Char, 0xd799)
true
Compatibility: Julia 1.6
Support for values in the form of a subarray appeared in Julia 1.6. |
#
Base.isvalid
— Method
isvalid(s::AbstractString, i::Integer) -> Bool
A predicate indicating whether the specified index is the beginning of character encoding in s
. If isvalid(s, i)
is set to true, s[i]
will return a character whose encoding begins at this index. If false, s[i]
will cause an invalid index error or an out-of-bounds error, depending on whether the index i
is within acceptable bounds. For the function isvalid(s, i)
to have an O(1) complexity level, the encoding of the string s
must be https://en.wikipedia.org/wiki/Self-synchronizing_code [self-synchronizing]. This is the basic assumption for supporting universal strings in Julia.
Examples
julia> str = "αβγdef";
julia> isvalid(str, 1)
true
julia> str[1]
'α': Unicode U+03B1 (category Ll: Letter, lowercase)
julia> isvalid(str, 2)
false
julia> str[2]
ERROR: StringIndexError: invalid index [2], valid nearby indices [1]=>'α', [3]=>'β'
Stacktrace:
[...]
#
Base.match
— Function
match(r::Regex, s::AbstractString[, idx::Integer[, addopts]])
Searches for the first match of the regular expression r
in s
and returns the object RegexMatch
containing the found match, or nothing if no matches are found. A matching substring can be obtained by accessing m.match
, and recorded sequences can be obtained by accessing m.captures
. The optional idx
argument defines the index from which the search should start.
Examples
julia> rx = r"a(.)a"
r"a(.)a"
julia> m = match(rx, "cabac")
RegexMatch("aba", 1="b")
julia> m.captures
1-element Vector{Union{Nothing, SubString{String}}}:
"b"
julia> m.match
"aba"
julia> match(rx, "cabac", 3) === nothing
true
#
Base.eachmatch
— Function
eachmatch(r::Regex, s::AbstractString; overlap::Bool=false)
It searches for all matches of the regular expression r
in s
and returns an iterator based on matches. If the overlap
argument is true
, the intersection of the indexes of matching sequences in the source string is allowed, otherwise they must have separate character ranges.
Examples
julia> rx = r"a.a"
r"a.a"
julia> m = eachmatch(rx, "a1a2a3a")
Base.RegexMatchIterator{String}(r"a.a", "a1a2a3a", false)
julia> collect(m)
2-element Vector{RegexMatch}:
RegexMatch("a1a")
RegexMatch("a3a")
julia> collect(eachmatch(rx, "a1a2a3a", overlap = true))
3-element Vector{RegexMatch}:
RegexMatch("a1a")
RegexMatch("a2a")
RegexMatch("a3a")
#
Base.RegexMatch
— Type
RegexMatch <: AbstractMatch
A type representing a single match with Regex
found in the string. Usually created by a function match
.
The substring of the entire matched string is stored in the match
field. The captures
field stores substrings for each record group with numeric indexes. To index by record group name, the entire mapped object should be indexed instead, as shown in the examples. The position from which the mapping starts is stored in the 'offset` field. The `offsets' field stores the positions of the beginning of each record group. A value of 0 means a group that has not been recorded.
This type can be used as a group iterator of the Regex
record, which outputs the substrings recorded in each group. This allows you to decompose the mapping records into their components. If the group was not recorded, the value nothing
is given instead of the substring.
Methods that accept the RegexMatch
object are defined for iterate
, length
, eltype
, keys
, haskey
and 'getindex`, where the keys are the names or numbers of the record groups. For more information, see the description keys
.
Examples
julia> m = match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30 in the morning")
RegexMatch("11:30", hour="11", minute="30", 3=nothing)
julia> m.match
"11:30"
julia> m.captures
3-element Vector{Union{Nothing, SubString{String}}}:
"11"
"30"
nothing
julia> m["minute"]
"30"
julia> hr, min, ampm = m; # деструктурируем группы записи путем итерации
julia> hr
"11"
#
Base.keys
— Method
keys(m::RegexMatch) -> Vector
Returns a vector of keys for all entry groups of the basic regular expression. The key is enabled even if there are no matches with the record group. In other words, idx
will be in the return value, even if `m[idx] == nothing'.
Unnamed record groups will have integer keys corresponding to their indexes. Named record groups will have string keys.
Compatibility: Julia 1.7
This method was added in Julia 1.7. |
Examples
julia> keys(match(r"(?<hour>\d+):(?<minute>\d+)(am|pm)?", "11:30"))
3-element Vector{Any}:
"hour"
"minute"
3
#
Base.isless
— Method
isless(a::AbstractString, b::AbstractString) -> Bool
Checks whether the string a
precedes the string b
in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).
Examples
julia> isless("a", "b")
true
julia> isless("β", "α")
false
julia> isless("a", "a")
false
#
Base.:==
— Method
==(a::AbstractString, b::AbstractString) -> Bool
Checks the character-by-character equality of strings (more strictly speaking, the equality of Unicode code positions). If any of the strings is of type AnnotatedString
, the string properties must also match.
Examples
julia> "abc" == "abc"
true
julia> "abc" == "αβγ"
false
#
Base.cmp
— Method
cmp(a::AbstractString, b::AbstractString) -> Int
Compares two strings. Returns 0
if both strings have the same length and all their characters match at each position. Returns -1
if a
is a prefix of b
or the characters in a
are preceded by the characters of b
in alphabetical order. Returns 1
if b
is a prefix of a
or the characters in b
precede the characters of a
in alphabetical order (strictly speaking, this is the lexicographic order by Unicode code positions).
Examples
julia> cmp("abc", "abc")
0
julia> cmp("ab", "abc")
-1
julia> cmp("abc", "ab")
1
julia> cmp("ab", "ac")
-1
julia> cmp("ac", "ab")
1
julia> cmp("α", "a")
1
julia> cmp("b", "β")
-1
#
Base.lpad
— Function
lpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String
Returns the string representation of s
and fills the resulting string on the left with the characters p
up to the length of n
characters (in textwidth
). If the length of s
is already equal to n
characters, an equal string is returned. By default, the line is filled with spaces.
Examples
julia> lpad("March", 10)
" March"
Compatibility: Julia 1.7
In Julia 1.7, this function started using the value |
#
Base.rpad
— Function
rpad(s, n::Integer, p::Union{AbstractChar,AbstractString}=' ') -> String
Returns the string representation of s
and fills the resulting string on the right with the characters p
up to the length of n
characters (in textwidth
). If the length of s
is already equal to n
characters, an equal string is returned. By default, the line is filled with spaces.
Examples
julia> rpad("March", 20)
"March "
Compatibility: Julia 1.7
In Julia 1.7, this function started using the value |
#
Base.findfirst
— Method
findfirst(pattern::AbstractString, string::AbstractString)
findfirst(pattern::AbstractPattern, string::String)
Finds the first occurrence of pattern' in `string'. Equivalent to `findnext(pattern, string, firstindex(s))
.
Examples
julia> findfirst("z", "Hello to the world") # возвращает nothing, но не выводится в REPL
julia> findfirst("Julia", "JuliaLang")
1:5
#
Base.findnext
— Method
findnext(pattern::AbstractString, string::AbstractString, start::Integer)
findnext(pattern::AbstractPattern, string::String, start::Integer)
Finds the next occurrence of pattern
in string
starting from the start
position. The pattern
can be either a string or a regular expression. In the latter case, the string
argument must be of type `String'.
The return value is the range of indexes in which a matching sequence is found, such that s[findnext(x, s, i)] == x
:
findnext("substring", string, i)
== start:stop', so `string[start:stop] == "substring"
and i <= start
, or `nothing' if there are no matches.
Examples
julia> findnext("z", "Hello to the world", 1) === nothing
true
julia> findnext("o", "Hello to the world", 6)
8:8
julia> findnext("Lang", "JuliaLang", 2)
6:9
#
Base.findnext
— Method
findnext(ch::AbstractChar, string::AbstractString, start::Integer)
Finds the next occurrence of the character ch
in the string
starting from the position `start'.
Compatibility: Julia 1.3
This method requires a Julia version of 1.3 or higher. |
Examples
julia> findnext('z', "Hello to the world", 1) === nothing
true
julia> findnext('o', "Hello to the world", 6)
8
#
Base.findlast
— Method
findlast(pattern::AbstractString, string::AbstractString)
Finds the last occurrence of pattern' in `string'. Equivalent to `findprev(pattern, string, lastindex(string))
.
Examples
julia> findlast("o", "Hello to the world")
15:15
julia> findfirst("Julia", "JuliaLang")
1:5
#
Base.findlast
— Method
findlast(ch::AbstractChar, string::AbstractString)
Finds the last occurrence of the character ch
in `string'.
Compatibility: Julia 1.3
This method requires a Julia version of 1.3 or higher. |
Examples
julia> findlast('p', "happy")
4
julia> findlast('z', "happy") === nothing
true
#
Base.findprev
— Method
findprev(pattern::AbstractString, string::AbstractString, start::Integer)
Finds the previous occurrence of pattern
in string
starting from the start
position.
The returned value is the range of indexes in which a matching sequence is found, such that s[findprev(x, s, i)] == x
:
findprev("substring", string, i)
== start:stop
, so string[start:stop] == "substring"
and stop <= i
, or nothing
if there are no matches.
Examples
julia> findprev("z", "Hello to the world", 18) === nothing
true
julia> findprev("o", "Hello to the world", 18)
15:15
julia> findprev("Julia", "JuliaLang", 6)
1:5
#
Base.occursin
— Function
occursin(needle::Union{AbstractString,AbstractPattern,AbstractChar}, haystack::AbstractString)
Determines whether the first argument is a substring of the second one. If needle
is a regular expression, it checks whether haystack
contains a match.
Examples
julia> occursin("Julia", "JuliaLang is pretty cool!")
true
julia> occursin('a', "JuliaLang is pretty cool!")
true
julia> occursin(r"a.a", "aba")
true
julia> occursin(r"a.a", "abba")
false
See also the description contains
.
occursin(haystack)
Creates a function that checks whether its argument is included in the haystack
, that is, a function equivalent to needle -> occursin(needle, haystack)
.
The returned function is of type Base.Fix2{typeof(occursin)}
.
Compatibility: Julia 1.6
This method requires a Julia version at least 1.6. |
Examples
julia> search_f = occursin("JuliaLang is a programming language");
julia> search_f("JuliaLang")
true
julia> search_f("Python")
false
#
Base.reverse
— Method
reverse(s::AbstractString) -> AbstractString
Turns the line backwards. More strictly speaking, this function reverses the order of the code positions in the string. Its main purpose is to process strings in reverse order, especially when searching for regular expression matches backwards. See also the function description reverseind
, which converts indexes of s
to indexes of reverse(s)
and vice versa, and a description of the graphemes function from the Unicode module, which works with user-visible characters (graphemes) rather than code positions. Also, see the function description. Iterators.reverse
, which allows iterating in reverse order without creating a copy. Custom string types should have their own implementation of the reverse
function, which should usually return a string of the same type and in the same encoding. If a string is returned in a different encoding, the reverse
function must also be redefined for this string type so that the condition s[reverse(s,i)] == reverse(s)[i]
is met.
Examples
julia> reverse("JuliaLang")
"gnaLailuJ"
The examples below may give different results on different systems. The expected result is indicated in the comments. |
Combining characters can produce unexpected results.:
julia> reverse("ax̂e") # во входных данных циркумфлекс находится над x, а в выходных — над e
"êxa"
julia> using Unicode
julia> join(reverse(collect(graphemes("ax̂e")))) # меняет порядок следования графем на обратный; циркумфлекс находится над x как во входных, так и в выходных данных
"ex̂a"
#
Base.replace
— Method
replace([io::IO], s::AbstractString, pat=>r, [pat2=>r2, ...]; [count::Integer])
Searches for the specified pattern pat
in s
, replacing each occurrence with r
. If the count
argument is specified, no more than count
occurrences are replaced. 'pat` can be a single character, a vector, or multiple characters, a string, or a regular expression. If 'r` is a function, each occurrence is replaced by r(s)
, where s
is a matched substring (when pat
is of type AbstractPattern
or AbstractString
) or a symbol (when pat
is of type AbstractChar
or is a collection of AbstractChar
). If pat
is a regular expression, and r
— Substitution string
, references to record groups in r
are replaced with matching text. To remove instances of pat
from string', assign `r
the empty string String
(""
).
The returned value is a new string after replacement. If the argument io::IO
is specified, the converted string is written to io
instead (and io
is returned). (For example, this can be used together with 'IOBuffer' to reuse the pre-allocated buffer array in place.)
You can specify multiple templates: they will be applied from left to right at the same time, so that only one template will be applied to any of the characters and the templates will be applied to the input text, not to substitutions.
Compatibility: Julia 1.7
Version 1.7 is required to use multiple templates. |
Compatibility: Julia 1.10
The |
Examples
julia> replace("Python is a programming language.", "Python" => "Julia")
"Julia is a programming language."
julia> replace("The quick foxes run quickly.", "quick" => "slow", count=1)
"The slow foxes run quickly."
julia> replace("The quick foxes run quickly.", "quick" => "", count=1)
"The foxes run quickly."
julia> replace("The quick foxes run quickly.", r"fox(es)?" => s"bus\1")
"The quick buses run quickly."
julia> replace("abcabc", "a" => "b", "b" => "c", r".+" => "a")
"bca"
#
Base.eachsplit
— Function
eachsplit(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
eachsplit(str::AbstractString; limit::Integer=0, keepempty::Bool=false)
The string str
is separated by occurrences of the delimiters dlm
and the iterator is returned by substrings. The dlm
argument can have any formats that are allowed by the first argument of the method. `findnext' (that is, a string, regular expression, or function), or contain a single character or a collection of characters.
If the dlm
argument is not specified, the default value is isspace
.
Optional named arguments:
-
limit
: the maximum size of the result;limit=0
means unlimited size (default value); -
keepempty
: Whether empty fields should be saved in the result. The default value isfalse' if the `dlm
argument is specified, ortrue
if thedlm
argument is not specified.
See also the description split
.
Compatibility: Julia 1.8
The |
Examples
julia> a = "Ma.rch"
"Ma.rch"
julia> b = eachsplit(a, ".")
Base.SplitIterator{String, String}("Ma.rch", ".", 0, true)
julia> collect(b)
2-element Vector{SubString{String}}:
"Ma"
"rch"
#
Base.eachrsplit
— Function
eachrsplit(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
eachrsplit(str::AbstractString; limit::Integer=0, keepempty::Bool=false)
Returns an iterator over substrings SubString
in str
, which are obtained as a result of separation by delimiters dlm
and are output in reverse order (from right to left). The dlm
argument can have any formats that are allowed by the first argument of the method. findprev
(that is, a string, a single character, or a function), or contain a collection of characters.
If the dlm
argument is not specified, the default value is isspace
, and keepempty
is set to false
by default.
Optional named arguments:
-
If
limit > 0
, the iterator splits the string a maximum oflimit - 1' times and returns the rest in its entirety. With `limit < 1
(by default), the number of splits is unlimited. -
keepempty
: whether to return empty fields during iteration. The default value isfalse
if thedlm
argument is not specified, ortrue
if thedlm
argument is specified.
Note that unlike the functions split
, rsplit
and 'eachsplit`, this function iterates through the input substrings from right to left.
Compatibility: Julia 1.11
This feature requires a version of Julia not lower than 1.11. |
Examples
julia> a = "Ma.r.ch";
julia> collect(eachrsplit(a, ".")) == ["ch", "r", "Ma"]
true
julia> collect(eachrsplit(a, "."; limit=2)) == ["ch", "Ma.r"]
true
#
Base.split
— Function
split(str::AbstractString, dlm; limit::Integer=0, keepempty::Bool=true)
split(str::AbstractString; limit::Integer=0, keepempty::Bool=false)
Divides the string str
into an array of substrings by occurrences of the delimiters dlm'. The `dlm
argument can have any formats that are allowed by the first argument of the method. `findnext' (that is, a string, regular expression, or function), or contain a single character or a collection of characters.
If the dlm
argument is not specified, the default value is isspace
.
Optional named arguments:
-
limit
: the maximum size of the result;limit=0
means unlimited size (default value); -
keepempty
: Whether empty fields should be saved in the result. The default value isfalse' if the `dlm
argument is specified, ortrue
if thedlm
argument is not specified.
Examples
julia> a = "Ma.rch"
"Ma.rch"
julia> split(a, ".")
2-element Vector{SubString{String}}:
"Ma"
"rch"
#
Base.rsplit
— Function
rsplit(s::AbstractString; limit::Integer=0, keepempty::Bool=false)
rsplit(s::AbstractString, chars; limit::Integer=0, keepempty::Bool=true)
It acts in the same way as the function split
, but starting from the end of the line.
Examples
julia> a = "M.a.r.c.h"
"M.a.r.c.h"
julia> rsplit(a, ".")
5-element Vector{SubString{String}}:
"M"
"a"
"r"
"c"
"h"
julia> rsplit(a, "."; limit=1)
1-element Vector{SubString{String}}:
"M.a.r.c.h"
julia> rsplit(a, "."; limit=2)
2-element Vector{SubString{String}}:
"M.a.r.c"
"h"
#
Base.strip
— Function
strip([pred=isspace,] str::AbstractString) -> SubString
strip(str::AbstractString, chars) -> SubString
Removes from str
the beginning and ending characters that are specified in the chars
argument or for which the pred
function returns the value `true'.
By default, the leading and ending spaces and separators are removed; for more information, see the function description. isspace
.
The optional chars
argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.
Compatibility: Julia 1.2
A method that accepts a predicative function requires a Julia version of at least 1.2. |
Examples
julia> strip("{3, 5}\n", ['{', '}', '\n'])
"3, 5"
#
Base.lstrip
— Function
lstrip([pred=isspace,] str::AbstractString) -> SubString
lstrip(str::AbstractString, chars) -> SubString
Deletes the initial characters from str
that are specified in the chars
argument or for which the pred
function returns the value `true'.
By default, initial spaces and separators are removed; for more information, see the function description. isspace
.
The optional chars
argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.
Examples
julia> a = lpad("March", 20)
" March"
julia> lstrip(a)
"March"
#
Base.rstrip
— Function
rstrip([pred=isspace,] str::AbstractString) -> SubString
rstrip(str::AbstractString, chars) -> SubString
Deletes the end characters from str
that are specified in the chars
argument or for which the pred
function returns the value `true'.
By default, trailing spaces and separators are removed; for more information, see the function description. isspace
.
The optional chars
argument defines the characters to be deleted.: It can be a single character, a vector, or a set of characters.
Examples
julia> a = rpad("March", 20)
"March "
julia> rstrip(a)
"March"
#
Base.startswith
— Function
startswith(s::AbstractString, prefix::Union{AbstractString,Base.Chars})
Returns the value true
if the string s
begins with the value of the prefix
argument, which can be a string, a character, or a tuple, vector, or set of characters. If the prefix
is a tuple, vector, or set of characters, it checks whether the first character of the string s
is included in this set.
Examples
julia> startswith("JuliaLang", "Julia")
true
startswith(io::IO, prefix::Union{AbstractString,Base.Chars})
Checks whether the object IO
begins with a prefix, which can be a string, a character, or a tuple, vector, or set of characters. See also the description peek
.
startswith(prefix)
Creates a function that checks whether its argument starts with prefix', that is, a function equivalent to `+y → startswith(y, prefix)+
.
The returned function is of type Base.Fix2{typeof(startswith)}
and can be used to implement specialized methods.
Compatibility: Julia 1.5
To use the |
Examples
julia> startswith("Julia")("JuliaLang")
true
julia> startswith("Julia")("Ends with Julia")
false
startswith(s::AbstractString, prefix::Regex)
Returns the value true
if the string s
begins with the regular expression template `prefix'.
'startswith` does not compile the binding into a regular expression, but passes it as |
Compatibility: Julia 1.2
This method requires a Julia version of at least 1.2. |
Examples
julia> startswith("JuliaLang", r"Julia|Romeo")
true
#
Base.endswith
— Function
endswith(s::AbstractString, suffix::Union{AbstractString,Base.Chars})
Returns the value true
if the string s
ends with the value of the suffix
argument, which can be a string, a character, or a tuple, vector, or set of characters. If the suffix
is a tuple, vector, or set of characters, it checks whether the last character of the string s
is included in this set.
See also the description startswith
and contains
.
Examples
julia> endswith("Sunday", "day")
true
endswith(suffix)
Creates a function that checks whether its argument ends with a suffix
, that is, a function equivalent to y -> endswith(y, suffix)
.
The returned function is of type Base.Fix2{typeof(endswith)}
and can be used to implement specialized methods.
Compatibility: Julia 1.5
To use the |
Examples
julia> endswith("Julia")("Ends with Julia")
true
julia> endswith("Julia")("JuliaLang")
false
endswith(s::AbstractString, suffix::Regex)
Returns the value true
if the string s
ends with the regular expression pattern `suffix'.
'endswith` does not compile the binding into a regular expression, but passes it as |
See also the description occursin
and startswith
.
Compatibility: Julia 1.2
This method requires a Julia version of at least 1.2. |
Examples
julia> endswith("JuliaLang", r"Lang|Roberts")
true
#
Base.contains
— Function
contains(haystack::AbstractString, needle)
Returns true
if haystack' contains `needle'. Similar to the `occursin(needle, haystack)
call, but provided for consistency with startswith(haystack, needle)
and `endswith(haystack, needle)'.
Examples
julia> contains("JuliaLang is pretty cool!", "Julia")
true
julia> contains("JuliaLang is pretty cool!", 'a')
true
julia> contains("aba", r"a.a")
true
julia> contains("abba", r"a.a")
false
Compatibility: Julia 1.5
The `contains' function requires a Julia version of 1.5 or higher. |
contains(needle)
Creates a function that checks whether its argument contains a needle
, that is, a function equivalent to haystack -> contains(haystack, needle)
.
The returned function is of type Base.Fix2{typeof(contains)}
and can be used to implement specialized methods.
#
Base.first
— Method
first(s::AbstractString, n::Integer)
Returns a string consisting of the first n
characters of the string s
.
Examples
julia> first("∀ϵ≠0: ϵ²>0", 0)
""
julia> first("∀ϵ≠0: ϵ²>0", 1)
"∀"
julia> first("∀ϵ≠0: ϵ²>0", 3)
"∀ϵ≠"
#
Base.last
— Method
last(s::AbstractString, n::Integer)
Returns a string consisting of the last n
characters of the string `s'.
Examples
julia> last("∀ϵ≠0: ϵ²>0", 0)
""
julia> last("∀ϵ≠0: ϵ²>0", 1)
"0"
julia> last("∀ϵ≠0: ϵ²>0", 3)
"²>0"
#
Base.Unicode.uppercase
— Function
uppercase(c::AbstractChar)
Converts c
to uppercase.
Examples
julia> uppercase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)
julia> uppercase('ê')
'Ê': Unicode U+00CA (category Lu: Letter, uppercase)
uppercase(s::AbstractString)
Returns the string s
with all characters converted to uppercase.
See also the description lowercase
, titlecase'
and `uppercasefirst.
Examples
julia> uppercase("Julia")
"JULIA"
#
Base.Unicode.lowercase
— Function
lowercase(c::AbstractChar)
Converts c
to lowercase.
Examples
julia> lowercase('A')
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> lowercase('Ö')
'ö': Unicode U+00F6 (category Ll: Letter, lowercase)
lowercase(s::AbstractString)
Returns the string s
with all characters converted to lowercase.
See also the description uppercase
, titlecase'
and `lowercasefirst.
Examples
julia> lowercase("STRINGS AND THINGS")
"strings and things"
#
Base.Unicode.titlecase
— Function
titlecase(c::AbstractChar)
Converts c
to the header case. It may differ from the uppercase for digraphs. See the example below.
Examples
julia> titlecase('a')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)
julia> titlecase('dž')
'Dž': Unicode U+01C5 (category Lt: Letter, titlecase)
julia> uppercase('dž')
'DŽ': Unicode U+01C4 (category Lu: Letter, uppercase)
titlecase(s::AbstractString; [wordsep::Function], strict::Bool=true) -> String
Capitalizes the first character of each word in the string s
. If the strict
argument is true, all other characters are converted to lowercase; otherwise, they remain unchanged. By default, all non-letter characters that begin a new grapheme are considered word separators. In the named argument wordsep
, you can pass a predicate to define the characters that should be considered word separators. See also the function description uppercasefirst
, which allows you to uppercase only the first character in the string s
.
See also the description uppercase
, lowercase
and uppercasefirst
.
Examples
julia> titlecase("the JULIA programming language")
"The Julia Programming Language"
julia> titlecase("ISS - international space station", strict=false)
"ISS - International Space Station"
julia> titlecase("a-a b-b", wordsep = c->c==' ')
"A-a B-b"
#
Base.Unicode.uppercasefirst
— Function
uppercasefirst(s::AbstractString) -> String
Returns the string s
with the conversion of the first character to uppercase (more strictly speaking, to the uppercase letter of Unicode). See also the function description `titlecase', which allows you to uppercase the first letters of each word in the string `s'.
See also the description lowercasefirst
, uppercase
, lowercase
and titlecase
.
Examples
julia> uppercasefirst("python")
"Python"
#
Base.Unicode.lowercasefirst
— Function
lowercasefirst(s::AbstractString)
Returns the string s
with the conversion of the first character to lowercase.
See also the description uppercasefirst
, uppercase
, lowercase
and titlecase
.
Examples
julia> lowercasefirst("Julia")
"julia"
#
Base.join
— Function
join([io::IO,] iterator [, delim [, last]])
Combines the iterator object into a single line by inserting a separator (if specified) between the elements. If the last
argument is specified, its value will be used instead of the delim
between the last two elements. Each element of the iterator' object is converted to a string using `print(io::IOBuffer, x)'. If the `io
argument is specified, the result is written to the io
stream, rather than being returned as a string.
Examples
julia> join(["apples", "bananas", "pineapples"], ", ", " and ")
"apples, bananas and pineapples"
julia> join([1,2,3,4,5])
"12345"
#
Base.chop
— Function
chop(s::AbstractString; head::Integer = 0, tail::Integer = 1)
Deletes the first head
and last tail
characters from the string s
. Calling chop(s)
removes the last character from the string s'. If more characters than `length(s)
are requested to be deleted, an empty string is returned.
See also the description chomp
, startswith
and first
.
Examples
julia> a = "March"
"March"
julia> chop(a)
"Marc"
julia> chop(a, head = 1, tail = 2)
"ar"
julia> chop(a, head = 5, tail = 5)
""
#
Base.chopprefix
— Function
chopprefix(s::AbstractString, prefix::Union{AbstractString,Regex}) -> SubString
Removes the prefix prefix
from the string s
. If the string s
does not start with prefix
, a string equal to s
is returned.
See also the description chopsuffix
.
Compatibility: Julia 1.8
This feature was first implemented in Julia 1.8. |
Examples
julia> chopprefix("Hamburger", "Ham")
"burger"
julia> chopprefix("Hamburger", "hotdog")
"Hamburger"
#
Base.chopsuffix
— Function
chopsuffix(s::AbstractString, suffix::Union{AbstractString,Regex}) -> SubString
Removes the suffix suffix
from the string s
. If the string s
does not end with suffix
, a string equal to s
is returned.
See also the description chopprefix
.
Compatibility: Julia 1.8
This feature was first implemented in Julia 1.8. |
Examples
julia> chopsuffix("Hamburger", "er")
"Hamburg"
julia> chopsuffix("Hamburger", "hotdog")
"Hamburger"
#
Base.thisind
— Function
thisind(s::AbstractString, i::Integer) -> Int
If the index i
is located within the boundaries of the string s
, returns the index of the beginning of the character to which the encoding unit i
belongs. In other words, if i
is the beginning of a character, it returns i
; if i
is not the beginning of a character, it goes to the beginning of the character and returns its index. If the value of i
is 0 or ncodeunits(s)+1
, i
is returned. In all other cases, the BoundsError error occurs.
Examples
julia> thisind("α", 0)
0
julia> thisind("α", 1)
1
julia> thisind("α", 2)
1
julia> thisind("α", 3)
3
julia> thisind("α", 4)
ERROR: BoundsError: attempt to access 2-codeunit String at index [4]
[...]
julia> thisind("α", -1)
ERROR: BoundsError: attempt to access 2-codeunit String at index [-1]
[...]
#
Base.nextind
— Method
nextind(str::AbstractString, i::Integer, n::Integer=1) -> Int
-
If
n == 1
If the index i
is located within the boundaries of the string s
, returns the index of the beginning of the character, the encoding of which begins after the index i
. In other words, if i
is the beginning of a character, it returns the beginning of the next character; if i
is not the beginning of a character, it moves forward to the beginning of the character and returns its index. If the argument i
is 0
, the value 1
is returned, If the index i
is within the boundaries of the string, but is greater than or equal to lastindex(str)
, the value ncodeunits(str)+1
is returned. Otherwise, a BoundsError
error occurs.
-
If
n > 1
It is equivalent to applying
nextind
n
times forn==1'. The only difference is that if the value of `n
is so large that when applyingnextind
the value ofncodeunits(str)+1
is reached, then in each remaining iteration the returned value is increased by1'. This means that in this case, `nextind
can return a value greater than `ncodeunits(str)+1'. -
If
n == 0
Returns
i
only if the value ofi
is a valid index in the strings
or is equal to0
. Otherwise, aStringIndexError
orBoundsError
error occurs.
Examples
julia> nextind("α", 0)
1
julia> nextind("α", 1)
3
julia> nextind("α", 3)
ERROR: BoundsError: attempt to access 2-codeunit String at index [3]
[...]
julia> nextind("α", 0, 2)
3
julia> nextind("α", 1, 2)
4
#
Base.prevind
— Method
prevind(str::AbstractString, i::Integer, n::Integer=1) -> Int
-
If
n == 1
If the index i
is located within the boundaries of the string s
, returns the index of the beginning of the character, the encoding of which begins before the index i
. In other words, if i
is the beginning of a character, it returns the beginning of the previous character; if i
is not the beginning of a character, it goes back to the beginning of the character and returns its index. If the argument i
is 1
, the value 0
is returned, If the argument i
is ncodeunits(str)+1
, the value lastindex(str)
is returned, otherwise the error BoundsError
occurs.
-
If
n > 1
Is equivalent to using
prevent
n
times forn==1'. The only difference is that if the value of `n
is so large that when usingprevent
, the value of0
is reached, then in each remaining iteration, the return value is reduced by1
. This means that in this case, `prevent' can return a negative value. -
If
n == 0
Returns
i
only if the value ofi
is a valid index in the stringstr
or is equal toncodeunits(str)+1
. Otherwise, aStringIndexError
orBoundsError
error occurs.
Examples
julia> prevind("α", 3)
1
julia> prevind("α", 1)
0
julia> prevind("α", 0)
ERROR: BoundsError: attempt to access 2-codeunit String at index [0]
[...]
julia> prevind("α", 2, 2)
0
julia> prevind("α", 2, 3)
-1
#
Base.Unicode.textwidth
— Function
textwidth(c)
Returns the number of columns required to display the symbol.
Examples
julia> textwidth('α')
1
julia> textwidth('⛵')
2
textwidth(s::AbstractString)
Returns the number of columns required to output a row.
Examples
julia> textwidth("March")
5
#
Base.isascii
— Function
isascii(c::Union{AbstractChar,AbstractString}) -> Bool
Checks whether a single character or all elements of a string belong to the ASCII encoding.
Examples
julia> isascii('a')
true
julia> isascii('α')
false
julia> isascii("abc")
true
julia> isascii("αβγ")
false
For example, isascii' can be used as a predicative function for `filter
or replace
to remove or replace non-ASCII characters:
julia> filter(isascii, "abcdeγfgh") # удаления символов, не относящихся к ASCII;
"abcdefgh"
julia> replace("abcdeγfgh", !isascii=>' ') # замены символов, не относящихся к ASCII, на пробелы.
"abcde fgh"
isascii(cu::AbstractVector{CU}) where {CU <: Integer} -> Bool
Checks whether all values in the vector are ASCII encoded (from 0x00 to 0x7f). This function is intended to be used by other string implementations that require fast ASCII validation.
#
Base.Unicode.iscntrl
— Function
iscntrl(c::AbstractChar) -> Bool
Checks whether the character is a control character. Control characters are non-printable characters of the Latin-1 Unicode subset.
Examples
julia> iscntrl('\x01')
true
julia> iscntrl('a')
false
#
Base.Unicode.isletter
— Function
isletter(c::AbstractChar) -> Bool
Checks whether the character is a letter. A character is considered a letter if it belongs to the general Unicode category "Letter", that is, its category code begins with "L".
See also the description isdigit
.
Examples
julia> isletter('❤')
false
julia> isletter('α')
true
julia> isletter('9')
false
#
Base.Unicode.islowercase
— Function
islowercase(c::AbstractChar) -> Bool
Checks whether the character is a lowercase letter (according to the Lowercase
property from the Unicode standard).
See also the description isuppercase
.
Examples
julia> islowercase('α')
true
julia> islowercase('Γ')
false
julia> islowercase('❤')
false
#
Base.Unicode.isnumeric
— Function
isnumeric(c::AbstractChar) -> Bool
Checks whether the character is a digit. A character is considered a digit if it belongs to the general Unicode category "Number", that is, its category code begins with "N".
Keep in mind that this broad category includes symbols such as ¾ and ௰. To check whether a character is a digit from 0 to 9, use the function isdigit
.
Examples
julia> isnumeric('௰')
true
julia> isnumeric('9')
true
julia> isnumeric('α')
false
julia> isnumeric('❤')
false
#
Base.Unicode.isprint
— Function
isprint(c::AbstractChar) -> Bool
Checks whether a character is printable, including spaces, but not a control character.
Examples
julia> isprint('\x01')
false
julia> isprint('A')
true
#
Base.Unicode.ispunct
— Function
ispunct(c::AbstractChar) -> Bool
Checks whether a character belongs to the general Unicode category of "Punctuation", that is, its category code begins with "P".
Examples
julia> ispunct('α')
false
julia> ispunct('/')
true
julia> ispunct(';')
true
#
Base.Unicode.isspace
— Function
isspace(c::AbstractChar) -> Bool
Checks whether the character is a space. These include the ASCII characters \t, \n, \v, \f, \r and " ", the Latin-1 character U+0085 and the Unicode characters Zs.
Examples
julia> isspace('\n')
true
julia> isspace('\r')
true
julia> isspace(' ')
true
julia> isspace('\x20')
true
#
Base.Unicode.isuppercase
— Function
isuppercase(c::AbstractChar) -> Bool
Checks whether the character is an uppercase letter (according to the Uppercase
property from the Unicode standard).
See also the description islowercase
.
Examples
julia> isuppercase('γ')
false
julia> isuppercase('Γ')
true
julia> isuppercase('❤')
false
#
Base.Unicode.isxdigit
— Function
isxdigit(c::AbstractChar) -> Bool
Checks whether the character is a valid hexadecimal digit. Note that the character x
is not included here (as in the standard prefix `0x').
Examples
julia> isxdigit('a')
true
julia> isxdigit('x')
false
#
Base.escape_string
— Function
escape_string(str::AbstractString[, esc]; keep = ())::AbstractString
escape_string(io, str::AbstractString[, esc]; keep = ())::Nothing
General escaping of standard C and Unicode escape sequences. The first form returns an escaped string; the second outputs the result in `io'.
Backslashes (\
) are escaped with a double backslash ("\\"
). Non-printable characters are escaped using standard C escape codes, the sequence "\0"
for NUL (if there is no ambiguity), the Unicode code position (prefix "\u"
), or the hexadecimal value (prefix "\x"
).
The optional esc
argument defines additional characters that must also be escaped with a backslash (when using the first form, the character "
is also escaped by default).
The keep
argument passes a collection of characters that should remain unchanged. Please note that esc
has priority.
See also the function description unescape_string
, which performs the opposite action.
Compatibility: Julia 1.7
The |
Examples
julia> escape_string("aaa\nbbb")
"aaa\\nbbb"
julia> escape_string("aaa\nbbb"; keep = '\n')
"aaa\nbbb"
julia> escape_string("\xfe\xff") # недопустимо в utf-8
"\\xfe\\xff"
julia> escape_string(string('\u2135','\0')) # нет неоднозначности
"ℵ\\0"
julia> escape_string(string('\u2135','\0','0')) # \0 будет неоднозначно
"ℵ\\x000"
#
Base.escape_raw_string
— Function
escape_raw_string(s::AbstractString, delim='"') -> AbstractString
escape_raw_string(io, s::AbstractString, delim='"')
Escapes a string using the method used for analyzing raw string literals. For each double quote character ("
) in the input string, s
(or delim
if specified) this function counts the number of n preceding backslashes (\
), and then increases the number of backslashes from n to 2n+1 (even for n = 0). It also doubles the sequence of backslashes at the end of the string.
This escaping convention is used in raw strings and other non-standard string literals. (This escaping convention is also adopted in the Microsoft C/C compiler runtime.++ when analyzing the contents of the command line for the argv[] array.)
See also the description escape_string
.
#
Base.unescape_string
— Function
unescape_string(str::AbstractString, keep = ())::AbstractString
unescape_string(io, s::AbstractString, keep = ())::Nothing
General decoding of standard C and Unicode escape sequences. The first form returns an escaped string; the second outputs the result in io'. The `keep
argument passes a collection of characters that (along with backslashes) must remain unchanged.
The following escape sequences are recognized:
-
escaped backslash (
\\
); -
escaped double quotes (
\"
); -
standard C escape sequences (
\a
,\b
,\t
,\n
,\v
,\f
,\r
,\e
); -
Unicode BMP code positions (
\u
with 1—4 hexadecimal digits at the end); -
all Unicode code positions (
\U
with 1—8 hexadecimal digits at the end, maximum value = 0010ffff); -
hexadecimal bytes (
\x
with 1—2 hexadecimal digits at the end); -
octal bytes (
\
with 1—3 octal digits at the end).
See also the description escape_string
.
Examples
julia> unescape_string("aaa\\nbbb") # escape-последовательность C
"aaa\nbbb"
julia> unescape_string("\\u03c0") # Юникод
"π"
julia> unescape_string("\\101") # восьмеричная форма
"A"
julia> unescape_string("aaa \\g \\n", ['g']) # используется аргумент `keep`
"aaa \\g \n"
AnnotatedString
objects
The API for AnnotatedString objects is considered experimental and may be modified in different versions of Julia. |
#
Base.AnnotatedString
— Type
AnnotatedString{S <: AbstractString} <: AbstractString
A row with metadata in the form of annotated areas.
To be more precise, it’s a simple wrapper around any other string. AbstractString
, which allows you to annotate areas of an encapsulated string using bulleted values.
C
┌──────┸─────────┐
"this is an example annotated string"
└──┰────────┼─────┘ │
A └─────┰─────────┘
B
The diagram above shows the string AnnotatedString
with three annotated areas (designated A
, B
and C
). Each annotation contains a label (Symbol
) and a value (Any
). These three pieces of information are stored as @NamedTuple{region::UnitRange{Int64}, label::Symbol, value}
.
The labels don’t have to be unique: the same area can have multiple annotations with the same label.
In general, the following properties should be preserved in the code written for AnnotatedString
:
-
symbols that the annotation applies to;
-
the order in which annotations are applied to each character.
In specific cases of using AnnotatedString
, additional semantics may be introduced.
A consequence of these rules is that adjacent annotations with identical labels and values are equivalent to a single annotation covering the combined range.
See also the description AnnotatedChar
, annotatedstring
, annotations
and annotate!
.
Constructors
AnnotatedString(s::S<:AbstractString) -> AnnotatedString{S}
AnnotatedString(s::S<:AbstractString, annotations::Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}})
The AnnotatedString string can also be created using the function annotatedstring
, which acts much the same as string
, but retains all annotations present in the arguments.
Examples
julia> AnnotatedString("this is an example annotated string",
[(1:18, :A => 1), (12:28, :B => 2), (18:35, :C => 3)])
"this is an example annotated string"
#
Base.AnnotatedChar
— Type
AnnotatedChar{S <: AbstractChar} <: AbstractChar
A Char object with annotations.
To be more precise, it’s a simple wrapper around any other character. AbstractChar
, which contains a list of arbitrary bulleted annotations (@NamedTuple{label::Symbol, value}
) along with the encapsulated symbol.
See also the description AnnotatedString
, annotatedstring'
, `annotations and annotate!
.
Constructors
AnnotatedChar(s::S) -> AnnotatedChar{S}
AnnotatedChar(s::S, annotations::Vector{@NamedTuple{label::Symbol, value}})
Examples
julia> AnnotatedChar('j', :label => 1)
'j': ASCII/Unicode U+006A (category Ll: Letter, lowercase)
#
Base.annotatedstring
— Function
annotatedstring(values...)
Creates an AnnotatedString
string from any number of values
using their output representation (print
).
It works similarly string
, but retains all available annotations (as values AnnotatedString
or AnnotatedChar
).
See also the description AnnotatedString
and AnnotatedChar
.
Examples
julia> annotatedstring("now a AnnotatedString")
"now a AnnotatedString"
julia> annotatedstring(AnnotatedString("annotated", [(1:9, :label => 1)]), ", and unannotated")
"annotated, and unannotated"
#
Base.annotations
— Function
annotations(str::Union{AnnotatedString, SubString{AnnotatedString}},
[position::Union{Integer, UnitRange}]) ->
Vector{@NamedTuple{region::UnitRange{Int64}, label::Symbol, value}}
Retrieves all annotations that relate to str'. If the `position
argument is specified, only annotations that overlap with position
are returned.
Annotations are provided together with the areas to which they apply, in the form of a vector of tuples "area-annotation".
According to the semantics described in AnnotatedString
, the order of the returned annotations corresponds to the order in which they were applied.
See also the description annotate!
.
annotations(chr::AnnotatedChar) -> Vector{@NamedTuple{label::Symbol, value}}
Gets all annotations chr
as a vector of pairs of annotations.
#
Base.annotate!
— Function
annotate!(str::AnnotatedString, [range::UnitRange{Int}], label::Symbol, value)
annotate!(str::SubString{AnnotatedString}, [range::UnitRange{Int}], label::Symbol, value)
Annotates the range in the string str' (or the entire string) with a bulleted value (`label
=> value
). To remove existing label
annotations, use the nothing
value.
The order in which annotations are applied to str' has semantic meaning, as described in `AnnotatedString
.
annotate!(char::AnnotatedChar, label::Symbol, value::Any)
Annotates the character char
with the pair label => value
.