Engee documentation
Notebook

Filtering control ASCII characters

Brief information

This example demonstrates filtering control ASCII characters from a range of characters using the built-in function filter and the predicate iscntrl.

Introduction

ASCII control characters are special characters in the ASCII table with codes from 0 to 31 and 127 that are not visually displayed, but perform special functions for controlling input/output devices, formatting text, and transmitting data. They are used to control the behavior of terminals, printers, and other devices. For example, the symbol \n (with code 10) is used to translate a string, and the character \t (with code 9) - for tabulation. In programming, it is often necessary to determine whether a character is a control character, or to filter such characters from the text. The Julia language provides built-in functions for this, including iscntrl checks whether the character is a control character, and filter allows you to select the collection items that meet the specified condition.

The main part

Preparing the character range

Creating a range of all ASCII characters from 0 to 127

Char(0) - the symbol with the code 0 (\0)

Char(127) - the symbol with the code 127 (\x7f)

In [ ]:
chars_range = Char(0):Char(127)
Out[0]:
'\0':1:'\x7f'

Here we create a range of characters from the character with the code 0 (zero character, '\0') up to the 127 character ('\x7f'), covering the entire ASCII table. The range in Julia language allows you to work with sequences of elements, where each element follows the previous one with a certain step. In this case, the default step is 1, so we get all 128 ASCII characters.

Filtering control characters

We apply a filter to select only control characters.:

  • iscntrl - a function that checks whether a character is a control character

  • filter applies a predicate to each element of the range and returns an array of only those elements for which the predicate returned true.

In [ ]:
control_chars = filter(iscntrl, chars_range)
Out[0]:
33-element Vector{Char}:
 '\0': ASCII/Unicode U+0000 (category Cc: Other, control)
 '\x01': ASCII/Unicode U+0001 (category Cc: Other, control)
 '\x02': ASCII/Unicode U+0002 (category Cc: Other, control)
 '\x03': ASCII/Unicode U+0003 (category Cc: Other, control)
 '\x04': ASCII/Unicode U+0004 (category Cc: Other, control)
 '\x05': ASCII/Unicode U+0005 (category Cc: Other, control)
 '\x06': ASCII/Unicode U+0006 (category Cc: Other, control)
 '\a': ASCII/Unicode U+0007 (category Cc: Other, control)
 '\b': ASCII/Unicode U+0008 (category Cc: Other, control)
 '\t': ASCII/Unicode U+0009 (category Cc: Other, control)
 '\n': ASCII/Unicode U+000A (category Cc: Other, control)
 '\v': ASCII/Unicode U+000B (category Cc: Other, control)
 '\f': ASCII/Unicode U+000C (category Cc: Other, control)
 ⋮
 '\x15': ASCII/Unicode U+0015 (category Cc: Other, control)
 '\x16': ASCII/Unicode U+0016 (category Cc: Other, control)
 '\x17': ASCII/Unicode U+0017 (category Cc: Other, control)
 '\x18': ASCII/Unicode U+0018 (category Cc: Other, control)
 '\x19': ASCII/Unicode U+0019 (category Cc: Other, control)
 '\x1a': ASCII/Unicode U+001A (category Cc: Other, control)
 '\e': ASCII/Unicode U+001B (category Cc: Other, control)
 '\x1c': ASCII/Unicode U+001C (category Cc: Other, control)
 '\x1d': ASCII/Unicode U+001D (category Cc: Other, control)
 '\x1e': ASCII/Unicode U+001E (category Cc: Other, control)
 '\x1f': ASCII/Unicode U+001F (category Cc: Other, control)
 '\x7f': ASCII/Unicode U+007F (category Cc: Other, control)

Challenge filter(iscntrl, chars_range) filters the created range and returns only those characters that are control characters. Function iscntrl returns true if the character belongs to the "Cc: Other, control" category in the Unicode classification. The result is written to a variable control_chars, which will contain a vector of 33 elements - all control ASCII characters.

Result analysis

Code execution result:

  • A vector of 33 elements has been obtained
  • All elements are ASCII control characters
  • Characters with codes 0-31 and 127 are included in the result
  • Some symbols are represented by special symbols (for example, \n, \t, \r), and others in hexadecimal codes (\x01, \x1f)

Conclusion

In this example, we looked at how to use the built-in Julia language functions to filter control ASCII characters. We created a range of all ASCII characters, and then applied the function filter with a predicate iscntrl to highlight only control characters. This technique is useful in text processing when it is necessary to identify or remove invisible control characters that may affect the formatting or correct operation of programs. Understanding such symbols is important for developers working with text data, parsers, terminal I/O, or system programming.

The example was developed using materials from Rosetta Code