Engee documentation
Notebook

Basic functions of working with tables

In this example, let's look at working with tables using the library. DataFrames.jl in the Engee environment.


Creating a table

Let's define a table and use it to consider working with functions.

One of the ways to set a task is to get it from an existing matrix.

In [ ]:
x = rand(3,4) # We set the 3x4 matrix
Out[0]:
3×4 Matrix{Float64}:
 0.926038   0.804963   0.534224    0.79081
 0.0455585  0.247521   0.00657984  0.944393
 0.560838   0.0451912  0.272253    0.225811
In [ ]:
# Connecting the library to work
using DataFrames

y = DataFrame( x, ["A", "B", "C", "D"] )
Out[0]:

3 rows × 4 columns

ABCD
Float64Float64Float64Float64
10.9260380.8049630.5342240.79081
20.04555850.2475210.006579840.944393
30.5608380.04519120.2722530.225811

You can set column names automatically by specifying the keyword in the second parameter. :auto.

In [ ]:
y1 = DataFrame( x, :auto )
Out[0]:

3 rows × 4 columns

x1x2x3x4
Float64Float64Float64Float64
10.9260380.8049630.5342240.79081
20.04555850.2475210.006579840.944393
30.5608380.04519120.2722530.225811

Functions for working with rows and columns

To display, for example, the second column of the table, you can access it with a dot by the column name or display all the rows of the second column. These two methods are shown in the cells with the code below.

In [ ]:
y.B # Accessing through the column name
Out[0]:
3-element Vector{Float64}:
 0.8049630878241182
 0.24752121054936527
 0.04519118103988562
In [ ]:
y[:,2] # Output all rows of the 2nd column
Out[0]:
3-element Vector{Float64}:
 0.8049630878241182
 0.24752121054936527
 0.04519118103988562

You can also output all but one column, for example С.

In [ ]:
select(y[:,:], Not("C"))
Out[0]:

3 rows × 3 columns

ABD
Float64Float64Float64
10.00.00.0
20.04555850.2475210.944393
30.5608380.04519120.225811

To rename columns, there is a function rename(). The first argument of the function is the table, and the second is the new column names.

In [ ]:
rename!(y1, ["A", "B", "C", "D"])
Out[0]:

3 rows × 4 columns

ABCD
Float64Float64Float64Float64
10.9260380.8049630.5342240.79081
20.04555850.2475210.006579840.944393
30.5608380.04519120.2722530.225811

Column values can be sorted using the function sort().

In [ ]:
sort!(y1.C) # We sorted the values in column C and recorded the result in the source table.
Out[0]:
3-element Vector{Float64}:
 0.006579837672216482
 0.2722532183791443
 0.534224047073746

You can add a column to a table by referring to it by name and assigning a value. If there is no column with that name in the table, a new one will be created.

In [ ]:
y1.F = 3 * (y1.B + y1.C) # Let's write in a new column the result of the sum of the values of columns A and B, multiplied by 3
y1
Out[0]:

3 rows × 5 columns

ABCDF
Float64Float64Float64Float64Float64
10.9260380.8049630.006579840.790812.43463
20.04555850.2475210.2722530.9443931.55932
30.5608380.04519120.5342240.2258111.73825

Thus, we have reviewed the basic basic functions of working with tables using the library. DataFrames.jl. For more information on working with tables, see DataFrames.jl.