Comment on page
The Query task is the most important task in Gaio. All data manipulation is carried out through it. When using it, Gaio is building an SQL command to send to the database and return a new table. If the table already exists, Gaio deletes and recreates it. However, there is the possibility of not deleting, but accumulating data in the table, by activating the Insert Mode button (19), which will use the SQL INSERT feature instead of deleting and creating a table.
When a Query task is run without Insert Mode (17) enabled, the following sequence of actions is performed:
- 1.If there is, deletion of the result table (DROP TABLE);
- 2.Execution of the SQL query generated by it;
- 3.Creates the table 'result table' (CREATE TABLE);
- 4.Insertion of data into the generated table (INSERT INTO).
- 1.Query name
- 2.Table name
- 3.Save button
- 4.Save and Run button
- 5.Preview button
- 6.SQL Button
- 7.Source table columns
- 8.Calculated Field
- 9.Aggregated Fields
- 10.New table columns
- 11.Filter: Columns that will be filtered
- 12.Grouping: Columns to be grouped
- 13.Sorting: Columns that will be sorted
- 14.Rows: Limit number of rows for the new table (LIMIT)
- 15.Limit by: Column that will be used as criteria for categorization when there is a limit on rows
- 16.Page (offset): Pagination
- 17.Relationship between tables (JOIN)
- 18.Column Properties
- 19.Insert Mode (INSERT)
- 20.close button
By default the name will be query, but you can define a name that helps identify what was done in the query. This name cannot start with a number or be less than 3 characters long.
Set a name that cannot start with a number or be less than 3 characters long. If the table name starts with tmp_ , it will be a temporary table and therefore it is easily deleted along with the other temporary tables. It is good practice to use temporary tables so as not to consume excess disk space.
This button saves the query, but does not execute it and therefore the table generated by it will not be recreated. It is useful when the task takes a longer time to execute, and you do not want to run it at the moment.
By clicking on this button, Gaio will save the query and execute it, regenerating the query's target table.
With this button, you can preview what the result of the query will be like before you even run it. Gaio sends the SQL to the database bringing only 10 rows. It is possible after the preview is displayed, to change the number of lines to be displayed.
In some situations, the result of the preview may differ from the final table. This is because the preview does not fetch the data from the source, sends a file with the result and imports the data into an internal Gaio table, as the execution of the query does. In the preview, you just run the query and the result is shown.
Gaio displays the SQL code that will be executed, however it cannot be edited. To build SQL manually, use the SQL task to query tables that are already in Gaio or the External SQL task to perform direct manual queries on the source databases (SQL Server, Oracle, PostGreSQL, MySQL, MariaDB and MemSQL).
The tables being used in this query are displayed here. Within them (when clicked), their respective columns are displayed and can be moved to the center of the window for various purposes such as filters and ordering.
Using this button, it is possible to create a new column using SQL functions. Such functions can range from simply extracting the year from a date to executing complex SQL queries to generate the column.
ANSI SQL can be used in the construction and functions from the database (DBMS) that is currently being consulted can also be used. When querying tables that are already in Gaio, functions from the ClickHouse DBMS must be used .
Pay attention to the type of column that this calculation will generate. If it is a text column, you will not be able to do calculations with it later. It can also take types such as date, numeric and others.
Convert date/time in text format to date format. In this case, there will be an automatic effort to identify the format of your text.
This item lists the columns that were aggregated with some function (sum, average, min, max, count, etc.). They are available there to be dragged to the options in the center of the screen.
Here the columns that will be in the table that will be generated from this query are defined. They can be columns that are exactly the same as the source tables, they can be calculated from the Calculated Field and they can also be aggregations (average, minimum, maximum, etc.). All items that are in that location will be in the generated table.
In this item, filters are applied to the data, for example, viewing takes data from only one state to the table. Here you can only use columns that will filter data in the source tables and/or in calculated fields that are not aggregating data.
When aggregating a column using, for example, average, sum, minimum, maximum and others, the Filter item appears in aggregated fields on the screen. This is where you can filter aggregated columns. This happens because this filter is applied after the data is generated, after the aggregation is consolidated and not before as in the previous filter option.
If a column was aggregated with, for example, a sum, in general some column will be used for the sums to be presented by it, such as the sum of sales by State. In this case, the State is a column to be grouped, as it is desired to see the sums of sales for it. In SQL practice, the columns that are not being aggregated (sum, min, max, etc.) must be in Grouping .
Keeping in mind the columns that will be in the output table, several columns can be inserted in this item, which can be sorted ascending or descending. Gaio will respect the sequential order in which the columns were inserted in this field for sorting.
This feature is very useful in two situations:
- 1.The query will return a lot of data and if you want to see a good set of data before the final execution, for example limit 1000 rows.
- 2.In conjunction with sorting, it is possible to create a TOP 20 sellers, limiting the query to 20 lines, for example.
Thinking about the previous item (13. Limit), it may be desired to return 100 lines, but starting from line 1,000. This item will allow such a data set to be made available in the generated table.
For SQL users, this is just an interface to build relationships between tables that will have their columns available to be used in formulas and be available in the table to be generated.
This button will open the window below that will allow you to join different tables, including those from different applications (as long as it is shared with the application in use).
At this point, it is necessary to inform the columns through which the two tables are related. If the relationship between the two tables is based on more than one column, it will be necessary to click the Join button on the previous screen more than once.
As many tables as necessary can be related by repeating this procedure.
The result of the relationship between the two tables can be in 4 ways.
- Inner join : only the lines that match in the two tables will be brought.
- Full join : all rows from both tables will be brought (it usually brings duplicates and should only be used when the user is fully aware of what will be returned, as it can lead to errors in data analysis.
- Left join : all rows from the left table and the data it finds in the right table will be brought based on the columns that join them.
- Right join : all rows from the right table and the data found in the left table will be brought.
Here it is possible to change the name that the column will have in the generated table, aggregate, change the column formatting, not its type. To change the type of a column it is necessary to use functions in the Calculated Field.
The standard procedure when completing activities in this Query task will be to execute. At this point, if the table already existed, it will be deleted and created again. However, if this option is checked (Insert Mode), Gaio will not delete the table, but will place the new data at the end of the table that already existed.
Name of the process that is creating the query: This location only displays the name of the process where this query is being created or edited.