Optimize your queries
Use the suggestions in this section for optimizing your SQL queries in Athena.
Use LIMIT with the ORDER BY clause
The ORDER BY clause returns data in a sorted order. This requires
                Athena to send all rows of data to a single worker node and then sort the rows. This
                type of query can run for a long time or even fail.
For greater efficiency in your queries, look at the top or bottom
                    N values, and then also use a LIMIT
                clause. This significantly reduces the cost of the sort by pushing both sorting and
                limiting to individual worker nodes rather than to a single worker.
Optimize JOIN clauses
When you join two tables, Athena distributes the table on the right to worker nodes, and then streams the table on the left to perform the join.
For this reason, specify the larger table on the left side of the join and the smaller table on the right side of the join. This way, Athena uses less memory and runs the query with lower latency.
Also note the following points:
- 
                    When you use multiple JOINcommands, specify tables from largest to smallest.
- 
                    Avoid cross joins unless they are required by the query. 
Optimize GROUP BY clauses
The GROUP BY operator distributes rows based on the GROUP
                    BY columns to the worker nodes. These columns are referenced in memory
                and the values are compared as the rows are ingested. The values are aggregated
                together when the GROUP BY column matches. In consideration of the way
                this process works, it is advisable to order the columns from the highest
                cardinality to the lowest.
Use numbers instead of strings
Because numbers require less memory and are faster to process compared to strings, use numbers instead of strings when possible.
Limit the number of columns
To reduce the total amount of memory required to store your data, limit the number
                of columns specified in your SELECT statement.
Use regular expressions instead of LIKE
Queries that include clauses such as LIKE '%string%' on large strings
                can be very computationally intensive. When you filter for multiple values on a
                string column, use the regexp_like()
Use the LIMIT clause
Instead of selecting all columns when you run a query, use the LIMIT
                clause to return only the columns that you require. This reduces the size of the
                dataset that is processed through the query execution pipeline. LIMIT
                clauses are more helpful when you query tables that have a large of number of
                columns that are string-based. LIMIT clauses are also helpful when you
                perform multiple joins or aggregations on any query.