Presto and Trino: Querying Everything From Anywhere
Learn how Presto and Trino enable federated SQL queries across data lakes, warehouses, databases, and object storage without moving data.
Presto and Trino: Querying Everything From Anywhere
You have data in S3, in PostgreSQL, in MongoDB, in Kafka, in your data warehouse, and in three different MySQL instances from acquired companies. Your analysts need to join across all of it. What do you do?
Presto (and its fork Trino) answer this question with a federated query engine that connects to everything and queries it in place. No ETL required to consolidate before analysis. No separate data movement jobs. SQL spans your entire data estate.
The Problem: Data Silos
Most enterprises end up with data scattered across systems that do not talk to each other. Your BI tool connects to the warehouse. Your data scientists query the data lake directly. Your operational reports hit the OLTP database directly. Customer data lives in three different systems depending on which acquisition it came from.
Consolidating everything into one system is expensive, slow, and politically difficult. Different teams own different systems. Regulatory requirements mandate data residency. Some data changes too frequently to replicate.
Presto and Trino solve this with a virtual approach: instead of moving data to the query, send the query to the data.
Architecture: Coordinator and Workers
Presto and Trino use a classic distributed query architecture. The coordinator receives SQL, parses and plans the query, schedules work across workers, and returns results. Workers execute tasks in parallel, reading from their assigned data sources and exchanging intermediate results over the network.
-- Query across multiple sources in a single SQL statement
SELECT
c.customer_name,
c.region,
SUM(wh.order_amount) AS warehouse_revenue,
SUM(ec.cart_value) AS ecommerce_revenue,
SUM(wh.order_amount) + SUM(ec.cart_value) AS total_revenue
FROM mysql_customers.crm.customers c
JOIN redshift_sales.warehouse_orders wh ON c.customer_id = wh.customer_id
JOIN glue_catalog.ecommerce.transactions ec ON c.customer_id = ec.customer_id
WHERE c.region = 'EMEA'
GROUP BY c.customer_name, c.region;
The coordinator knows where each catalog’s data lives. It pushes down filter predicates to each source to minimize data transfer. Workers receive only the columns and rows they need for joins and aggregations.
Trino Architecture
flowchart TD
subgraph Client[Client]
BI[BI Tool / JDBC / CLI]
end
subgraph Coordinator[Coordinator Node]
SQL[SQL Parser]
PLAN[Query Optimizer]
SCHEDULE[Task Scheduler]
METRICS[Metrics]
end
subgraph Workers[Worker Nodes]
W1[Worker 1]
W2[Worker 2]
W3[Worker N]
end
subgraph Catalogs[Data Catalogs]
HIVE[Hive / Glue]
POSTGRES[PostgreSQL]
REDSHIFT[Redshift]
S3[S3 / Object Storage]
KAFKA[Kafka]
end
BI -->|SQL query| Coordinator
Coordinator -->|dispatch| W1
Coordinator -->|dispatch| W2
Coordinator -->|dispatch| W3
W1 <-->|exchange| W2
W2 <-->|exchange| W3
W1 -->|read| HIVE
W2 -->|read| POSTGRES
W3 -->|read| REDSHIFT
W1 -->|read| S3
W2 -->|read| KAFKA
The coordinator parses SQL and creates a distributed query plan. Workers exchange intermediate results with each other. Each worker reads from the connectors assigned to its tasks.
Presto Versus Trino: The Fork
Presto was originally developed at Facebook (now Meta) and open-sourced in 2013. For years, it was the only option for federated SQL queries. Facebook continued developing Presto internally, and the open-source version and the internal version gradually diverged.
In 2019, several former Facebook engineers left and founded Starburst (formerly PrestoSQL). They took the PrestoSQL code, renamed it Trino, and built an enterprise offering on top. The fork was amicable enough that the Presto trademark stayed with the original project.
Today, both projects continue. PrestoDB (under the Presto Foundation at the Linux Foundation) and Trino both actively develop, with slightly different priorities. Trino has historically moved faster on new features and has a larger ecosystem of connectors. PrestoDB has the backing of Meta and stability for companies deeply invested in the original codebase.
For most new deployments, Trino is the more active project with broader connector support. The SQL dialect is nearly identical. Performance characteristics are comparable. Starburst’s enterprise distribution adds security, observability, and support features for companies that want them.
Query Execution: How It Works
When you submit a query, the coordinator parses the SQL and creates a logical plan. The optimizer reorganizes joins, pushes down predicates, and determines which data to read from which connectors. The physical plan divides work into stages that run on workers.
EXPLAIN (TYPE DISTRIBUTED)
SELECT
date_trunc('week', o.order_date) AS week,
p.category,
COUNT(DISTINCT o.customer_id) AS unique_customers,
SUM(o.total_amount) AS revenue
FROM redshift.analytics.orders o
JOIN presto.default.products p ON o.product_id = p.id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '30' DAY
GROUP BY 1, 2
ORDER BY 4 DESC;
The EXPLAIN output shows the distributed plan: which stages run on which workers, how data is partitioned, and what pushes down to connectors. Understanding explain plans is essential for tuning Trino queries at scale.
Connectors: Talking to Everything
The connector system is what makes Trino powerful. Each connector understands how to read (and sometimes write) from a specific data source.
The Hive connector reads from data lakes using Hive metastore or AWS Glue as the catalog. It supports Parquet, ORC, JSON, and text files. It pushes filter predicates down to file level when possible.
-- Query a data lake table managed by Glue
SELECT *
FROM hive.production.clickstream_events
WHERE event_date = DATE '2025-03-27'
AND event_type = 'purchase'
LIMIT 100;
The PostgreSQL connector creates a virtual schema that maps to PostgreSQL tables. Queries run directly against PostgreSQL, with predicate and aggregation pushdown.
-- Join PostgreSQL dimension with Redshift facts
SELECT
p.product_name,
p.inventory_count,
r.revenue_30d
FROM postgresql.inventory.products p
LEFT JOIN (
SELECT product_id, SUM(amount) AS revenue_30d
FROM redshift.analytics.sales
WHERE sale_date >= CURRENT_DATE - 30
GROUP BY product_id
) r ON p.product_id = r.product_id
WHERE p.inventory_count < 10;
Memory management is critical. Trino processes data in memory across workers, and queries that exceed memory limits fail. The task.max-memory configuration controls per-task memory. For large joins or aggregations, you may need to spill to disk or increase memory allocation.
Performance Characteristics
Trino excels at filtering and aggregating large datasets across multiple sources. A query that filters a billion-row fact table in S3 to a small result set is fast because predicate pushdown means workers only read matching data.
Joins between large tables are expensive. If both sides of a join are massive, you need to broadcast one side to all workers or shuffle data by join key. Broadcasting works when one side fits in worker memory. Shuffling works for larger datasets but requires network transfer.
Capacity Estimation
Trino workers process data in memory. Right-sizing the cluster prevents OOM failures and ensures good parallelism.
# Estimate worker memory needs
query_memory_gb = 10 # per-worker memory for a typical analytical query
concurrent_queries = 5
memory_per_worker_gb = 16 # typical worker memory
spill_space_gb = memory_per_worker_gb * 0.5 # spill space = 50% of memory
# Workers needed for concurrent queries
workers_for_concurrency = concurrent_queries * query_memory_gb / memory_per_worker_gb
# Network bandwidth
gb_per_query = 50 # data read per query
query_duration_sec = 30
network_mbps = (gb_per_query * 8) / query_duration_sec # ~13 Gbps for 50GB in 30s
print(f"Workers for concurrent queries: {workers_for_concurrency:.0f}")
print(f"Network bandwidth per query: {network_mbps:.0f} Gbps")
# Workers for concurrent queries: 3
# Network bandwidth per query: 13 Gbps
At 50GB per query and 30-second target duration, you need ~13 Gbps network throughput per query. For 10 concurrent queries, that’s 130 Gbps total. If workers share a 10 Gbps network link, queries queue up. Profile actual network usage per query with Trino’s query.stats before scaling.
Analytical queries—aggregations, window functions, complex joins across multiple filtered sources—play to Trino’s strengths. Transactional queries that return single rows or update small amounts of data are not what Trino is designed for.
Trino vs Presto vs Native Connectors
| Aspect | Trino | PrestoDB | Native Connector (Spark/Flink) |
|---|---|---|---|
| Federated queries | Yes (primary use case) | Yes | Limited |
| Connector ecosystem | Broad (30+ connectors) | Broad (25+ connectors) | Single source per engine |
| SQL dialect | Standard SQL (ANSI) | Standard SQL | Engine-specific |
| Active development | Very active | Moderate | Varies by engine |
| Memory management | Per-task memory limits | Per-task memory limits | JVM heap |
| Security | Built-in RBAC, LDAP | Built-in RBAC | Depends on engine |
| Best for | Data virtualization, multi-source joins | Meta-internal workloads | Single-source batch/streaming |
For multi-source federation, Trino or PrestoDB. For single-source workloads, use the native engine (Spark for batch ETL, Flink for streaming).
When to Use Trino
Trino is the right tool when you need to query data in place without moving it. Data virtualization, cross-source joins, ad-hoc exploration of data lake contents, and federated queries across multiple warehouses or databases are all natural use cases.
It’s also excellent for data lake query acceleration. Instead of relying on Spark for all data lake queries, you can serve interactive BI queries through Trino. Many teams use Trino as the query layer for their data lake, with Spark handling ETL and batch processing.
Trino is not an operational database. It does not handle high-frequency inserts or updates well. It does not provide ACID transactions across sources. It is a query engine, not a storage system.
Trino Production Failure Scenarios
Coordinator OOM on complex query
A user submits a query with a 10-way join across large tables. The coordinator holds the entire query plan and intermediate results in memory. The coordinator runs out of heap and crashes. All running queries fail, and the cluster becomes unavailable.
Mitigation: Set query.max-memory to limit per-query memory. Set query.max-total-memory-per-query for distributed memory. Monitor coordinator heap usage and alert before it exceeds 80%. For very large queries, force task.max-memory-per-node limits.
Memory spill causing query timeouts
A query tries to join two large tables that don’t fit in worker memory. Trino spills intermediate results to disk, but disk is slow. The query runs for 30 minutes and times out before completing.
Mitigation: Estimate memory requirements before running large joins. Use EXPLAIN to see if a join will broadcast or shuffle. For very large joins, pre-aggregate one side or use approximate algorithms (HyperLogLog for count-distinct). Monitor spilled_memory and spilled_files metrics.
Predicate pushdown failure causing connector overload
A user submits a query with a date filter event_date = '2025-03-27'. The PostgreSQL connector fails to push down the predicate because the column type doesn’t match. The connector reads all rows from PostgreSQL and transfers them to workers, overwhelming the PostgreSQL instance.
Mitigation: Test connector pushdown with EXPLAIN. Verify column types match between the query filter and the source. For connectors with poor pushdown, add application-level filtering in the query itself. Monitor connector.read.rows vs task.output.rows to detect pushdown failures.
Catalog misconfiguration exposing wrong data
A new catalog is added to Trino for a development data source. The catalog name dev_postgres is similar to prod_postgres. A user accidentally queries the wrong catalog and gets stale development data in a production report.
Mitigation: Use consistent naming conventions (prefix catalogs with environment: prod_, dev_). Restrict catalog access by role. Add query comments that must specify the target environment. Audit query logs for cross-environment joins.
Conclusion
Presto and Trino changed the game for data platform architecture. Instead of ETL pipelines that copy data into a central warehouse, you can query across your entire data estate with SQL. The tradeoff is latency and resource usage—federated queries are slower than querying local data, and they consume resources from every system they touch.
Quick Recap
- Trino and PrestoDB query data in place—no ETL needed to consolidate before analysis.
- The coordinator parses and optimizes SQL; workers execute tasks and exchange intermediate results.
- Predicate pushdown is critical for performance: verify it works with
EXPLAINbefore running large queries. - Memory limits prevent single queries from crashing the cluster—set
query.max-memory-per-node. - Use RBAC and catalog naming conventions to prevent cross-environment queries.
- For single-source workloads, use the native engine (Spark, Flink) instead of Trino.
For teams struggling with data silos, Trino provides a pragmatic middle ground. You do not have to move everything to one place to analyze it together. Connect the sources, configure the catalogs, and start querying across systems.
Trino Observability
Key metrics to monitor:
-- Query-level performance metrics from Trino's built-in tables
SELECT
query_id,
state,
total_rows,
total_bytes,
execution_time_ms,
peak_memory_bytes
FROM system.runtime.queries
WHERE state = 'FAILED'
ORDER BY execution_time_ms DESC
LIMIT 10;
-- Worker health
SELECT
worker_id,
tasks_running,
tasks_blocked,
memory_usage_bytes,
cpu_usage_pct
FROM system.runtime.worker_nodes;
Monitor per query: rows scanned, bytes scanned, execution time, peak memory. Alert on: coordinator heap > 80%, query failure rate > 5%, worker memory pressure. Track query.stats.output_bytes to detect queries returning unexpectedly large results (potential data leak).
Trino Security Checklist
- Enable LDAP or OAuth authentication for all users
- Configure RBAC for catalog and schema access
- Restrict access to system catalog (
information_schema) by role - Validate catalog names: use environment prefixes (
prod_,dev_) to prevent cross-environment queries - Enable query audit logging with user, query text, and source IP
- Set memory limits per query (
query.max-memory-per-node) to prevent single-query OOM from affecting cluster - Encrypt connections between workers (internal communication) and to data sources
Trino Anti-Patterns
Using Trino as an operational database. Trino is not designed for high-frequency single-row lookups or write workloads. If you find yourself running millions of small queries through Trino, use the native database or add caching (Redis, Hive Metastore) instead.
Large broadcast joins without memory limits. Forcing a broadcast join on a 100GB table overwhelms worker memory. The query fails, but not before causing GC pressure on all workers. Always check EXPLAIN for memory estimates before running large joins.
No catalog access control. If any user can query any catalog, a misconfigured BI tool could hit production systems. Always configure RBAC per catalog and validate in staging first.
For large-scale batch processing, see Apache Spark. For portable streaming and batch pipelines, read about Apache Beam.
Category
Related Posts
AWS Data Services: Kinesis, Glue, Redshift, and S3
Guide to AWS data services for building data pipelines. Compare Kinesis vs Kafka, use Glue for ETL, query with Athena, and design S3 data lakes.
Azure Data Services: Data Factory, Synapse, and Event Hubs
Build data pipelines on Azure with Data Factory, Synapse Analytics, and Event Hubs. Learn integration patterns, streaming setup, and data architecture.
Data Lake Architecture: Raw Data Storage at Scale
Learn how data lakes store raw data at scale for machine learning and analytics, and the patterns that prevent data swamps.