Trino - Cross DB Reporting

Introduction

Managing multiple data sources was a big challenge I was facing on my NAS environment. I have various applications running on my NAS, and each application utilizes different databases to store the data. This diverse ecosystem of databases includes MySQL, PostgreSQL, SQLite, MS SQL, and others, each serving its specific purpose.

My main applications are:

Immich, a photo and video backup system
A WordPress blog
Some of my personal projects

Each of these applications has its own database, and I wanted to generate a daily report that requires data from multiple databases. Manually joining tables across these different databases was not only time-consuming but also prone to errors and inefficiencies. I was looking for a solution that could seamlessly integrate and query data from these disparate sources to generate accurate and timely reports.

This is where TrinoDB shines. It’s a powerful distributed SQL query engine that enables us to query data from multiple databases as if they were part of a single unified system. With TrinoDB, we can perform complex SQL queries, join tables across different databases, and generate daily reports without the hassle of manual data integration.

Setting Up Trino on NAS

There are many ways you can install Trino, but I prefer using Docker, and it’s pretty straightforward using Docker. Run the below Docker command to install:

docker run --name trino -d -p 8080:8080 trinodb/trino

I personally love DBeaver and used it to connect to my Trino instance. Since my plan was to always use it locally, I simplified the setup by ignoring the authentication part. However, if you want to enable authentication, you’ll need to set up the configuration properties.

Creating DB Connectors

Before running the above Docker command, set the environment variable:

CATALOG_MANAGEMENT=dynamic

With this, you can create the database connector dynamically using some SQL commands:

CREATE CATALOG mysql USING mysql
WITH (
  "connection-url" = 'jdbc:mysql://<ip-address>:<port>',
  "connection-user" = '<db-username>',
  "connection-password" = '<db-password>'
);

If you don’t set this environment variable, the catalog management will be marked as static, and you can create the catalog using catalog property files. However, whenever you add any new catalog, you have to make sure to restart the container.

You can read more about catalog types here.

To integrate my various applications, I created three connectors:

Immich Application: Connected to a PostgreSQL database.
WordPress Blog: Connected to a MySQL database.
Personal Projects: Connected to an MS SQL Server database.

These connectors allow me to seamlessly query and join data across different databases, making it easier to generate the comprehensive reports I need.

Conclusion

With these three connectors, I can seamlessly query any table and join tables across different databases. This integration has simplified my data management tasks and enabled me to generate comprehensive reports effortlessly.

Another outcomes of using Trino DB is the creation of a user dashboard. This dashboard aggregates and displays details across all my applications. Since the user ID is consistently the email address across most of my applications, the dashboard provides a unified view of user-related data, making it easier to track and manage user information.

Introduction#

Setting Up Trino on NAS#

Creating DB Connectors#

Conclusion#

Introduction

Setting Up Trino on NAS

Creating DB Connectors

Conclusion