Exercise 5.1 — Matillion Hub Trial + First Job

Stand up a Matillion Hub trial, connect it to your Snowflake trial, and build a transformation that reads meter_reads and writes hourly_consumption_by_zip. First hands-on feel of push-down ELT.

Prereq: Snowflake trial (Ex 2.1), meter_reads table exists. Time: 45 min. Cost: ~$0.02 Snowflake credits. Matillion trial is free.

Step 1 — Sign up for Matillion Hub

Go to hub.matillion.com. Sign up with work/personal email.
Create an organization. Pick the closest region to us-west-2.
You'll be offered "Data Productivity Cloud" — pick it (this is the current product).
Trial gives you 14 days of unrestricted access + ongoing free-tier credits after.

Step 2 — Connect Matillion to Snowflake

Matillion → Connectivity → Credentials → Add Snowflake.
Provide: account URL (xy12345.us-west-2.aws.snowflakecomputing.com), user, password.
Test connection. Should succeed.
Create a Project: utility-de-learning. Pick the Snowflake credential, database LEARN_DB, schema UTILITY, warehouse LEARN_WH.

Step 3 — Build a transformation

Your goal: given the synthetic meter_reads from Exercise 2.1, produce an hourly_consumption_by_zip summary.

New Transformation Job: hourly_consumption_by_zip.
Drag Table Input → Source: LEARN_DB.UTILITY.METER_READS. Select all columns.
Drag Calculator after it. Add a column read_hour = DATE_TRUNC('hour', read_ts).
Drag Aggregate. Group by premise_zip, read_hour. Aggregate: SUM(consumption_kwh) AS hourly_kwh, COUNT(*) AS reads_in_hour.
Drag Rewrite Table. Target: LEARN_DB.UTILITY.HOURLY_CONSUMPTION_BY_ZIP.
Wire them in order: Input → Calculator → Aggregate → Rewrite Table.
Click Validate. Matillion should accept the graph.
Click Run Job. Watch it execute.

Step 4 — See the SQL Matillion generated

This is the magic moment. In the job detail panel, find the "SQL" or "Generated SQL" view. Matillion compiled your 4-node graph into a single statement like:

CREATE OR REPLACE TABLE LEARN_DB.UTILITY.HOURLY_CONSUMPTION_BY_ZIP AS
SELECT
  premise_zip,
  DATE_TRUNC('hour', read_ts) AS read_hour,
  SUM(consumption_kwh) AS hourly_kwh,
  COUNT(*) AS reads_in_hour
FROM LEARN_DB.UTILITY.METER_READS
GROUP BY premise_zip, DATE_TRUNC('hour', read_ts);

The entire transformation — 4 visual nodes — becomes one SQL statement. Snowflake executes it. Matillion did zero compute.

Step 5 — Wrap in an Orchestration Job

New Orchestration Job: nightly_consumption_refresh.
Drag Run Transformation → select hourly_consumption_by_zip.
Drag Send Email after it. Add your email in Success path.
Validate, Run.

That's the full Matillion pattern: Orchestration wraps Transformation wraps SQL.

Step 6 — Verify in Snowflake & cleanup

SELECT * FROM LEARN_DB.UTILITY.HOURLY_CONSUMPTION_BY_ZIP LIMIT 20;
SELECT COUNT(*) FROM LEARN_DB.UTILITY.HOURLY_CONSUMPTION_BY_ZIP;

Expect ~7,200 rows (300 zips × 24 hours × 30 days / bucket). Drop when done:

DROP TABLE IF EXISTS LEARN_DB.UTILITY.HOURLY_CONSUMPTION_BY_ZIP;

Interview talking point — save to notepad

"In Matillion I authored the transformation as a visual graph, but the generated SQL is a single CREATE OR REPLACE TABLE statement pushed down to Snowflake. So Matillion orchestrates, Snowflake computes — which means performance tuning happens at the warehouse-sizing level, not the Matillion level."

← Module 5 index Module 6 →