Credits & Further Reading
Where to go deeper on any module, authoritative sources for facts in this course, and the license notes for anything embedded.
Going deeper ยท by module
If a module isn't enough โ or you want to push past what this course covers โ these are the trustworthy places to go next.
Module 1 ยท DE Orientation
- Fundamentals of Data Engineering โ Joe Reis & Matt Housley. The canonical modern-DE textbook. Read chapters 1โ3 for the mental model.
- Data Mesh (Zhamak Dehghani). Useful conceptual context even if you don't adopt it.
- Medallion architecture โ Databricks glossary.
Module 2 ยท Snowflake
- Snowflake Docs โ the authoritative source. Use the search bar liberally.
- Snowflake Quickstarts โ short, free, hands-on tutorials. Great for interview-prep demos.
- Snowflake free on-demand training (Snowflake University). Start with "Data Warehousing Workshop" and "Data Applications Workshop."
- Warehouse considerations โ internal doc, often cited in interviews.
- select.dev blog โ practical Snowflake cost optimization posts.
Module 3 ยท Python for DE
- snowflake-connector-python docs.
- Snowpark for Python docs โ dataframe-style in-warehouse transforms.
- boto3 official docs.
- Python for Data Analysis โ Wes McKinney (the pandas author). If you want to fully hydrate pandas fluency.
Module 4 ยท AWS for DE
- AWS Glue docs.
- Step Functions docs.
- AWS Free Tier details.
- AWS Cloud Quest: Data Analytics (free on AWS Skill Builder with a free account). Gamified walkthrough.
Module 5 ยท Matillion
- Matillion docs.
- Matillion Academy โ free, self-paced courses directly from the vendor. If the interview mentions Matillion specifically, do the "Data Productivity Cloud Essentials" track.
- Matillion YouTube channel.
Module 6 ยท Git + Jenkins for DE
- schemachange on GitHub.
- dbt docs โ if you hit Jenkins pipelines for transformations, dbt is often involved even if not on the JD.
- Your existing expertise covers this module already. Read the schemachange README, skim the Jenkins plugin for Snowflake, done.
Module 7 ยท Governance
- Snowflake Access Control overview.
- Column-level security intro.
- Row Access Policies.
- Snowflake Security Best Practices (whitepaper).
Module 8 ยท Streaming
- Apache Kafka documentation.
- Confluent Developer โ free, high-quality Kafka tutorials.
- Kinesis Data Streams dev guide.
- Snowpipe Streaming overview.
- Designing Data-Intensive Applications โ Martin Kleppmann. If you want the deepest possible understanding of log-based systems, chapter 11.
Module 9 ยท Optimization + Utility
- Snowflake Query Profile guide.
- NARUC (National Association of Regulatory Utility Commissioners) โ industry regulator context.
- IEEE Smart Grid resources for reliability metrics (SAIDI/SAIFI/CAIDI).
- DOE intro to the electric grid.
- "How utilities use analytics" โ vendor-agnostic overview.
Cost transparency โ quick reference
Repeats the cost block from the dashboard for easy reference while you set up accounts.
| Service | Free tier | What you pay for | Guardrail |
|---|---|---|---|
| Snowflake | 30-day trial, $400 credits | Warehouse compute (per-second), storage | AUTO_SUSPEND=60, XSMALL only |
| AWS S3 | 5GB/mo forever | Storage above 5GB, requests, transfer | Billing alarm at $1 |
| AWS Lambda | 1M invocations/mo forever | Above 1M, or long-running memory-heavy functions | 128MB memory for learning; 15-min max |
| AWS Glue | None (1M Catalog requests free) | $0.44/DPU-hour, billed per-second w/ 1-min min | Minimum DPUs; delete jobs after module |
| AWS Step Functions | 4k state transitions/mo forever (Standard) | Above 4k | Short state machines only |
| AWS Kinesis | None | $0.015/shard/hr + $0.014/M records | Create โ test โ delete within session |
| Matillion Hub | 14-day full trial + ongoing metered | Credits for compute-intensive runs | Complete Module 5 inside 14 days; free tier covers Module 5 labs |
| Kafka (local Docker) | Free | Your laptop CPU/RAM | n/a |
| Jenkins (local Docker) | Free | Your laptop CPU/RAM | n/a |
Generalizable DE skills (beyond this JD)
If this role falls through and you want to aim at DE more broadly, these are the pieces we intentionally didn't cover because they weren't in the posting.
- dbt โ the analytics-engineering framework. Transformation-as-code, tested, versioned. If the next DE interview mentions it, the 2-hour dbt fundamentals course on the dbt Labs site gets you conversant.
- Airflow โ the most common non-Matillion orchestrator. Same job-as-DAG mental model.
- Spark โ Glue is Spark, but Spark itself (PySpark API, DataFrame ops, tuning) is worth its own study if you're going EMR/Databricks/Glue-ETL heavy.
- Data modeling โ dimensional modeling (Kimball), data vault, normalization basics. Generally valuable, only lightly touched here.
- Delta Lake / Iceberg / Hudi โ open table formats. Increasingly common, especially for Databricks/Snowflake-Iceberg interop.
- Observability for DE โ Monte Carlo, DataDog for DE, OpenLineage. Not on the JD, but this is where DevOps senses pay off.
- GCP BigQuery / Azure Synapse โ the other cloud warehouses. Concepts transfer.
Attribution
This course does not embed third-party images, audio, or video. External sites are linked, not embedded. All content authored for this curriculum.