Module 8 · Real-Time Data Streaming

AMI (Advanced Metering Infrastructure) generates millions of meter reads per hour — a perfect streaming use case. The skill requirement asks for Kinesis AND Kafka. We build both, then land the data in Snowflake.

Required skill · s6 (paraphrased)

Real-time data streaming: AWS Kinesis and Apache Kafka.

Your coverage: Gap

What we're assuming

Log-based systems: you've used search/logging stacks (append-only-log + consumer mental model is familiar).
Queues/brokers adjacent: you've worked with K8s event systems, CI/CD webhook chains. Partitioned distributed logs won't feel foreign.
Docker: you have Kafka's local-dev path baked in already.
No direct Kafka/Kinesis production experience.

What we modify for your background

Skip: "what is a message queue," intro to distributed systems.
Emphasize: the log-as-source-of-truth mental model, partition/shard as the parallelism unit, consumer group coordination, exactly-once vs at-least-once. These are the interview-question drivers.
Compare-and-contrast: Kinesis vs Kafka operationally (who manages what), since the skill list names both.
Utility domain: we use meter-reads-per-second as the running example, which maps to the capstone in Module 9.

Want more depth?

See Credits → Module 8. Designing Data-Intensive Applications (Kleppmann) Ch. 11 is the definitive conceptual foundation. Confluent Developer is the best free Kafka training.

Lessons

Exercises

Exercise 8.1 — Local Kafka + Python Producer/Consumer (AMI meter reads) crash deep

← Module 7 Start lesson 8.1 →