Lesson 9.3 โ Utility Data 101
Your biggest differentiator opportunity. The JD specifically asks for utility-data-operations experience โ you don't have it, but this lesson gives you enough vocabulary and mental model that you can talk intelligently about CIS, GIS, AMI, OMS, and why the industry is the way it is. 30 minutes to become dangerous.
The five acronyms you will hear constantly
| Acronym | Stands for | What it holds | Who owns it |
|---|---|---|---|
| CIS | Customer Information System | Accounts, service agreements, billing cycles, payment history | Usually an SAP, Oracle CC&B, or similar enterprise suite. Billing team owns. |
| GIS | Geographic Information System | Spatial system of record for assets: poles, transformers, substations, circuits, service points | Typically Esri ArcGIS. Asset Management team owns. |
| AMI | Advanced Metering Infrastructure | Smart meters + head-end system. Generates the bulk of a modern utility's data volume โ 15-min interval reads from millions of meters. | AMI vendor head-end (Itron, Landis+Gyr, Sensus) plus in-house MDM (Meter Data Management). |
| OMS | Outage Management System | Outage events โ when, where, how long, affected customers, root cause | Typically Oracle NMS, GE PowerOn, or similar. Operations team owns. |
| DER | Distributed Energy Resources | Customer-side generation/storage: rooftop solar, batteries, EV chargers | Emerging; often a separate DERMS product. |
The meter-to-cash flow (THE flow to know)
If an interviewer asks "walk me through how a utility's data flows," this is the answer:
Physical meter Customer bill
โ โฒ
โผ โ
AMI head-end โโโบ MDM โโโบ VEE โโโบ Usage table โโโบ Rate engine
(vendor) (dedupe, (validate, (CIS joins, (rules engine
convert estimate, rating) in CIS)
format) edit)
Supporting lookups:
GIS: service-point-to-meter-to-circuit-to-substation
CIS: account status, service agreement, rate plan
OMS: was this meter out during read window?
Key terms:
- Meter-to-cash: the end-to-end revenue pipeline. Every data decision touches it.
- VEE: Validation, Estimation, Editing. Regulatory-required processing of raw reads before they can be billed. Catches zero reads from dead meters, interpolates missing intervals, flags outliers for analyst review. Audit trails required by state PUCs.
The reliability trio โ SAIDI / SAIFI / CAIDI
The three metrics every utility reports to its public utility commission. Expect to see dashboard queries for these:
| Metric | Formula | Plain English |
|---|---|---|
| SAIDI | ฮฃ customer-minutes of interruption รท total customers | "How many minutes the average customer was out." |
| SAIFI | ฮฃ customer interruptions รท total customers | "How many outages the average customer experienced." |
| CAIDI | SAIDI รท SAIFI | "Average length of an outage when it happens." |
If SAIDI is rising and SAIFI is steady, outages are lasting longer โ restoration is slowing. If SAIFI is rising and SAIDI is steady, more small outages are happening โ prevention is slipping. DE teams maintain the pipelines that compute these; every exec dashboard touches them.
Why utility data is unusual (the industry texture)
- Highly regulated. State PUCs require audit trails. You cannot overwrite meter reads. Append-only, with revision flags.
- Slow clocks. Utilities change slowly. A platform built 5 years ago is "new." Legacy pockets abound (mainframe CIS, on-prem GIS).
- Spiky volumes. AMI pushes bulk data 4 times per hour. Outage events push huge bursts when a storm hits. Design for the burst, not the average.
- Safety criticality. Bad data isn't just annoying โ it's regulatory risk. A dropped outage record can mean a regulator fine. A mis-rated bill generates regulatory complaints.
- Merger/acquisition baggage. Utilities consolidate. Expect 3โ4 CIS systems coexisting, each with its own ID space, and a migration project that's been "nearly done" for years.
- DER transition. The industry is shifting from "utilities sell power to customers" to "customers also produce power." Many DE problems are re-modeling around bidirectional flow.
The connection model (simplified)
Substation โโ Circuit โโ Transformer โโ Service Point โโ Meter โโ Premise โโ Account โโ Customer
โ โ โ โ โ โ
(GIS) (GIS) (GIS) (GIS) (AMI) (CIS)
Most interesting DE problems involve joining this chain:
- "All customers on Transformer T-42, their average consumption last Tuesday" โ AMI ร GIS ร CIS join.
- "How many customers lost power during the 4/18 outage?" โ OMS ร GIS ร CIS join.
- "Where should we prioritize vegetation management?" โ OMS ร GIS (pole locations + tree risk).
How a modern utility DE stack might actually look
[CIS] โโโบ [Kafka (CDC via Debezium)] โโโบ [S3 raw] โโโบ [Snowflake bronze]
โ
[GIS] โโโบ [Nightly API pull (Python)] โโโโโโโบโค
โ
[AMI head-end] โโโบ [Kinesis/Kafka] โโโบ [Snowpipe Streaming]
โ
[OMS] โโโบ [JDBC pull into Glue] โโโโโโโโโโโโบโโ
โผ
[Matillion / dbt Silver + Gold]
โ
โผ
[Analyst BI, ML, regulator reports]
Notice: every box in this stack is something you've now touched in this course.
The domain-fluency answer (keep it ready)
"I understand the core data systems โ CIS for customers and billing, GIS for assets, AMI for meter reads, OMS for outages โ and the meter-to-cash flow with its VEE step before billing. The reliability KPIs utilities report (SAIDI, SAIFI, CAIDI) are what most exec dashboards are about. I'd design DE pipelines with the industry's regulatory rigor in mind โ append-only raw, audit trails on transformations, and data-quality tests as CI checks."