Architecture

AI Email Triage for Construction: What Actually Works

OpsForEnergy·February 19, 2026·7 min read

On a typical Monday, a solar EPC project manager receives 80–120 emails. Roughly 60% are operational: permit updates, crew check-ins, subcontractor documents, vendor invoices, and inspection notices. Another 20% are internal coordination. The remaining 20% are noise — newsletters, cold outreach, and thread replies that no longer require action.

The PM's first job every morning is triage. She scans subject lines, opens the ones that look urgent, and mentally sorts them into buckets. This takes 30–45 minutes. And it is almost entirely pattern-matching — exactly the kind of task a language model can do.

I built an email triage classifier for construction inboxes and tested it on 500 real emails from three solar EPCs. Here is what worked, what did not, and how to build a taxonomy that an agent can actually use.

The taxonomy has six top-level categories, each with sub-types:

Permit: status update, revision request, approval, inspection scheduled, inspection result
Field: crew check-in, daily report, site photo, blocker report, safety observation
Document: closeout item, lien waiver, as-built, O&M manual, warranty
Vendor: invoice, PO confirmation, delivery update, equipment issue
Internal: scheduling, meeting note, general coordination
Noise: newsletter, spam, completed thread, unrelated

The classifier is a prompt-engineered Claude Sonnet 4 call. The prompt includes the email subject and body, the taxonomy with definitions, and instructions to return a JSON object with category, sub-type, urgency (low/medium/high), and a one-line summary.

The results: Across 500 emails, the classifier achieved 89% accuracy at the top level and 82% at the sub-type level. The most common errors were misclassifying vendor invoices as documents, and confusing internal scheduling emails with field crew check-ins when the subject line was vague.

What improved accuracy: Adding a few-shot examples with edge cases raised sub-type accuracy to 91%. Specifically, including examples of invoices with attachment-only bodies, and check-ins sent from personal Gmail addresses, helped the model generalize.

What did not help: Giving the model access to the full email thread history actually hurt accuracy by 4%. The extra context introduced noise. For triage, the most recent email is usually enough.

An honest limitation: This classifier works well for operational inboxes with predictable senders. It breaks when the inbox is flooded with cold sales emails that mimic operational language — for example, a subject line like "Urgent: Project Update" from a marketing automation tool. Those still require a human loop or a sender-allowlist.

The full taxonomy and prompt template are available below. Use it as a starting point for your own construction inbox agent.

Want to see this in action? Here's the demo →

Download

Email Classification Taxonomy — PDF with category definitions, prompt template, and few-shot examples.

Get the taxonomy →

How I Built an AI Agent to Track AHJ Permits Inbox Agent