Portfolio

Sanitized for public publication. Customer names, internal product names, employee names, and specific deal amounts removed. Patterns and methods preserved.

Public résumé: https://www.mpt.solutions/michael-tuszynski/ Blog: https://www.mpt.solutions/

What I build

I build production AI systems that engineering and revenue teams actually use — not demos, not strategy decks. My current platform work codifies a Fortune-1000 professional services org's full pursuit lifecycle as an internal AI platform: a Claude Code plugin marketplace orchestrating Salesforce, Microsoft 365, OneDrive, Anthropic Claude, and AWS Bedrock behind a single operator surface.

What I'm hired for: turning hand-wavy AI initiatives into versioned, auditable platforms with real guardrails, real evaluations, and real adoption.

AI engineering platform (in production, 9+ months)

An AI-native operating system for enterprise presales pursuits — discovery, qualification, statement-of-work lifecycle, risk review, engagement handoff — running on a single context store with native integrations to the enterprise stack.

Scale at last cut: - 11 plugins, 60+ slash commands, 562 passing tests - ~51K LOC TypeScript - Multi-provider model routing (Anthropic Claude primary, AWS Bedrock fallback) with env-var toggle - Prompt-contract framework with versioned input/output schemas - Evidence-citation guardrails: High/Medium/Low confidence + dated source URLs required on every claim - Session-start hooks that hot-load operational knowledge from a synced channel — institutional memory becomes ambient without a plugin release - Prompt caching for shared guardrails and knowledge base - Per-command rule contracts; provider switching via environment variable

Outcomes (anonymized): - Compressed hours of manual artifact production into command-driven workflows - Enabled $0.5M+ and $1M+ enterprise engagements to close on platform-authored statements of work, including iterative revisions across multiple direction changes - Partner-funded engagement program added co-funding to a pursuit on the same engagement - Productized AI Well-Architected workshop at small-five-figure list price; first paying customer in flight at time of writing

The pattern: start as the engineer in the room shipping custom for one customer, then pull the patterns into a framework the next ten field engineers can run without me.

Customer-facing pursuit microsite pattern

For an enterprise AI agent engagement in K-12 EdTech, I built a customer-facing pursuit microsite in days — same Claude-on-platform motion as the internal SOW workflow, applied to an external artifact instead of an internal deliverable. The microsite anchored follow-up conversations with the client's CEO and the hyperscaler co-sell partner; the pursuit moved from ideation workshop to a verbal Phase-1 commitment inside three weeks.

Pattern observations: - AI-listener-agent architecture on AWS Bedrock + ECS Fargate behind WebSocket API Gateway, with OpenSearch Serverless for RAG and Transcribe/Polly for voice — exactly the stack other K-12 vendors are converging on - A POC built live in the ideation workshop (single-file Flask + Anthropic, ~3,700 lines, eight views — teacher dashboard, adaptive student chat, district admin metrics, parent portal, classroom simulation) doubled as a conversation piece that pulled stakeholders into the architecture conversation faster than slides ever did - The microsite + POC + workshop combination collapses what used to be three serial deliverables into one parallel motion

The artifact pattern is portable: an AI pursuit microsite is cheap to build, lands differently than a deck, and serves as the persistent reference point through Phase-1/Phase-2 sequencing.

Token-efficient Claude Code corpus design

A reproducible methodology for keeping a multi-plugin Claude Code corpus lean as it grows. Took a production corpus from 221K tokens → 179K tokens (a 19% reduction) without removing a single procedural step or contract term.

The numbers:

Pool	Before	After	Saved	%
Commands (per invocation)	146,105	111,140	−34,965	−23.9%
Rules (auto-load every session)	18,375	14,478	−3,897	−21.2%
Skills (per invocation)	56,620	53,815	−2,805	−5.0%
Total	221,100	179,433	−41,667	−18.8%

Token counts via tiktoken / cl100k_base — close proxy for Anthropic tokenization for relative comparison.

Why it matters: With Claude Code's aggressive prompt caching, the dollar savings on a 50-user team are real but modest. The compounding wins are bigger: - Context-window headroom on every command — more room for the actual work - Faster first-token latency — less prompt to process up front - Cleaner prompts drive better instruction adherence — subjective but real, especially on long workflows - Linear scaling — when user count doubles, the savings double

The discipline transfers. Any team building agentic tooling on top of large models hits the same scaling problem.

Seven patterns that drove the savings:

Rules dedup is the cheapest 4K tokens you'll find. Multiple plugins each carried byte-identical context files. Collapsing into a shared core/rules/ directory saved ~3.9K tokens that load in every session, every user, every time. Five-minute fix; permanent return.
The top decile of commands held 42% of the command corpus. Token distribution is heavily right-skewed. Sort by tokens descending and trim the top decile — beats trimming evenly across the corpus.
Pipeline-banner repetition is invisible but expensive. Step counters like > **Step 3 of 5** repeated at the top of a command AND inside every subsection. Once at the top is enough. Saved ~600–1,200 tokens per affected command.
Auto-resolve fallback boilerplate copy-pastes faster than it earns. Many commands had a six-bullet "if exactly one... if zero, stop and ask... if multiple, stop and ask..." block. Collapsing to one sentence preserved behavior and saved ~400 tokens per command.
Multi-format output templates with 90% overlap should be one base + deltas. If your "JSON output," "markdown output," and "console output" all list the same 12 fields, write one structured template and note format-specific variants as deltas — not three near-duplicates.
Persona reload reminders inside commands are redundant. Personas load once at session start. Reminders like "First, identify the user from ~/.config/<app>/persona.md..." inside individual commands are dead weight.
Skills are different — be careful what you trim. Framework specs that say "follow this exactly, do not skip sections" have load-bearing prose. Skills with code (helper-class definitions, PPTX coordinates with EMU values) — the numbers are the contract. Skills accounted for only 5% of the savings because most of them shouldn't be touched.

Best practices for new commands:

Default to terse. Most commands need 1.5K–2.5K tokens, not 5K+. If yours is heading past 4K, audit before merging.
Don't restate guardrails. They live in core/rules/. Reference them; don't repeat them.
Don't restate path resolution. "Resolve writes via resolve_output_path()" is enough. The contract is in core/rules/.
Auto-resolve fallback is one sentence. The full if/zero/multiple pattern is implicit. State the rule, not the branches.
Multi-format outputs go in one template with conditional sections.
Banner repetitions: once at the top, never repeat.
Procedural specifics, code, and contract terms stay verbatim. Banned-word lists, governance rules, allocation floors, API IDs, picklist values — never trim.

Methodology (reproducible):

Baseline the corpus. A tools/token-baseline.py walks plugins/*/{commands,rules,skills}/*.md, counts tokens with tiktoken, emits CSV + XLSX with per-file detail, totals by type, totals by plugin, top-20 sheet.
Identify the top offenders. Sort by tokens descending. Top decile concentrates the savings.
Audit rules for byte-identical duplication. An md5 across plugin rule files surfaces collapse candidates.
Trim each targeted file. Cut redundant pre-flight prose, repeated path-resolution boilerplate, restated guardrails, banner repetitions, verbose multi-format templates. Keep procedural specifics, code, contract terms.
Re-baseline after each batch. Label snapshots (baseline, after-dedup, after-top7, final). Every snapshot is auditable.

Lean prompts compound. Cluttered prompts compound differently.

Run quarterly, or any time the corpus grows by >10% without functional additions.

Selected AI engineering primitives

Cross-platform OneDrive path resolver — disambiguates personal vs. organizational mount paths across macOS and Windows; survives the OneDrive rename/move dance that breaks most tooling
Prompt-contract framework — versioned input/output schemas per command, with a contract violation surfacing in the developer console rather than degrading silently
Evidence-citation + confidence-disclosure guardrails — every claim in generated output must cite a dated source URL and a High/Medium/Low confidence rating; outputs that can't satisfy the contract refuse to generate
Session-start hook that hot-loads operational knowledge — institutional memory becomes ambient without requiring a plugin release; ops team can update guidance in a synced channel and the next session picks it up
Multi-provider AI infrastructure — Anthropic Claude (primary) with AWS Bedrock fallback for users without enterprise access; prompt caching for shared guardrails and knowledge base; per-command rule contracts; provider switching via env-var toggle
:council multi-agent advisory pattern — pressure-tests recommendations through a 5-advisor council before they reach the user; surfaces dissent that single-agent flows hide
Document generation pipeline — branded Microsoft Word output via AI-augmented markdown rendering; preserves enterprise template fidelity while generating from structured content

Background

25 years in engineering leadership and platform work. Former CTO of a streaming platform (20+ person team, $4MM+ budget, 99.98% uptime, 100% YoY user growth, monolith-to-microservices migration). Six years as Senior Solutions Architect at AWS — co-authored the ECS Workshop (canonical hands-on container workshop, thousands of engineers trained), featured speaker on the AWS re:Think Podcast on cost optimization.

UC Berkeley Executive Education — Professional Certificate in Machine Learning and Artificial Intelligence (2025). Harvard Data Science Review — Agentic AI Intensive (December 2025). AWS Certified Solutions Architect Professional + DevOps Engineer Professional.

Personal AI lab — NEXUS

Multi-machine personal R&D workspace where I prototype the patterns that show up in client work. NAS + Mac-mini + laptop topology with ~7 production services (TypeScript/Node), 18+ LaunchAgents, custom Slack bot with auto-discovered slash commands, semantic search over conversation history (Ollama embeddings), self-hosted dashboards behind Cloudflare Tunnels, and integrations with Anthropic Claude, Plaid, Microsoft 365, Synology DSM, GitHub, and Ghost CMS. Provides the test bed for prompt-contract patterns, agent orchestration, hot-loaded knowledge, and evaluation harnesses I later bring to enterprise engagements.

Public writing

The Cloud Codex — daily-ish writing on AI engineering, platform work, agent orchestration, prompt caching, evaluation harnesses
What I Learned Building an Agentic AI Operating System for My Own Job — practitioner voice on shipping internal AI tooling
Agentic Application Modernization Reality — patterns applied to real modernization engagements
ECS Workshop — co-authored hands-on container training (AWS open source)

Inquiries: mike@mpt.solutions · 415-793-4717 · Full résumé (PDF)