Why I built this

From a private checklist to an open framework

This conversation between Aakash Gupta and Jaclyn Konzelmann (Google's Director of AI Product, on his podcast) genuinely inspired me. Jaclyn's evaluation framework was so clear and rigorous that my first thought was simple: I need to measure myself against this.

In disaster relief, you don't map safe routes and hide them. You share them.

I started outlining her criteria as a self-assessment, then realized it shouldn't be a private checklist. So it became a framework with a purpose: give everyone, especially people from non-traditional backgrounds, an actionable guide to what to build, where to focus, and how to position themselves for AI PM roles.

If you have edits or ideas to improve it, please open an issue or pull request on GitHub.

Continue to the philosophy

The conversation that inspired this framework

Philosophy

Why traditional PM evaluation fails AI builders

Four principles that separate people who ship AI products from people who coordinate them.

Builders over coordinators

The era of pure project management is over. AI PMs in 2026 ship code, prototype in hours, and show technical depth through real projects.

Key shift: from "managed a team that built X" to "I built X in a weekend."

Velocity as a core competency

Speed is not optional, it is existential. AI capabilities evolve weekly, so PMs must prototype, test, and iterate faster than the technology changes.

Evidence required: a portfolio of 8 to 10 concurrent side projects, built in days, not months.

Deep AI intuition

Surface awareness is not enough. Real intuition comes from hands-on building: understanding model limitations, prompt engineering, and architectural tradeoffs.

Non-negotiable: personal AI projects that apply LLMs, agents, or workflows in creative ways.

Building in public

The best AI PMs share their learning journey openly, through blogs, repositories, demos, and thought leadership that helps other people build.

Signal: an active GitHub, a technical blog, or regular AI experimentation shared in the open.

The 2026 paradigm shift: we are not hiring people to manage AI product development. We are hiring people who can build AI products themselves, then scale that capability through teams.

The assessment

The six pillars

A complete evaluation across the full spectrum of AI PM competencies.

Technical skills & hands-on building

Can they actually build things, or only talk about building?

Recent hands-on coding (GitHub activity, personal projects)
Personal AI tools, agents, or workflows built and shipped
Ability to prototype ideas in hours, not weeks
Comfort with modern development tools and workflows
Technical curiosity shown through experimentation

Red flag No repositories, no personal projects, last code written five or more years ago.

Product thinking & 0-to-1 leadership

Have they taken something from idea to shipped product?

Clear examples of 0-to-1 product launches
User-centric problem definition and validation
Comfort navigating ambiguity and incomplete information
Evidence of product taste and design sensibility
Metrics-driven decision making

Strong signal Multiple products launched from scratch with measurable user impact.

AI/ML knowledge & deep intuition

Do they understand AI from building, not just from reading?

Personal AI projects that demonstrate model understanding
Knowledge of current capabilities and limitations
Experience with prompt engineering, fine-tuning, or agents
Creative applications of AI to real problems
Stays current through active experimentation, not passive reading

Critical test Can they explain why they chose GPT-4 over Claude or Gemini for a specific use case?

Communication & building in public

Do they share their journey and help others build?

Active technical blog, Substack, or public documentation
Repositories with clear READMEs and demos
Thought leadership that advances the field
Compelling storytelling and narrative structure
Ability to explain complex technical concepts simply

Differentiator 5,000+ engaged followers sharing AI building insights regularly.

Strategic thinking & second-order vision

Do they build platforms that enable others, or only first-order features?

Platform thinking: tools that get better as AI improves
Understanding of ecosystem dynamics and network effects
Long-term vision balanced with rapid iteration
Ability to identify leverage points and force multipliers
Systems thinking applied to product architecture

Question Are they building for today's AI, or tomorrow's?

Execution & rapid shipping

Do they treat ideas as cheap and execution as everything?

A portfolio of projects shipped in days, not months
8 to 10 concurrent side projects that show breadth
Bias toward action over analysis paralysis
Comfortable with imperfect v1s and rapid iteration
The language of building: "I shipped," not "I managed"

Litmus test Can they build and ship a working demo in a weekend?

The decision

The decision framework

How candidates move through the 2026 standard, from minimum thresholds to a final score.

Minimum thresholds (pass / fail)

Personal AI projects: at least one visible AI project with code or a demo.

Building in public: evidence of sharing work (GitHub, blog, demos).

Resume creativity: product taste beyond a standard LinkedIn template.

Fail any threshold → No Screen

Red flags (disqualifiers)

Job hopping without a clear narrative, inflated titles, vague responsibilities, no concrete metrics, plagiarism, or misrepresentation.

Any red flag → No Screen

Must-have signals (5 of 5 required)

Evidence of continuous learning and staying current with AI
At least one personal AI project with evidence
Experience shipping products, not just planning them
A compelling narrative explaining their journey
Clear alignment with the AI/ML product space

Missing any → Maybe, at best

Differentiation signals (3 or more for a strong screen)

8 to 10 concurrent side projects
Built something in hours or days, not months
An active technical blog or significant following
Open-source contributions or community leadership
Platform or framework thinking in past work
Conference speaking or thought leadership
A unique background or unconventional path
Evidence of rapid prototyping velocity

3 or more signals → Strong Screen

Scoring system

Pillar	Weight	Max score	Evaluation focus
Technical skills	10 points	10 / 10	GitHub activity, personal projects, code quality
Product thinking	10 points	10 / 10	0-to-1 launches, user impact, metrics
AI/ML knowledge	10 points	10 / 10	Personal AI projects, hands-on evidence
Communication	10 points	10 / 10	Public building, blog, thought leadership
Strategic thinking	10 points	10 / 10	Platform thinking, second-order effects
Execution	10 points	10 / 10	Shipping velocity, portfolio breadth
Total	60 points	60 / 60	Aggregate across all pillars

<25No Screen

25–34Maybe

35–44Screen

45+Strong Screen

Aggregate score across all six pillars, out of a maximum of 60 points.

In practice

How to use this framework

Integrate the evaluation system into your existing hiring process.

Resume screening

Use the automated analyzer for AI-powered analysis from multiple providers (GPT-5, Claude Sonnet 4.5, Gemini 2.5 Pro):

bin/analyze --deep-analysis resume.pdf

Generates HTML reports with consensus scoring and detailed pillar breakdowns.

Interview panels

Share the framework with all interviewers beforehand. Use pillar-specific questions to probe each area:

"Walk me through your GitHub repositories."
"What did you build last weekend?"
"Show me your AI experiments."

Calibration

Run multiple candidates through the framework and compare scores. Calibrate your team's shared sense of what a "Strong Screen" looks like in practice.

Customization

Fork the repository and adapt the framework: adjust weights, add custom criteria, or modify the scoring rubric.

Feedback and improvements are welcome. Open an issue or pull request to help make it better for everyone.

The 2026 AI PM Evaluation Framework