A practical guide for the cross-functional AI committees now forming across the industry — where to focus, where to walk away, and the questions to ask every vendor before you sign.
Key takeaways
- The committee is the right instinct; the framing is usually wrong. Evaluate problems, not functions. “AI for marketing” or “AI for procurement” is how pilots end up in the ~95% that show no measurable return.
- Use one test for every tool: the decision must be repeatable, the inputs unstructured, and the outcome measurable. Miss one and the model has nothing to learn from.
- Focus first on document-heavy, repetitive work: supplier communication, QA document and expiration tracking, invoice-to-PO matching, defect detection.
- Walk away from whole-function transformation, subjective judgment calls, and any tool whose outcome you can’t measure in 90 days.
- Buying beats building for messy, high-stakes supplier workflows — by roughly 2-to-1 in the MIT data.
Across food and beverage, a new body is showing up on the org chart: the AI committee. It pulls together the heads of procurement, QA, R&D, operations, IT, and finance to decide — together — when and how the company should adopt AI. The instinct is right. The way most committees evaluate tools is not.
This guide is for those committees. It lays out a single test for separating AI that creates value from AI that drains budget, the green-light and red-light zones specific to food and beverage, and a vendor scorecard you can use in the room.
The committee moment
What is an AI committee? A cross-functional group — usually procurement, QA, R&D, operations, IT, and finance — that decides together when and how a company adopts AI. It sets the evaluation standard and data rules, vets security before anything is signed, and shares what works across departments so teams don’t go off on their own.
The committee is no longer an early-adopter habit. In a 2025 Gartner poll of more than 1,800 executive leaders, 55% of organizations reported having an AI board or dedicated oversight committee in place. The National Association of Corporate Directors found that 62% of boards now discuss AI regularly, though only 27% have formalized that oversight into committee charters — a sign most groups are still in the education-and-risk phase rather than steering real decisions.
That phase is healthy, and it’s exactly how the strongest committees describe their own origin. A senior procurement director at a century-old food company told us this week that her CEO stood up the committee about a year ago because he “really didn’t want individual departments sort of going off on their own.” They brought every key department head into one room. At first it was education:
“The committee was really just talking about what is AI, just building a knowledge base… a broad breadth education and sort of a risk rewards assessment.” — Senior Director of Procurement, century-old food & beverage company
Now any application that touches AI “needs to go back to the committee” — both to share what works across departments and to keep IT involved on data integrity. As she put it, what they wanted to avoid was departments signing agreements and “uploading data” on their own. The governance instinct is sound.
Why most AI committees evaluate tools the wrong way
The trap is what happens next. Committees tend to evaluate AI by function — “what’s our AI-for-marketing play?”, “should we do AI in operations?” — and by vendor promise. That framing is precisely what fails.
- ~95% of enterprise generative-AI pilots showed no measurable P&L impact (MIT, State of AI in Business 2025)
- $30–40B spent across those pilots — only ~5% captured real value
- Lowest ROI went to sales & marketing, which drew the largest budgets
MIT’s study examined 300 deployments and concluded the gap wasn’t model quality, regulation, or talent — it was that most tools were never wired into a real workflow and never learned from one. Your committee’s job is to land in the 5%. That starts with changing the question.
The one test that matters: evaluate problems, not functions
We laid out the underlying pattern in Why AI winners in CPG are built around repeatable problems, not vague functions. The short version: AI doesn’t fail in CPG because the models are weak. It fails because the problem is poorly defined.
The applications that work share three characteristics. Use them as your committee’s first-pass filter on any tool, in any function:
- Repeatable. The decision recurs hundreds or thousands of times a year — not a once-a-quarter judgment call.
- Unstructured inputs. The raw material is messy: emails, PDFs, attachments, free text, inconsistent supplier replies — the stuff that today gets re-keyed by hand.
- Measurable outcome. There’s a clear result with a tight feedback loop, so the tool — and your committee — can tell whether it actually worked.
Strip any one of these out and the model has nothing to learn from, and you’re back in the 95%.
A useful reframe: a role is almost never the right unit of evaluation, because a role is a bundle of very different tasks. A procurement manager’s job is “a collection of a lot of different tasks. There are some that are very specific and repetitive, and then there’s others that are more ambiguous.” Good AI is a “force multiplier on some specific, repetitive tasks” — it will not replace the bundle. That same procurement director reached the conclusion on her own: the goal is “the productivity gains,” and her committee is confident job replacement “is really not what’s going to come out of this.” That mindset isn’t just culturally safer — it’s the mindset that picks tools that actually work.
Where should food & beverage companies focus AI? (Green-light zones)
The repeatable-unstructured-measurable test points to a consistent set of high-yield decisions — mostly in the back and middle office, where work is document-heavy and the same decision happens over and over.
✓ Focus here
- Supplier communication & document gathering — RFPs, price quotes, COA collection
- QA document lifecycle — parsing certificates from messy attachments, flagging expirations before a line stops
- Invoice-to-PO matching — surfacing overcharges; high frequency, fully measurable
- Quality & defect inspection — vision on a line, where data is already structured
✕ Not yet
- “AI for marketing” — subjective, non-repeating, lowest ROI in the data
- “AI for [an entire department]” — a function is a mixed bundle, not one decision
- Low-frequency strategic calls — nothing for the model to learn from
- Static “supplier database” tools — they rot when manual updates stop
Even teams with formal e-sourcing platforms hit the wall. A procurement lead at a baking company told us that for formal RFPs they use an e-sourcing tool, but “there’s still certainly a lot of offline gathering of information, trying to standardize the format,” and that an AI tool that could do it “would be really interesting.”
The market data backs the focus. Octave’s Pulse of Quality in Manufacturing 2026 survey — fielded in Q1 2026 across 2,263 managers in mid-to-large manufacturers, food and beverage included — found that 47% now use AI in quality processes, up from 33% a year earlier, and that the top use cases are document automation (48%), training (46%), and defect detection (44%). Document-centric, repetitive work is where adoption is actually landing. Mordor Intelligence’s January 2026 analysis put the AI-in-F&B market at roughly $13.4B in 2025 rising to $18.3B in 2026, with early adopters reporting 8–12% gains in overall equipment effectiveness and 10–15% cuts in inventory spoilage — and noted the center of gravity shifting from isolated pilots to enterprise-wide rollouts.
The common thread: every green-light zone is a specific decision, not a department.
Where should they not focus? (Red-light zones)
Just as important is where your committee should say no — or at least “not yet.”
- “AI for marketing.” Marketing involves subjective calls where even strong teams disagree on the right answer, and the tasks rarely repeat in the same shape. It’s the most-funded and lowest-ROI area in MIT’s data for a reason.
- “AI for [an entire department].” Any pitch framed around transforming a whole function fails the test by definition — a function mixes repeatable and non-repeatable tasks together.
- Subjective, low-frequency judgment calls. Decisions made a handful of times a year give the model nothing to learn from.
- Anything where you can’t define the outcome. If your committee can’t write down how you’ll know it worked, neither can the vendor — and you won’t be able to defend the spend later.
- Static “supplier database” tools. Tools that depend on humans manually uploading and maintaining data tend to rot, because the updates stop. Favor tools that pull live from real inputs over directories that go stale.
A clean heuristic from a sourcing leader at a better-for-you snack brand: he drew a line between a “general business tool” like the office suite everyone already runs, and “a specific tool” built for one job — in his case, gathering pricing information. The specific tool is where vertical AI earns its keep; the general tool is already commoditized.
Should you buy or build?
When a green-light problem is identified, the next reflex — especially in IT-led committees — is “could we build this ourselves?” The MIT study is unusually direct here: purchased solutions from specialized vendors succeeded about 67% of the time, while internal builds succeeded roughly one-third as often. The reason maps to the three-characteristic test — building a tool that reliably extracts structure from chaotic, high-stakes supplier documents is far harder than a reporting dashboard on already-clean data. (More in Build vs. buy: why you can’t vibe code your way to procurement intelligence.)
The AI tool evaluation scorecard
Bring this to the room. Score any tool 0–2 on each line; anything that can’t clear the first three is not ready, regardless of the demo.
| Criterion | What “good” looks like | Score (0–2) |
|---|---|---|
| Repeatable decision | Targets one decision made hundreds/thousands of times a year | |
| Unstructured inputs | Works on the messy emails / PDFs / attachments you have today | |
| Measurable outcome | You can state the metric and feedback loop before you buy | |
| Time to value | Delivers value in days, not a multi-month upload/integration project | |
| Learns from your workflow | Adapts to your process and outcomes; doesn’t reset each session | |
| Data handling | Clear, narrow data scope; passes IT/security review on integrity & access | |
| Works with reluctant users | Captures value without requiring 100% manual adoption | |
| Outcome-based pricing | Commercial model aligns to value delivered, not just seats | |
| Vertical fit | Built for food & beverage specifics, not a horizontal tool repointed |
A tool scoring 14+ with full marks on the first three is a strong candidate. A polished demo that stalls on criteria 1–3 is the 95%.
Five questions every AI committee should ask a vendor
- “What is the one repeatable decision this owns?” If the answer is a function or “lots of things,” stop.
- “How does this learn from our workflow over time?” MIT identified the inability to learn and adapt as the core reason pilots stall. Generic tools that don’t improve on your data plateau fast.
- “What exactly do you read, store, and not touch?” Your IT lead will ask anyway — the strongest committees make IT a co-chair specifically to vet data integrity before signing. Good vendors constrain scope tightly (e.g., supplier email only, never internal mail).
- “How fast do you ship, and how do updates reach us?” A procurement manager put it well this week: “It changes fast. A lot of new things come on board quickly. So what do updates look like?” Ask for cadence and a recent example.
- “How will we measure impact in 90 days?” Make the vendor commit to the metric. If they can’t, you can’t defend the renewal.
How to keep the committee from becoming the bottleneck
One caution as committees mature. PwC’s 2025 Responsible AI research found that committees are essential for early alignment on scope, risk, and approvals — but they become a chokepoint if every AI system has to route through them for review. The fix isn’t to dissolve the committee; it’s to split responsibilities as you scale: the committee sets the test, the guardrails, and the data rules, and empowers line leaders to evaluate specific tools against that standard. Keep the committee for alignment and risk; push tool-level decisions to the people who own the workflow.
The bottom line
Don’t approve “AI for procurement” or “AI for quality.” Identify one specific decision your team makes over and over, using information you already have but can’t act on — then evaluate tools against whether they own that decision, learn from your workflow, and produce a number you can measure. Master one repeatable decision, and the structured data and momentum to expand follow. That’s the difference between the 5% and everyone else.
The committee is the right governance instinct. Pointing it at problems instead of functions is what turns it into a competitive advantage.
Sources: MIT NANDA, The GenAI Divide: State of AI in Business 2025 · Gartner 2025 executive poll on AI oversight (n=1,800+) · National Association of Corporate Directors, 2025 board practices survey · Octave, Pulse of Quality in Manufacturing 2026 (Censuswide, Q1 2026, n=2,263) · Mordor Intelligence, AI in Food & Beverages market, January 2026 · PwC, 2025 US Responsible AI Survey. Prospect quotes are drawn from recorded Waystation sales conversations and anonymized.