AI in Mid-market Companies

Executive Summary

Artificial Intelligence is becoming table stakes for maintaining competitiveness, even for organizations that do not consider themselves "tech companies." Mid-market firms, with their leaner teams and tighter margins, can capture outsized value from AI by automating knowledge-heavy processes and unlocking new insights—provided they adopt a deliberate, risk-aware approach.

This report distills the lessons we have learned from dozens of conversations with CEOs and functional leaders who know they need AI but are unsure where to start. We first build a shared vocabulary (Foundations) and a clear-eyed view of what current large language models can—and cannot—do (Model Strengths & Limitations). We then introduce a scoring framework that helps executives identify the highest-impact, lowest-friction use cases hiding inside their own operations (Identifying Opportunities).

The playbook that follows turns strategy into action: configure the Big Three (prompt, context, tools), execute tasks with agentic systems, and implement an evaluation loop that keeps outputs reliable as models evolve. Throughout, we surface practical architectures, governance checkpoints, and cost-control tactics so teams can move fast without sacrificing safety or ROI.

Finally, we spotlight the "lethal trifecta" of AI security—private data access, exposure to untrusted input, and outbound communication—and outline engineering patterns that break at least one side of that triangle at all times.

Key Outcomes

• Prioritize projects that deliver measurable value within one to two quarters
• Deploy AI responsibly across core functions
• Build internal momentum for a longer-term roadmap of increasingly sophisticated capabilities

Foundations

The current wave of AI tools, products, and workflows is rapidly evolving. This is exciting and provides many value creating opportunities, but it is very difficult to keep up! Let's begin by defining the key words, terms, products, and companies that will form the foundation of our framework.

Essential Terms

Large language model (LLM)

Computer software system built by analyzing a massive body of diverse text including books, websites, reference materials, online discussions, and professional documents that estimates the most likely helpful response based on patterns in human communication.

Artificial intelligence (AI)

A computer system that connects a powerful model (like an LLM) with other systems or components to create versatile computational systems that exhibit intelligent behavior by analyzing data, understanding context, and generating appropriate responses or actions.

Context

Additional task specific information (text, images, files) that help an LLM understand the analytical setting for your prompt and ground its responses. Used to augment the LLMs extensive, generic world knowledge with task relevant information.

Prompt

A written set of instructions given to an LLM. This often includes the description of a persona or role ("you are a digital marketing specialist" or "you are a senior software engineer with expertise in Python"), relevant context, task instructions, output format expectations, and ideally a validation or testing strategy the LLM can use to verify its output.

Tools

Connections, integrations, and extensions that can be invoked by an AI system to extend its capabilities beyond generating. Examples include reading or writing files, sending emails, accessing a database, search the web, and interact with external APIs.

Agent

An LLM with access to tools running in a loop. This means the AI can work independently on complex tasks by thinking through problems step-by-step, using different tools as needed, and adjusting its approach based on results—similar to how a skilled assistant would complete a project from start to finish.

View Additional Terms

Embeddings

A numerical summary of the semantic meaning of a piece of data. The actual numbers are not meaningful, but they have the useful property that similar data will have similar numbers.

Vector Store

A storage system or database for storing embeddings. Key properties include fast comparison of embeddings for different data, storage of original data, and storage of metadata like original file name/url/page number/etc.

Retrieval Augmented Generation (RAG)

An agentic system where (1) embeddings are generated and stored for domain specific data and documents (2) user queries (prompts) are embedded (3) similar/matching/relevant document chunks are retrieved and added to the LLMs context (4) the LLM responds to the user query

Reasoning or Thinking Models

Advanced AI models that show their step-by-step problem-solving process before providing an answer, similar to how a human might walk through their analysis. These models excel at complex tasks requiring logic, mathematics, coding, and strategic planning by breaking problems into smaller components.

Model Context Protocol (MCP) Server

An open standard developed by Anthropic that enables AI assistants to securely connect with external data sources and tools through a unified protocol. MCP servers act as bridges between AI systems and your organization's databases, APIs, and internal tools, allowing AI to access real-time information while maintaining security boundaries.

Token

The basic unit of text that an LLM processes—roughly equivalent to a word or part of a word. Understanding tokens helps estimate costs and context limits, as most AI services charge per token processed.

Fine-tuning

The process of customizing a pre-trained AI model with your organization's specific data, terminology, and use cases—like training a new consultant on your company's unique methodologies and client base.

API (Application Programming Interface)

The technical bridge that allows your business systems to communicate with AI services programmatically, enabling automation and integration into existing workflows.

Hallucination

When an AI generates plausible-sounding but factually incorrect information. Understanding this limitation is crucial for risk management and establishing appropriate verification processes.

Multimodal AI

Systems that can process and generate multiple types of content—text, images, audio, and video—enabling more comprehensive analysis and communication capabilities.

OpenAI

The company behind ChatGPT and GPT models, a major player in commercial AI services offering both consumer and enterprise solutions.

Anthropic

Creator of Claude AI, focused on building helpful, harmless, and honest AI systems with strong emphasis on safety and reliability for enterprise use.

Google Gemini

Google's flagship AI model family that powers various Google products and services. Available in multiple versions (Ultra, Pro, Flash) for different performance needs, Gemini integrates deeply with Google Workspace tools like Docs, Sheets, and Gmail, making it particularly relevant for organizations already using Google's enterprise ecosystem.

Microsoft Copilot

Microsoft's AI assistant integrated across Office 365 and enterprise tools, designed to enhance productivity within familiar business applications.

Perplexity

An AI-powered search engine that provides sourced, conversational answers to queries—useful for research and competitive intelligence.

Workflow Automation

Using AI to connect and orchestrate multiple business processes, reducing manual work and improving consistency across operations.

Guardrails

Technical and procedural controls that ensure AI systems operate within defined parameters, maintaining compliance, brand standards, and risk thresholds.

Inference

The process of an AI model generating responses to inputs in real-time—the operational phase where business value is created from trained models.

Model Governance

The framework of policies, procedures, and oversight mechanisms ensuring responsible AI deployment, including version control, access management, and performance monitoring.

Evaluations (Evals)

Systematic testing and measurement processes to assess AI model performance, accuracy, and reliability for specific business use cases. Like quality assurance in traditional software, evaluations help organizations validate that AI outputs meet required standards, identify edge cases where models may fail, and track performance over time.

Model Strengths & Limitations

LLMs are incredibly powerful, but they are not good at everything. Understanding these boundaries is crucial for successful implementation.

Strengths

Summarization

Excellent at understanding, synthesizing, and summarizing language across documents
Exploration

Can quickly iterate and explore diverse options when properly configured
Web Research

Can search hundreds of websites and produce comprehensive reports
Coding

World-class coding assistants that significantly increase productivity

Limitations

Multi-step Reasoning

Performance degrades with complex logic requiring multiple reasoning steps
Causal Reasoning

Cannot truly understand cause-and-effect relationships
Mathematical Reasoning

Struggles with multi-step equations and complex calculations
Hallucination

May generate plausible but false information

Identifying Opportunities for AI

A key question for any business decision maker is how to identify opportunities for effectively utilizing AI in their business.

In the table below we present the most important project or task properties to consider when trying to identify where AI can add value in your organization. Review each property and think about your current business processes—tasks that align closely with these properties are strong candidates for AI-driven improvement. This framework is designed to help leadership quickly spot high-impact opportunities and avoid common pitfalls when evaluating where to start with AI.

Property	Weight (/5)	Explanation	Variable	Example Tasks
Repetitive & Rule-Based	5	The work follows a predictable pattern or a set of defined, logical rules that can be learned and consistently applied.	R	Invoice processing, data entry, form categorization, basic report generation.
Data-Intensive	5	Success depends on processing, synthesizing, or retrieving information from large volumes of text, code, or other data formats.	D	Market research analysis, legal e-discovery, summarizing scientific literature, log analysis.
Pattern Recognition Dependent	5	The core activity involves identifying trends, anomalies, classifications, or clusters within data that may not be obvious to humans at scale.	P	Fraud detection, sentiment analysis, customer churn prediction, medical image screening.
Generative in Nature	5	The primary output involves creating new content, code, or structured data based on a prompt or existing information.	G	Writing email drafts, generating marketing copy, creating code snippets, summarizing meetings.
Experiment-Based	4	The task requires rapid iteration or the generation of multiple variations to test hypotheses or explore creative options.	E	A/B testing ad copy, brainstorming product names, simulating customer dialogues, experimenting with different user interface concepts, generating test data.
Labor-Intensive	4	The task requires a significant number of human hours to complete, making automation a high-value proposition for cost and time savings.	L	Document review, audio transcription, moderating user-generated content, tagging images.
Objective & Verifiable	4	The quality of the output can be measured against clear, objective criteria, making it possible to validate the AI's performance.	O	Answering factual questions, checking code for syntax errors, data validation, comparing documents.
Prone to Human Error	3	The task is tedious or requires such high attention to detail that humans are likely to make mistakes due to fatigue or oversight.	H	Data migration and cleanup, proofreading for basic errors, reconciling large financial statements.
Low Requirement for Emotional Intelligence	3	The task does not depend on deep empathy, complex negotiation, or nuanced interpersonal skills to be completed successfully.	I	Tier-1 technical support, scheduling logistics, routing customer inquiries, data classification.

How to Calculate a Weighted Score for AI Opportunity Assessment

To systematically evaluate and prioritize AI opportunities, use this weighted scoring formula:

AI Opportunity Score = (R×5) + (D×5) + (P×5) + (G×5) + (E×4) + (L×4) + (O×4) + (H×3) + (I×3)

Score each task from 0-5 for each property using the variables from the table above.

Score interpretation:

• 140-180: Excellent AI candidate - high priority for implementation
• 100-139: Good AI candidate - strong potential for automation
• 60-99: Moderate AI candidate - consider after higher priorities
• Below 60: Poor AI candidate - likely not worth pursuing with current technology

Maximum possible score: 180 points

Spreadsheet Implementation:

1. Create columns for each variable (R, D, P, G, E, L, O, H, I)
2. Score each task from 0-5 for each property
3. Use formula: =(R*5)+(D*5)+(P*5)+(G*5)+(E*4)+(L*4)+(O*4)+(H*3)+(I*3)
4. Sort tasks by final score to prioritize implementation order

Download Excel Template

Ready-to-use spreadsheet with formulas pre-configured

How to Use This Framework

Start by generating a list of tasks or processes within your organization. For each task, review the properties in the table and assign a score based on how closely the task matches each property.

Tasks that score highly—especially those that are repetitive, data-intensive, pattern recognition dependent, or generative—are prime candidates for AI automation or augmentation.

This approach helps leadership quickly identify where AI can deliver the greatest impact, prioritize projects, and build a roadmap for implementation. Focus first on high-value, low-risk opportunities to build momentum and confidence before tackling more complex or sensitive areas.

View Example Scoring Grid

Task/Process	R	D	P	G	E	L	O	H	I	Score	Priority
Monthly Financial Report Generation	5	5	3	4	2	5	5	4	5	156	Excellent
Customer Support Email Triage	4	4	5	3	1	5	3	3	4	129	Good
Contract Review & Analysis	3	5	4	2	1	5	3	3	3	113	Good
Social Media Content Creation	3	2	3	5	5	3	2	2	3	112	Good
Employee Performance Reviews	2	3	2	2	1	3	1	2	1	68	Moderate
Executive Strategy Sessions	1	2	1	1	2	2	0	1	0	42	Poor

Scoring Key:

0: Not applicable

1: Minimally present

2: Somewhat present

3: Moderately present

4: Strongly present

5: Extremely present

Implementation Playbook

Identifying the opportunity is the first and most critical step in AI success. The subsequent steps are: (1) framing the problem for success (2) executing the task with an agentic system (3) reviewing the output.

Frame the Problem (The Big Three)

Prompt

What instructions would you give to a teammate to complete the task without further explanation?

Context

What reports, datasets, examples would you refer to if solving this problem yourself?

Tools

What systems would you access or calculations would you perform as part of your workflow?

Effective Prompt Structure

# PROMPT TASK

Role: You are a <role_description>

# Task Overview
<client> works in <industry> doing <value_add>
<client> is trying to optimize...
This task is important because <reason>
Results will be used by <team_member>

# Desired output
Successfully completing the task will require you to <output_description>
The <output> must be <quality1>, <quality2>, and <quality3>

# Verification Strategy
To ensure proper completion <testing_strategy>

# Tasks
1. <step1>
2. <step2>
3. ...

Execute the Task

• Choose your AI client (ChatGPT, Claude Desktop, Gemini, etc.)
• Attach relevant files for context
• Set up necessary tools
• Copy/paste your prompt and execute

Review the Output

Start by checking for completeness and accuracy. Compare results against known benchmarks. For objective tasks, use automated checks. For subjective outputs, gather stakeholder feedback. Document issues to improve future prompts and configurations.

Advanced Strategies & Architectures

Evaluation Framework

The LLM landscape changes rapidly. Something you build today may not work the same tomorrow. An essential step for reliability is setting up an evaluation framework.

Basic Evaluation Requirements

1. Prompt – the exact question or task you give the AI
2. Background material/tools – documents, data, or tools the AI can use
3. Good answer description – desired result characteristics
4. Example answer – human-written reference response
5. Scoring method – simple rating system (0-5 or pass/fail)
6. Target score – minimum acceptable performance

Security Considerations: The Lethal Trifecta

Critical Security Triangle

1. Access to your private data — one of the most common purposes of tools
2. Exposure to untrusted content — any mechanism for malicious text to reach your LLM
3. Ability to externally communicate — pathways that could exfiltrate data

When these three capabilities intersect, they create a direct pipeline for attackers: untrusted content can instruct the model to read your most sensitive files and immediately transmit them outside your perimeter. Because the entire sequence happens inside the agent's own reasoning loop, conventional security layers often have no visibility or control.

Mitigation Strategy

The only reliable mitigation is to ensure at least one side of the triangle is disabled at all times. If an agent must handle confidential information, restrict both its exposure to arbitrary inputs and its ability to call outbound services. If it must process untrusted input, run it in a tightly controlled sandbox with synthetic data and no external network access.

AI In Mid-market Companies

Executive Summary

Foundations

Essential Terms

Large language model (LLM)

Artificial intelligence (AI)

Context

Prompt

Tools

Agent

Embeddings

Vector Store

Retrieval Augmented Generation (RAG)

Reasoning or Thinking Models

Model Context Protocol (MCP) Server

Token

Fine-tuning

API (Application Programming Interface)

Hallucination

Multimodal AI

OpenAI

Anthropic

Google Gemini

Microsoft Copilot

Perplexity

Workflow Automation

Guardrails

Inference

Model Governance

Evaluations (Evals)

Model Strengths & Limitations

Strengths

Summarization

Exploration

Web Research

Coding

Limitations

Multi-step Reasoning

Causal Reasoning

Mathematical Reasoning

Hallucination

Identifying Opportunities for AI

How to Calculate a Weighted Score for AI Opportunity Assessment

Spreadsheet Implementation:

How to Use This Framework

Implementation Playbook

Frame the Problem (The Big Three)

Prompt

Context

Tools

Effective Prompt Structure

Execute the Task

Review the Output

Advanced Strategies & Architectures

Evaluation Framework

Basic Evaluation Requirements

Security Considerations: The Lethal Trifecta

Mitigation Strategy

AI In Mid-market
Companies