Understanding AI Costs

How much does it cost to use AI in QuickContract?

How AI pricing works

Cloud-based AI providers charge based on tokens — the units of text that models process. Roughly speaking, one token equals about 4 characters of English text. A typical contract contains 2,000–5,000 tokens of output, plus the input tokens from your prompts, meeting transcripts, and business profile data.

You pay for both input tokens (what you send to the model) and output tokens (what the model generates). Output tokens are typically 2–5x more expensive than input tokens. QuickContract does not add any markup — you pay only what your AI provider charges.

Cost per contract

A typical contract generation costs between $0.02 and $0.10 depending on the model and contract length. Here is a rough breakdown by activity:

Activity	Estimated cost
Generate a standard contract	$0.02 – $0.10
QuickEdit a section	$0.01 – $0.03
Assistant chat message	$0.005 – $0.02
Contract comparison	$0.03 – $0.08
Workflow run	$0.05 – $0.15
Generate clause variant	$0.01 – $0.03

These estimates assume mid-tier models like Claude Sonnet or GPT mini-tier. Premium models (Claude Opus, top-end GPT) cost more. Budget models (Llama, Gemini Flash) cost less.

$5 goes a long way

$5 on OpenRouter typically lasts weeks of normal use. That is enough to generate 50–200 contracts, run dozens of QuickEdit operations, and have many Assistant conversations. Most users top up once and do not think about costs for a long time.

Tracking your spending

When OpenRouter is your active provider, QuickContract displays your remaining balance in the sidebar. This updates automatically after each AI operation so you always know where you stand.

For Anthropic, OpenAI, and Gemini, check your usage on each provider's dashboard. These providers bill to your payment method at the end of each billing cycle or when you hit a spending threshold.

Reducing costs

Use a mid-tier model — Claude Sonnet and the GPT mini tier offer excellent quality at a fraction of the cost of premium models
Use templates — generating from a template uses fewer tokens because the AI has a clear structure to follow
Use Ollama for drafts — run a free local model for initial drafts, then switch to a cloud model for final polishing
Be specific in prompts — clear, detailed input leads to better output on the first try, avoiding costly regeneration

Free option: Ollama

If AI costs are a concern, Ollama lets you run AI completely free on your own hardware. There are no per-token charges and no accounts to manage. The trade-off is slower generation speed and somewhat lower quality compared to the best cloud models.

Ollama (Offline)

Free vs Pro vs Firm Bundles