v3 · enterprise observability | SOC 2 in progress

Operationalintelligence forAI infrastructure.

CostLynx is the cost observability and governance layer for teams running LLMs in production — spend, tokens, attribution, anomalies, and savings, normalized across every provider you ship on.

Start tracking spend →See the platform

API-first ingestion · No prompts or responses stored · 14-day evaluation, no card required

Spend tracked / 24h

$2,487,512

+3.8% vs 7-day avg

Inference events / sec

14,204

live · 6 providers

Avg savings detected

31.4%

across customer cohort

Anomalies / week

128 caught

$412k flagged · paused early

Anomaly detectedgpt-4o spike · staging · 23m ago

Pricing provenancebilling → 0.0024 / 1k · resolved

acme-corp / production / overview

Livelast 24h⌘K

AI spend · last 24 hours

acme-corp / production · normalized to USD · refreshes every 30s

1h24h7d30dQTD

Total spend (24h)

$48,210.44

▲ 3.8%vs 7d avg

Tokens

412.8M

▲ 1.2%input · 311M

$ / 1k tokens

$0.117

▼ 4.6%blended across providers

Active models

14 / 18

6 providers

Spend by provider

OpenAI

Anthropic

Google

Azure

Provider breakdown share · spend · Δ7d

Provider

Spend

Δ 7d

OpenAI

$22,180

+4.1%

Anthropic

$13,940

+6.8%

Google

$7,210

−2.4%

Azure

$4,880

+1.2%

Operations feed last 12h

Anomaly · staging-eu · gpt-4o spend +218% over 1h baseline. Auto-paused budget burn for review.23m ago · finops-ops

Saving applied · switching 38% of `summarize-doc` from gpt-4o → claude-haiku, est. $1,840/mo.2h ago · platform-team

Budget threshold · search-team · 92% of $12k cap consumed at day 18 of 30.5h ago · alerts

Pricing override resolved · OpenAI org-rate applied, $0.0024 / 1k input.today · contracts

01 · The problem

AI spend is the fastest-growing
line item nobody can explain.

Token bills arrive monthly. Models, providers, and prompts change weekly. The result: finance can’t attribute, engineering can’t optimize, and leadership flies blind. CostLynx closes that loop.

01 / Visibility

87%

of engineering orgs running LLMs in production cannot attribute spend to a single feature, model, or team within the current billing cycle.

Internal benchmark · n=312 platform leads · Q1 2026

02 / Drift

3.4×

median AI bill overshoot vs forecast in the quarter after launching a new agent or retrieval pipeline. Without anomaly detection, drift is a monthly invoice surprise.

CostLynx customer cohort · trailing 180 days

03 / Opportunity

31%

average modeled savings available from capability-aware model optimization and pricing-provenance corrections — left on the table when teams lack a control plane.

Savings engine simulations · 9-month window

02 · The intelligence layer

One operational layer
between your apps and
your AI providers.

CostLynx sits beside your inference path — not in it — and normalizes every event into a single usage schema. From that schema we run attribution, anomaly detection, budget enforcement, and capability-aware savings recommendations. Your prompts and responses never leave your stack.

Ingest

Capture every inference event.

API-first ingestion or SDK instrumentation. Idempotent dedupe. Provider sync for OpenAI billing.

POST /v1/events 14.2k / s

sdk · python, ts, go 3 langs

openai · billing sync hourly

Normalize

One schema across every provider.

Tokens, costs, latencies, model versions, attribution tags — reconciled into a single columnar store you can query.

schema · usage.v3 42 fields

pricing provenance 4 tiers

org / project / env hierarchy

Operate

Govern, attribute, save.

Dashboards, budgets, anomaly rules, attribution, and capability-aware savings recommendations — all reviewed before any production change.

attribution & showback live

budgets · slack webhook sla 5m

savings engine v3

03 · Capabilities

What teams use day to day.

A control plane your engineers will actually open every morning — and your CFO will reference at the next board review.

Spend timeline

Every token, every model, every workload.

A single, normalized timeline of AI spend across every provider you ship on. Slice by team, environment, feature, model, or prompt template.

Anomaly engine

Catch spend drift before the invoice arrives.

Z-score detection against your own historical baseline. Delivers to Slack via webhook.

Attribution

Attribution that holds up in board review.

Org → project → env hierarchy with consistent tagging fields.

Savings engine

Capability-aware recommendations.

Same-provider and cross-provider alternatives, with pricing provenance you can audit. Never auto-applied.

Budgets

Burn-down by team, project, or env.

Hard caps, soft thresholds, and forecasted exhaustion dates — all delivered to the right owner.

Multi-provider

OpenAI · Anthropic · Google · Azure · Bedrock · Mistral.

Apples-to-apples comparison on the same operational dataset.

Audit log

Every threshold change, every override.

SOC 2-aligned event log for procurement and security.

Exports

API-first data access.

Query spend, usage, and attribution data programmatically via the v1 API.

04 · Savings engine

Cut spend without
compromising performance.

CostLynx models capability-aware alternatives against your real traffic — then shows you what you would have spent, what you would have saved, and where evaluation says quality holds. Nothing changes in production unless you ship it.

Replay against your actual traffic.

No synthetic prompts. We simulate alternative allocations on a representative slice of your last 30 days of usage.

Capability-aware, not blind cheap.

Recommendations are gated by capability checks: tool-use, structured output, context length, latency p95.

Pricing provenance, every time.

Estimates resolve from organisation override → billing import → public list → unavailable, in that order.

summarize-doc · production · last 30d

simulated · n=82,440 events

Current model mix

$28,140 / mo

gpt-4o100% · $0.118 / 1k

median latency812 ms

quality (judge)0.92

Every provider, one source of truth.

No more reconciling six dashboards into a spreadsheet at month-end. Normalized cost, usage, latency, and pricing — across every model you ship on.

Provider

Ingestion

Pricing source

Models tracked

P95 latency

OOpenAI

API + billing sync

org override → billing

612 ms

AAnthropic

API ingestion

org override → public

704 ms

GGoogle · Vertex

API ingestion

org override → public

588 ms

AzAzure OpenAI

API + deployment sync

deployment rates

642 ms

BAWS Bedrock

API ingestion

CUR reconcile · daily

724 ms

MMistral · self-host

SDK metering

unit-cost formula

428 ms

06 · Observability map

Watch your AI spend flow
across every provider — in real time.

Every inference event in your platform — from the application that fired it to the provider that served it — is captured by CostLynx's observability mesh and normalized into spend, attribution, and anomaly signals. What you see below is the same data your on-call team watches at 3am.

Application layerCostLynx layerProviders

Application layer

checkout-agent12.4k events / s

$22,180

summarize-docanomaly · z = 4.8 σ

$3,180

search-rag8.1k events / s

$13,940

CostLynx layer

cost analyticsspend · attribution · forecasts

live

anomaly detectionz-score · slack · webhook

active

budget governancehard caps · burn-down alerts

active

Providers

OpenAI · gpt-4o38 models tracked

46%

Anthropic · claude-3.512 models tracked

29%

Google · gemini-1.522 models tracked

15%

Azure · gpt-4o-mini16 models tracked

Mistral · self-host7 models tracked

Live events / sec25,084▲ 1.8% · last 1h

Tracked workloads38 / 424 near budget threshold

p95 ingest latency38 ms▼ 6% · 7d

Saved · trailing 30d$214.8Kacross 14 active models

07 · Built for both sides of the table

Engineering ships. Finance forecasts.
One operational source.

For engineering & platform

Cost as a first-class signal next to latency and error rate.

Stop instrumenting bespoke spend metrics into Datadog. Stop building monthly “why did our OpenAI bill triple” postmortems. CostLynx gives you the same SRE-grade workflow for spend.

Token-level attribution by feature
Per-workload p95 + $ / 1k tokens
Anomaly rules via Slack webhook
Capability-aware savings simulations
Idempotent SDK ingestion
API-first — no UI lock-in

For finance & FinOps

Track AI spend with the same rigor as cloud.

Move AI off the “miscellaneous SaaS” line and into an attributable, auditable category. Showback and attribution, board-ready cost-per-feature.

Cost attribution by org / project / env
Burn-down vs monthly budget
Monthly burn-rate forecast
Savings opportunity tracking
API export
Procurement-ready audit log

08 · Governance & trust

Procurement-ready on day one.

Designed alongside the platform, security, and finance teams that have to sign off on you — not bolted on at series B.

Security posture

Metadata-only by design.

Tokens, costs, model versions, attribution tags. No prompts or responses cross the boundary unless you opt in.

TLS 1.3AES-256 at restEU residency

Identity & access

SSO, SAML, MFA, RBAC.

Granular roles for engineering, finance, and security. Audit log of every threshold change, override, and key rotation.

SAML 2.0RBAC

Compliance

SOC 2 Type II in progress.

Sub-processor list, DPA, vendor security questionnaire, and pen-test summary on request.

SOC 2 II · IPGDPR

Operations

99.95% target uptime.

Hosted on Vercel's global edge network with a status page updated within 5 minutes of any incident.

SLA · EnterpriseAudit log

09 · Pricing

Priced for the team you have today —
built for the one you’ll have next year.

Per-workspace pricing. Usage scales with events ingested, not seats. No charge for read-only stakeholders. Growth includes a 14-day free trial.

Starter

$79/ month

For teams getting their first LLM workload into production.

Start free trial

3 projects · 2 environments
500k events / month included
Overview & usage dashboards
Slack webhook anomaly alerts
Community support · 1 business day

Growth most popular

$599/ month

For teams scaling AI usage across products and customers.

Start 14-day trial →

Unlimited projects & environments
10M events / month · overage at $0.40/M
Savings engine v3 & pricing provenance
Budgets, burn-down, anomaly detection
Slack webhook · anomaly alerts
Multi-provider connectors
Priority support · 4-hour SLA

Enterprise

Custom

For platform organizations with custom governance, SLAs, and data-residency needs.

Contact sales →

SSO (SAML), MFA, audit log
Regional data residency · EU / US
Prompt-sampling controls & strict mode
Dedicated success manager & SLA
Custom data export & ingestion lanes
Security review · DPA on request

10 · FAQ

Questions we get from
platform and finance teams.

Does CostLynx sit in my inference path?

No. CostLynx is a metering and analytics layer, not a proxy. Your application calls providers directly; CostLynx ingests usage events asynchronously. Latency to your end users is unchanged.

Do you store our prompts or responses?

By default, no. We ingest metadata only — tokens, costs, model versions, attribution tags, and timing. Optional prompt sampling for enterprise is available with strict-mode redaction and per-project opt-in.

How accurate are the savings estimates?

Savings are modeled against a representative slice of your last 30 days of traffic and gated by capability checks. We surface a confidence band and the pricing provenance for every estimate. Nothing ships to production without your review.

How does ingestion scale?

The ingestion API is idempotent and dedup-keyed. We process at sustained rates above 50k events/sec per workspace, with burst tolerance on top. SDKs handle local batching and back-pressure automatically.

What does "capability-aware" actually mean?

A savings recommendation is only surfaced when the candidate model can plausibly serve the workload — tool-use, structured output, context window, and observed latency p95 are all checked against the current workload's requirements.

Can we self-host?

Self-hosted CostLynx is available on the Enterprise plan via private deployment in your cloud (AWS, GCP, Azure). The control plane and data plane both run in your VPC; we provide upgrade tooling and a managed support tier.

See your AI spend
by tomorrow morning. Literally.

A working dashboard in under an hour. No rip-and-replace. No proxy. No prompts stored.

Start tracking spend →Book a 20-min walkthrough

AI spend is the fastest-growingline item nobody can explain.

One operational layerbetween your apps andyour AI providers.

What teams use day to day.

Cut spend withoutcompromising performance.

Current model mix

Recommended

Every provider, one source of truth.

Watch your AI spend flowacross every provider — in real time.

Engineering ships. Finance forecasts.One operational source.

Procurement-ready on day one.

Priced for the team you have today —built for the one you’ll have next year.

Questions we get fromplatform and finance teams.

See your AI spendby tomorrow morning. Literally.

AI spend is the fastest-growing
line item nobody can explain.

One operational layer
between your apps and
your AI providers.

Cut spend without
compromising performance.

Watch your AI spend flow
across every provider — in real time.

Engineering ships. Finance forecasts.
One operational source.

Priced for the team you have today —
built for the one you’ll have next year.

Questions we get from
platform and finance teams.

See your AI spend
by tomorrow morning. Literally.