Cost Tracking

Proxle automatically calculates the cost of every LLM request, giving you visibility into where your money goes.

How It Works

After each request, Proxle extracts token counts from the response
Token counts are matched against the provider pricing table for the model used
Cost is calculated as: (input_tokens * input_price + output_tokens * output_price) / 1,000,000
The cost and token counts are stored alongside the request

Confidence Levels

| Level | Description | |-------|-------------| | Exact | Token counts came from the provider's response (e.g., OpenAI usage field) | | Approximate | Token counts were estimated using tiktoken or character-based approximation |

Most OpenAI and Anthropic requests have exact confidence. Cohere and Gemini may use approximations depending on the response format.

Cost Attribution by Feature

Use the metadata.feature field in your SDK calls to attribute costs to features:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    metadata={"feature": "chat_assistant"}
)

Then view the breakdown in Dashboard > Analytics > Cost by Feature.

Cost Attribution by User

Use the metadata.user_id field to track per-user costs:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    metadata={"user_id": "user_123"}
)

View per-user costs in Dashboard > Analytics > Cost by User.

Analytics Dashboard

The analytics page shows:

Total cost over a time period (7, 14, or 30 days)
Cost by provider - Which providers cost the most
Cost by model - Which models cost the most
Cost by feature - Which features are most expensive
Cost by user - Which users generate the most cost
Cost over time - Daily cost trends
Cache savings - Money saved through cache hits

Supported Models and Pricing

Proxle maintains a pricing table for all supported models:

OpenAI

GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo
O1, O1-mini, O3-mini

Anthropic

Claude Opus 4, Claude Sonnet 4, Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus, Claude 3 Haiku

Cohere

Command R+, Command R, Command, Command Light

Google Gemini

Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini 1.0 Pro

Azure OpenAI

GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo

Pricing is updated regularly to reflect provider price changes.

Smart Caching Request Replay