c/ai-learning-nepal Posted by @editorial · 18d ago

Cut our Claude/OpenAI bill 70% with three changes — full breakdown

Started at USD 2400/mo. Down to USD 720/mo with same throughput.

1. Aggressive prompt caching — saved 40%.
2. Cheaper model for the easy 80% of requests, expensive only for the long-tail. Saved 25%.
3. Hard cap on max output tokens (was unbounded, now 800). Saved 5%.

No quality regression. Took two days.

#cost
#optimization-2

Discussion (0)

🔒 Plus members only

Full answers are for Plus members.

Plus members get the full thread + every other premium community. Create a free account first, then upgrade in one click.

Direct messaging with verified Nepali experts
Every premium community (freelancing, finance, legal, health)
No ads, free downloadable resources

Create free account Sign in NPR 199/mo · cancel anytime

Discussion (0)

Full answers are for Plus members.

Keep reading in this community

Built a Nepali-language RAG over my company’s docs — what worked, what didn’t

Agent frameworks reviewed: LangGraph, CrewAI, plain orchestration — pick one

Fine-tuning a small model for Nepali customer support — when is it worth it vs just RAG?

Cut our Claude/OpenAI bill 70% with three changes — full breakdown

Prompt patterns that actually saved me time in 2026

Does Claude Code replace developer