Google Gemini 2.0 Flash - solid API performance, great quality, and cheaper than GPT-4-mini. The new workhorse?

Published: February 23, 2025

Video thumbnail

LLM usage hierarchy for my day-to-day live use cases

Point to note: gpt-4o and sonnet API pricing is approx. 15x mini.

And now: gemini-2.0-flash - getting excellent results in early testing for my use cases. Earlier Gemini models had higher API call failures and issues with structured output, but that seems to have changed. Stress testing now. If Gemini's API reliability and structured output consistency hold up, I may move to gemini-2.0-flash as primary and start migrating clients.

Typical use cases

Requiring LLM API calls or agent setups:

As an aside, coding is separate:

What's new with gemini-2.0-flash?

Production release: first week of Feb. Based on initial live testing with my typical use cases:

API pricing

Model USD per M input tokens USD per M output tokens Free tier
gpt-4o-mini 0.15 0.60 none
gpt-4o 2.5 10 none
claude-3.5-sonnet 3 15 none
gemini-2.0-flash 0.10 0.40 15 req/min, 1M tokens/min, 1500 req/day

Flash 2.0 is now priced below mini.

Want to test yourself?

Try my open-source AI analytics apps, no API key needed, live on rex.tigzig.com. Compare between LLMs. Source code available on the site in the Help and Build sections.

1. Multi (sequential) agentic app for advanced analytics

Compare reasoning between o3-mini, flash-2.0, and deepseek-r1. Sample files available on the site. Temporary Postgres databases created on the fly, no login required. Tigzig Analyzer

2. AI-enabled mutual fund portfolio processor

Compare file schema identification between gpt-4o and other LLMs with live runs. Use your choice of mutual fund monthly file or a sample file from Google Drive (link in Help section). Tigzig MF Portfolio Processor

🔗
Blog Migration Notice: Some links or images in earlier posts may be broken. View the original post on the old blog site.