top of page

Gemini 3 Pro Added to Database AI Suite. Tested Against Claude Sonnet 4.5 and GPT-5.1.Results: Claude still leads. GPT-5.1 is solid. Gemini 3 Pro lands third.

Updated: 2d

ree

Gemini 3 Pro Added to Database AI Suite. Tested Against Claude Sonnet 4.5 and GPT-5.1.

Results: Claude still leads. GPT-5.1 is solid. Gemini 3 Pro lands third.


▪️ Performance Scores

Multi-step database analysis workflows:

Claude Sonnet 4.5: 115

GPT-5.1: 100

Gemini 3 Pro: 90


90-tier: Gemini 2.5 Pro, GPT-4.1, KIMI 2 Thinking

85-tier: Gemini 2.5 Flash, Qwen 3 Max, GLM 4.6, DeepSeek R1

80-tier: Gemini 2.0 Flash


▪️ Model Findings

- Claude Sonnet 4.5: Creates multiple segmentation variables beyond the prompt. Quality leader.

- GPT-5.1: Strong. Token bloat gone vs GPT-5. Clear second.

- Gemini 3 Pro: Similar to Gemini 2.5 Pro. Better explanations, same output quality. Doesn't match top two.



▪️ What Was Tested

Credit card analysis: 1M customer + 10M transaction tables (AWS RDS MySQL). Multi-step workflow - plan, summarize, create derived variables, merge, segment profiles.

Weighted ranking of Indian credit card issuers from RBI data (Postgres).



▪️ Practical Choices

High quality → Claude Sonnet 4.5

Balance → GPT-4.1

Low cost- great value→ Gemini 2.5 / 2.0 Flash



▪️ Cost Breakdown

-- Planning vs Execution

Single iteration: reasoning LLM runs once (20% cost), execution agent runs 7-10 queries with debugging (80% cost).


-- Reasoning / Planning Cost (Per 100 Questions)

High: Claude, GPT-5.1, Gemini 3 Pro (~$7.00)

Mid: GPT-4.1, KIMI 2, Qwen, GLM, DeepSeek (~$2.50)

Budget: Gemini 2.5 Flash (~$1.50), Gemini 2.0 Flash ($0.30)

Avoid: Gemini 2.5 Pro ($10) and GPT-5 ($15) - token bloat.


-- Execution Cost (Per 100 Questions)

- Advanced analysis: ~$12.50 (GPT-4.1, consistent across reasoning models)

- Single-step: $0.40 (GPT-4.1-mini) to ~$3.50 (GPT-5.1)


I use OpenAI for SQL execution - more reliable. Multi-step workflows multiply costs fast. Use only when needed.


These costs relate to my typical uses. Your numbers will vary based on context, architecture & output volume. Always test with your use case. Always check actual billing, not token based estimates.



▪️ DATS-4: Database AI Suite- v4

Remote database AI app. Postgres or MySQL. Single-step queries or multi-step analysis. Python charts, table upload, PDF reports. Open source, live, free.


-- Try It

Use Sample button. Data loads to temp Postgres. Or connect your database.

Public app routes through my backend - sandbox only. Deploy on your servers for live work


▪️ Resources

Database AI Field guide- usage, architecture, process flows

Previous post: GPT-5.1 + KIMI 2 evaluation with short video

Live apps: app.tigzig.com

Posts & guides - AI for Analytics: tigzig.com



▪️ Direct access to the main database AI apps:

→ Remote Database (Postgres/MySQL): app.tigzig.com/analyzer

→ In-Browser (DuckDB, local processing): https://app.tigzig.com/sql-rooms

→ ChatGPT + Your Database (connect any Postgres/MySQL): ChatGPT here

→ ChatGPT + Fixed Data (Simultaneous connection across 3 Databases): ChatGPT here.


▪️Related resources - Database AI

→ Implementing Database AI : Field Guide

49-page PDF - architecture, agent setups, cost analysis

Read Here

→ In-Browser Database AI

1.5 GB files: process locally with DuckDB

Read here

→ 10 Options for Talking to your databases

10 micro-apps across 5 categories - Remote, In-Browser, ChatGPT, Voice, Rapid Deploy

Read here


 
 

Recent Posts

See All
"𝘌𝘹𝘦𝘤𝘶𝘵𝘦 𝘈𝘚𝘈𝘗. 𝘈𝘱𝘱𝘳𝘰𝘷𝘢𝘭 𝘨𝘳𝘢𝘯𝘵𝘦𝘥. 𝘎𝘰𝘰𝘨𝘭𝘦 - 𝘢𝘨𝘢𝘪𝘯𝘴𝘵 𝘔𝘪𝘤𝘳𝘰𝘴𝘰𝘧𝘵 & 𝘔𝘦𝘵𝘢 ..."

"𝘌𝘹𝘦𝘤𝘶𝘵𝘦 𝘈𝘚𝘈𝘗. 𝘈𝘱𝘱𝘳𝘰𝘷𝘢𝘭 𝘨𝘳𝘢𝘯𝘵𝘦𝘥. 𝘎𝘰𝘰𝘨𝘭𝘦 - 𝘢𝘨𝘢𝘪𝘯𝘴𝘵 𝘔𝘪𝘤𝘳𝘰𝘴𝘰𝘧𝘵 & 𝘔𝘦𝘵𝘢 past 15 𝘺𝘦𝘢𝘳𝘴, 𝘣𝘦𝘯𝘤𝘩𝘮𝘢𝘳𝘬 𝘷𝘴. 𝘚&𝘗 500, 𝘵𝘦𝘤𝘩𝘯𝘪𝘤𝘢𝘭𝘴 & 𝘲

 
 
bottom of page