Chat, Query, and Transform Multi-GB Files - In Natural Language, Right in Your Browser with DuckDB.
Published: October 17, 2025
Analyzed a 1.6 GB, 11-million-row CSV file on a custom build of SQL Rooms + DuckDB. The entire analysis - querying and charts ran 100% in the local browser. No backend. No remote DB. And fast.
Built on the open-source SQL Rooms (sqlrooms.org) framework from Foursquare, customized for my analytics workflows
What you can do
- Process hundreds of MBs, millions of rows - fully local
- Merge, append, transform, analyze, and visualize with AI
- Chat with your data in natural language
- No remote database or upload
- File types: CSV, TSV, pipe-delimited, Parquet
- Export tables
Live App
Part of the broader suite of Database AI tools - 10 open source micro-apps for querying, transforming, and analyzing databases in natural language. From remote databases to voice AI.
To Try it
- Get a free Gemini API key from aistudio.google.com. (or use OpenAI / Claude). Keys stay fully local in your browser.
- Explore with the default Tour de France dataset & starter prompts
- Or try Cricket Data (100 MB - 1.5 GB) on my drive
- Upload your own files, run queries and visualize directly in browser
Is it 100% Local? What goes out?
- File data remains in the browser's DuckDB instance.
- LLM receives schema, sample rows, and your query results.
- API calls go directly from browser → LLM (except OpenAI, routed via proxy for CORS).
Can it be 100% local?
Yes - the original SQL Rooms repo supports Ollama for fully offline use.
Memory & Limits
Handles multiple files of a few hundred MBs each with ease. For larger volumes, performance depends on your device memory. DuckDB-WASM defaults can be tuned if your hardware supports it.
Code & Docs
Docs: on app site → Hit "Docs" on top
Original Project (Credit)
Architected by Ilya Boyandin
For those testing or customizing
- App Architecture file for AI Coders (humans welcome): Included in the GitHub repo - explains architecture, modifications, and deployment details for quick reuse.
- Sample Datasets:
- Consolidated Cricket Data (2003–2025, Test + ODI + T20 + Others): 1.6 GB, 11 million rows.
- Individual match files: 100–300 MB each, for testing multi-file merges.
- Example Prompt:
- This is ball-by-ball data i.e. one row = one ball. For ODI and T20, find Top 10 strikers by total runs off the bat, with total number of matches, total number of overs, and run rates per match/over, and share in table format. Separate tables for T20 and ODI.
- Share two charts, one for each match type – dual axis – with run rate per over and run rate per match – one as a bar and the other as a line.