top of page

Chat, Query, and Transform Multi-GB Files - In Natural Language, Right in Your Browser with DuckDB.

Updated: 5 days ago


Analyzed a 1.6 GB, 11-million-row CSV file on a custom build of SQL Rooms + DuckDB. The entire analysis- querying and charts ran 100% in the local browser. No backend. No remote DB. And fast.

Built on the open-source SQL Rooms (sqlrooms.org) framework from Foursquare, customized for my analytics workflows

📌 What you can do

▸ Process hundreds of MBs, millions of rows - fully local

▸ Merge, append, transform, analyze, and visualize with AI

▸ Chat with your data in natural language

▸ No remote database or upload

▸ File types: CSV, TSV, pipe-delimited, Parquet

▸ Export tables

📌 Live App at


Part of the broader suite of Database AI tools - 10 open source micro-apps for querying, transforming, and analyzing databases in natural language. From remote databases to voice AI.

📌 To Try it:

1. Get a free Gemini API key from aistudio.google.com. (or use OpenAI / Claude). Keys stay fully local in your browser.

2. Explore with the default Tour de France dataset & starter prompts

3. Or try Cricket Data (100 MB- 1.5 GB) on my drive

4. Upload your own files, run queries and visualize directly in browser

📌 Is it 100% Local? What goes out?

▸ File data remains in the browser's DuckDB instance.

▸ LLM receives schema, sample rows, and your query results.

▸ API calls go directly from browser → LLM (except OpenAI, routed via proxy for CORS).

📌 Can it be 100% local?

Yes - the original SQL Rooms repo supports Ollama for fully offline use.

📌 Memory & Limits

Handles multiple files of a few hundred MBs each with ease. For larger volumes, performance depends on your device memory. DuckDB-WASM defaults can be tuned if your hardware supports it.

📌 Code & Docs

🔗 Docs: on app site → Hit “Docs” on top

📌 Original Project (Credit)

▸ SQL Rooms AI - the one I customized: sqlrooms-ai.netlify.app

▸ SQL Rooms Example Apps & GitHub: sqlrooms.org/examples.html

▸ Release Post from Vikram Gundeti: https://lnkd.in/gfFyD-gK

▸ Architected by Ilya Boyandin


📌For those testing or customizing:


▸ App Architecture file for AI Coders (humans welcome): Included in the GitHub repo - explains architecture, modifications, and deployment details for quick reuse.


▸ Sample Datasets:

1. Consolidated Cricket Data (2003–2025, Test + ODI + T20 + Others ): 1.6 GB, 11 million rows.

2. Individual match files: 100–300 MB each, for testing multi-file merges.


▸ Example Prompt:

- This is ball-by-ball data i.e. one row = one ball. For ODI and T20, find Top 10 strikers by total runs off the bat, with total number of matches, total number of overs, and run rates per match/over, and share in table format. Separate tables for T20 and ODI.

- Share two charts, one for each match type – dual axis – with run rate per over and run rate per match – one as a bar and the other as a line.



 
 
bottom of page