Go from a 200MB flat file with 1.5M records to analysis in minutes with my open-source AI-SQL App
Published: September 15, 2025
20 Yrs ODI Cricket stats - I'm providing the data and tools. Go run it.
1. Get the Data
- Data: 25 years of ODI cricket data from cricsheet.org processed with Tigzig tools.
- Format: Pipe delimited raw TXT file, approx. 200MB, 1.5M records.
2. Get the Free Database
- Platform: neon.com
- Action: Go to their site. Get a free, live Postgres database in seconds. No CC required. Copy the credentials. This is your temporary analysis sandbox.
3. Load the Data
- Platform: app.tigzig.com → Database Apps → DATS-4
- Action: Connect to your Neon DB, then upload the 200MB text file. The app handles the rest. Takes approx. 2 minutes.
- Menu → Connect to Database
- Menu → Choose File
4. Query with Natural Language
- Action: Just type or dictate your question.
- Example 1: "Show top 10 batsmen by runs off the bat with chart"
- Example 2: "For these, show runs, matches, overs, run rate per match & per over, with chart"
DATS-4
My open-source SQL multi-agent app. It handles Text-to-SQL, Python charting, stats, instant Postgres creation, PDF outputs, and provides 9 reasoning models (Gemini, Claude, DeepSeek, more).
Practitioner's Warning
This is a public-facing app. All credentials and API calls run through my backend server.
Rule: Use this public version for sandbox testing ONLY, with temporary databases and non-sensitive data.
For Live Use: Full source code shared. Deploy it on your VPN. Current setup is low-security for open testing; live use must tighten auth and access controls. Basic OAuth module with Auth0 included in source.
Where it gets messy
This example uses file I pre-processed for rapid analysis.
Reality: is not like click-click and report appears. It's more like bang-head, bang-head and then a drop appears.
The Work: needs data cleaning, semantic layers, pre-computed metrics, marts and summary tables. AI is a powerful tool, but it doesn't replace solid data engineering - even though I use AI for data engineering too, including pre-processing of this data.