Intelligent AI Web Scraper in Excel with Python (xlwings Lite)

Published: November 3, 2025

Video thumbnail

Tell it what to extract - fields, filters, rules - all in plain English.

Live Tool. Paste a URL list. The app extracts the fields you asked for, applies your transforms, filters and any data cleaning instructions. And updates structured data in Excel, URL status, error logs, and a 30-KPI scraping dashboard.

What Intelligence does

You can share instructions in conversational English.

How it works

Jina.ai fetches the text from URL. Gemini extracts and transforms. xlwings Lite runs it all. Control LLM Intelligence by tweaking LLM parameters: topP, temperature, max tokens, and thinking budget. Configure scrape parameters like request delays, max retries and timeout settings. Non-deterministic by default. For stricter pulls, add a Python parser layer or a hybrid. Select LLMs by use case: cheap-fast for volume, slower-stronger for precision.

How to customize

I use it as-is for common client scrapes and customize for tougher cases. To customize, hand the code to your AI Coder with change requests. Examples: switch to OpenAI or Anthropic, capture images, follow child URLs, add a python parser step. The foundation is xlwings Lite. I've documented my process and examples in the xlwings Practice Lab: xlwings-lite.tigzig.com. Refer to my blog posts (links below) on choosing an AI Coder.

Live Tool, Source Code & Docs

Built with xlwings Lite by Felix Zumstein (lite.xlwings.com)

Resources

AI Coder Instruction file for xlwings Lite

xlwings Lite official site: lite.xlwings.org

🔗
Blog Migration Notice: Some links or images in earlier posts may be broken. View the original post on the old blog site.