TIGZIG Logo TIGZIG - Co-Analyst

xlwings Lite : Coding Guidelines (Compressed) for AI Coders (Humans Welcome)

For Analytics, Automation & Machine Learning Workflows

For any questions, drop a note at [email protected]

# SYSTEM CONSTRAINTS (MUST FOLLOW)

These are the absolute highest priority rules. You must verify your code against these before outputting. 1. NO INDEXES: Always use .options(index=False) when writing DataFrames. sheet.value = df is FORBIDDEN. 2. ROBUST SHEET DELETION: Never assume a sheet doesn't exist. Always check and delete before creating: if s.name == 'MySheet': s.delete(). 3. NO AUTO-FIT: Never call .autofit(). It is not supported. 4. NO RGB TUPLES: Always use hex strings (e.g., '#F0F0F0') for colors.

Last Update date: 30th NOVEMBER 2025

1. Introduction

This document provides coding guidelines for xlwings Lite - a browser-based Python environment running inside Excel.

2. Execution Context (Where Your Code Runs)

YOUR CODE RUNS IN: - File: main.py inside the xlwings Lite add-in task pane (right side of Excel) - Environment: Pyodide (Python compiled to WebAssembly, runs in browser sandbox) - Trigger: User clicks play button or presses F5 to execute @script decorated functions - Access: Your code has direct read/write access to the active workbook via book: xw.Book - Output: print() statements appear in the console log below the code editor - Dependencies: Managed via requirements.txt tab in the same task pane

3. AI Coder Checklist: Core Directives for All Scripts

Golden Rules: These 20 directives are non-negotiable and MUST be applied in every script.

  1. ALWAYS use the find_table_in_workbook() helper to locate tables.
  2. ALWAYS use .options(index=False) when writing DataFrames.
  3. ALWAYS use hex strings (e.g., '#F0F0F0') for colors, NEVER RGB tuples which will raise ValueError.
  4. ALWAYS make custom functions (@func) robust. Use typing.Any for arguments from cell references and handle data conversions safely inside the function. NEVER rely on specific type hints like float or str for cell inputs, as this will cause #VALUE! errors. See section 6.7 for the mandatory pattern.
  5. CRITICAL: NEVER use .expand() on newly written data. It runs too fast and WILL fail with an IndexError. ALWAYS define ranges explicitly using .resize(df.shape[0] + 1, df.shape[1])). This is the most common point of failure.
  6. ALWAYS wrap fallible operations like sheet.tables.add() in try...except.
  7. CRITICAL: NEVER access a table's parent sheet with .sheet (e.g., my_table.sheet). It will fail. The helper from Rule #1 MUST return both the sheet and the table (sheet, table = find_table...). See Section 3.1 for the mandatory pattern.
  8. CRITICAL FOR SUMMARY TABLES: When creating new tables from summary data, ensure the main descriptor column (e.g., 'Value') contains only strings. A column mixing text ('< 10k') and numbers (10000) WILL cause a silent GeneralException when .tables.add() is called. ALWAYS explicitly convert all values in such columns to strings (e.g., str(value)) before writing the data. See Section 3.1 for detailed examples of preventing mixed-type column failures.
  9. CRITICAL FOR PIVOTED SUMMARIES: When a summary requires reshaping (e.g., turning rows into columns with .unstack()), you MUST use the robust iterative 'list-of-dicts' pattern. A direct .groupby().agg() is only safe for simple, non-reshaped summaries.
  10. ALWAYS USE @script FOR RUNNABLES: Any Python function that users may execute via the xlwings Lite task pane MUST be decorated with @script β€” even if it's called elsewhere. Treat each user-facing functionality as a standalone runnable. NEVER omit @script on functions meant for direct execution.
  11. CRITICAL: NEVER write a 2D Python list to a single starting cell (e.g., sheet["A10"].value = [["Label", value]]). This operation is unstable in the xlwings Lite API and is the known cause of silent InvalidArgument errors that Python cannot detect. ALWAYS write small, non-DataFrame data atomically (one cell at a time). See Section 3.4 for the mandatory pattern.
  12. CRITICAL FOR PERSISTENT UI ERRORS: When a script consistently fails with a generic InvalidArgument error and produces partial or no output (e.g., only the first of several tables appears), especially if the user reports "it was working fine and suddenly stopped," you MUST stop suggesting code changes. Your immediate next step is to ask the user a targeted diagnostic question: "Have you recently manually renamed an output sheet that was created by a previous run of this same script?" This is a known cause of uncatchable API errors, and the only solution is user action. See Section 7.5.1 for the full explanation and user guidance.
  13. CRITICAL FOR SILENT UI ERRORS: If you see a generic error in the Excel UI (e.g., InvalidArgument or Invalid parameter out of range) but no specific Python error in the logs, it's almost always a data type issue during the write operation. To prevent this, you MUST explicitly convert any DataFrame column that contains mixed data types (e.g., numbers and text) to a single, consistent type (e.g., .astype(str)) BEFORE writing it to Excel. See Section 7.5.2 for the full explanation.
  14. CRITICAL: PREVENT InvalidArgument ERRORS BY AVOIDING MIXED DATA TYPES. Never mix data types within a DataFrame column. The most common cause of a silent InvalidArgument error is a column containing both numbers and empty strings (''). ALWAYS use None for missing numbers, not ''. When a column must contain text and numbers (e.g., '<10k', 10000), you MUST convert the entire column to string via .astype(str) before writing. See Section 7.5.1 for the full troubleshooting guide on this error.

    Example: # INCORRECT (WILL FAIL with InvalidArgument): Mixes float and string df['P-Value'] = [0.05, '', 0.05, ''] # -> dtype: object

    # CORRECT (Robust): Uses None, keeps column numeric
    df['P-Value'] = [0.05, None, 0.05, None] # -> dtype: float64
    # ALSO CORRECT (Robust): Explicitly converts to string for display
    df['Value'] = ['<600', 600, 700]
    df['Value'] = df['Value'].astype(str) # -> All values are strings
  15. CRITICAL: NEVER USE sheet.autofit() - IT'S NOT SUPPORTED. Will raise NotImplementedError. Use sheet.range().column_width = value or rely on Excel's default formatting.

  16. CRITICAL FOR SAMPLING: ALWAYS use sklearn.model_selection.train_test_split with the stratify parameter for creating balanced samples (e.g., Test/Control). This is the mandatory gold standard for ensuring representative samples in data science workflows.

  17. CRITICAL TO PREVENT InvalidArgument ERRORS: The sheet.tables.add(name=...) API call is unstable when used repeatedly in a single script run. While naming one table is often safe, creating multiple named tables in a loop or in quick succession on the same sheet WILL lead to silent crashes. To balance functionality with stability, follow this logic: - WHEN TO NAME (The Exception): If the user provides a specific name for a table (e.g., "create a table named LEADS_CAT"), you SHOULD use the name parameter. This is considered a low-risk, single-table operation and is often essential for the workflow. - WHEN TO OMIT (The Default): If the user does not provide a specific name, OR if you are creating multiple tables on a single report sheet (e.g., a numeric and categorical profile on one audit sheet), you MUST OMIT the name parameter. This is the safest default behavior. Let Excel assign the names (Table1, Table2). - Brittle (Will Fail in Loops/Multi-Table Reports): sheet.tables.add(source=my_range, name=f"DynamicName_{i}") - Robust (Always Works): sheet.tables.add(source=my_range)

  18. ALWAYS activate() the last output sheet created or modified, where applicable, to enhance user experience. This ensures users immediately see the results of the script execution instead of remaining on the original data sheet.

  19. CRITICAL: ROBUST NUMERIC CONVERSION AND IMPUTATION FOR STATISTICAL CONSISTENCY. When converting columns to numeric types and subsequently calculating statistics (like median, mean, standard deviation) for purposes such as imputation or stratification, ensure the pd.to_numeric(errors='coerce') operation is completed and materialized before calculating the statistic. Chaining .fillna(df['col'].median()) directly after pd.to_numeric() in a single line can lead to the .median() method (or other statistical functions) operating on an intermediate object dtype Series that has not yet had all non-numeric values reliably converted to np.nan. This can result in subtly incorrect imputed values or stratification bins, leading to inconsistent analytical results. Always convert, then calculate statistics, then impute.

Example:

&#35; INCORRECT (Potential Pitfall: Median calculation may be unreliable due to chaining):
&#35; df['MyNumericColumn'] = pd.to_numeric(df['MyNumericColumn'], errors='coerce').fillna(df['MyNumericColumn'].median())

&#35; CORRECT (Robust and Mandatory for reliable imputation/data preparation):
df['MyNumericColumn'] = pd.to_numeric(df['MyNumericColumn'], errors='coerce') # Step 1: Ensure column is fully numeric with NaNs

&#35; Apply imputation logic based on requirement (e.g., conditional median, fixed value, or mean)
if df['MyNumericColumn'].isnull().any(): # Optional: only calculate/fill if NaNs are present
    impute_value = df['MyNumericColumn'].median() # Step 2: Calculate median (or mean) on the now-clean, numeric column
    df['MyNumericColumn'].fillna(impute_value, inplace=True) # Step 3: Fill NaNs
&#35; For a fixed fill value (e.g., 0 for inquiry):
&#35; df['InquiryColumn'].fillna(0, inplace=True)
  1. CRITICAL: SEPARATE DATA CALCULATION FROM EXCEL I/O. For complex scripts involving multi-step data processing and reporting, ALWAYS separate the calculation and transformation logic (e.g., creating summary DataFrames, running statistical tests) from the Excel interaction logic (e.g., writing DataFrames to ranges, adding tables, formatting cells). - Calculation Functions: Should take DataFrames/parameters and RETURN transformed DataFrames or Python data structures. They should NOT directly interact with xw.Sheet or xw.Range objects. - I/O Functions (or blocks): Should take the calculated DataFrames/structures and a xw.Sheet/xw.Range object, then perform the writing and formatting operations. - This separation enhances testability, modularity, readability, and simplifies debugging by clearly isolating data processing errors from Excel API interaction errors.

  2. ALWAYS ENSURE RE-RUNNABILITY: Before creating a new sheet with book.sheets.add('Name'), you MUST explicitly check if a sheet with that name already exists and delete it. Use a robust loop-based check (if s.name == name: s.delete()) rather than if name in book.sheets, which can be flaky.

3.1 Script Robustness & Reliability

Therefore, to access a table and its parent sheet, you MUST include and use the following helper function. This function's signatureβ€”returning both the sheet and the tableβ€”is mandatory.

# THIS HELPER FUNCTION IS MANDATORY FOR ALL SCRIPTS ACCESSING TABLES. The import and full type hints make this snippet robust and self-contained. Use it asis copy-paste.

from typing import Tuple, Optional, Any

def find_table_in_workbook(book: xw.Book, table_name: str) -> Tuple[Optional[xw.Sheet], Any]:
"""
Searches all sheets for a table and returns both the sheet and table objects.
Returns: (xw.Sheet, xw.Table) or (None, None) if not found.
"""
for sheet in book.sheets:
if table_name in sheet.tables:
return sheet, sheet.tables[table_name]
return None, None

Mandatory Usage Pattern: - Incorrect (WILL FAIL): table = find_table(...) followed by table.sheet - Correct (Robust): source_sheet, my_table = find_table_in_workbook(book, 'MyTable')

A. The Direct Method (.groupby().agg()): For Simple, Non-Pivoted Summaries This method is efficient and safe only when the output of the aggregation does not need to be reshaped. Use this when: You need a simple summary where the grouping variable remains as the index (or a column after .reset_index()). Example: Calculating average balance per lead category.

# SAFE AND CORRECT for this use case.
# The output structure is simple and predictable.
summary_df = df.groupby('LEADS_CAT').agg(
Count=('CUST_ID', 'count'),
Avg_Balance=('CURR_BAL', 'mean')
).reset_index()

# This result is stable and can be written directly to Excel.

B. The Robust Iterative Method (list-of-dicts): Mandatory for Pivoted/Reshaped Summaries This method MUST be used whenever the final report structure requires reshaping, such as turning unique row values into columns (pivoting). This is common in comparison reports (e.g., Test vs. Control). Use this when: The final table's columns are derived from the values of a column in the source data (e.g., 'TEST' and 'CONTROL' columns derived from the 'GROUP' column). Example: Creating the Test vs. Control numeric profile.

# BRITTLE - DO NOT USE: The .unstack() call is fragile and will fail if a group is missing.
# failed_df = df.groupby(['GROUP', 'Variable']).agg(...).unstack()

# ROBUST AND MANDATORY for this use case.
report_rows = []
for var in numeric_vars:
# Safely get stats for each group
test_stats = df[df['GROUP'] == 'TEST'][var].describe()
control_stats = df[df['GROUP'] == 'CONTROL'][var].describe()

# Build a dictionary row-by-row, which is predictable and safe.
report_rows.append({
'Variable': var,
'Metric': 'mean',
'Test_Group_Value': test_stats.get('mean', 0),
'Control_Group_Value': control_stats.get('mean', 0)
})

# The final DataFrame is built from a simple, stable list structure.
final_df = pd.DataFrame(report_rows)

Incorrect (WILL CRASH): The Value key has mixed types.

report_rows = [
{'Metric': 'BScore', 'Value': '< 600', 'Count': 10},
{'Metric': 'BScore', 'Value': 600, 'Count': 5}, # <-- This number will cause a crash
]
df = pd.DataFrame(report_rows)
sheet["A1"].value = df
# This next line will likely cause a GeneralException
sheet.tables.add(source=sheet["A1"].expand())

Correct (Robust): All values are explicitly converted to strings.

report_rows = [
{'Metric': 'BScore', 'Value': '< 600', 'Count': 10},
{'Metric': 'BScore', 'Value': str(600), 'Count': 5}, # <-- Safely converted to string
]
df = pd.DataFrame(report_rows)
sheet["A1"].value = df
# This will now work reliably
sheet.tables.add(source=sheet["A1"].resize(df.shape[0] + 1, df.shape[1]))

To prevent this script-halting error, you MUST use the duplicates='drop' parameter. This tells pandas to gracefully create fewer bins if necessary, rather than crashing.

# INCORRECT (Brittle): This will crash if df['SCORE'] has few unique values.
# This line WILL FAIL on certain datasets.
df['score_tertile'] = pd.qcut(df['SCORE'], 3, labels=False)

# CORRECT (Robust): This is the mandatory, professional pattern.
# By adding duplicates='drop', the script becomes robust to any data distribution.
df['score_tertile'] = pd.qcut(df['SCORE'], 3, labels=False, duplicates='drop')

Incorrect (Brittle): The P-Value key gets mixed types.

# This will create a column with mixed floats and strings, causing a crash.
report_rows = [
{'Metric': 'mean', 'Value': 105.3, 'P-Value': 0.04},
{'Metric': 'std', 'Value': 15.1, 'P-Value': ''}, # <-- Fails here!
]
df = pd.DataFrame(report_rows)
# This write operation will fail with 'invalid argument'.
sheet["A1"].value = df

Correct (Robust): Use None for missing numeric values.

# This creates a clean numeric column with NaN for missing values.
report_rows = [
{'Metric': 'mean', 'Value': 105.3, 'P-Value': 0.04},
{'Metric': 'std', 'Value': 15.1, 'P-Value': None}, # <-- Safe!
]
df = pd.DataFrame(report_rows)
# This write will succeed.
sheet["A1"].value = df

3.2 Formatting & Readability

# ALWAYS use hex strings for both background and font colors.
# RGB tuples for .color are not supported and WILL cause a ValueError.
header_range.color = '#F0F0F0'         # Light gray background
header_range.font.color = '#000000'    # Black text

3.3 Known Limitations to Acknowledge

3.4 Robust Data Writing: Preventing Silent InvalidArgument API Errors

The Problem: "Ghost" Errors After Successful Logs

A particularly difficult bug in xlwings Lite occurs when the Python console log shows that a script has completed successfully (βœ… SUCCESS…), but a generic InvalidArgument error still appears in the Excel task pane UI. This happens when Python successfully sends a command, but the underlying Excel JavaScript API fails to execute it.

This error is most commonly triggered when writing small, non-DataFrame data structures, like Chi-Square statistics or summary values.

The Root Cause: Unstable vs. Stable Write Operations

The core issue lies in the distinction between two types of write operations:

Mandatory Pattern: Use the Right Tool for the Write

To avoid this entire class of bugs, all write operations MUST adhere to the following patterns.

For Tabular Data (any data that is or can be a pd.DataFrame): - ALWAYS write the entire DataFrame object in a single, efficient operation.

&#35; CORRECT AND PROFESSIONAL (Fast, Reliable)
&#35; This pattern is used for writing the main numeric and categorical profile tables.
summary_df = pd.DataFrame(...)
sheet["A1"].value = summary_df

For Small, Non-DataFrame Data (e.g., summary stats, key-value pairs): - NEVER use the unstable 2D list write. - ALWAYS write this data one cell at a time. This is the only guaranteed-reliable method for this specific use case.

&#35; INCORRECT (Unstable, known to cause silent API errors)
# This was the exact cause of the bug in the Chi-Square stats write.
&#35; stats_list = [["Chi-Square Statistic:", 1.6072], ["P-Value:", 0.8075]]
&#35; sheet["A20"].value = stats_list

&#35; CORRECT AND PROFESSIONAL (Always Reliable)
# This pattern is now the mandatory way to write non-DataFrame data.
chi2_value = 1.6072
p_value = 0.8075
sheet["A20"].value = "Chi-Square Statistic:"
sheet["B20"].value = chi2_value
sheet["A21"].value = "P-Value:"
sheet["B21"].value = p_value

By strictly distinguishing between these two data structures and using the correct, stable write method for each, we can ensure our scripts are robust and professional.

3.5 Best Practice: Separate Calculation from I/O

For complex scripts, adopt a two-phase structure to improve robustness and simplify debugging:

Phase 1: Calculation

Perform all data loading, cleaning, analysis, and DataFrame creation in memory. Use print() statements to log progress. At the end of this phase, you should have all your final DataFrames ready.

Phase 2: Writing

In a single, final block of code, write all the prepared DataFrames and values to Excel.

This separation prevents a failure during an early write operation from leaving the workbook in a partially updated, corrupted state. It also makes it easier to identify whether an error is in your Python logic or in the interaction with Excel.

Example:

@script
def analyze_portfolio(book: xw.Book):
    # PHASE 1: CALCULATION
    print("πŸ“Š Loading and analyzing data...")

    # Load data
    sheet, table = find_table_in_workbook(book, 'Portfolio')
    df = table.range.options(pd.DataFrame, index=False).value

    # Perform all calculations
    summary_stats = calculate_summary_stats(df)
    risk_metrics = calculate_risk_metrics(df)
    allocation_df = calculate_allocations(df)

    print("βœ… All calculations complete.")

    # PHASE 2: WRITING
    print("πŸ“ Writing results to Excel...")
    try:
        # Create results sheet
        results_sheet = book.sheets.add('Portfolio_Analysis')

        # Write all results in sequence
        results_sheet["A1"].value = "Portfolio Analysis Results"
        results_sheet["A3"].value = summary_stats
        results_sheet["A10"].value = risk_metrics
        results_sheet["A20"].value = allocation_df

        print("βœ… All results written successfully.")

    except Exception as e:
        print(f"❌ Error writing results: {e}")

4. Compatibility

Supported: Windows/macOS (Microsoft 365, Office 2021+), Excel on the Web (any modern browser).

6. Custom Functions

6.1 Basic Syntax

from xlwings import func

@func
def hello(name):
    return f"Hello {name}!"

Call in Excel with: =HELLO("World") or =HELLO(A1)

6.2 Working with DataFrames

import pandas as pd
from xlwings import func, arg, ret

@func
@arg("df", pd.DataFrame)
@ret(index=False, header=False)
def correl2(df):
    return df.corr()

6.3 Type Hints Support

from xlwings import func
import pandas as pd

@func
def myfunction(df: pd.DataFrame) -> pd.DataFrame:
    return df

6.4 Variable Arguments

from xlwings import func, arg

@func
@arg("*args", pd.DataFrame, index=False)
def concat(*args):
    return pd.concat(args)

6.5 Documentation

from xlwings import func, arg

@func
@arg("name", doc='A name such as "World"')
def hello(name):
    """This is a classic Hello World example"""
    return f"Hello {name}!"

6.6 Date/Time Handling

import datetime as dt
from xlwings import func

@func
@arg("date", dt.datetime)
def process_date(date):
    return date

# For multiple dates in a range
import xlwings as xw
@func
def process_dates(dates):
    return [xw.to_datetime(d) for d in dates]

# For DataFrames with dates
@func
@arg("df", pd.DataFrame, parse_dates=[0])
def timeseries_start(df):
    return df.index.min()

6.7 Robustness and Debugging for Custom Functions

CRITICAL FOR AI CODERS: The most common point of failure for custom functions (@func) is the #VALUE! error. This is almost always caused by a data type mismatch between the Excel cell and the function's argument type hint. This "pre-emptive type conversion" by the xlwings engine occurs before your Python code's try...except block can catch the error.

6.7.1 The Mandatory Pattern for Robust Custom Functions

To prevent these failures, all functions that accept arguments from Excel cell references MUST follow this pattern:

  1. Use typing.Any as the type hint. This instructs the xlwings engine to pass the value as-is, without attempting a risky pre-conversion.
  2. Perform data type conversions safely inside the function using a robust helper function.

Incorrect (Brittle) Approach: Fails if value is blank or text.

@func
def my_function(value: float):
    &#35; This code is never reached if the pre-conversion fails.
    return value * 2

Correct (Robust) Approach: Handles any input gracefully.

from typing import Any

def _to_float(value: Any, default_if_error: float) -> float:
    """Safely converts any value (int, str, None) to a float."""
    if value is None:
        return default_if_error
    try:
        return float(value)
    except (ValueError, TypeError):
        return default_if_error

@func
def my_function(value: Any):
    # The conversion is now handled safely inside our code.
    numeric_value = _to_float(value, 0.0)
    return numeric_value * 2

6.7.2 Troubleshooting #VALUE! in Custom Functions

Follow this exact debugging sequence: 1. Check for Syntax Errors: First, confirm the function is recognized by Excel's autocomplete. If not, there is likely a syntax error in main.py (e.g., a misplaced import) preventing the file from loading. Test by replacing the entire file with a minimal function.

  1. Use the Minimal Debug Function: To diagnose data type issues, use this universal test function. It bypasses all conversion issues and reports exactly what Python receives.
from typing import Any

@func
@arg("CELL_VALUE", doc="A single cell to test.")
def final_debug_test(CELL_VALUE: Any) -> str:
    """Receives any value and reports its type and string representation."""
    try:
        value_type = type(CELL_VALUE).__name__
        str_value = str(CELL_VALUE)
        return f"Success! Type is '{value_type}', Value is '{str_value}'"
    except Exception as e:
        return f"Error: {e}"
  1. Analyze and Implement: Apply the =FINAL_DEBUG_TEST(A1) formula. The output (Success! Type is 'int', …) will reveal the data types. Refactor the failing function using the mandatory robust pattern from section 6.7.1.

6.8 Why Custom Scripts (@script) Avoid This Issue

This data type issue primarily affects custom functions (@func) and not scripts (@script) due to their fundamentally different data processing models:

Aspect Custom Script (@script) Custom Function (@func)
Data Flow Bulk Operation (entire tables/ranges) Scalar Operation (one cell per argument)
Conversion Engine pandas DataFrame converter Direct "Argument Marshalling" Bridge
Robustness High. Pandas is designed to handle messy, mixed-type data and infer column dtypes without crashing. Low (by default). Prone to failure if a cell's type doesn't match the argument's type hint.
Solution Use pd.to_numeric after loading data into a DataFrame. Use the typing.Any pattern to handle conversion manually.

7. Custom Scripts

7.1 Basic Syntax

Custom Scripts in xlwings Lite are Python functions that run at the click of a button and have access to the Excel object model. They are equivalent to VBA Subs or Office Scripts.

import xlwings as xw
from xlwings import script

@script
def hello_world(book: xw.Book):
    sheet = book.sheets[0]
    sheet["A1"].value = "Hello xlwings!"

7.2 Running Scripts

7.3 Sheet Buttons

@script(button="[xlwings_button]Sheet1!B4", show_taskpane=True)
def hello_world(book: xw.Book):
    # your code here

Note: Button clicks change cell selection, so don't use for scripts that depend on selected cells.

7.4 Configuration Options

@script(
    include=["Sheet1", "Sheet2"],  # Only include these sheets' content
    exclude=["BigData"],  # Exclude these sheets' content
    button="[mybutton]Sheet1!A1",  # Sheet button configuration
    show_taskpane=True  # Show taskpane when button clicked
)

7.5 Tips and Troubleshooting

7.5.1 InvalidArgument Troubleshooting (Diagnose in Order)

Cause 1: Mixed Data Types (Most Common) β†’ See Section 3.3 for examples. Use None not '', or .astype(str) for mixed columns.

Cause 2: Sheet Renaming Contamination Script was working, now fails after user manually renamed an output sheet (e.g., RESULTS β†’ RESULTS_v1). - Fix: Ask user to delete the renamed sheet and re-run - Prevention: Use "Move or Copy β†’ Create a copy" instead of renaming

Cause 3: API Instability (Last Resort) Symptom: Script logs success, but Excel UI shows generic error OR object of type 'int' has no len(). This specific error indicates the Excel JS API has returned an invalid handle.

If Causes 1 & 2 don't apply, force an API sync before tables.add() or .number_format:

sheet["Z1"].value = "SYNC"  # Force write
_ = sheet["Z1"].value       # Force read roundtrip
sheet.tables.add(source=my_range)  # Now this works
sheet["Z1"].value = ""      # Cleanup

8. Comprehensive Guide to Limitations & Unsupported Features

This section provides a consolidated overview of all known limitations in xlwings Lite as of June 2025. Understanding these constraints is crucial for effective development.

8.1 Pyodide and Environment Constraints

xlwings Lite runs on Pyodide, which imposes several environment-level restrictions: - Python Version: The Python version is fixed by the specific Pyodide distribution used in the add-in. - Memory Limit: There is a 2GB memory limit for the Python environment. - Debugging: There is no debugger support. Use print() statements to the Output Pane for debugging. - Concurrency: multiprocessing and threading are not supported. - Package Availability: Only packages that are pure Python or have been specifically compiled for the Pyodide environment can be used. Check the official Pyodide packages list for availability. - Network Connections: Direct TCP/IP sockets are not available. This means: - No direct connections to databases like PostgreSQL, MySQL, etc. (must use a web API layer). - All HTTP requests are subject to browser CORS (Cross-Origin Resource Sharing) policies.

8.2 Detailed Unsupported API Features (Anti-Hallucination List)

The following standard xlwings methods are NOT implemented in Lite. Do not attempt to use them.

Object Unsupported Methods/Properties (Common)
App quit(), screen_updating, calculation, display_alerts
Book save(), to_pdf(), close()
Sheet copy(), autofit(), used_range (Use find_table helper instead), select()
Range copy(), paste(), merge(), unmerge(), autofill(), formula_array, to_png()
Chart set_source_data() (Must recreate chart to update), to_png/pdf
Shape delete(), duplicate()
Table table.sheet (Fails silently - use find_table_in_workbook() helper)
Font Can SET (bold, color, size), but CANNOT GET/READ any font properties
General .api property (No direct access to Excel JavaScript API)

General Rule: If you get NotImplementedError, the feature isn't available in Lite. Check the xlwings Lite docs for current status.

8.3 Planned Future Enhancements

The following features are on the development roadmap but are not yet available as of June 2025.

Note: When users request unavailable features, guide them to use available workarounds, consider alternative approaches, and watch for updates in newer versions.

9. Connecting to External Data & APIs

This section details how xlwings Lite interacts with external data sources, including web APIs and databases. Due to its browser-based environment (Pyodide), direct database connections are not supported; all interactions must occur via web APIs.

9.1 Working with Web APIs

xlwings Lite supports common Python HTTP libraries and Pyodide's native pyfetch for making web requests.

  1. Supported Libraries:

    • requests: For synchronous HTTP requests.
    • httpx, aiohttp: For asynchronous HTTP requests (requires async/await syntax).
    • pyfetch: Pyodide's native asynchronous JavaScript fetch wrapper.

    Synchronous with requests

    import requests response = requests.get("https://api.example.com/data")

    Async with aiohttp

    import aiohttp async with aiohttp.ClientSession() as session: async with session.get("https://api.example.com/data") as response: data = await response.json()

  2. Handling API Responses:

    • For FastAPI servers returning file responses, use await response.text() to extract content.
    • Pipe-delimited data (common in FastAPI file responses) can be parsed by splitting lines with .split(" ") and columns with .split("|").
    • When working with RexDB server responses, process them as plain text rather than attempting to parse as JSON.
  3. Best Practices for Web API Requests:

    • Always use HTTPS for API requests.
    • Handle errors gracefully with try...except blocks.
    • Log detailed error information for debugging.
    • Consider implementing request retries for reliability.

9.2 Connecting to Databases via an API Layer

Direct SQL database connections are not supported in xlwings Lite (no TCP sockets in browser). All database interactions must go through a web API layer.

Options: - Custom API: Build with FastAPI/Flask for full control - Ready-to-Use REST APIs: PostgREST, Supabase (PostgreSQL), Oracle ORDS, MySQL REST Service, NocoDB - SQLite: Can download .db file and process locally with sqlite3 (add to requirements.txt)

10. Security Best Practices

Security is paramount when working with xlwings Lite, especially given its browser-based execution environment. This section outlines best practices for managing sensitive information and securing your API interactions.

10.1 Environment Variables for Secrets

xlwings Lite runs in a secure browser sandbox and cannot directly access local system environment variables. It provides two ways to set environment variables:

Setting Environment Variables: 1. In the xlwings Lite add-in, navigate to the Environment Variables settings. 2. Provide the Name (e.g., OPENAI_API_KEY), Value (e.g., your-key), and select the desired Scope (Add-in or Workbook). 3. Click Save. 4. Restart xlwings Lite for changes to take effect.

Using Environment Variables in Code:

import os
import xlwings as xw
from xlwings import func, script

@script
def sample_script(book: xw.Book):
    key = os.getenv("OPENAI_API_KEY")
    if key is not None:
        print(key)
    else:
        raise Exception("Store your OPENAI_API_KEY key under Environment Variables!")

Important Notes: - Add-in scope overrides Workbook scope variables if names conflict. - Always back up important add-in scope variables. - Restart xlwings Lite after setting new variables to ensure they are loaded.

10.2 Cross-Origin Resource Sharing (CORS)

Since xlwings Lite runs in the browser, all HTTP requests are subject to CORS policies.

Key Points: - Requests originate from https://addin.xlwings.org (include this in your server's Access-Control-Allow-Origin) - Requests go directly from user's browser to your API server (not through xlwings.org) - For IP whitelisting, whitelist the client's IP, not xlwings.org servers - Include Authorization header in pyfetch for token-based auth

11. Python Dependencies Management

Packages are managed via requirements.txt tab in the editor. Packages must be Pyodide-compatible (check list).

Version Pinning Rules (IMPORTANT):

# Pure Python packages - PIN versions
xlwings==0.33.14
requests==2.31.0

&#35; Pyodide-provided packages - DON'T PIN (use Pyodide's version)
pandas
numpy

Notes: Restart xlwings Lite after changing packages. Private packages can use direct wheel URLs.

12. Latest Features (as of June 2025)

Recent updates have added several important capabilities:

  1. Self-Hosting Support (June 2025): - Build custom Docker images - Include additional packages - Self-host the add-in

  2. Sheet Button Support (May 2025): - Create clickable buttons on sheets - Configure with button parameter - Requires xlwings 0.33.14+

  3. Performance Optimizations (May 2025): - include/exclude configuration for scripts - Control workbook data transfer - Optimize for large workbooks

  4. Font Formatting (April 2025): - Can now set font properties:

    • bold, italic, color
    • size, name
    • Note: Cannot read font properties
  5. Polars Support (April 2025): - Native converter for Polars DataFrame and Series - Use import polars as pl (standard Pyodide package) - sheet["A1"].value = pl_df works directly without converting to pandas - Can offer performance gains over pandas in browser environment

  6. Bug Fixes and Improvements: - Better error tracebacks in output pane - Fixed Range.expand()

13. Example Scripts (Quick Reference Patterns)

Note: These examples focus on xlwings Lite-specific patterns only. For full working scripts, refer to your separate reference files.

13.1 Core Script Pattern

@script
def my_script(book: xw.Book):
    &#35; 1. Get data from table (ALWAYS use find_table_in_workbook helper)
    sheet, table = find_table_in_workbook(book, 'MyTable')
    df = table.range.options(pd.DataFrame, index=False).value

    &#35; 2. Process data (pure Python/pandas)
    result_df = df.groupby('Category').agg({'Value': 'sum'}).reset_index()

    &#35; 3. Write results - delete existing sheet first for re-runnability
    for s in book.sheets:
        if s.name == 'Results': s.delete()

    new_sheet = book.sheets.add(name='Results')
    new_sheet["A1"].options(index=False).value = result_df

    &#35; 4. Create table with EXPLICIT range sizing (NEVER use .expand() on new data)
    table_range = new_sheet["A1"].resize(result_df.shape[0] + 1, result_df.shape[1])
    try:
        new_sheet.tables.add(source=table_range)
    except Exception as e:
        print(f"Table creation failed: {e}")

    new_sheet.activate()

13.2 Formatting (Hex Colors Only)

header_range.color = '#4472C4'        # CORRECT: hex string
header_range.font.color = '#FFFFFF'
header_range.font.bold = True
&#35; WRONG: header_range.color = (68, 114, 196)  # RGB tuples raise ValueError

13.3 Async API Pattern (pyfetch for External Data)

from pyodide.http import pyfetch

@script
async def fetch_external_data(book: xw.Book):
    response = await pyfetch(
        "https://api.example.com/data",
        method="GET",
        headers={"Accept": "application/json"}
    )
    if response.ok:
        data = await response.json()
        # Process and write to Excel...
    else:
        print(f"API Error: {response.status}")

13.4 Writing Non-DataFrame Data (Cell-by-Cell)

# WRONG: 2D list writes are unstable in Lite
&#35; sheet["A1"].value = [["Label", value], ["Label2", value2]]

&#35; CORRECT: Write atomically, one cell at a time
sheet["A1"].value = "Chi-Square:"
sheet["B1"].value = chi2_value
sheet["A2"].value = "P-Value:"
sheet["B2"].value = p_value

13.5 Mixed Data Type Prevention

# WRONG: Mixed types cause InvalidArgument
report_rows = [
    {'Metric': 'Score', 'Value': '< 600'},
    {'Metric': 'Score', 'Value': 600},      # Number mixed with string!
]

&#35; CORRECT: Convert entire column to string before writing
df = pd.DataFrame(report_rows)
df['Value'] = df['Value'].astype(str)
sheet["A1"].options(index=False).value = df

13.6 Matplotlib Charts to Excel via pictures.add()

import matplotlib.pyplot as plt
import tempfile
import os

# 1. Create chart and save to temp file
fig, ax = plt.subplots(figsize=(8, 6))
ax.plot(df['DATE'], df['CLOSE'])
temp_dir = tempfile.gettempdir()
chart_path = os.path.join(temp_dir, "my_chart.png")
fig.savefig(chart_path, dpi=150, bbox_inches='tight')
plt.close(fig)

# 2. Add to Excel sheet
charts_sheet.pictures.add(chart_path, name="MyChart", update=True, anchor=charts_sheet["A1"])
&#35; - update=True: Replaces existing picture with same name (re-runnable)
&#35; - anchor: Cell reference for top-left corner positioning

13.7 Reading Variable-Length Data with .expand('down')

&#35; .expand() is SAFE for READING existing data (problematic only when WRITING new data)
names_range = sheet.range("A3").expand('down')  # Expands down to last non-empty cell
values = names_range.value  # Returns list if multiple values, single value if one cell

# Always handle both cases
if not isinstance(values, list):
    values = [values]  # Normalize to list

13.8 Robust API Data Parsing Pattern (Pipe-Delimited)

&#35; Use this helper when handling text/pipe-delimited responses from Web APIs
# Common when using FastAPI intermediaries to avoid JSON overhead or CORS issues
def _parse_pipe_delimited(text_content):
    """Parses pipe-delimited text response into a DataFrame. Robust to empty lines."""
    lines = text_content.strip().split("\n")
    if not lines or not lines[0]:
        return pd.DataFrame()

    headers = [h.strip() for h in lines[0].split("|")]
    data_rows = [
        [cell.strip() for cell in line.split("|")]
        for line in lines[1:] if line.strip()
    ]

    df = pd.DataFrame(data_rows, columns=headers)
    return df.apply(pd.to_numeric, errors='ignore')  # Auto-convert numeric columns

# Usage:
response = await pyfetch("https://myapi.com/data")
text_content = await response.text()
df = _parse_pipe_delimited(text_content)