This is all super far away, work in progress, and very high level. However, I wanted to have a place where I'll put down my vision and let everybody comment and provide their inputs.
Modernize
- Add type hints
- Enums: use enums instead of simple strings. Should be mostly backward compatible.
- Smaller syntax changes to be more in line with Office Scripts/Google Apps Script rather than VBA
- Exception handling: sort of not really existing right now, ideally will raise the same exceptions on all platforms
- Speed: some things like value cleaning/expand() should be able to be implemented in a more efficient way with reasonable effort
Breaking Changes
-
Refactor converters as they don't play nicely with type hints, probably something like this (current behavior could remain supported for backward compatibility:
myrange.value # list of lists
myrange.options(numbers=int).value # list of list with options
myrange.to_pd() # pandas DataFrame
myrange.to_pd(index_col=0) # pandas DataFrame with options
myrange.to_pd(fmt="series") # pandas Series (not sure on the API yet)
myrange.to_pl() # polars DataFrame
# etc.
-
Require pip install "xlwings[standard]" to have the default dependencies (pywin32/appscript+psutils) installed on Windows/macOS. This will make it easier to install xlwings w/o deps in edit mode (previously done via python setup.py develop)
-
Replace Book.caller() with something like def myfunc(book: xw.Book):. This is how it's already done in xlwings Server.
-
Align DataFrame reading with pandas, i.e., set index=False, or rather, replace it with index_col, tbd if the current default behavior for column headers should remain.
-
Decide and document 0 vs 1 based indexing. Might be a breaking change. Office Scripts and Google Apps Script have taken different decisions:
// Office Scripts
mybook.getWorksheets()[0].getPosition() // 0
// Google Apps Script
mybook.getSheets()[0].getIndex() // 1
Currently, mysheet.index etc. returns a 1-based index. Probably a good idea (non-breaking change) idea would be to follow Jinja2 with .index and index0.
-
Return all integers as ints instead of floats. All engines except pywin32 and appscript do this automatically.
-
PiE returns vectors as vertical ranges while xlwings returns them horizontally. Maybe something to consider or maybe not as horizontal seem more intuitive to me:
# 2d corresponds to the Excel layout
[[1, 2],
[3, 4]]
# Taking out the first row still corresponds to the Excel layout with xlwings
[1, 2]
-
Fix constants/enums to drop the leading xl and probably make it snake_case
-
Make adding help text to args of custom functions easier than @arg(doc="..."), possibly by parsing the function's docstring using a specific convention (Google or NumPy style, or maybe support all docstring styles).
-
Switch ndim from the current behavior to ndim="natural", which leaves columns in 2d notation. The previous behavior can be activated via ndim="squeeze".
Add Features
- Add support for multi-ranges
sheet.range("A1", "C1"). This is helpful for applying single-cell properties efficiently, such as applying a background color to different areas in one go.
- Add support for reading/writing files by wrapping OpenPyXL etc. xlwings Reader is already available, only the writer is missing.
- Add support for manipulating Google Sheets via Sheets API and Excel via Graph API
- Maybe make a minimal xlwings Server a drop-in replacement for current approaches (autostart on localhost). This will enable:
- UDFs on macOS and Google Sheets
- Handling UDFs with Errors (#N/A) properly
- No more OneDrive issues as the connection doesn't depend on the path
- No more COM issues (zombie processes etc.)
- No pywin32 dependency for RunPython/UDFs (pywin32 will only be used for from-Python-to-Excel automation)
- Can be used locally or via server
- RunPython may hit limits though with size
- Better configuration in general, including support for env vars and function args in addition to current possibilities. Use a standard config file format (maybe via Pydantic).
- Add possibility to select a script to run in the add-in, similar to how it's done in Office Scripts (Automate tab).
- Decide whether the current quickstart behavior (creates a Python file with the same name in the same directory) is the best default approach.
- Make add-in play nicer with anti-virus by offering versions that don't have the dll/shell calls but instead work via xlwings Server
- Document
None vs "" for empty cells: None has been chosen so there's no additional transformation needed when turning number data into a NumPy array/DataFrame.
- Decide and document whether the current behavior of returning scalars/ 1-dim lists is the way to go or if ndim=2 should be the default for everything.
- Allow to access 'legacy' sheet types like chart sheets -- they are currently ignored in
mybook.sheets (shows only worksheets). Maybe via mybook.all_sheets.
Declutter
- Remove support for other office apps (Access etc.)
- Remove REST API
- Remove round parenthesis indexing/slicing, e.g.,
mysheet(1) in favor of mysheet[0] (There should be one-- and preferably only one --obvious way to do it)
- Remove support for legacy dynamic arrays on Windows UDFs
Refactor
- Split
main.py/ cli.py into modules
- Refactor unit tests (use pytest and make them fast)
- replace
rst with md
- refactor
setup.py to pyproject.toml (requires dll to be stored somewhere else, buth they should not have to be next to the python.exe anymore)
This is all super far away, work in progress, and very high level. However, I wanted to have a place where I'll put down my vision and let everybody comment and provide their inputs.
Modernize
Breaking Changes
Refactor converters as they don't play nicely with type hints, probably something like this (current behavior could remain supported for backward compatibility:
Require
pip install "xlwings[standard]"to have the default dependencies (pywin32/appscript+psutils) installed on Windows/macOS. This will make it easier to install xlwings w/o deps in edit mode (previously done viapython setup.py develop)Replace
Book.caller()with something likedef myfunc(book: xw.Book):. This is how it's already done in xlwings Server.Align DataFrame reading with pandas, i.e., set
index=False, or rather, replace it withindex_col, tbd if the current default behavior for column headers should remain.Decide and document 0 vs 1 based indexing. Might be a breaking change. Office Scripts and Google Apps Script have taken different decisions:
Currently,
mysheet.indexetc. returns a 1-based index. Probably a good idea (non-breaking change) idea would be to follow Jinja2 with.indexandindex0.Return all integers as ints instead of floats. All engines except pywin32 and appscript do this automatically.
PiE returns vectors as vertical ranges while xlwings returns them horizontally. Maybe something to consider or maybe not as horizontal seem more intuitive to me:
Fix constants/enums to drop the leading
xland probably make it snake_caseMake adding help text to args of custom functions easier than
@arg(doc="..."), possibly by parsing the function's docstring using a specific convention (Google or NumPy style, or maybe support all docstring styles).Switch
ndimfrom the current behavior tondim="natural", which leaves columns in 2d notation. The previous behavior can be activated viandim="squeeze".Add Features
sheet.range("A1", "C1"). This is helpful for applying single-cell properties efficiently, such as applying a background color to different areas in one go.Nonevs""for empty cells:Nonehas been chosen so there's no additional transformation needed when turning number data into a NumPy array/DataFrame.mybook.sheets(shows only worksheets). Maybe viamybook.all_sheets.Declutter
mysheet(1)in favor ofmysheet[0](There should be one-- and preferably only one --obvious way to do it)Refactor
main.py/cli.pyinto modulesrstwithmdsetup.pytopyproject.toml(requires dll to be stored somewhere else, buth they should not have to be next to the python.exe anymore)