Round 13 comparison found five shape rendering gaps:
- a:xfrm rot on standalone shapes was only applied when the shape lived
inside a wpg:wgp group; inline shapes rendered upright regardless.
Rotation now applies in both code paths.
- wps:bodyPr anchor=ctr/b vertical text alignment only worked for group
members; standalone shapes ignored it. Now applied in both paths.
- prstGeom prst=ellipse/oval rendered as a solid rectangle. Emit
border-radius:50% so the shape reads as an oval; prst=roundRect gets
a 12px radius approximation.
- a:gradFill (solid gradient) was dropped — shape appeared with no
background. Now emit CSS linear-gradient from gsLst stops (pos in
1/1000-percent) with angle converted from OOXML 60000ths to CSS deg.
Deferred: exotic prstGeom (line, arrow, callout) need SVG authoring,
documented in KNOWN_ISSUES.md as a future pass.
Round 12 comparison found four picture-level visual effects that were
silently dropped in the HTML preview:
- a:xfrm rot (rotation in 60000ths of a degree) — now emits CSS
transform:rotate(Xdeg) on the <img>
- a:xfrm flipH/flipV — now emits transform:scaleX(-1) / scaleY(-1),
combined with rotate when both present
- a:ln (picture outline) — now emits CSS border with width converted
from EMU to px and srgbClr mapped to a hex color
- a:effectLst a:outerShdw — now emits box-shadow with offset/blur
computed from dir (degrees) and dist/blurRad (EMU)
Existing crop (a:srcRect) handling is preserved and effects are
composed through both the cropped and uncropped image render paths.
Two more render gaps caught by comparison testing:
- <w:fldChar><w:ffData><w:ffCheckBox> form field checkboxes were
dropped entirely in the preview. Now emit ☑ (checked) or ☐ (unchecked)
based on w:default or w:checked state, matching Word's native glyph
in read-only previews.
- <w:w val="N"/> character horizontal scale (narrower/wider glyph
rendering) was ignored. Emit CSS transform:scaleX(N/100) with
display:inline-block so the scaled width is actually reserved.
- MergeRunProperties also merges CharacterScale now, matching the
pattern already used for Spacing, so style-inherited scale reaches
the renderer.
Deferred (complex, need dedicated work): numFmt variants beyond
decimal/lowerLetter/lowerRoman; header/footer titlePg+evenOdd;
right-aligned tab with non-dot leader; contextualSpacing boundary.
Render-comparison testing found several cell/revision rendering gaps:
- Tracked insertions (<w:ins>) previously rendered as plain text, losing
the author annotation. Now wrap in a .track-ins span with underline +
green color, with the author name in a tooltip.
- Tracked deletions (<w:del>) were dropped entirely, leaving the
reviewer unable to see what was removed. Now render the deleted text
inside a .track-del span with strikethrough + red color.
- Cell <w:textDirection> btLr/tbRl was ignored — text stayed horizontal
where Word rotates 90°. Emit CSS writing-mode:vertical-rl; btLr adds
a 180° rotation to flip the reading direction.
- Cell <w:noWrap/> was dropped — now emits white-space:nowrap so cell
content doesn't wrap.
Section <w:cols w:num="N"/> was previously ignored in the HTML preview —
all content rendered single-column regardless of the declared column
count. Now emit CSS on .page-body:
- column-count:N for num > 1
- column-rule:1px solid for w:sep="true"
- column-gap:Xpt from w:space (twips → pt)
Line-numbering (w:lnNumType) still TODO — requires per-line markers.
w:tab chars previously all rendered as a single em-space regardless of
paragraph tab stops, making 'Left\tCenter\tRight' visually collapse to
three adjacent words. Now:
- Track per-paragraph tab index in render context
- For each tab, look up the Nth declared tab stop and emit an
inline-block span with width equal to the distance from the previous
stop position
- Honor dot/hyphen/underscore leaders on positional stops via CSS
border-bottom patterns
- Fallback to 36pt (0.5in) when no stops are defined
TOC-style right-aligned dot-leader tabs still flow through the existing
dot-leader class path.
Render-comparison testing against native Word found several run-level
properties silently dropped or collapsed in HTML preview:
- Double strikethrough rendered identical to single (both as
text-decoration:line-through). Now adds text-decoration-style:double.
- Underline style variants (double/wave/dotted/dash/thick/*Heavy) all
collapsed to plain single underline. Mapped each to CSS
text-decoration-style and text-decoration-thickness.
- w:spacing (character spacing) was ignored. Emit letter-spacing in pt.
- Paragraph-add shortcut silently dropped outline/shadow/emboss/imprint/
vanish/rtl/noproof — only the run-add path honored them. Mirrored
the 7 missing handlers in the paragraph branch.
- MergeRunProperties never merged Spacing or the 6 effect props, so
even when written to XML they were dropped during effective-props
resolution and never reached the HTML renderer.
The 'Created: ... (resident started)' message now suggests running
officecli close when done, so agents/users can release the file lock
immediately instead of waiting 60s idle timeout.
Tables with no explicit <w:tblW> were rendered as width:100%, filling the
full page even when the <w:tblGrid> specified narrower column widths.
Native Word auto-fits such tables to content — compute width from
gridCol sum instead. Use max-width for auto layout (allows shrink),
width for fixed layout. Also handles tblW type=pct (percentage).
- FontMetricsReader: include hhea lineGap in ratio for accurate line height
- @font-face: add ascent-override/descent-override/line-gap-override
- Heading line-height uses font metrics ratio instead of "normal"
- Paragraph spacing collapse: subtract prev spaceAfter from spaceBefore
- contextualSpacing: suppress spacing between same-style adjacent paragraphs
- docGrid type=lines: snap line-height to linePitch multiples
- Support contextualSpacing property in set handler (paragraph + style)
Add mergefield as a first-class field type. Usage:
officecli add doc.docx "/body/p[1]" --type mergefield --prop fieldName=CustomerName
Placeholder text defaults to «fieldName» format (e.g. «CustomerName»).
https://claude.ai/code/session_013XdLypgxPSbNA428pzDXB3
R12-2 (fuzzer, MEDIUM): sheet-level sort dispatch early-returned when
rows.Count == 0, so `sort=XFE asc` / `sort=AAAA asc` on an empty sheet
silently returned "Updated" instead of rejecting the invalid column.
Move the empty-sheet no-op inside SortRangeRows so column validation
runs first, and tighten the XFD-overflow check to fire on any length
(was >= 4), catching 3-letter overflows like XFE/ZZZ.
R12-3 (fuzzer, LOW): `sort=asc` (column letter forgotten) produced a
misleading "Sort column ASC is outside the range A:B". Reject ASC/DESC
as column tokens up-front with a targeted "direction keyword, not a
column letter" error.
When user passes a header name like 'Score' as a sort column, the prior
error wording ('Sort column SCORE is outside the range A:B') misled AI
agents into guessing column letters rather than recognizing that header
names are unsupported. Detect column tokens that parse past XFD with
length >= 4 and return a targeted 'Column names are not supported; use
column letters (A, B, AA, up to XFD)' message. Genuine out-of-range
letters (e.g. Z in A:B) still return the original range-error wording.
Add OOXML-compliant dual representation for SVG images:
- main a:blip/@r:embed → PNG fallback part (auto or user-supplied)
- a:blip/a:extLst asvg:svgBlip → SVG image part
Modern Office (2016+) renders the SVG; older viewers see the raster
fallback. Introduces Core/SvgImageHelper + SVG dimension parsing in
ImageSource so width/height auto-sizing matches PNG/JPEG behavior.
Supports 'fallback=<path>' prop on Add to override the 1x1 transparent
PNG default. Set (path/src) symmetrically strips/attaches the extension
and deletes the orphaned SVG part when replacing across formats.
- Sheet-level sort case now calls DeleteCalcChainIfPresent, matching the
range-level sort path. Without this, a stale calc chain could survive
the reorder and expose Excel to a mid-state repair risk on open.
- Swap ws.Elements<SortState>() -> ws.Descendants<SortState>() in the
three sort-rewrite/clear sites so malformed files that nest <sortState>
under <sheetData> are also cleaned on rewrite, instead of leaving the
nested one behind and ending with two sortStates.
R7-1: physical sort comparer switched from Ordinal to OrdinalIgnoreCase so
mixed-case keys ("Apple"/"apple") land in an order consistent with the
sortState@caseSensitive=false metadata default and with Excel's own default.
R7-2: RewriteSidecarRefsAfterSort now also rewrites ProtectedRange sqref
(7th sidecar, same cell-anchored scoping as dataValidations / CF). Single-
cell tokens inside the sort rectangle follow row movement; range tokens
and out-of-rect tokens are preserved.
R7-3: all three SortState removal sites (sheet-level clear, range-level
clear, WriteSortState) iterate Elements<SortState>().ToList() instead of
GetFirstChild, so malformed files carrying duplicate sortState children
are fully collapsed to a single (or zero) element.
R7-4 (sortHeader default) rejected again with a CONSISTENCY(sort-header-default)
comment block at the dispatch site documenting the decision history and the
preferred future path (project-wide default flip, not a per-call heuristic
warning).
RewriteSidecarRefsAfterSort handled hyperlinks, comments, threaded
comments, dataValidations, and conditionalFormatting but ignored
DrawingsPart. Pictures, shapes, and charts anchored at a row inside
the sort range stayed pinned to the original 0-indexed RowId after
the data under them was reordered, leaving them visually attached
to the wrong content row.
Now TwoCellAnchor and OneCellAnchor FromMarker/ToMarker RowIds are
remapped through oldToNewRow alongside the other sidecars. Follows
the same partial-rect scoping as dataValidations / conditional
formatting: a TwoCellAnchor is remapped only when both From and To
rows fall inside the sort rectangle; if the anchor straddles the
boundary it is preserved verbatim. OneCellAnchor has only From, so
it moves whenever From is inside. Columns are never rewritten
because sort only permutes rows.
Limitation: anchors straddling the sort rect boundary remain
authored-as-is, consistent with how multi-cell dataValidation and
CF range tokens are handled.
Previously shown as A:asc,B:desc which fails at parse time — actual
sort spec is space-separated column+direction, comma-separated for
multi-key (e.g. 'Salary desc' or 'Dept asc, Salary desc'). AI agents
following the wrong example hit errors on every sort call.
Sorting A1:A1 or a range whose data region collapses to zero/one
rows is a logical no-op — there is nothing to reorder. The previous
code still wrote sortState in those branches, which made Excel UI
show a sort indicator on a range that was never actually sorted.
Skip WriteSortState in the two no-op paths so the UI stays honest.
ConditionalFormatting rules anchored on single cells (e.g. a highlight
rule on A2:A2) were left pointing at the pre-sort cell after sort,
so the rule followed a row that no longer existed there. Extend the
post-sort sidecar rewrite with a ConditionalFormatting branch that
mirrors the dataValidations handling: tokenize sqref, skip multi-cell
range tokens (same partial-rect scope limitation), and remap each
single-cell token inside the sort rectangle via oldToNewRow.
Row sort rewrote cell CellReference values in <sheetData>, but left
sidecar metadata untouched. Hyperlinks, comments, and single-cell
dataValidation sqref tokens continued to point at the old row positions —
so after sort the hyperlink/comment/validation appeared attached to a
different row of data.
Capture the old->new row mapping before mutating row indices, then rewrite
hyperlink ref, comment ref, and each single-cell token in dataValidation
sqref that falls inside the sort rectangle. Refs outside the rectangle
and multi-cell range tokens (e.g. A2:A10) that cross the sort boundary
are intentionally left untouched — splitting partial ranges would require
a more invasive rewrite.
Also rename the internal CellInRange helper to CellColumnInSortRange.
The name now accurately reflects that the check is column-only; row
containment is enforced by the caller iterating rowsInRange.
When a caller passes both merge=true and sort=... to set /Sheet1/A1:B3,
merge was applied first and wrote MergeCells into the XML, then
SortRangeRows rejected the merged region and threw, leaving the file in a
half-written state with an unwanted merge persisted.
Detect the combo at SetRange entry and throw before any write. Users who
need both must split the call. Consistent with the existing
'fail-before-write' precedent (merged-cell reject, formula reject).
The sheet-properties help block listed 'sort' but not the companion
'sortHeader' flag, even though the Set handler has consumed it since
sort landed. Add a one-line description next to sort.
double.TryParse("NaN") returns true, producing rank=0 (number), while
double.TryParse("1e999") overflows to +Infinity — also rank=0. The
resulting sort order mixed non-finite doubles with finite numbers in
ways Excel never does; Excel treats NaN / Infinity / -Infinity as
literal strings.
Classify those tokens (and any non-finite parse result) as rank=1
(string) so number/text ordering stays consistent with Excel.
ParseCellReference previously used int.Parse on the row portion of a
cell reference, which threw OverflowException on malformed inputs like
"A4294967295" (uint.MaxValue). The overflow bubbled all the way up as
an unhandled numeric exception with no document context.
Switch to long.TryParse and fold the range check into the same branch
so any row outside 1..1048576 — whether out-of-int-range or merely
out-of-Excel-range — produces a consistent ArgumentException with the
offending reference included.
Previously sort refused the operation when any formula lived anywhere on
a row that overlapped the sort range, and when any row in the whole
sheet had a duplicate RowIndex. Both checks were over-broad:
- Formulas in columns outside the sort column range are unaffected by
sort (the formula text and its refs stay intact even if the row moves).
- Duplicate RowIndex rows outside the sort row range cannot cause the
sort step to lose or misplace data.
Narrow both checks to cells/rows that actually intersect the sort
range. Missing RowIndex is still always rejected because such a row
cannot be located in any range and risks silent drop by the sort scan.
Replace per-cell inset box-shadow with a single absolutely-positioned
overlay div sized to the union rect of the selected cells. The previous
approach drew the selection frame via inset box-shadow, which rendered
visibly offset from the cell's visual edge in border-collapse tables
because adjacent cells share a 1px border and shadow positioning is
relative to the padding box, not the shared border edge.
The overlay anchors inside the table so it scrolls with content
automatically; a scroll/resize listener handles edge cases.
Repro: a sheet with sheetProtection@sheet="true" could still be sorted
silently, mutating a worksheet the author explicitly protected. The
sort also leaves the protection in place, so the user's next interactive
sort in Excel will be blocked — masking the breach.
Fix: at SortRangeRows entry, check the sheet's SheetProtection. If
@sheet is true and @sort is absent/true (OOXML default: sort IS
protected), throw InvalidOperationException. Honor the escape hatch
@sort="false", which per spec means "sort is excluded from the
protected operation set" — allow the sort in that case so we do not
regress legitimate workflows.
Repro: two related malformed inputs that silently corrupted sort output.
(a) A <row> element without a RowIndex attribute — the range filter
`r.RowIndex?.Value >= dataStartRow` silently dropped that row, so
the row's data survived sort but was reassigned a stale RowIndex,
losing the user's data alignment.
(b) Two <row r="N"> entries with the same RowIndex — sort wrote two
rows into the same target slot, silently dropping one.
Fix: at SortRangeRows entry, scan SheetData once and throw
InvalidOperationException on either condition. Sorting a corrupted
layout should surface the corruption, not silently paper over it.
Repro: a sheet containing mergeCells but no autoFilter. After a sort,
WriteSortState fell back to sheetData.InsertAfterSelf(ss), producing
a child sequence of sheetData → sortState → mergeCells which violates
CT_Worksheet (sortState must precede mergeCells? No — sortState sits
AFTER autoFilter and BEFORE mergeCells per ECMA-376). Strict
validators reject the document; Excel silently ignores the sortState.
Fix: thread sortState into its correct schema slot by walking the
predecessors in reverse (autoFilter, scenarios, protectedRanges,
sheetProtection, sheetCalcPr, sheetData) and InsertAfterSelf the
nearest present anchor. This places sortState between its nearest
predecessor and any successor (mergeCells, conditionalFormatting,
hyperlinks, etc.).
Repro: a sheet where SheetData lists rows out of RowIndex order
(e.g. <row r="3">, <row r="1">, <row r="2">) — legitimate output from
some writers or malformed edits. SortRangeRows built originalIndices
from List position (document order), so the sorted data was mapped
onto scrambled target row numbers, producing a wrong arrangement.
Fix: OrderBy(v => v) on originalIndices so sorted slots are always
assigned in ascending row order regardless of SheetData layout.
Repro:
set A1=30 A2=10 B1==A1+1000 B2==A2+1000
set A1:B2 sort='A asc'
-> B1 silently became =A2+1000 (stale ref to old row)
Cause: SortRangeRows rewrites Cell.CellReference to the new row index
but leaves CellFormula.Text encoding the *old* relative addresses, so
Excel recalculates against wrong refs and silently produces wrong
values. Data-corruption class.
Fix: extend the existing shared-formula reject to cover any cell with
a CellFormula in the data rows (CONSISTENCY(sort-rejects-formulas)).
A full ref-rewrite (handling A1/\$A\$1/A:B/Sheet!A1/named ranges) is
high risk for partial-correctness regressions and deferred to v2.
Known limitation: does not catch formulas *outside* the sort range
that reference cells *inside* it; same scope as the shared-formula
check.
Before: WriteSortState always did sheetData.InsertAfterSelf(sortState).
When an autoFilter was present on the sheet, this produced the child
order [sheetData, sortState, autoFilter], which violates the
CT_Worksheet schema (autoFilter must precede sortState).
After: if an autoFilter child exists, insert sortState immediately
after it; otherwise keep the existing 'after sheetData' placement.
Repro: set xxx.xlsx '/Sheet1/A1:B10' --prop 'sort=A asc B'
Before: parsed as (A, asc) and silently dropped the trailing 'B'.
After: throws ArgumentException 'Invalid sort key ...: too many tokens.
Expected <col> [asc|desc]'.
Applies per comma-separated key entry, so 'A asc, B desc extra' also
fails loudly on the second key.
Repro: set xxx.xlsx '/Sheet1/A3:A1' --prop sort='A asc'
Before: row1>row2 scan produced an empty rowsInRange, data was not
reordered, and sortState@ref was written as the literal reversed 'A3:A1'.
After: SortRangeRows swaps col1/col2 and row1/row2 when they arrive in
max:min order. Rows reorder correctly and sortState@ref is well-formed.
Repro: set xxx.xlsx '/Sheet1/A1:B10' --prop sort='C asc'
Before: silently succeeded with no reorder and wrote a malformed
sortCondition ref pointing outside sortState@ref.
After: throws ArgumentException 'Sort column C is outside the range A:B'.
The check runs after each key is parsed in SortRangeRows, using the
normalized col1/col2 bounds.
Repro: set xxx.xlsx '/Sheet1/A1:A3' --prop sort=''
Before: silently succeeded, no sort applied, no error.
After: throws ArgumentException 'sort value cannot be empty'.
Sheet-level sort='' keeps its clear-sortState semantics (handled by the
sheet-level dispatcher before reaching SortRangeRows); the throw fires only
when the empty value arrives via a range path.
Refactor Excel row sort to follow the region-action convention used by
merge: the sort range is now encoded in the path (/Sheet1/A1:C100) rather
than a separate --prop range=... Sheet-level path auto-detects the used
range and delegates to the same SortRangeRows helper.
Correctness fixes folded into the rewrite:
- precise column-letter match (old StartsWith('A') misfired on AA)
- raw CellValue comparison (not display-formatted text) so numeric keys
compare as doubles even when the cell has a format code
- first sort key uses OrderBy (was ThenBy on a no-op identity)
- per-row sort-key materialization (was O(rows × keys × cells/row))
- reject ranges intersecting merged cells (was silent corruption)
- reject ranges with shared-formula groups (was broken ref anchors)
- sortState placed after SheetData; sortCondition@ref scoped to the key
column within the sort range
New sheet-level contract:
set data.xlsx /Sheet1/A1:C100 --prop sort='A asc, B desc' --prop sortHeader=true
set data.xlsx /Sheet1 --prop sort='A asc' (auto-detect)
The previous fix (ae51cbc) only removed RedirectStandardOutput, which
was insufficient: .NET's UseShellExecute=false always passes
bInheritHandles=TRUE to CreateProcess regardless of redirect settings,
leaking the caller's pipe handles into the resident child.
When the caller's stdout is a pipe ($(), | cat, CI, SDK wrappers),
the pipe never gets EOF until the resident exits (~60s idle timeout),
blocking the caller for the entire duration.
Fix: temporarily clear HANDLE_FLAG_INHERIT on stdin/stdout/stderr
before Process.Start, then restore immediately after. This prevents
the shell's pipe handles from being duplicated into the resident
while preserving .NET's internal handle plumbing.
- Add P/Invoke for GetStdHandle and SetHandleInformation (kernel32)
- Guard with RuntimeInformation.IsOSPlatform(Windows) — no-op on
Mac/Linux where fork+exec uses close-on-exec by default
- Keep RedirectStandardError for startup failure diagnostics
Before: time result=$(officecli create x.docx) → 61s
After: time result=$(officecli create x.docx) → 2s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>