The proposed standard was designed from the start to be capable of faithfully representing the pre-existing corpus of word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft Corporation. However, the proposed standard does not fully describe these binary formats. The referencing of unexplained backward compatibility modes poses problems for third party implementers
Describe the binary formats or move the references to the informative parts of the standard.
Part 4
te
Proposed Disposition of DIS 29500 Comment FI-0005 (Modified: 2008-01-13) Documenting the Microsoft Office “binary” file formats (i.e., .doc, .xls, and .ppt) (the “Binary Formats”) is not the intention or in the scope of DIS 29500. However, Ecma International discussed this subject with Microsoft Corporation. Microsoft indicated that the documentation of the Binary Formats has been available royalty-free under RAND-Z to anyone who requests it by sending an email to officeff@microsoft.com, as described at http://support.microsoft.com/kb/840817/en-us. Microsoft indicated that many companies and public institutions have asked for and received the Binary Formats since Microsoft started providing access to this documentation. Nevertheless, in response to requests for even easier access to the Binary Formats, Microsoft has agreed to remove any intermediate steps necessary to get the documentation, and will post it and make it directly available for a direct download on the Microsoft web site. Microsoft will also make the Binary Formats subject to its Open Specification Promise (see www.microsoft.com/interop/osp) by February 15, 2008. Similar Comments: NZ-0013

Does not require fixing
Firstly The binary format specifications are freely available to everyone from Microsoft.
However that is not the nature of the problem although the comments suggest it is.
Even with fully described format specification it does not mean that you can render document faithfully. Rendering is not a part of the specifications. To render documents identically to need to measure how they were rendered before and copy that. It is not something that is written in the format specificatiosn which describe the syntax of the format.
Office Open XML allows for a faithfull syntax. Legacy compatibility can be syntactically full by correctly describing the legacy compatibitly syntax. Describing the legacy rendering does not fall under that as rendering is not described for any other part of the current specification either.
As the legacy compatibility syntax is normative you cannot move that to an informative part. You could however remove all explanations of legacy compatibility item to an informative part.
“Even with fully described format specification it does not mean that you can render document faithfully. Rendering is not a part of the specifications.”
Maybe this should be added?
@Andre
Rendering does not really have to be interchangeble as much as the information contained in the document.
Describing the exact rendering in an open standard format specification might make the spec 10 times as big
Given that existing .doc files render differently when different print drivers/paper sizes are used, its not clear how critical 100% perfect rendering is, or whether it is achievable at all.
Steve,
very good point. Exact, faithful, rendering is the objective of things like PDF and whatever that PDF clone that Microsoft are going to push next. I think Microsoft have tried far too hard to just wrap their legacy format with XML and thereby made a brand new legacy format.
if faithful rendering is not the goal of the specification, then why are said references to unexplained backward compatibility modes in the spec in the first place?
Regardless of rendering capability, the formats are necessary for the standard to appear complete. “Availability from Microsoft” is not the same as a standards document. I agree that rendering specifications need not be present, although this is independent of the specifications of the content to be included. Binary formats listed elsewhere, if allowed to stand, could be extended to any section in any standard, driving them towards futility.
Also, the content of such sections should be documented at least for the reasons of machine aided Difference functions, which cascades to workflow concepts such as version control, security, forensics, etc.
@Hylke
You said: “if faithful rendering is not the goal of the specification, then why are said references to unexplained backward compatibility modes in the spec in the first place?”
We use it mainly to detect which files might render differently after conversion.
Even when not using MS Office that might try to duplicate the rendering from the past for its cusomers the implemntation you use can easily detect the compatiblity tag and inform the reader that the documents is likely to look slightly different than it did in it’s original Wordperfect or Word95 state.
This is simular to OpenOffice using the ODF office-setting “UseFormerLinSpacing” which can also only be rendered by OpenOffice but can identified by other implementations and used to inform the user.
It is possible and easy to add the Biff spec as an annex to the ISO NOOOXML spec. It is no big deal to do so and it helps us to get rid off one Brazilian comment. So please do, Microsoft
[…] careful attention to this insightful comment about Office binaries getting encapsulated in an XML-shaped wrapper. That pretty much sums up the purpose and essence of OOXML. It’s a gown for legacy formats […]