My blog has been moved to

Saturday, February 16, 2008

Microsoft Office binary format to Open XML

As promised by Brian Jones, finally the project (under a liberal BSD-like license) to convert Microsoft Office binary format (doc, xls, ppt) to the Open XML format (used since Office 2007), named b2xtranslator, has been initiated on SourceForge. As I wrote before (on Google Code Project Hosting vs SourceForge), it'd be fantastic if Microsoft would have placed the project on Google Code instead of SourceForge (but it is Microsoft we are talking about). As of now, download is not yet available as the project is still in its very early stage.

Many office suite developers have an interest on Microsoft Office file format because it is widely used. The description of the format itself can be obtained from Microsoft since quite some time, however only with the b2xtranslator we will start to see an implementation which is "blessed" by Microsoft.

Technically, there are two slightly disappointing points with this b2xtranslator project. First, it is not an implementation used by Microsoft itself in its office suite but rather an independent one. Second, it requires .NET framework (exactly as I have predicted). For the former, again it means playing the same cat-and-mouse game since there is no guarantee that the translation is fully compatible with the code in the real Microsoft Office. For the latter, well I leave to you, the readers, to reach your own conclusion. Of course there is Mono (which, by the way, is used to integrate odf-converter in Novell edition of, but you know well where the discussion would lead.

As usual, let's wait and see.


Anonymous said...

What a waste of time. All that's happening here is that Microsoft is simply creating another avenue to convert one problematic format into another problematic format, increasing the work for everyone who wants to actually make sense of these formats and create independent, working implementations.

In effect, it's a project to achieve what they want, and to keep the merry-go-round turning.

Anonymous said...

let's just convert things to ODF, a format that doesn't make a mockery of the words "open" and "standard".

Ariya Hidayat said...

@segedunum: You bet. I wonder how useful b2xtranslator would be for commercial purposes.

Ariya Hidayat said...

@Chani: Well, isn't that exactly my point? If it would have been the same code used in Microsoft product, then reusing/reading/understanding it makes it easier for the developers of open-source ODF-friendly office suites (OO, KOffice, etc) to improve their import filters.

Ethan Anderson said...

I remember when microsoft pulled this on the web.

We still haven't recovered. Keep them out of the standard setting seat until they learn to abide by the ones already set. I will not use ooxml, as I do not use .doc.

Anonymous said...

It uses .NET. Ok, actually that's the obvious joice for MS.
So one virtual machine (mono) would be required for converting the documents, and another on (Java) to have all functionality from OOo available.
Feels a bit like bloat...


Anonymous said...

Until MS's "Covenant not to sue" is clarified I remain doubtful over the patent issues around items covered by their OSP (including both OOXML and the document formats).