Qt wiki will be updated on October 12th 2023 starting at 11:30 AM (EEST) and the maintenance will last around 2-3 hours. During the maintenance the site will be unavailable.
Handling Microsoft Word file format: Difference between revisions
AutoSpider (talk | contribs) (Convert ExpressionEngine links) |
Henri Vikki (talk | contribs) No edit summary |
||
(8 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
__NOTOC__ | |||
[[Category:Developing_with_Qt]] | [[Category:Developing_with_Qt]] | ||
This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Word#File_formats Microsoft Word] documents in your Qt application. Please also read the general considerations outlined on the [[Handling_Document_Formats | Handling Document Formats]] page. | This page discusses various available options for working with [http://en.wikipedia.org/wiki/Microsoft_Word#File_formats Microsoft Word] documents in your Qt application. Please also read the general considerations outlined on the [[Handling_Document_Formats | Handling Document Formats]] page. | ||
{| style="background-color:LightCyan;" cellpadding="20" cellspacing="0" border="1" | |||
| Note that this information is collaboratively collected by the community, with no promise | |||
of completeness or correctness. In particular, use your own research and judgment | |||
when evaluating third-party libraries or tools! | |||
|} | |||
One needs to distinguish between two different formats (this page deals with both of them): | One needs to distinguish between two different formats (this page deals with both of them): | ||
| | {| class="wikitable" | ||
| ''classification:'' | binary | XML-based | | | | ||
| ''main filename extension:'' | | ! Legacy "Word Document" format | ||
| ''main internet media type:'' | | ! "Office Open XML Document" format | ||
| ''default format of Word:'' | until Word 2003 | since Word 2007 | | |- | ||
| ''classification:'' | |||
| binary | |||
| XML-based | |||
|- | |||
| ''main filename extension:'' | |||
| .doc | |||
| .docx | |||
|- | |||
| ''main internet media type:'' | |||
| application/vnd.ms-word | |||
| application/vnd.openxmlformats-officedocument.wordprocessingml.document | |||
|- | |||
| ''default format of Word:'' | |||
| until Word 2003 | |||
| since Word 2007 | |||
|} | |||
== Reading / Writing == | == Reading / Writing == | ||
Line 25: | Line 46: | ||
If you are exclusively targeting the Windows platform and Microsoft Word will be installed on all target machines, then you can use [http://doc.qt.io/qt-4.8/activeqt.html Qt’s ActiveX framework] to access Word’s .doc and .docx processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by Word's COM object), consult [http://wiki.qt.io/Using_ActiveX_Object_in_QT this how to] (focuses on Microsoft Excel, but it works the same way for Word). | If you are exclusively targeting the Windows platform and Microsoft Word will be installed on all target machines, then you can use [http://doc.qt.io/qt-4.8/activeqt.html Qt’s ActiveX framework] to access Word’s .doc and .docx processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by Word's COM object), consult [http://wiki.qt.io/Using_ActiveX_Object_in_QT this how to] (focuses on Microsoft Excel, but it works the same way for Word). | ||
{| class="wikitable" | |||
| | | | ||
| [http://office.microsoft.com/word/ '''Microsoft Word'''] | ? | | ! DLL file name | ||
! COM object name | |||
! platforms | |||
! license | |||
|- | |||
| [http://office.microsoft.com/word/ '''Microsoft Word'''] | |||
| ? | |||
| Word.Application | |||
| Windows | |||
| <span style="color:Navy">commercial</span> | |||
|} | |||
=== Using independent parser/writer libraries === | === Using independent parser/writer libraries === | ||
{| class="wikitable" | |||
| | |||
! API | |||
! .doc | |||
! .docx | |||
! reading | |||
! writing | |||
! platforms | |||
! license | |||
|- | |||
| [http://… '''…'''] | |||
| … | |||
| … | |||
| … | |||
| … | |||
| … | |||
| … | |||
| … | |||
|- | |||
| [http://www.abisource.com/projects/ '''wv'''] | |||
| C | |||
| <span style="color:Green">yes </span> | |||
| <span style="color:DarkRed"> no </span> | |||
| <span style="color:Green">yes </span> | |||
| <span style="color:DarkRed">no </span> | |||
| Win, Mac, Linux | |||
| GPL <span style="color:Navy">[strong copyleft] </span> | |||
|} | |||
=== Using manual XML processing === | |||
Files using the XML-based (.docx) format could be processed using Qt's XML handling classes (see [[Handling_Document_Formats | Handling Document Formats]]). | Files using the XML-based (.docx) format could be processed using Qt's XML handling classes (see [[Handling_Document_Formats | Handling Document Formats]]). | ||
{| style="background-color:moccasin;" cellpadding="20" cellspacing="0" border="1" | |||
| TODO: Expand this section. | |||
|} | |||
=== Using batch conversion tools === | |||
If all else fails, there is always the option of using an existing tool to automatically convert between Microsoft Word files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via [[Doc:QProcess]]. Some possibilities are: | |||
{| class="wikitable" | |||
| | | | ||
| [http://www.abisource.com '''AbiWord'''] | | ! .doc to | ||
| [http://www.abisource.com/projects '''wvWare'''] | | ! .docx to | ||
| | ! … to .doc | ||
! … to .docx | |||
! platforms | |||
|- | |||
| [http://www.abisource.com '''AbiWord'''] | |||
| .txt .rtf .html .dbk .odt .docx … | |||
| .txt .rtf .html .dbk .odt … | |||
| - | |||
| .txt .rtf .html .dbk .odt .doc … | |||
| Win, Mac, Linux, … | |||
|- | |||
| [http://www.abisource.com/projects '''wvWare'''] | |||
| .txt .rtf .html .dbk … | |||
| - | |||
| - | |||
| - | |||
| Win, Mac, Linux, … | |||
|- | |||
| … | |||
| … | |||
| … | |||
| … | |||
| … | |||
| … | |||
|} | |||
''Notes:'' | ''Notes:'' | ||
AbiWord can be used like this for batch conversion: <code>abiword —to=outputfile.rtf inputfile.doc<code> | AbiWord can be used like this for batch conversion: <code>abiword —to=outputfile.rtf inputfile.doc</code> | ||
== Displaying / User-Interacting == | == Displaying / User-Interacting == | ||
Line 59: | Line 142: | ||
=== Using Word itself === | === Using Word itself === | ||
{| style="background-color:moccasin;" cellpadding="20" cellspacing="0" border="1" | |||
| TODO: If you know whether Word provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQT, please fill out | |||
this section (include links to relevant resources!) | |||
|} | |||
=== Manual solution === | === Manual solution === | ||
{| style="background-color:moccasin;" cellpadding="20" cellspacing="0" border="1" | |||
| TODO: Tips for implementing a custom Microsoft Word viewer widget, using Qt and the Microsoft Word parsing libraries mentioned above | |||
|} | |||
== See Also == | == See Also == | ||
Line 71: | Line 157: | ||
* [[Handling_Document_Formats | Handling Document Formats]] | * [[Handling_Document_Formats | Handling Document Formats]] | ||
** ''other Microsoft Office formats:'' | ** ''other Microsoft Office formats:'' | ||
*** [[ | *** [[Handling Microsoft PowerPoint file format | Microsoft Powerpoint]] | ||
*** [[ | *** [[Handling Microsoft Excel file format | Microsoft Excel]] | ||
** ''other "Text Document" formats:'' | ** ''other "Text Document" formats:'' | ||
*** [[Handling_HTML | HTML]] | *** [[Handling_HTML | HTML]] |
Latest revision as of 06:54, 31 March 2015
This page discusses various available options for working with Microsoft Word documents in your Qt application. Please also read the general considerations outlined on the Handling Document Formats page.
Note that this information is collaboratively collected by the community, with no promise
of completeness or correctness. In particular, use your own research and judgment when evaluating third-party libraries or tools! |
One needs to distinguish between two different formats (this page deals with both of them):
Legacy "Word Document" format | "Office Open XML Document" format | |
---|---|---|
classification: | binary | XML-based |
main filename extension: | .doc | .docx |
main internet media type: | application/vnd.ms-word | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
default format of Word: | until Word 2003 | since Word 2007 |
Reading / Writing
Using Word itself
If you are exclusively targeting the Windows platform and Microsoft Word will be installed on all target machines, then you can use Qt’s ActiveX framework to access Word’s .doc and .docx processing functionality through OLE automation. For an introductory code example (and a way to list the API provided by Word's COM object), consult this how to (focuses on Microsoft Excel, but it works the same way for Word).
DLL file name | COM object name | platforms | license | |
---|---|---|---|---|
Microsoft Word | ? | Word.Application | Windows | commercial |
Using independent parser/writer libraries
API | .doc | .docx | reading | writing | platforms | license | |
---|---|---|---|---|---|---|---|
… | … | … | … | … | … | … | … |
wv | C | yes | no | yes | no | Win, Mac, Linux | GPL [strong copyleft] |
Using manual XML processing
Files using the XML-based (.docx) format could be processed using Qt's XML handling classes (see Handling Document Formats).
TODO: Expand this section. |
Using batch conversion tools
If all else fails, there is always the option of using an existing tool to automatically convert between Microsoft Word files and a more manageable format, and let your Qt application deal with that format instead. The conversion tool could be bundled with your application or specified as a prerequisite, and controlled via Doc:QProcess. Some possibilities are:
.doc to | .docx to | … to .doc | … to .docx | platforms | |
---|---|---|---|---|---|
AbiWord | .txt .rtf .html .dbk .odt .docx … | .txt .rtf .html .dbk .odt … | - | .txt .rtf .html .dbk .odt .doc … | Win, Mac, Linux, … |
wvWare | .txt .rtf .html .dbk … | - | - | - | Win, Mac, Linux, … |
… | … | … | … | … | … |
Notes:
AbiWord can be used like this for batch conversion:
abiword —to=outputfile.rtf inputfile.doc
Displaying / User-Interacting
Using Word itself
TODO: If you know whether Word provides a "viewer" ActiveX control that can be embedded in a Qt application through ActiveQT, please fill out
this section (include links to relevant resources!) |
Manual solution
TODO: Tips for implementing a custom Microsoft Word viewer widget, using Qt and the Microsoft Word parsing libraries mentioned above |
See Also
- Handling Document Formats
- other Microsoft Office formats:
- other "Text Document" formats: