> MSWord can save things into various formats. OpenOffice can open
> and convert MSWord file in several more formats. There is a Java
> toolkit to read the MSWord file, though it is still at a very early
> stage:
>
> http://poi.apache.org/hwpf/index.html.
>
> But in general, AFAIK, nothing will read MSWord format directly and
> do something useful with it. It is just too complex.
I wanted to mention that OpenOffice uses Python as a scripting language. That makes it possible to write Python programs that run as macros in OpenOffice to do a great deal of processing. See
http://udk.openoffice.org/python/scriptingframework/index.html
udk: OOo scripting framework and python
It is also possible to use Java and Javascript as scripting languages to control OpenOffice:
http://framework.openoffice.org/scripting/scriptingf1/developer-guide.html framework: Writing Scripts in BeanShell, Javascript, and Java
John Sowa