Re: Converting .doc files to .xml



How about using MS Word 2003? The "File->Save As" dialog box allows
you to select XML as a file type. This saves the document as WordML,
which has all of the metadata you are looking for.

If you need this to be command line, then your best bet is to probably
write some VBA or C# code to fire up Word in the background.

HTHs,
Steve Ball
Explain
http://www.explain.com.au/

Jordi Cuenca wrote:
> Hi,
>
> I am looking for an already developped tool that could convert Word .doc
> files (I do not mind the version) to .xml format.
>
> That tool should be a command line tool and I should be able to convert
> several files from .doc to .xml and, very important, I should be able to
> get the non-text data of the file (I mean, for example, author, last
> printing time and so on).
>
> The idea I have of it is somethink like:
>
> c:\> doc2xml [list of parameters] *.doc *.xml
>
> I've been trying with antiword but I did not succeed because it is
> asking me for a DTD file that I do not know from where to get it.
>
> Thank you.
>
> Jordi.
> despertaferro@xxxxxxxxxxxxxxxx

.



Relevant Pages

  • LVM2: cannot perform lvreduce on logical volume
    ... I have successfully created an ext3 filesystem on it, and can mount and read/write files to it. ... # This is an example configuration file for the LVM2 system. ... # Whether or not to display the command name on each line output ... # Configuration of metadata backups and archiving. ...
    (RedHat)
  • Re: Rebuild for spotlight?
    ... Navigate to your cache files in the command line (for this case, ... Perform command "ls" to get a list of the directory's contents. ... This should list all the metadata Spotlight knows about the item. ...
    (microsoft.public.mac.office.entourage)
  • Re: Finding the source files in the binary
    ... Suppose if I have used the command some thing like below:- ... And what if this string is optimised away? ... an XML format to exchange this type of information in comp.std.c ... be on-topic in a newsgroup for the OP's implementation, ...
    (comp.lang.c)
  • Re: File location in metadata?
    ... Michael Goerz wrote: ... menu command, it includes ... If I email or otherwise transmit this PDF electronically to someone ... This "metadata" is not really in the file. ...
    (comp.text.pdf)
  • Re: List files between Dates
    ... You can use the findcommand to perform a qualified search. ... Standard date metadata stored in the inode are ctime, mtime, and atime. ... filename, perhaps)? ...
    (comp.unix.aix)