Re: Word + win32ole - how to find formatting of a word?




-------- Original-Nachricht --------
Datum: Sun, 26 Oct 2008 22:14:53 +0900
Von: Mohit Sindhwani <mo_mail@xxxxxxxxx>
An: ruby-talk@xxxxxxxxxxxxx
Betreff: Re: Word + win32ole - how to find formatting of a word?

Mohit Sindhwani wrote:
Axel Etzold wrote:
HI! I'm trying to use Ruby and win32ole to parse a Word document.
So far, I'm able to extract the style and text of each paragraph.
That works great to convert it into individual divs (in the HTML CSS
sense).

Now, inside the paragraphs, there are certain words that have
special formatting (for e.g. the name of a command which is in
monospace) - I'm trying to find how to extract those special cases.
Does anyone know how to achieve that?


Dear Mohit,
you could save the Word file as an html and then extract the
relevant information...
I did that using OpenOffice and got a file containing the font
information in the following form.


Hi Axel

Thanks for replying! Converting to HTML and working with that is my
last option actually. In a well-written document, I found that using
Word to return style information about the paragraph is a lot less
work and relatively easy to work with. I guess it's time to consider
your suggestion!

Actually, after digging around, I found that this gets me somewhere there:
words = doc.Words
words.each {|w|
index += 1
ft = w.Font.Name
ftHash[ft] = 1
}

Thanks for your help!

Cheers,
Mohit.
10/26/2008 | 9:14 PM.



Dear Mohit,

you're welcome :)
It's always nice to best answer one's own questions , isn't it ? Thanks for the info !

Best regards,

Axel

--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

.



Relevant Pages

  • Re: Word + win32ole - how to find formatting of a word?
    ... I'm able to extract the style and text of each paragraph. ... you could save the Word file as an html and then extract the relevant information... ... I did that using OpenOffice and got a file containing the font information in the following form. ... Word to return style information about the paragraph is a lot less work ...
    (comp.lang.ruby)
  • Re: Word + win32ole - how to find formatting of a word?
    ... I'm able to extract the style and text of each paragraph. ... That works great to convert it into individual divs (in the HTML CSS ... Word to return style information about the paragraph is a lot less ...
    (comp.lang.ruby)
  • Re: Page layout for tubie pages on the net Re: Deep Space 845 55W SET amps.
    ... the right hand margin is all chopped up. ... margin with no more raggedness than necessary to accommodate the varying word ... he should specify a font and a page width in his HTML code. ... Your computer enters an end of paragraph mark. ...
    (rec.audio.tubes)
  • Machine-editable DOM/sexpr model specifying HTML output to be generated?
    ... an editable DOM, which starts out with a generic ABORT message ... does my purpose (second paragraph) sound reasonable ... generate HTML output to standard output live at the moment I call ... is to ask anyone with experience with any of those packages to ...
    (comp.lang.lisp)
  • Re: looking for components seems to trichview
    ... >Other components that permits editing html directly, ... - use same style sheet in different documents ... - setting for all paragraph and character attributes ... - define character attributes not only in character 'style' attached ...
    (borland.public.delphi.thirdpartytools.general)