Win32 Scripting

Hi,

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF format
(the Township I work for has a new policy, thanks to my boss and I,
mandating all publicly available documents must be in portable and,
preferably, open standard formats).

Anyways… is there a way using the OLE thingy in Ruby (see my grasp of
Windows terminology) to take a list of files, open and save them as RTF
in Word, without having to do it manually? I think there’s a whole
’nother batch of crud that needs conversion besides just the web-site
stuff I’m working on now.

Thanks,
Sean Etc.

Sorry I can’t comment on the technique you want… but I
thought I’d comment on one thing just in case it was an
issue.

My understanding is probably flawed, but I thought that
RTF (though touted as an open standard) was really a
Microsoftism that was intimately tied to the “latest”
implementation of MS Word.

Microsoft (perhaps even worse than IBM) thinks that
everything they do is a standard.

Of course, every standard originates somewhere. PDF comes
from Adobe, though it may arguably be more open.

There’s always XHTML; what could be more open than that?
But it may not be what you need.

Now pardon me while I go back to my paperwork for
patenting the ASCII character set.

Hal

···

----- Original Message -----
From: “Sean Middleditch” elanthis@awesomeplay.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Thursday, August 22, 2002 12:36 PM
Subject: Win32 Scripting

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF format
(the Township I work for has a new policy, thanks to my boss and I,
mandating all publicly available documents must be in portable and,
preferably, open standard formats).

Hey,

Here are some hints:

  • use win32ole, browse some examples
  • you’ll probably instruct word to open a document, then save it as
    rtf. Record a macro in word that does this, then look at it in the macro
    editor. It will contain the calls that open and close documents :slight_smile: These
    are the same functions you will use in Ruby, with the occasional change in
    punctuation. Let Ruby read the filenames, instruct Word to convert them.

Danny

···

On Fri, 23 Aug 2002, Sean Middleditch wrote:

Hi,

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF format
(the Township I work for has a new policy, thanks to my boss and I,
mandating all publicly available documents must be in portable and,
preferably, open standard formats).

Anyways… is there a way using the OLE thingy in Ruby (see my grasp of
Windows terminology) to take a list of files, open and save them as RTF
in Word, without having to do it manually? I think there’s a whole
'nother batch of crud that needs conversion besides just the web-site
stuff I’m working on now.

Thanks,
Sean Etc.

Hi,

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF format
(the Township I work for has a new policy, thanks to my boss and I,
mandating all publicly available documents must be in portable and,
preferably, open standard formats).
RTF isn’t really an open standard. It isn’t fully specified and exporting
RTF from different versions of MS Word often lead to incompatible RTF
files which can’t be read by MS Word itself, not saying about different
applications. So in result your documents ‘portability’ may vary.

Anyways… is there a way using the OLE thingy in Ruby (see my grasp of
Windows terminology) to take a list of files, open and save them as RTF
in Word, without having to do it manually? I think there’s a whole
'nother batch of crud that needs conversion besides just the web-site
stuff I’m working on now.
There is WIN32::OLE module available in RAA, it allows transparent and
complete control of OLE components using Ruby, and – with dRuby – not
only under Windows boxes.

···

On Fri, Aug 23, 2002 at 02:36:48AM +0900, Sean Middleditch wrote:


/ Alexander Bokovoy

O’Reilly’s Law of the Kitchen:
Cleanliness is next to impossible

Check out the win32ole bit in the pickaxe book – it has an Excel example.
From there, crack open the Word docs (or Google msdn.microsoft.com) and I’d
bet there’s a pretty easy way to do it. Sorry can’t poke around more for
you.

Chris

Sorry I can’t comment on the technique you want… but I
thought I’d comment on one thing just in case it was an
issue.

My understanding is probably flawed, but I thought that
RTF (though touted as an open standard) was really a
Microsoftism that was intimately tied to the “latest”
implementation of MS Word.

Microsoft (perhaps even worse than IBM) thinks that
everything they do is a standard.

Of course, every standard originates somewhere. PDF comes
from Adobe, though it may arguably be more open.

There’s always XHTML; what could be more open than that?
But it may not be what you need.

Now pardon me while I go back to my paperwork for
patenting the ASCII character set.

Um, dunno what to say; RTF works perfectly, and is quite documented. I
can find more complete RTF readers/editors than I can PDF ones (I can’t
name a single free PDF reader that handles all PDF documents perfectly,
especially when you get into embedded forms and the like.)

The problem with XHTML/HTML is simply lack of support across the board.
Sure, a lot of Open Source editors handle it well, but that’s not true
for the popular suites in many cases. On the other hand, if I can get a
script that does what I want, I may just as well put up the documents in
multiple formats (RTF, PDF, XHTML, blah, and so on), which would be nice
as well.

···

On Thu, 2002-08-22 at 14:29, Hal E. Fulton wrote:

Hal

Hi,

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF format
(the Township I work for has a new policy, thanks to my boss and I,
mandating all publicly available documents must be in portable and,
preferably, open standard formats).
RTF isn’t really an open standard. It isn’t fully specified and exporting
RTF from different versions of MS Word often lead to incompatible RTF
files which can’t be read by MS Word itself, not saying about different
applications. So in result your documents ‘portability’ may vary.

Hmm, I’ve been under a totally different impression… perhaps I’d be
better off limiting it to PDF after all. (On the other hand, I’ve yet
to encounter problems with RTF myself, and it seems at least).

Anyways… is there a way using the OLE thingy in Ruby (see my grasp of
Windows terminology) to take a list of files, open and save them as RTF
in Word, without having to do it manually? I think there’s a whole
'nother batch of crud that needs conversion besides just the web-site
stuff I’m working on now.
There is WIN32::OLE module available in RAA, it allows transparent and
complete control of OLE components using Ruby, and – with dRuby – not
only under Windows boxes.

Is there any documentation on the interfaces Word provides? Again,
I’ve not touched Windows itself barely, much less Windows API’s…

···

On Thu, 2002-08-22 at 14:45, Alexander Bokovoy wrote:

On Fri, Aug 23, 2002 at 02:36:48AM +0900, Sean Middleditch wrote:


/ Alexander Bokovoy

O’Reilly’s Law of the Kitchen:
Cleanliness is next to impossible

Check out the win32ole bit in the pickaxe book – it has an Excel example.
From there, crack open the Word docs (or Google msdn.microsoft.com) and I’d
bet there’s a pretty easy way to do it. Sorry can’t poke around more for
you.

Ah, OK. Will have to go find that book again (buried somewhere…).
Thanks!

···

On Thu, 2002-08-22 at 15:45, Chris Morris wrote:

Chris

Hey,

Here are some hints:

  • use win32ole, browse some examples
  • you’ll probably instruct word to open a document, then save it as
    rtf. Record a macro in word that does this, then look at it in the macro
    editor. It will contain the calls that open and close documents :slight_smile: These
    are the same functions you will use in Ruby, with the occasional change in
    punctuation. Let Ruby read the filenames, instruct Word to convert them.

Danny

Recording a macro is a really good idea. You may want to let Ruby call the
macro, rather than recording the macro and then porting it to Ruby. But VBA
code will let you use named parameters, something Win32OLE (pretty sure)
won’t allow. Just make sure the macro really does what you want (e.g., that
it doesn’t save every file with the same name as the one used to record the
macro).

Passing parameters to macros from external scripts can be, as I recall,
tricky. You can set custom document properties and have the macro use those
values, though.

Here are some more hints:

Make sure Word is configured to not prompt for saving changes or otherwise
ask for input before executing commands. Scripts can get hung if Word throws
up an unseen dialog box.

Plan well for exceptions. If Word barfs on something and does not exit
correctly you may end up with zombie ‘winword.exe’ processes.

Be careful of default paths. If at all possible try to use fully-qualified
file paths.

James

Ah, OK. Will have to go find that book again (buried somewhere…).
Thanks!

It’s also buried at rubycentral.com :slight_smile:

Chris

I was resisting the temptation of using windows for donig all my essays and
papers while at university. So I ran Linux, producing my papers and essays in
openoffice (staroffice back then). I would save my work as RTF then email
them to my school account. I always had to format them at school, mainly
things like tables and even simple things like bullet points wouldn’t work
out the same. I hate to imagine what a moderatly complexly formated document
would look like when going from openoffice to MS Word. Ultimately I find that
RTF is like Java (sort of) – you can write it once, but chances are high
that you will need to do extra work to get the intended effect :-\


Signed,
Holden Glova

···

On Fri, 23 Aug 2002 06:57, Sean Middleditch wrote:

On Thu, 2002-08-22 at 14:45, Alexander Bokovoy wrote:

On Fri, Aug 23, 2002 at 02:36:48AM +0900, Sean Middleditch wrote:

Hi,

As someone who’s barely touched a Windows system in 4 years, I’m in
some need of help designed a “simple” script: we have some 100+
documents saved in MS Word format. We need them converted to RTF
format (the Township I work for has a new policy, thanks to my boss and
I, mandating all publicly available documents must be in portable and,
preferably, open standard formats).

RTF isn’t really an open standard. It isn’t fully specified and exporting
RTF from different versions of MS Word often lead to incompatible RTF
files which can’t be read by MS Word itself, not saying about different
applications. So in result your documents ‘portability’ may vary.

Hmm, I’ve been under a totally different impression… perhaps I’d be
better off limiting it to PDF after all. (On the other hand, I’ve yet
to encounter problems with RTF myself, and it seems at least).

Sean Middleditch elanthis@awesomeplay.com wrote in
news:1030041679.23861.36.camel@smiddle.civic.twp.ypsilanti.mi.us:

···

On Thu, 2002-08-22 at 14:29, Hal E. Fulton wrote:

Sorry I can’t comment on the technique you want… but I
thought I’d comment on one thing just in case it was an
issue.

My understanding is probably flawed, but I thought that
RTF (though touted as an open standard) was really a
Microsoftism that was intimately tied to the “latest”
implementation of MS Word.

Microsoft (perhaps even worse than IBM) thinks that
everything they do is a standard.

Of course, every standard originates somewhere. PDF comes
from Adobe, though it may arguably be more open.

There’s always XHTML; what could be more open than that?
But it may not be what you need.

Now pardon me while I go back to my paperwork for
patenting the ASCII character set.

Um, dunno what to say; RTF works perfectly, and is quite documented.
I can find more complete RTF readers/editors than I can PDF ones (I
can’t name a single free PDF reader that handles all PDF documents
perfectly, especially when you get into embedded forms and the like.)

The problem with XHTML/HTML is simply lack of support across the
board. Sure, a lot of Open Source editors handle it well, but that’s
not true for the popular suites in many cases. On the other hand, if
I can get a script that does what I want, I may just as well put up
the documents in multiple formats (RTF, PDF, XHTML, blah, and so on),
which would be nice as well.

One thing to watch out for is that any graphics inside an RTF file can
bump the size of the doc from a few tens of k to a mebagyte very easily.


Robert Cowham

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hmm, I’ve been under a totally different impression… perhaps I’d be
better off limiting it to PDF after all. (On the other hand, I’ve yet
to encounter problems with RTF myself, and it seems at least).

I was resisting the temptation of using windows for donig all my essays and
papers while at university. So I ran Linux, producing my papers and essays in
openoffice (staroffice back then). I would save my work as RTF then email
them to my school account. I always had to format them at school, mainly
things like tables and even simple things like bullet points wouldn’t work
out the same. I hate to imagine what a moderatly complexly formated document
would look like when going from openoffice to MS Word. Ultimately I find that
RTF is like Java (sort of) – you can write it once, but chances are high
that you will need to do extra work to get the intended effect :-\

Well, on the other hand, this is true of any format. If we continued
using .DOC files, it would be just as bad, if not worse. PDF files are
generally too large, and tend not to render right in many viewers,
including Adobe’s official ones (that is, the PDF files generated by MS
products, at least). XHTML is bad as well, given that the popular
browser doesn’t necessarily render it perfectly all the time…

In the end, more people can read RTF files with less trouble than
Office2000 Word files, so RTF it will be. Some of the larger documents
in need of more powerful formatting will likely end up as PDF, tho.

···

On Fri, 2002-08-23 at 04:51, Holden Glova wrote:


Signed,
Holden Glova
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9Zfb/0X8w8X71zPcRAnHWAJ9r0lijXezW47G1ekL+vIsbdnNP6ACfRD80
SfY59VdVryF1f3KuXZnQ7Pc=
=JXy+
-----END PGP SIGNATURE-----