I was annoyed that I couldn't paste
a .htm document saved from Word into a EPF rich editor without getting
some rather unuseful results. I created a small groovy script to greatly
improve the process.
Usage guide:
Save Word doc as filtered .htm page.
It will create a folder with external resources used such as images, numbered
image001, image002 and so on. This really sucks for importing into EPF.
Run the WordHtmForEPF.groovy in this
directory, perhaps with a filepattern argument for selecting which htm
files to process, fx: WordHtmForEPF.groovy Guide*.htm
The script will create new .htm files
with EPF_ prefix for each .htm file matching the filepattern. These new
.htm files have image src attributes redirected to images in a /resource
folder.
The images from the Word saved resource
folders will be renamed more usefully (with document name prefix) and copied
to the new /resources folder.
Now if you open a EPF_xxx.htm fil in
a broswer you can copy paste it into a EPF RTE and the result will be just
beautiful. All images are transfered, stored in EPF /resources folder for
the type of method element and each image named to indicate the method
element they are part of :)
It even handles special characters (danish
characters as of now...)
/**
* @author kristian@xxxxxxxxxx
*
*/
//import java.net commons
public class WordHtmForEPF {