If you've worked with one of the XML vocabularies for which Oxygen has out of the box
support like DITA, Docbook, TEI, XHTML you've probably already
used the support Oxygen has for converting content pasted in the application from external
applications like Microsoft Word, Excel or from any web browser. This is
a very useful feature for converting various types of content to XML because it preserves and
converts styling, links, lists, tables and image references.
The feature relies on the fact that when copying content in the applications mentioned above,
they set in the clipboard the HTML equivalent of the copied content. So all Oxygen has to do
is clean up that HTML, make it wellformed XHTML and apply conversion XSLT stylesheets over
it.
This support is not hardcoded and anybody who is developing an Oxygen framework customization for a certain XML vocabulary can
provide conversion stylesheets for external pasted HTML content.
I will describe how this works for the
DITA framework and you can do the same for
yours. You can also use this information to modify the way in which smart paste works for the
bundled framework configurations.
- In the Preferences->Document Type Association page you can choose
to edit (or extend) the DITA document type association.
- In the Extensions tab the Extensions bundle implementation is set to
DITAExtensionsBundle which resides in the DITA Java extensions archive
dita.jar.
- The DITAExtensionsBundle is an extension of the ExtensionsBundle API and it provides its
own external object extension
handler:
/**
* @see ro.sync.ecss.extensions.api.ExtensionsBundle#createExternalObjectInsertionHandler()
*/
@Override
public AuthorExternalObjectInsertionHandler createExternalObjectInsertionHandler() {
return new DITAExternalObjectInsertionHandler();
}
- The DITAExternalObjectInsertionHandler extends the base class AuthorExternalObjectInsertionHandler and
provides a reference to its specific conversion
stylesheet:
/**
* @see ro.sync.ecss.extensions.api.AuthorExternalObjectInsertionHandler#getImporterStylesheetFileName(ro.sync.ecss.extensions.api.AuthorAccess)
*/
@Override
protected String getImporterStylesheetFileName(AuthorAccess authorAccess) {
return "xhtml2ditaDriver.xsl";
}
Note: The
Extensions tab also allows you to specify the external object insertion handler
as a separate extension.
- In the same Document Type edit dialog in the Classpath tab you will see
that there is a reference to a framework-specific resources folder
like:${framework}/resources/
- If you look on disk in the DITA framework resources folder:
"OXYGEN_INSTALL_DIR\frameworks\dita\resources" you will find the
xhtml2ditaDriver.xsl stylesheet there. The stylesheet imports various other
stylesheets which you could probably fully reuse and which apply various cleanups on HTML
produced with MS Word. It also handles the conversion between the pasted HTML content and
DITA so it is a good starting point, you can copy the entire set of XSLT stylesheets to
your framework and use those as a starting point.