Bug 357294 - [api] provide API for converting content to wiki markup
Summary: [api] provide API for converting content to wiki markup
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 1.6.0   Edit
Assignee: David Green CLA
QA Contact: David Green CLA
URL:
Whiteboard:
Keywords: api, noteworthy, plan
Depends on: 368066
Blocks: 348298
  Show dependency tree
 
Reported: 2011-09-09 19:29 EDT by David Green CLA
Modified: 2012-02-28 11:26 EST (History)
2 users (show)

See Also:


Attachments
mylyn/context/zip (7.30 KB, application/octet-stream)
2011-09-09 20:02 EDT, David Green CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Green CLA 2011-09-09 19:29:32 EDT
API is needed to convert content to wiki markup.  Content source format could be HTML, other markup languages, or some arbitrary format.  

DocumentBuilder provides a great API and abstraction for driving content generation in the target output format.  By using this common API, 3rd parties would be able to support new source formats simply by implementing a parser driving the document builder API.  

The current use case it to target Textile markup, however targeting other markup languages could also be needed.  Each target output format would need its own implementation of DocumentBuilder in much the same way as we have subclasses for all of the currently supported output formats (such as HTML, Dita, DocBook, XSL-FO).
Comment 1 David Green CLA 2011-09-09 19:37:20 EDT
pushed commit e600cfbb5cc970a65b5dca509027a5f82411760d

new API introduced:
* @org.eclipse.mylyn.wikitext.core.parser.HtmlParser@ - parses XHTML and drives the DocumentBuilder API
* @org.eclipse.mylyn.wikitext.textile.core.TextileDocumentBuilder@ - emits content as Textile markup
Comment 2 David Green CLA 2011-09-09 20:02:05 EDT
Created attachment 203097 [details]
mylyn/context/zip
Comment 3 Miles Parker CLA 2011-09-09 20:03:58 EDT
David, not sure what you're planning to use for parser, but JSoup is really nice to use, especially for wonky html.
Comment 4 Steffen Pingel CLA 2011-09-11 08:40:18 EDT
The changes broke the build on 3.5. I think you need to depend on org.junit4 (3.5 ships version 4.5.0) if you want to use org.junit.Assert instead of junit.framework.Assert.

[ERROR] Failed to execute goal org.eclipse.tycho:tycho-compiler-plugin:0.12.0:compile (default-compile) on project org.eclipse.mylyn.wikitext.tests: Compilation failure: Compilation failure:
[ERROR] /opt/users/hudsonbuild/workspace/mylyn-integration-e3.5/org.eclipse.mylyn.docs/org.eclipse.mylyn.wikitext.tests/src/org/eclipse/mylyn/wikitext/textile/core/TextileDocumentBuilderTest.java (at line 23):[-1,-1]
[ERROR] import org.junit.Assert;
[ERROR] ^^^^^^^^^
[ERROR] The import org.junit cannot be resolved
[ERROR]
Comment 5 David Green CLA 2011-09-12 15:47:33 EDT
Thanks Miles, JSoup looks great!  The current implementation in WikiText is using SAX, which has some obvious shortcomings (such as requiring that the source input be well-formed XML).  We're looking for a more reliable parsing solution. 

JSoup related links:
* http://jsoup.org/
* "MIT License":http://jsoup.org/license
Comment 6 Miles Parker CLA 2011-09-12 17:56:31 EDT
Another advantage is that the selector interface and other API features makes parsing much less nicer than using the IMO very fiddly DOM API. http://jsoup.org/cookbook/extracting-data/selector-syntax
Comment 7 Miles Parker CLA 2011-09-12 17:57:19 EDT
(In reply to comment #6)
> Another advantage is that the selector interface and other API features makes
> parsing much less nicer than using the IMO very fiddly DOM API.
> http://jsoup.org/cookbook/extracting-data/selector-syntax

"much less nicer" -> "much nicer" :)
Comment 8 David Green CLA 2011-09-14 16:55:43 EDT
Filed "CQ 5559":https://dev.eclipse.org/ipzilla/show_bug.cgi?id=5559
Comment 9 David Green CLA 2012-01-06 12:44:43 EST
CQ has been approved.  The next step is to get jsoup into orbit.
Comment 10 Steffen Pingel CLA 2012-01-24 12:00:48 EST
I updated the poms to use the latest Orbit S-build and added jsoup to the target definitions. You should be able to consume it in WikiText now.
Comment 11 David Green CLA 2012-01-24 13:39:53 EST
Thanks Steffen.
Comment 12 David Green CLA 2012-01-25 00:42:24 EST
implemented.
Comment 13 Steffen Pingel CLA 2012-01-25 09:42:06 EST
In case you didn't notice, a few tests are now failing on the build server: https://hudson.eclipse.org/hudson/job/mylyn-docs-nightly/.
Comment 14 David Green CLA 2012-01-25 13:52:58 EST
These are now fixed.
Comment 15 Steffen Pingel CLA 2012-01-25 13:59:24 EST
I added the jsoup bundle to the WikiText SDK to ensure that it gets published to the update site.
Comment 16 David Green CLA 2012-01-25 21:19:34 EST
Great, thanks