Summary: | Add a extension point and support for Contributing Parsers for specific Partions | ||
---|---|---|---|
Product: | [WebTools] WTP Source Editing | Reporter: | David Carver <d_a_carver> |
Component: | wst.sse | Assignee: | wst.sse <wst.sse-inbox> |
Status: | NEW --- | QA Contact: | Nick Sandonato <nsand.dev> |
Severity: | enhancement | ||
Priority: | P3 | CC: | david_williams, gregory.amerson, jin.phd, raghunathan.srinivasan, thatnitind, zulus |
Version: | 3.0 | Keywords: | helpwanted, investigate |
Target Milestone: | Future | ||
Hardware: | PC | ||
OS: | All | ||
Whiteboard: |
Description
David Carver
2008-04-07 12:44:30 EDT
Actually, it works the other way around for us. The tokenizers parse through the source exactly once and everything else is built on top it their output. The partitions created by the partitioner aren't part of a in-memory model that stays updated, they're instead created on-the-fly based on the text region information (this causes the results from StructuredTextPartitioner*.getPartition(int) to not fully comply with the contract, a low-cost solution for which we don't yet have). A partitioner along these lines would still need information about what constitutes an edge between two different partition types, and would need an appropriate state table to transition between them at the right times--essentially duplicating what we have already done with the source parser. And then you'd have to resolve the problems faced by needing to incrementally reparse that document as it's edited. For script tags, the HTML partitioner is smart enough to recognize the script tag, read its type and language values, and generate the partition's type based on those values. You can see the XML partitioner doing something similar in StructuredTextPartitionerForXML.getPartitionType(ITextRegion, int) so that the partition type of a Processing Instruction's content varies with the specified target. I suppose we could try something like allowing the region factory to call another tokenizer (or whatever) on the text contained by regions of specific contexts and return different implementation classes when needed. This skips over the more complicated issues with optimizing the reparsing. An example would be taking the text of an attribute and running it through another (generated) parser to detect XPath expressions, and when one was found, return a subclass of AttributeValueRegion encapsulating more information than normal (even if it's just a boolean saying "hey! there's an expression here!"). A partitioner could then make use of this information. The interaction between the tokenizer and region factory amounts to one line of code, so it would be one place to start. You know, in theory. Yeah...unfortunately, I learned this weekend that it was parsing..regions...partitions... and that the parsing was hard coded to a particular editor. It gets even trickier with XML parsing, in that you may have multiple namespaces that each may need to have some parser that handles specific content beyond that which is provided by the base XML parser. XSL's xpath being one example, xinclude and Xpointer being another. For XSL, there are only three type of attributes I want the XSL Partition to appear in, but currently when the Partitions are set, I haven't found a way to get the namespace that the particular xml tag or attribute resides in. Consider that for XSL it might also be nice to have CSS and Script editing support included based on the content-type for a particular region...if it could just reuses portions of the existing editors through some sort of extension point, you could get a very powerful and feature rich editor with little extra work by the adopter. Just for those stumbling across this bug... (In reply to comment #2) > Yeah...unfortunately, I learned this weekend that it was > parsing..regions...partitions... and that the parsing was hard coded to a > particular editor. It's actually tied to a org.eclipse.core.contenttype.contentTypes extension, as driven by the model loader that's associated to it. |