Community
Participate
Working Groups
3.0M6 Eclipse 3.4M6 Currently the way the Structured Source Editor seems to be setup, the Tokenizers need to know the complete language in order to handle and edit the document, and to provide the necessary region identifiers. It would be nice if we could generalize this through an extension point for SEE that would indicate by content type, what Partitions should get which parsers, so that the regions can be contributed by the specific parsers. Let's take an example editor, like the HTML editor. Unless I'm totally missing how this works (and that could be the case), then the HTML editor must understand both the HTML, CSS, and Script Partitions and have a parser internally defined to handle these partions and generate the appropriate region/node information. However this ties the implementation directly to that particular editor. What if special handling for a new Micro-format was needed with out a grammar for it, and special handling needed to happen for particular regions within that Micro-format. It's not uncommon for HTML now a days to have a mixture of HTML, Microformats, CSS, Scripts, etc all in one file. The XML editor is another example. We have grammar content assistance contributed through the xml catalogs and DTDS, but for XML grammars like XInclude, the grammar only has a portion of the functionality. Specific functionality and parsing for XPointer or XPath needs to be provided as well, which requires to parse the Xpath expression as if it were a Script tag from HTML that contain java script, it needs special handling. The idea here would be to have parsing take place based on content-type or a user specified class. This would allow an editor like XQuery that contains XQuery syntax, but also XML support to have functionality for both, and an adopter could potentially add support for parser specific functionality and content assistance as well. If this functionality is already there, it doesn't seem to be documented in a clear enough fashion. Maybe an article needs to be written for Eclipse Corner that shows how to do it with the existing API and extension points? I don't expect this for 3.0m7 or 3.0, but it would make my life much simplier in XSL Tooling.
Actually, it works the other way around for us. The tokenizers parse through the source exactly once and everything else is built on top it their output. The partitions created by the partitioner aren't part of a in-memory model that stays updated, they're instead created on-the-fly based on the text region information (this causes the results from StructuredTextPartitioner*.getPartition(int) to not fully comply with the contract, a low-cost solution for which we don't yet have). A partitioner along these lines would still need information about what constitutes an edge between two different partition types, and would need an appropriate state table to transition between them at the right times--essentially duplicating what we have already done with the source parser. And then you'd have to resolve the problems faced by needing to incrementally reparse that document as it's edited. For script tags, the HTML partitioner is smart enough to recognize the script tag, read its type and language values, and generate the partition's type based on those values. You can see the XML partitioner doing something similar in StructuredTextPartitionerForXML.getPartitionType(ITextRegion, int) so that the partition type of a Processing Instruction's content varies with the specified target. I suppose we could try something like allowing the region factory to call another tokenizer (or whatever) on the text contained by regions of specific contexts and return different implementation classes when needed. This skips over the more complicated issues with optimizing the reparsing. An example would be taking the text of an attribute and running it through another (generated) parser to detect XPath expressions, and when one was found, return a subclass of AttributeValueRegion encapsulating more information than normal (even if it's just a boolean saying "hey! there's an expression here!"). A partitioner could then make use of this information. The interaction between the tokenizer and region factory amounts to one line of code, so it would be one place to start. You know, in theory.
Yeah...unfortunately, I learned this weekend that it was parsing..regions...partitions... and that the parsing was hard coded to a particular editor. It gets even trickier with XML parsing, in that you may have multiple namespaces that each may need to have some parser that handles specific content beyond that which is provided by the base XML parser. XSL's xpath being one example, xinclude and Xpointer being another. For XSL, there are only three type of attributes I want the XSL Partition to appear in, but currently when the Partitions are set, I haven't found a way to get the namespace that the particular xml tag or attribute resides in. Consider that for XSL it might also be nice to have CSS and Script editing support included based on the content-type for a particular region...if it could just reuses portions of the existing editors through some sort of extension point, you could get a very powerful and feature rich editor with little extra work by the adopter.
Just for those stumbling across this bug... (In reply to comment #2) > Yeah...unfortunately, I learned this weekend that it was > parsing..regions...partitions... and that the parsing was hard coded to a > particular editor. It's actually tied to a org.eclipse.core.contenttype.contentTypes extension, as driven by the model loader that's associated to it.