Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[ptp-dev] Re: [photran] Telecon


On May 7, 2005, at 6:30 AM, Ralph Johnson wrote:

On 5/3/05 10:58 AM, "Craig Rasmussen" <crasmussen@xxxxxxxx> wrote:

3. Discuss integration of outside tools (parsing primarily) and their
relationship to the existing Photran parser.  We have been discussing
tools for static analysis to do performance monitoring with the
University of Oregon (and University of Munich in two weeks). These tools
require commercial quality compilers to munge LANL codes.  We have
been thinking that outside parsers could be used with Photran
as an option, in addition to the normal Photran parser. The University
of Munich uses NAG for parsing and we have been trying to get the IBM
eclipse folk to encourage their Fortran compiler group to output their
IR in XML format.

I assume you really mean "integrate a Fortran front end" and not really
"Fortran parser". You can't really integrate just a parser unless it is
written in the same language.  A parser doesn't do much.  We need a
representation of Fortran programs, so at the very least we need an abstract syntax tree representation. It would be possible to have a front end that produces a program representation in XML (or something more compact). But
it would take a lot of work to hack a front end to produce this.

See comments below on EDG, Cleanspace, and NAG.  We have also
implemented a rudimentary tool based on the --dump-parse-tree output
of gfortran (but we want to redo this and output the AST directly).

I was talking with Bjarne Stroustrup this week about a project of his to
make a standard intermediate format for C++ and a set of tools that
manipulate it.  He has been working at this project off and on for the
better part of a decade. He is working with two compiler groups to hack their systems so that they produce the necessary information. It is clearly the right long-term strategy for C++. It should have been done years ago. But it is very hard. It takes good people many years to do it. Fortran is
not as hard to compile as C++, but it is harder than C or Java.

It would be cool if Bjarne succeeds.  We are trying to work our
contacts at IBM to do this for the XLF Fortran compiler and I've talked with
other Fortran vendors at J3 meetings.

We looked at various Fortran compilers when the project started. We are on our third parser right now. We got the grammar from someone else (the hard part) and are generating the parser automatically. In that sense, we are using someone else's parser, but it was still a lot of work and doesn't deal with the lexical issues that are so important to Fortran. I suppose you have better contacts than we do and could get source to front-ends that we could not get. However, they won't be written in Java and so integrating
them with Eclipse tools will take a lot of work.

Do you know how hard it would be to input an xml representation
of a Fortran AST (produced by an external tool) into Java?

If compiler vendors will work with us, we could have them emit an
intermediate representation that all the other tools would use. First, we would have to define that intermediate representation. Different projects would have different requirements. For example, our refactoring project requires that we know exactly where each token came from. In other words, each token needs to have its offset in the original file. This is so we can pretty print without messing up the original formatting. Most projects do not require this information, so compiler vendors are unlikely to produce it
unless we ask for it.

Apparently there is a standard xml representation for an AST (but I don't know much about it). I'll find out more information last week. The token offset (at least the line number) is also needed by the current Oregon and TUM
(Technical University of Munich) tools.

What are the projects at Oregon and Munich?  Do they work with existing
compilers, or do they have their own parsers? What kind of information do
they need?

The University of Oregon has a project, Program Database Toolkit
(PDT, http://www.cs.uoregon.edu/research/paracomp/proj/pdtoolkit/)
that uses a common and fairly terse IR for both C++ and Fortran, called
the Program Database (PDB).  They use the EDG and Cleanspace
compiler front ends.

Munich (TUM) is interested in similar things (performance monitoring
tools). They use NAG's Fortran front end to create a standard XML IR
format.

At LANL we have large legacy codes in F77 (and earlier) and
some codes in modern Fortran.  Whatever tools we use must be
able to work with LANL codes.  In particular, they must be able
to parse fixed format files (and other F77 madness), see attached file.
Our worry is that custom parser tools won't work with LANL codes.

Regards,
Craig

Attachment: junk.f
Description: Binary data


Back to the top