14223 – Log file format must be changed

Bug 14223 - Log file format must be changed

Summary: Log file format must be changed

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Resources (show other bugs)
Version:	2.0
Hardware:	PC All

Importance:	P3 normal (vote)
Target Milestone:	2.0 M6
Assignee:	DJ Houghton
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2002-04-19 10:41 EDT by DJ Houghton
Modified:	2002-06-03 11:17 EDT (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description DJ Houghton

2002-04-19 10:41:54 EDT

The log file format was changed to be XML. This has proved to be problematic 
and the format should be changed.

Problems mainly occur when there is a crash. On startup we write the meta tag 
and the beginning <log> tag but if we crash we don't have the opportunity to 
write the closing </log> element, thus making the document invalid XML and 
unable to be parsed.

This bug will be used to track info on suggested format changes.

Comment 1 Dejan Glozic

2002-04-19 11:04:53 EDT

A suggestion from PDE team that will probably he echoed by other interested 
parties:

Plug-ins that are interested in Log file content ultimately care about IStatus 
objects that are persisted there, not the format. If you open up LogFileReader 
class (or something along the lines), you will free up yourselves to change the 
format at will. In other words, make LogFileReader the API, not the Log file 
format. 

PDE needs this content in order to present IStatus entries that existed in the 
log file before Log View was created (including startup entries). PDE only 
cares about IStatus objects and the ability to use LogFileReader to fetch the 
array of entries is the only thing needed. How you persist and subsequently 
parse the Log file automatically becomes your implementation detail.

Comment 2 DJ Houghton

2002-04-19 15:33:50 EDT

We also need to consider not deleting the log file when we start up. I don't 
like the idea of an "ever growing" log file, but perhaps one log file per 
session? We have API for users to get the name of the log file, so we should 
be ok. Dejan, can you see any problems with this? I am thinking something 
along the lines of <date and time>.log:

20020418-124300.log
20020418-153021.log

Comment 3 Dejan Glozic

2002-04-19 16:02:14 EDT

Having more than one log file would open a a potential feature for the Log view 
to have a 'Open File' action. I don't have a problem with this, as long as you 
provide API that gives me the name of the Log that belongs to the currently 
active session (the one the Log View is running in). That way, people will 
always see entries of the active session, and can require to load another 
(inactive) log by choosing it from the list.

Comment 4 DJ Houghton

2002-04-22 11:01:57 EDT

Summary of propsed changes:

1). New API (perhaps on Platform?) which takes the file name of the log file 
as an argument and returns an array of IStatus.

2). Platform.getLogFileLocation() returns the file name of the log file for 
the current session.

3). Log files will exist on a per session basis. They will no longer be named 
simply ".log" but will have some sort of unique identifier in their name. 
(date-and-time.log is recommended as it would easily allow users to associate 
a particular log file with a session)

4). Log files will not be deleted by the platform.

5). Log files will only be created when there is information to write to them. 
(e.g. if there are no warnings or errors then there is no file)

Comment 5 Dejan Glozic

2002-04-22 11:10:31 EDT

Looks good, but did you consider pollution? There may be hundreds of sessions 
with log files, so you may want to introduce a rolling window (e.g. last ten 
logs) and delete anything older. UI can expose the number as an option. 
Otherwise, you may litter the metadata with dozens, even hundreds of files.

Comment 6 John Wiegand

2002-04-22 12:06:30 EDT

The multiple log files is an interesting idea.
Auto-cleaning up log files is a bad idea.  It is important that these logs are 
cleaned up explicitly.  We don't want to lose information.
If there is a concern for too many log files, you could put them in a .log 
directory under .metadata.  THat would make them easier to clean up.

Comment 7 Dejan Glozic

2002-04-22 13:02:51 EDT

Should PDE LogView allow cleanup via view actions?

Comment 8 Jim des Rivieres

2002-04-22 14:35:14 EDT

There are actually 2 different problems here:
- we need a robust log mechanism that does not lose info between sessions
- PDE needs a way to parse the log to find out what happened while it
way inactive

The single "ever growing" log file is simpler than multiple log
file (this is what we had in VA/ST). In theory, the log file should be empty 
unless the product is experiencing internal failures. If it gets too big,
the user can delete it manually.

As for format, I propose something very simple and not XML based.
Each log entry looks like:

!ENTRY <date> <plug-in id> <severity> <code>
<text>

The first log entry written each session is preceded by the line:

!SESSION

The <text> in a log entry could be one or more lines, up to end of file
or the next line that begins with either "!ENTRY" or "!SESSION". The
text includes the IStatus message, exception backtraces, and some
representation of child statuses.

The session delimiter makes it possible to separate log entries coming
from separate sessions.

The log file is not kept open; it is opened only to append an entry,
and then closed, to minimize chances that the log is lost.
The log is only appended to; the platform does not delete it.
And the log is not written to unless there is an error (DJ's points 4 and 5).

No new API is required. PDE should read the log by opening ".log" in the
platform metadata area and parsing it itself (and closing it immediately).

I believe this mechanism is simple, robust, and addressed both requirements.

Comment 9 Dejan Glozic

2002-04-22 14:48:28 EDT

It feels unnatural for interested tools (like PDE) to parse something that is 
written in an internal format. There is no way to avoid making an API here, it 
is simply a decision whether we are making the format itself being an API or 
add a method for getting an array of IStatus objects created by parsing the log.

I am still of the opinion that the platform should make the log file format 
internal by providing the API for getting the log status objects. This way, it 
can change the format in the future without breaking tools like PDE. In 
addition, platform can benefit from being the only one that both reads and 
writes the log ('IStatus [] getLogEntries()' API can be implemented in such a 
way to prevent conflicts when an attempt is made to read the log while it is 
written to).

Comment 10 DJ Houghton

2002-04-22 16:41:10 EDT

The problem is that we cannot create IStatus from the log file after the file 
has been writen. A simple example to to consider the (optional and potentially 
nested) exception which is inside the status object. We can't create the 
original exception to include in a Status object.

What information are you trying to present to the user? Do you have examples 
of why you need status objects?

Comment 11 Dejan Glozic

2002-04-22 16:54:46 EDT

I don't strictly need IStatus objects. PDE Log View is currently registered as 
a log listener and any log entry added after it has been created will appear in 
the view. The view uses TableTree widget, so nested status objects can be 
shown. Therefore, the easiest way to expand view's reach beyond its lifecycle 
is to somehow get the contents of the log file in form of IStatus objects. 

Your observation is correct in that nested status objects will need to be 
supported (the view supports them already, so the issue is how to persist and 
recreate them using the log file). In this regard, XML was more appealing 
because it provided for hiearchy, but the need to close the XML file file with 
the matching </root> tag makes is problematic in face of platform crashes. 

I don't have an easy answer here. You will have to persist IStatus objects 
(including nested ones) in some way. If you do not restore status objects back, 
PDE will have to, so it is just a matter of who will do the parsing. My claim
is that PDE is currently the only plug-in that uses the log file content in
an organized way (by turning its content into something scructured and 
presentable in the UI).  If there will be no public utilities to parse the
content of the log file beyond pointing plug-ins at its location, this whole
bug becomes internal Core discussion on the log file format.

Comment 12 DJ Houghton

2002-04-24 21:04:41 EDT

Log file is no longer deleted each session, it is every-growing. Will 
investigate in the future the potential for spiltting the file into per-session 
files.

Changed format to be:

!SESSION

!ENTRY plugin_id severity code date
!MESSAGE text
!STACK text

!SUBENTRY depth plugin_id severity code date
!MESSAGE text
!STACK text

Core will not provide a log file parser and will leave that up to the clients.
Code complete and awaiting review before release.

Comment 13 DJ Houghton

2002-04-25 09:39:25 EDT

Released.
Available in integration builds > 2002-04-24.

Comment 14 Dejan Glozic

2002-04-25 11:02:03 EDT

PDE LogView may have an intersting problem with the 'ever-growing' nature of 
the log file. The original design of the view assumed that the view's content 
is showing entries for the current session. Since the new log file will contain 
entries for all the sessions, PDE will have a problem figuring out how to 
ignore the entires that do not belong to the currently active session. Using 
the entries after that last !SESSION entry is not reliable because the current 
session may not have any entries because there were no errors in it. Do you 
have a tip on how can PDE figure out if the !SESSION entry belongs to the 
currently active session?

Comment 15 DJ Houghton

2002-04-25 11:21:43 EDT

Not really...I don't think that there's anything we can do. Even if we 
timestamp the !session tag we still don't know which session (current or 
previous) it happened in.

Perhaps this is one reason why we should look at moving to one log file per 
session in the future.

Comment 16 John Wiegand

2002-04-25 11:26:46 EDT

Core must have already solved this problem: What algorithm do you use to 
determine to add the !SESSION tag?

However you decide to do this, you can make API so someone can know:
anyLoggingThisSession() [poor name]

Comment 17 DJ Houghton

2002-04-25 13:03:49 EDT

Sorry, forgot that we do already have a means to determine whether or not the 
log file is dirty for the current session.

Can add API to return a boolean indicating whether or not we have yet logged in 
the current session.

Comment 18 Dejan Glozic

2002-04-25 13:42:27 EDT

That will probably be sufficient. I could test the API and if true, I would 
assume that the last !SESSION entry is current.

Comment 19 DJ Houghton

2002-04-25 14:11:44 EDT

Ok, see Bug 14649 to track the progress on this new API.

Comment 20 Paul Slauenwhite

2002-06-03 11:17:45 EDT

An alternative solution is to remove the root tag(s) from the .log.xml file, 
rename the .log.xml file back to .log (a file of XML fragments) and create a 
XML wrapper file (i.e. .log_Wrapper.xml) in the same directory for 
parsing/viewing:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE document [<!ENTITY log SYSTEM "./.log">]>
<document>
&log;
</document>