Bug 195402 - [api] RSE should support gzipped tar archives (*.tgz, *.tar.gz)
Summary: [api] RSE should support gzipped tar archives (*.tgz, *.tar.gz)
Status: RESOLVED FIXED
Alias: None
Product: Target Management
Classification: Tools
Component: RSE (show other bugs)
Version: 2.0   Edit
Hardware: All All
: P4 enhancement (vote)
Target Milestone: 3.0 M6   Edit
Assignee: Martin Oberhuber CLA
QA Contact: Martin Oberhuber CLA
URL:
Whiteboard:
Keywords: api, bugday, contributed, helpwanted
Depends on:
Blocks: 219686
  Show dependency tree
 
Reported: 2007-07-04 11:19 EDT by Xuan Chen CLA
Modified: 2011-05-25 07:26 EDT (History)
2 users (show)

See Also:


Attachments
Patch for Bug 195402 (15.79 KB, patch)
2008-02-14 23:35 EST, Johnson Ma CLA
no flags Details | Diff
updated patch for bug 195402 (104.83 KB, patch)
2008-03-05 10:23 EST, Johnson Ma CLA
mober.at+eclipse: iplog+
Details | Diff
Updated patch with minor stylistic changes (123.67 KB, patch)
2008-03-12 09:53 EDT, Martin Oberhuber CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Xuan Chen CLA 2007-07-04 11:19:00 EDT
If I created a tar file using command (in Linux):

tar cvf openRSE3.tar eclipse

I was able to expand it using RSE.

But if I use this command:
tar cvzf openRSE3.tar eclipse
to create a jar file, I could not expand it.  I only got an empty node.

-z option is to create a compressed tar file, and .tgz suffix is a convetion for gzipped tar file.  So if we want to adding this support, we need to make sure the .tgz file extension will be recognized as well.  Right now, RSE does nto recognize it as a tar file.
Comment 1 Martin Oberhuber CLA 2007-07-04 11:37:35 EDT
The tar -z option is a nonstandard GNU extension and equivalent to 
  tar cfv - options | gzip2 > archive.tar.gz

IMHO, gzipped tarfiles are a different kind of archive and could be implemented by any extender by contributing the new kind of archive handler. Same for bzip2'd archives (tar -cfj --> creating archive.tar.bz2)

Extensions that should be recognized by the new kind of archive handler are 
  *.tar.gz
  *.tgz
  *.tar.bz2

As an other alternative, we could support these in the RSE default archive handler as well, but it's an implementation detail. At any rate, this sounds like a nice little job that contributors could help out with.
Comment 2 Martin Oberhuber CLA 2007-07-04 15:06:45 EDT
Also note that there is the JArchive project, which focuses on creating archive handlers -- we should not work on stuff that they already provide:
http://sourceforge.net/projects/jarchive/
"provide a standalone archive manipulation tool for Eclipse. At first we focus on the folowing archives: zip, war, jar, ear, t64, d64, rar, tar."

It is read-only for now; they seem to be working against EFS APIs:
http://wiki.eclipse.org/EFS#JArchive
at some point we might consider a bridge between our archive handler API and JArchive or other EFS-based archive handlers as back-end. Another EFS based archive access plugin exists in Platform UI:
http://wiki.eclipse.org/EFS#UI_Examples:_Zip_and_Memory_file_systems
Comment 3 Martin Oberhuber CLA 2007-12-14 05:23:04 EST
Adding gzip support should be really really simple because the JRE already has GZipInputStream / GZipOutputStream. Just create SystemTgzHandler as a subclass of SystemTarHandler, and register it via the archivehandlers extension point. For reference on using the GZipStreams see
   http://www.particle.kth.se/~lindsey/JavaCourse/Book/Code/P1/Supplements/Chapter09/File/ZipGZip/ZipGzipper.java

I guess I'd like to see that for 3.0...
Comment 4 Johnson Ma CLA 2008-02-14 23:35:52 EST
Created attachment 89809 [details]
Patch for  Bug 195402
Comment 5 Johnson Ma CLA 2008-02-14 23:43:08 EST
I have created a patch for this bug to support .tar.gar and .tgz files
Just like Martin pointed out, I use GzipInputStream/GZipOutputStream to wrap the streams used in SystemTarHandler
I also changed the way to check file extensions in ArchiveHandlerManager class, since the old code doesn't support file name like fool.tar.gz



Please review my patch and apply it if applicable.

(BTW, most test cases in FileServiceArchiveTest failed even before my patch.
Does anyone know the reason, or do i need some special setup to run those tests?)

Thanks

Johnson Ma
Comment 6 Xuan Chen CLA 2008-02-15 00:45:54 EST
The failing of tar file handling JUnit testcases has been fixed.  Please refer to this bug for more details:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=218491
Comment 7 Martin Oberhuber CLA 2008-02-15 04:38:26 EST
Thanks for the Patch - Xuan please consider for M6
Comment 8 Martin Oberhuber CLA 2008-02-28 16:54:04 EST
Patch looks good to me, just a few minor things:

1.) In the Copyright Headers of the files you're changing, please add a line
    listing your name and the change you are making. E.g.

    * Johnson Ma (Wind River) - [195402] Add tar.gz archive support

2.) As part of our due diligence as committers, we must ensure that all
    contributions are rightfully made under the EPL. That means, you need
    to verify that you have the right to contribute under the EPL and that
    you did not reference any 3rd party sources other than EPL sources.

    In the TM Project, we do this by asking contributors attach a corresponding
    note here on bugzilla, so could you please go ahead and add a comment here
    with the legal notice according to
  
http://www.eclipse.org/dsdp/tm/development/committer_howto.php#external_contrib

3.) I added an [api] marker to the bug since API is being added to the
    ArchiveHandlerManager class, which is public API, as well as the
    SystemTarHandler class, which is also public API. In such cases I'd
    like to see a summary of the API Changes Made in the form of a comment
    here on bugzilla, such that I can reference it in the build nots and
    release notes eventually.
    Also, for newly added API methods, please add an @since tag in their
    Javadoc.

4.) Your new classes "TgzFile", "SystemTgzHandler" have Copyright Headers
    saying "Copyright (c) 2008 Eclipse.org"
    I don't think that this is true. Since you are the Author of that class,
    the copyright owner is either yourself or your employer (in this case
    Wind River), but not the Eclipse Foundation. Please use the Copyright
    Header from file SftpFileService.java, for instance.

5.) I'm not sure that SystemTgzHandler should be public API, I think I'd
    prefer seeing it as an "internal" class

6.) Your change to UniversalFileSystemMiner seems to be wrong:
        setRegisteredHandler("tar.gz", SystemJarHandler.class)
    shouldn't the SystemTgzHandler be registered?

7.) I'd love to see some JUnit Tests for your new Tgz handler.
Comment 9 Johnson Ma CLA 2008-03-05 10:23:51 EST
Created attachment 91652 [details]
updated patch for bug 195402
Comment 10 Johnson Ma CLA 2008-03-05 10:26:32 EST
I, Johnson Ma, declare that I developed attached code from scratch, without referencing any 3rd party materials except material licensed under the EPL. {I am authorized by my employer to make this contribution under the EPL.}
Comment 11 Johnson Ma CLA 2008-03-05 10:32:05 EST
Hi Martin

I updated the patch according to your feedback as below:
*updated copyright and file headers.
*added @since @deprecated for those API changes
*fixed typo in UniversalFileSystemMiner
*added unit tests for the tar.gz support based on existing archive test cases
*fixed two test failures on linux in FileServiceArchiveTest 

Please take a look

Thanks

Johnson
Comment 12 Johnson Ma CLA 2008-03-05 10:40:47 EST
API changes for this bugfix are:
--ArchiveHandlerManager.java
*getExtension(File file) is deprecated, should use getRegisteredExtension(File file) instead.
*getExtension(String file) is deprecated, should use getRegisteredExtension(String file) instead. 

This is because define extension as any letters in the filename after the last "." doesn't work for file name like foo.tar.gz

--SystemTarHandler.java
*getTarOutputStream(File outputFile)
added this method to get output stream. So the subclass may return compressed output stream if needed

--TarFile.java
*getInputStream()
added this method the get the input stream. So the subclass may return compressed input stream if needed
Comment 13 Martin Oberhuber CLA 2008-03-12 09:53:51 EDT
Created attachment 92312 [details]
Updated patch with minor stylistic changes

Thanks for this excellent contribution.
I made the following additional changes:

  * Update copyright year 2008 everywhere
  * Moved SystemTgzHandler into an API package because otherwise it could
    not be used in UniversalFileSystemMiner
  * Fixed a bug in getRegisteredExtension() -- it would not work for uppercase
    file names on windows although it should be case insensitive IMHO (the
    now deprecated getExtension() was case insensitive)
  * Updated the test cases accordingly
  * Added constructor(String) to all test cases
  * Changed the Copyright Header of FileServiceArchiveBaseTest because that
    one was not invented by Wind River but really extracted from an IBM file.
    --> IBM should be main copyright owner, "Extracted from" comment with the
        Contributors.
Comment 14 Martin Oberhuber CLA 2008-03-12 09:56:48 EDT
Patch committed:

[195402] Add tar.gz / tgz archive support

See comment #12 for new API additions.
Comment 15 Martin Oberhuber CLA 2011-05-25 07:26:15 EDT
Comment on attachment 91652 [details]
updated patch for bug 195402

Patch from Johnson was eventually used.