Bug 490824 - PDF renderer does not include 4-byte UTF-8 characters.
Summary: PDF renderer does not include 4-byte UTF-8 characters.
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: BIRT (show other bugs)
Version: 4.5.1   Edit
Hardware: PC Windows 7
: P3 major (vote)
Target Milestone: ---   Edit
Assignee: Birt-ReportEngine-inbox@eclipse.org CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-31 15:21 EDT by Brent Kilgore CLA
Modified: 2016-03-31 15:21 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brent Kilgore CLA 2016-03-31 15:21:30 EDT
We are seeing an issue where 4-byte UTF-8 characters are not displaying in PDF output.  They show up in the web-viewer and HTML outputs fine.  One through three byte sequence work correctly.

The field in question is being supplied from a java formatter class.  Debugging this class shows it leaves the formatter as a proper UTF-16 sequence.  This also happens when adding the character directly in the report generator.

The symptom is odd though compared to other mangling I've dealt with.   When a 4 byte sequence is encountered, it is completely omitted from the report.  There are no garbled characters, "?"s or whitespace where the symbol should occur. 

Similar issues have occurred in AIX, but that was because the shell's locale was not set to UTF-8.  Unfortunately, this is windows so there is no UTF-8 locale.