{{tag>category:"OpenInsight" author:"Jim Vaughan" author:"Mike Ruane" author:"j Vaughan" author:"Steve Epstein" author:"Oystein Reigem"}} [[https://www.revelation.com/the-works|Join The Works program to have access to the most current content, and to be able to ask questions and get answers from Revelation staff and the Revelation community]] ==== Japanese char set (OpenInsight) ==== === At 17 OCT 2001 09:01:28PM Jim Vaughan wrote: === Does the new 32-bit stuff have any support for the Japanese character set? If so would it support this character set in the menus, forms and data? ---- === At 18 OCT 2001 07:28AM Mike Ruane wrote: === Jim- We're looking into it- as well as Chinese. One of the problems is that we don't speak Chinese or Japanese and expect trouble installing those versions of Windows. Mike ---- === At 18 OCT 2001 01:32PM j Vaughan wrote: === What kind of time frame are we looking at? ---- === At 19 OCT 2001 10:47PM Jim Vaughan wrote: === I know it's hard to guess how long something like this might take, but ... I need to know. We have a customer in Japan that would like to buy but needs the Japanese char set. Give me a best case worst case. If you think it can be done it will take from.... to.... Thanks. ---- === At 22 OCT 2001 07:18AM Mike Ruane wrote: === Jim- I have a new machine I can test it on, and someone who can help me get it installed. I should have some more details by next week. Mike ---- === At 22 OCT 2001 04:22PM j Vaughan wrote: === You guys are great. I look forward to hearing how it goes. ---- === At 29 OCT 2001 01:05PM Jim Vaughan wrote: === I just heard from my customer, they are meeting next week. Would it possible to know if this is gaoing to be available by then? ---- === At 29 OCT 2001 02:26PM Mike Ruane wrote: === Jim- We're formatting the machine today. Mike ---- === At 29 OCT 2001 03:29PM Jim Vaughan wrote: === Great, keep me updated. ---- === At 29 OCT 2001 04:25PM Steve Epstein wrote: === Dear Jim and Mike, I have asked the same question. I actually have a Japanese WIN2000 machine from our clients in Japan. Any testing I can do would be appreciated. I have the fonts, et al. Steve ---- === At 29 OCT 2001 05:24PM Mike Ruane wrote: === Guys- Thanks- First blush seems to be a no, as we need Unicode, which would destroy our data since we make heavy use of Ascii 251 to 255 as our system delimiters. MIke ---- === At 30 OCT 2001 10:27AM Jim Vaughan wrote: === So what does that mean, do you have any other avenues to pursue? ---- === At 30 OCT 2001 06:25PM Oystein Reigem wrote: === That must be the next big project. After the 32-bit version. To rid OI of those troublesome delimiters. Just trying to make myself popular. - Oystein - ---- === At 04 NOV 2001 03:57PM j Vaughan wrote: === So this is no, for now? Or no forever? If it's no for now, when in the future might it be available. I just need to give my customer an answer, even if it's one they don't like. ---- === At 05 NOV 2001 06:42AM Oystein Reigem wrote: === Mike, It would be nice if Unicode could be implemented in OpenInsight and kill dead the international-characters-versus-delimiters problem. But there are many questions on the way. I assume you've looked at some of them already. There are many different Unicode encoding formats. Some of them are fixed-length (1, 2, 3 or 4 bytes per character), some variable (characters with a mix of different lengths). I believe there are two basic alternatives if one wants to implement a multi-byte character encoding system in a database system like OpenInsight, where special characters or byte values are used to delimit various units of data during storage and computing. One is to use a fixed-length character encoding format and let the delimiters be multi-byte too. This means among other things that the file system must be rewritten to handle multi-byte characters instead of single-byte characters. I don't expect that can be done overnight. The other is as much as possible to handle multi-byte encoded text as any other byte sequence, and keep the old single-byte delimiters. But then one must choose an encoding format that avoids collisions with the delimiters. E.g with a 2-byte encoding format, none of the 2 bytes must ever be in the range 250-255. But is the latter possible? Is there a Unicode encoding format (e.g one that can be used for Japanese) where no byte is in the range 250-255? I believe no. But there [i]are[/i] formats where certain [i]other[/i] byte values never occur. E.g, the UTF-8 2-, 3- and 4-byte encodings always have byte values with the highest bit set to 1 (to distingush them from the single-byte UTF-8 encoding, which is plain old 7-bit ASCII). So perhaps by using that old trick with the bi-directional CHARMAP it's possible after all? E.g, shunt 250-255 down by 128. Next question is how comparisons and sorting can be done on multi-byte data. - Oystein - PS. I don't know [i]that[/i] much about Unicode. But I have colleagues who know a bit more. And there's the Unicode website . [[https://www.revelation.com/revweb/oecgi4p.php/O4W_HANDOFF?DESTN=O4W_RUN_FORM&INQID=WORKS_READ&SUMMARY=1&KEY=32A6C7ADD37E9BBB85256AE90005A0B9|View this thread on the Works forum...]]