ANSI - ASCII or what? (OpenInsight 32-bit Specific) [Revelation On-Line Wiki]

Sign up on the Revelation Software website to have access to the most current content, and to be able to ask questions and get answers from the Revelation community

At 23 AUG 2003 04:36:43PM Wilhelm Schmitt wrote:

OpenInsight 32-bit Specific

I have this problem related to foreign characters (especially with accented ú (accented "u").

1. In the system editor I create the following lines of code, using the keyboard sequence ALT-163 to produce the letter 'ú'

function u12

u1=ú'

u2=char(163)

debug

return ''

2. After compiling (F9), the system editor still shows the accented letter "u", as typed in with ALT-163.

3. Inspecting variables in the debugger, the accented "u" is identified as hex FA and the variable u2 shows a pound sign.

4. Closing the system editor and opening again, the source code no longer shows the accented "u" but a delimiter (the editor doesn't reveal which one), while the debugger continues to show the accented "u".

Although the data input and retrieval works fine with accented characters, each time accented characters are evaluated and the code gets inspected in the debugger, the headache starts, because there are different character sets used by both editors.

Also, several programs (like OR_VIEW, when listing reports defined by report builder) interprete accented "u" as a system delimiter.

How can I avoid this confusion?

Regards

Wilhelm

At 23 AUG 2003 07:42PM Pat McNerthney wrote:

Wilhelm,

It is very important in this area to specify what version of OI32 you are using. Much work has been done in this area and it has been a constant work in progress as the various issues have been discovered and addressed.

Accented lower case u is hex 0xfa and the decimal character 163 is the pound sign. Also, hex 0xfa is the @STM system delimiter.

The key sequence Alt-163 does nothing for me under Windows. What I use to get characters which are not typable on my keyboard is use the Character Map application.

In the later versions of the System Editor there is a menu option View-]Minimum Delimiter where you can set the minimum delimiter value recognized by the editor.

Pat

At 25 AUG 2003 09:32AM Oystein Reigem wrote:

Wilhelm,

The title of your posting says it all. You're mixing ANSI and ASCII. Or to be more precise you're mixing the 8-bit character sets used for Windows apps and MS-DOS apps.

For Windows apps in the Western world the character set mostly used is WinLatin-1. Or ANSI if you like.

For MS-DOS apps it's the DOS Codepage CP850 (DOSLatin1). ASCII is strictly a 7-bit system. But most people will understand what you mean if you say ASCII.

Character 163 (hex A3) in CP850 is "ú". But with OI you're in Windows, not in DOS. So you must use 250, because "ú" is character 250 (hex FA) in WinLatin-1.

You already know about the Alt+Numeric keyboard technique to key in characters not on the keyboard. But do you know there are two variations? If you know the ANSI integer value, say 250 for "ú", you key in the number with a leading zero: Alt+0250. If you know the DOS integer value, say 163 for "ú", you key in the number straight: Alt+163.

This stuff even works across the Win/DOS divide. So if the DOS value is all you know you can still key it in in a Windows app, as long as you use the DOS syntax. So you can key in Alt+163 in the System Editor and get "ú".

For character tables and more info see

- Oystein -

At 25 AUG 2003 10:48AM Wilhelm Schmitt wrote:

Pat,

we use OI413.

In the later versions of the System Editor there is a menu option View-]Minimum Delimiter where you can set the minimum delimiter value recognized by the editor.

In the system editor I put @TM as minimum value. The accented characters show correctly in the editor now. But when executing a report made with report builder, 0xfa is still interpreted as system delimiter mixing up columns.

So how can I have the report builder display a predefined report where a column contains a word with the letter 0xfa?

I guess the same problem exists in other OI tools.

Regards

Wilhelm

At 25 AUG 2003 11:23AM Wilhelm Schmitt wrote:

Oystein,

didn't know about the second ALT-key technique. Thanks.

Entering data (with accented characters like 0xfa) is not so much a problem, the keyboard config allows direct typing. The confusion starts within OI tools that see 0xfa only as a system delimiter.

Regards

Wilhelm

At 25 AUG 2003 12:20PM Oystein Reigem wrote:

Wilhelm,

The confusion starts within OI tools that see 0xfa only as a system delimiter.

I think you have to go Unicode to avoid that. In UTF-8 "ú" is stored as two bytes that are not delimiters.

¹⁾)

- Oystein -

At 25 AUG 2003 12:29PM Pat McNerthney wrote:

Wilhelm,

Another option you have in OI4 4.13 is to enable UTF8 character mode. This is enabled via a new check box in the Application Properties dialog.

UTF8 character mode will store all greater than 7-bit Unicode characters as multi-byte character sequences which do not conflict with the system deliminters.

Pat

At 25 AUG 2003 05:57PM Wilhelm Schmitt wrote:

Pat,

If I choose UTF8 is there any consideration I have to take into account in the program code, or is this being handled transparently in the background?

Does JOI also have to be configured for UTF8?

Regards

Wilhelm

At 25 AUG 2003 09:01PM Pat McNerthney wrote:

Wilhelm,

If I choose UTF8 is there any consideration I have to take into account in the program code, or is this being handled transparently in the background?

The system code handles this transparently quite nicely.

Currently in 4.13 you have to be aware in your own code that any non-ASCII character (anything greater than char(127)) is stored in the string using up to as many as three bytes. This means that if you do a "len(char(128))", the result will be 2. This is only a problem if you are trying to perform operations on these high character values as individual characters.

OI 7.0 has been enhanced to deal with these multi-byte sequences as individual characters.

Does JOI also have to be configured for UTF8?

I have no idea how JOI goes about doing it's thing.

Pat

At 25 AUG 2003 09:56PM Wilhelm Schmitt wrote:

Pat,

thanks for the information. We will try it out.

About UTF8 handling in JOI, I'll post it in the JOI forum directly.

Regards

Wilhelm

View this thread on the forum...

¹⁾

(Or perhaps the bi-directional CHARMAP stuff still works. With this solution you can have "ú" stored in the database with a different integer value.