Join The Works program to have access to the most current content, and to be able to ask questions and get answers from Revelation staff and the Revelation community

At 01 APR 2008 10:09:06PM James wrote:

Hi

I have a random sequence of 8 hex codes, and would like to iterate through each of them. This seems to work in around 95% of cases, using len() + syntax, however not always… what am I doing wrong?

e.g.

test1=\C192C192C192C192\ ;* len(test1)=4!?

test2=\1234567890ABCDEF\ ;* len(test2)=8, yay

Thanks.


At 02 APR 2008 04:56AM [url=http://www.sprezzatura.com]The Sprezzatura Group[/url] wrote:

James,

Are you running in UTF8 mode? If so you need to use the GetByteSize() function, not Len().

Len() returns the number of *characters* in the string which may not be the same as the number of *bytes* in a UTF8 string. Under ANSI mode these should be the same.

The Sprezzatura Group

World leaders in all things RevSoft


At 02 APR 2008 05:23AM Stefano Cavaglieri wrote:

James,

In addition to what Sprezzatura pointed out, please note that the specific sequence of characters in your first example may fall in the ISO-6429 sequence of control codes defined for ISO-8859 and Unicode encoded texts. Tipically \C000\ to \C01F\ and \C180\ to \C19F\ could lead to the same problem.

Best,

Stefano


At 02 APR 2008 08:24PM James wrote:

Hi and thanks for the tips, they were spot on.

Yep the UTF8 is understandably causing the issue when examining as a string, GetByteSize is always correct - great!

How to I iterate through each of these 8 bytes within OI, as I need to examine each one independently?

p.s. the last serious development I did in OI was a very looong time ago, e.g. 32bit was "impossible" :). I don't remember even discussing UTF8/16 at that time… I'm really impressed with the changes + current direction.


At 02 APR 2008 11:52PM Matthew Crozier wrote:

I'd temporarily flick over to ANSI mode and then process as normal:

Declare function [/color]isUTF8() 
[/color] 
    utf8mode=isUTF8() 
    [/color]Call [/color]SetUTF8( [/color]0[/color]) 
    [/color]For [/color]i=[/color]1 [/color]to [/color]len( byteString)   ;[/color]* len() counts bytes in ANSI mode 
        [/color]byte [/color]= byteString i, [/color]1[/color] 
        [/color]* Do byte stuff 
    [/color]Next [/color]i 
    [/color]Call [/color]SetUTF8( utf8mode)[/color][/color][/size]         
               

HTH - M@

[img]http://www.vernonsystems.com/images/logo_main_ani.gif[/img]


At 03 APR 2008 01:06AM [url=http://www.sprezzatura.com]The Sprezzatura Group[/url] wrote:

Sorry Matt you've hit a particularly personal sensitive spot :). Static variables should always be resolved outside of the loop - so resolve the len first then use a constant…

The Sprezzatura Group

World leaders in all things RevSoft


At 03 APR 2008 03:19PM Matthew Crozier wrote:

Fair comment! I do do that - I was just being lazy.

Cheers M@

[img]http://www.vernonsystems.com/images/logo_main_ani.gif[/img]


At 03 APR 2008 05:14PM James wrote:

Thanks for the tip Matt that will work, swapping char modes at runtime is quite a cool feature… excellent!

View this thread on the Works forum...

  • third_party_content/community/commentary/forums_works/daf5fa9fb49883028525741f000bd1fc.txt
  • Last modified: 2024/01/04 20:57
  • by 127.0.0.1