"Start byte" of OSBREAD statement (AREV32) [Revelation On-Line Wiki]

AREV32, Robert Heard, Dave Sigafoos, Barry Stevens, Andrew McAuley

Join The Works program to have access to the most current content, and to be able to ask questions and get answers from Revelation staff and the Revelation community

At 12 MAR 2023 08:38:30PM Robert Heard wrote:

Hello AREV32 fans,

I have been using the OSBREAD statement in a program to read a large DOS file.

I had planned to read large blocks of data (32000 bytes) per read. After reading the first black, I thought I would back up a little and read from that point. But this strategy has produced strange results.

So I did a simple test whereby I read each block, and each time I advance the Start Byte by 32000 (the length of the block that I am reading). This test seems to work perfectly.

I'm hoping someone can explain the intricate details of this OSBREAD statement.

Robert Heard

At 12 MAR 2023 08:52PM Dave Sigafoos wrote:

Hi Robert … not sure what the question is as you state you did a test and it worked perfectly … yeah!

Basically that is the way to go .. loop through until you get an eof chomping off x bytes and add that to it's starting point. Then read the next set for x bytes …

Dsig

At 12 MAR 2023 09:04PM Dave Sigafoos wrote:

By the way … by EOF i meant that there was no more data ..

DSig

At 12 MAR 2023 11:53PM Robert Heard wrote:

Hi Dave,

Yes, I was a bit thin on detail…

Here are the "Start byte" and "Record length" used on the successive OSBREAD statements:

1) 0, 32000

2) 32000-15, 32000

3) 64000-15, 32000

…

The kind of results I was seeing for the 1st FM-delimited line and last FM-delimited line of the buffer ("record"), were nonsense. It made me think that it really had no idea where it was.

Then did test run with following parameters to the OSBREAD statement:

1) 0, 32000

2) 32000, 32000

3) 64000, 32000

…

to EOF.

This test gave all perfect results, matching the source file byte-for-byte.

Makes me think you must have Start Byte as multiples of the record length (or buffer size).

I will do another test to see if I can re-read an earlier buffer, when I go back a multiple of the buffer size.

Robert.

At 13 MAR 2023 05:09AM Barry Stevens wrote:

Hi Dave,

Yes, I was a bit thin on detail…

Here are the "Start byte" and "Record length" used on the successive OSBREAD statements:

1) 0, 32000

2) 32000-15, 32000

3) 64000-15, 32000

…

The kind of results I was seeing for the 1st FM-delimited line and last FM-delimited line of the buffer ("record"), were nonsense. It made me think that it really had no idea where it was.

Then did test run with following parameters to the OSBREAD statement:

1) 0, 32000

2) 32000, 32000

3) 64000, 32000

…

to EOF.

This test gave all perfect results, matching the source file byte-for-byte.

Makes me think you must have Start Byte as multiples of the record length (or buffer size).

I will do another test to see if I can re-read an earlier buffer, when I go back a multiple of the buffer size.

Robert.

I am sure the help example explains it. Modified for your use:
equ RECSIZE$ to 32000

readOffset = 0

Loop



      OSBRead data From inputFileHandle At readOffset length RECSIZE$



      error = status()



    Until data = NULL$



      readOffset += RECSIZE$  ;* or  readOffset += len(Data)





    Repeat

At 13 MAR 2023 05:13AM Barry Stevens wrote:

Hi Dave,

Yes, I was a bit thin on detail…

Here are the "Start byte" and "Record length" used on the successive OSBREAD statements:

1) 0, 32000

2) 32000-15, 32000

3) 64000-15, 32000

…

The kind of results I was seeing for the 1st FM-delimited line and last FM-delimited line of the buffer ("record"), were nonsense. It made me think that it really had no idea where it was.

Then did test run with following parameters to the OSBREAD statement:

1) 0, 32000

2) 32000, 32000

3) 64000, 32000

…

to EOF.

This test gave all perfect results, matching the source file byte-for-byte.

Makes me think you must have Start Byte as multiples of the record length (or buffer size).

I will do another test to see if I can re-read an earlier buffer, when I go back a multiple of the buffer size.

Robert.

I am also curious why you initially were doing the -15 to the offset?

At 13 MAR 2023 05:17AM Andrew McAuley wrote:

I'd guess that this was an attempt to deal with splitting records over 32K boundaries. The normal technique is to read the 32K, find the last field mark, and carry forward that remainder to prepend to the next read.

NB AREV32 isn't restricted to 64K data limits, so reading much bigger chunks will be faster. Just try not to OSREAD multi Gigabyte files….

The Sprezzatura Group

The Sprezzatura Blog

World leaders in all things RevSoft

At 13 MAR 2023 05:22AM Barry Stevens wrote:

Hi Dave,

Yes, I was a bit thin on detail…

Here are the "Start byte" and "Record length" used on the successive OSBREAD statements:

1) 0, 32000

2) 32000-15, 32000

3) 64000-15, 32000

…

The kind of results I was seeing for the 1st FM-delimited line and last FM-delimited line of the buffer ("record"), were nonsense. It made me think that it really had no idea where it was.

Then did test run with following parameters to the OSBREAD statement:

1) 0, 32000

2) 32000, 32000

3) 64000, 32000

…

to EOF.

This test gave all perfect results, matching the source file byte-for-byte.

Makes me think you must have Start Byte as multiples of the record length (or buffer size).

I will do another test to see if I can re-read an earlier buffer, when I go back a multiple of the buffer size.

Robert.

I am sure the help example explains it. Modified for your use:
equ RECSIZE$ to 32000

readOffset = 0

Loop



      OSBRead data From inputFileHandle At readOffset length RECSIZE$



      error = status()



    Until data = NULL$



      readOffset += RECSIZE$  ;* or  readOffset += len(Data)





    Repeat
If you are extracting @fm delimited data rows, maybe you should be doing this way so you are not truncating records.
equ RECSIZE$ to 32000

readOffset = 0

AllMyFmRecords=''

Loop



      OSBRead data From inputFileHandle At readOffset length RECSIZE$



      error = status()



    Until data = NULL$

     MyFmRecord=data[1,@fm]

     AllMyFmRecords:=MyFmRecord:@fm



      readOffset +=len(MyFmRecord) + 1





    Repeat

  AllMyFmRecords[-1,1]=''

At 13 MAR 2023 11:17PM Robert Heard wrote:

Hello all,

Thanks to everyone for your feedback and help.

I did not know that OSREAD could handle large DOS file reads.

The reason I was going back a bit on subsequent READs is because the routine is SEARCHING for data. If the string I am searching spans 2 buffers, it won't report a match.

I think the best approach would be to prefix the final partial field (missing the @FM) to next buffer. Seems the logical solution.

Thanks again, much appreciated.

Robert.

View this thread on the Works forum...