Database Graphics Toolkit - Blackhawk Data Corporation by Mark Hirst
RTP Series - RTP57
RTP Series - RTP57A
LINEAR HASH FILE STRUCTURES - Part 1
RTP Series - RTP57
RTP Series - RTP57A
LINEAR HASH FILE STRUCTURES - Part 1
QTIPS - DOS File Names
DOS Interfacing (Part II)
VERBatim - V116
@ATTACK - @Pri.File
@ATTACK - @Rollout.File
File Variables
How Indexes Are Updated
Index Record Layouts
QTIPS - File Variable of File In SELECT Statement
QTIPS - Amending non-Attached Files
LINEAR HASH FILE STRUCTURES - Part 1
Index Flush
QTIPS - File Handle Structure
RTP Series - RTP9
RTP Series - RTP50
VERBatim - V25
@ATTACK - @Files
Utility Diskette # 3 - Part I
VERBatim - V86
@ATTACK - @Help.Level
@ATTACK - @Window.Level
Viewer
QTIPS - Using INIT.VIEW with Printers
REVMEDIA Revisited
QTIPS - Sub-Headings in RLIST (Revisited)
Redefining Keys
RTP Series - RTP53B
Prompt Help
VERBatim - V124
Popups
@ATTACK - @Environ.Set
@ATTACK - @Edit.Keys
@ATTACK - @Int.Const
@ATTACK - @HW
@ATTACK - @Modal
@ATTACK - @Move.Keys
@ATTACK - @Priority.Int
@ATTACK - @Macro.Mode
Utility Diskette # 3 - Part I
Utility Diskette # 3 - Part II
Utility Diskette # 4
QTIPS - FOR/NEXT variables
File Variables
LINEAR HASH FILE STRUCTURES - Part 1
QTIPS - File Handle Structure
LINEAR HASH FILE STRUCTURES - Part 1
Caching in on the Frames Array - Mike Pope
VERBatim - V126
Esc.To.Exit
Uncommon Knowledge - WC_WST_CHAR%
DOS Interfacing (Part II)
Utility Diskette # 4
@ATTACK - @Attrbt.Ptr
@ATTACK - @Query.Table
REVMEDIA Revisited
Uncommon Knowledge - WC_Table_Exit_Mode%
QTIPS - New Catalyst Option
Version 3 Technical Highlights - Deleting Tables Programmatically
Version 3 Technical Highlights - Aliasing Tables Programmatically
Version 3 TCL Subroutines - Creating Tables
Version 3 TCL Subroutines - Deleting Tables
Version 3 TCL Subroutines - Aliasing Tables
Symbol Table Structure
@ATTACK - @CPU.Type
@ATTACK - @HW
Reader's Clinic - Volume Pointer Record
QTIPS - Skipping Prompts
SecureUser
VERBatim - V25
@ATTACK - @Files.System
Advanced Revelation Initialisation Sequence (Overview) by Mike Pope
REVMEDIA Revisted
QTIPS - File Handle Structure
@ATTACK - @Return.Value
Utility Diskette # 4
Version 3 Technical Highlights - Creating New Accounts Programmatically
Version 3 Technical Highlights - Securing Accounts
Version 3 Technical Highlights - Deleting Accounts
Version 3 TCL Subroutines - Creating New Accounts
Version 3 TCL Subroutines - Deleting Accounts
Reader's Clinic - Different Id Same Record
RTP Series - RTP25
QTIPS - String Space
Reader's Letters - Jim Owen
QTIPS - Finding/Replacing Spaces With The Editor
Networked %SK%
RTP Series - RTP25
A RevTechie Replies - And Miscellaneous Jottings - Mike Pope - Revelation Technologies (UK) Ltd
Gas Bar
VERBatim - V9
@ATTACK - @Rec.Count
QTIPS - Replacing GAS.BAR routine during PERFORM "SELECT"
A RevTI Techie Replies - Mike Pope - Revelation Technologies (UK) Ltd
Base Conversions
QTIPS - @Date.Format
@ATTACK - @Date.Format
QTIPS - Short Cut Implicit Formatting
Utility Diskette # 4
Capture Playback and Convert.Keystrokes
Playing with Scan Codes
QTIPS - Compiling Protection Code
QTIPS - Invalid Code and Command
QTIPS - Code/Command Help
Compiling 64K on a Shoestring by Blaise Wrenn (LexStat Systems Ltd)

RevMedia FKB

DocumentV3I1A1
TitleLINEAR HASH FILE STRUCTURES - Part 1
KeywordsLINEAR
HASH
FILE
STRUCTURE
LK
OV
FRAMES
TextDave Harmacek of Harmacek Database SYSTEMS had a recent problem where his
client corrupted his Linear Hash File To recover the remaining information
it was necessary to deal WITH the files at the DOS level The file structure
was not self evident so we had to undertake some research The starting
point was Technical Bulletin # 29 by Ron Phillips In this he sets out most
of the salient facts relating to the way in which Linear Hashing works
Unfortunately it was not written WITH a view to providing all the
information required to produce an LH fixing utility so some elements of it
lack detail whilst others are a little misleading Over the next two issues
the physical structure of Linear Hash files will be addressed In this the
first article attention will be given primarily to the LK portion of the
file and the structure of pointers The second article will consider the OV
portion and how frames are linked

Header Information
Every LH file is split physically into two files the LK file and the OV
file The LK file contains the primary data frames and the OV contains the
overflow Each of these files is split up into a number of "frames"
Normally these frames are 1024 bytes in length although this size can be
changed when the file is created or recreated Each frame is made up of some
header information followed by the actual data in the frame The header
information can be one of four types

LK frame First Group Header
LK frame Subsequent Group Headers
OV frame Free Frames Group Header
OV frame Subsequent Group Headers

A frame can be identified by looking at the first byte in the frame which
will have one of the following values

ASCII(26) First LK Group Header
ASCII(13) Subsequent LK Group Headers
ASCII(7) Free Frames OV Header
ASCII(14) Subsequent OV Group Headers

Header Structure
The structure of the LK header is described in the following table (Where
(x) is length)

ÚÄÄÄÄÄÂÄÄÄÄÂÄÄÄÄÄÄÄÂÄÄÄÄÂÄÄÄÄÄÄÂÄÄÄÄÄÂÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄ¿
³Byte ³ 1 ³ 2 5 ³6 9 ³10 13 ³14 15³16 19 ³ 20 ³21 22³23 26 ³
ÃÄÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄÄÅÄÄÄÄÄÅÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄ´
³Group³Type³Forward³Skip³Modulo³Frame³In Use³Threshold³Size ³Rec Count³
³1 LK ³ (1)³ (4) ³(4) ³ (4) ³Size ³ (4) ³ (1) ³Lock ³ (4) ³
³ ³ ³ ³ ³ ³(2) ³ ³ ³ (2) ³ ³
ÃÄÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄÄÅÄÄÄÄÄÅÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄ´
³LK ³Type³Forward³Skip³Modulo³ ³ ³ ³ ³ ³
³Rest ³ ³ ³ ³ ³ ³ ³ ³ ³ ³
ÀÄÄÄÄÄÁÄÄÄÄÁÄÄÄÄÄÄÄÁÄÄÄÄÁÄÄÄÄÄÄÁÄÄÄÄÄÁÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÙ

Type Is a single byte identifying the frame type as above
Forward Is a pointer to the next OVERFLOW frame
Skip Is a pointer to the OVERFLOW frame where the next record actually
begins Hence if the CURRENT record is very large the system can
skip out the intervening OVERFLOW frames and go straight to the
OVERFLOW frame containing the next record
Modulo The modulo (number of primary frames) in the file Note that this
value may change between Group 1 and other Groups Group 1 is
always the most CURRENT modulo but other groups have a modulo
which reflects the modulo of the file when the particular group
was relocated (as the file grows or shrinks)
Framesize The size of each frame This must be the same for each frame in
the file but can change FROM file to file All utilities written
to access LH files at the DOS level ought to take account of this
fact and not assume a standard 1024 byte frame size In practice
frame sizes seem to have little effect on PERFORMANCE
In Use Records how much space is currently taken up by the file This
figure is used to work out when the file ought to be resized
Threshold This figure is the THRESHOLD figure at which resizing will
commence If the percentage utilisation drops below this figure
the file will have its modulo decreased if it goes above this
figure the file will have its modulo increased
Size Lock Set by system processes to indicate whether the file may be
permitted to expand/contract
Rec Count Number of records in the file


Converting FROM Base 256 to Base 10
The information is stored in a Base 256 format WITH the least significant
byte first Thus to convert the information FROM Base 256 to Base 10 one
could use the following code segment


CONVERT:
RET_VALUE = seq(VALUE[1 1])
RET_VALUE += 256 * seq(VALUE[2 1])
RET_VALUE += 65536 * seq(VALUE[3 1])
RET_VALUE += 16777216 * seq(VALUE[4 1])
VALUE = RET_VALUE
RETURN


The above code segment works for all lengths of number up to 4 digits
However remembering that V118 is used for exactly these SORT of
conversions but only to THREE figures it is actually approximately three
times quicker to use the following code segment (on a machine without an
80x87) (for four digit numbers)


CONVERT:
CALL V118(VALUE[1 3] LOW_BITS)
HIGH_BIT = 16777216 * seq(VALUE[4 1])
RET_VALUE = HIGH_BIT + LOW_BITS
RETURN

Note that if the number being converted is less than three digits in length
(as WITH frame size) and size lock the value to convert must be padded to 3
digits WITH CHAR(0)s (i e REPLACE the V118 call with
V118(FMT(VALUE[1 3] "L(" : CHAR(0) : ")#3" LOW_BITS))

Records Within The Frame
After the header block within the frame the data records begin These are
stored sequentially and at the end of each record is a record marker
(CHAR(255) At the end of all the records within a group is an "End of
Group" marker a CHAR(128) To enable the system to MOVE FROM record to
record rapidly within a frame (and ALSO to be able to detect frame format
errors) each record is preceded by a STRING of between 2 and 6 bytes
combining the length of the record WITH the length of the record id

The way in which these are stored is a little confusing The maximum record
size is 64K and the maximum record id size is theoretically the same 64K
can be represented in Base 256 by two bytes It might therefore seem
reasonable to assume that each record would be preceded by four bytes two
for the record length and two for the id length This is not the CASE In
storing the record and ID lengths AREV tries to optimise the information
and only stores as much information as it needs to If a record id or a
record length is short only one byte is needed to represent it so AREV
only stores one byte

To ensure that the system knows that only one byte is being used (or two or
three) AREV FLAGS the last byte in the length chain To do this the MSB
(Most Significant Bit the eighth bit of the number byte) is set to 1 If
the MSB is not used the largest size the byte can represent is CHAR(127)
Thus if a byte is greater than CHAR(127) the MSB must be set and this must
be the last byte in the length chain Finding out the Record length and Id
length is then achieved by starting at the first byte in the data frame and
reading one byte at a TIME until the byte is greater than CHAR(127) The
bytes READ up to this point represent the record length The Id length is
found by continuing to take a byte at a time until the next byte greater
than CHAR(127) is found The bytes read up to this point represent the Id
length It is traditional to TEST for bits being set by masking WITH other
bytes Thus to check for the eighth bit being set one would BITAND with
128 This will only return true if the eighth bit is set

The approach of only storing the minimum required has one problem Since
the MSB is used as a flag one bit per byte cannot be used to store
information There are thus only seven data bits per byte meaning that each
byte can only store up to 127 The maximum that can be stored in two bytes
is therefore 16 256 (127 + (127 * 127)) To store a record length over
16 256 then three bytes are required giving
127 + (127 * 127) + (127 * 127 * 127) = 2 064 639
It is thus possible that in some circumstances this approach may use more
bytes per record than two eight bit bytes Normally this will not be the
case

Unlike the information stored in the header block this information is
stored WITH the most significant byte first When the bytes making up the
lengths have been identified they can be converted into DECIMAL USING the
following code fragment

CONVERT_7_BITS:
BEGIN CASE
CASE LEN(V) = 3
V = (SEQ(V[1 1]*16384) + (SEQ(V[2 1]*128)
V += SEQ(V[3 1]) 128 ; * Strip flag bit
CASE LEN(V) = 2
V = SEQ(V[1 1]*128 + SEQ(V[2 1] 128
CASE LEN(V) = 1
V = SEQ(V) 128
END CASE
RETURN


(Volume 3 Issue 1 Pages 4 7)
[revmedia/copyrigh.htm]

Page last modified: 08/02/03