
                                 Z

       z - a fast pager for ascii files on IBM-compatible PCs
       ------------------------------------------------------

Purpose
-------

z is intended to allow rapid examination of large or small ascii files
on fully IBM-compatible PCs.  The command set is based on the Unix
utility 'less', though the code is completely different.  Extra features
have been added to support PC-specific functions, and to make the pager
more convenient for reading large ascii texts such as whole books, where
the file may be read in more than one session.

Command line options
--------------------

Assuming 'z' is the name of the binary, the pager may be invoked in either 
of the following forms:

	z <opts> <path> ...

		to read one or more files, as c:> z file1.txt text*.txt
		Wildcards * and ? may be used in filenames, in the standard
		DOS form.

	z <opts>

		to read from standard input, usually a pipe, as for
		example in a commmand line like  c:> dir c:\ | z.
		Other ways to make 'z' read standard input are to use '-' 
		or 'stdin' as the path. 

	<opts> can be:
		-h		to get a summary of command-line options
		-a<n>           to select display attribute 0..255
				(a decimal number, 7 is normal, 112 is
				reverse video, others produce various
				colour schemes - experiment!).
		-b              use this option to select BIOS calls
				rather than direct memory access. This
				option will hardly 'whizz', but will work
				on any display.
		-m              normally, display memory is at 0b8000h. For
				Hercules and possibly other obscure hardware
				it is at 0b0000h.  If it is, use this option.
		-i<n>		use n=0 to select case-sensitive expression
				matching, default is case-insensitive.

Commands
--------

Commands may be entered at any time the status line shows the file name and
position within the file.  All commands are a single key-press.  A numeric
argument may be entered before a command, and may influence the operation
of that command - this is represented by <n>.  <a> represents a single
letter entered at a prompt after a command.  <text>\r represents a text
string terminated by the ENTER/CR key, entered at a prompt after a command.

The commands are:

	CR CurDn e j		Move down <n> more line(s), default is 1 line.

	CurUp y k		Move up <n> line(s), default is 1 line.

	SP PagDn f		Move down <n> more screen(s),
				default is 1 screen.

	PagUp b			Move up <n> screen(s), default is 1 screen.

     	G			Go to line <n>,  default is last line.

	g			Go to line <n>,  default is first line.

	d			Move down <n> lines - if <n> is absent
				or zero, moves the same as the last d
				or u command - default 10 lines.

	u			Move up <n> lines - if <n> is absent or
				zero, moves the same as the last d or
				u command - default 10 lines.

	h F1			Display memo of commands.

	q			Finish reading the current file.  If there
				are more files on the command line, read
				the next one.

	ESC 			Exit pager, abandon all further files on
				the command line.

	t HOME			Go to the start of the file.  Can also use
				'g' with no argument.

	END			Go to end of file (display last screenful).
				can also use 'G' with no argument.

	m<a>			Set a mark at the current displayed line
				(top of screen).  <a> should be a letter,
				either case - there are 26 marks.

	'<a>			Go to a mark previously set by 'm'.

	p %			Go to the line <n> percent into the file
				(based on line number).

	a			Change display colour.  If <n> is zero or
				not given, will cycle through some possible
				colours, otherwise sets the display 
				attribute to <n>.  7 is normal, 112 is
				reverse video.

	i			Change 'ignore-case' flag. If <n> is odd,
				the flag will be set, else it will be
				cleared.

	s			Save index.  This creates a file with the
				same name as the file being read, with
				extension '.ind', containing sufficient
				information to make access faster if the
				file is re-read later.  The positions of
				the marks are also saved.

	r			The buffering system can get upset, especially
				if corrupt indexes exist. If this occurs, 'r' 
				will clear down the buffer allocation system,
				and remove confusion.

	/<text>\r ^S		Search for text, forwards from the line
				after the current text.  If <n> is given,
				searches for the <n>th occurrence.  Text
				must be entirely on one line.  Matching
				uses simple regular expressions.

	n			Repeat the last search, in the forward
				direction.

	?<text>\r ^R		As for '/', but searching backwards from
				the line above the current line.

	N ,			Repeat the last search, in the backward
				direction.

	x			Switch to hex mode.


Hex mode
--------

Hex mode is a separate function within the pager, with its own menu of
commands.  These are:

	CR CurDn		Down one line.

	SP PagDn		Down one page.

	y CurUp			Up one line.

	u PagUp			Up one page.

	<n>g			Go to a (hex) address.

	t HOME			Go to top.

	z END			Go to end.

	h F1			Help.

	q ESC			Exit back to main text mode.

Hex mode starts up with the address set to the address of the first
character on the first line of the previous text display.
 
	
Screen format
-------------

The screen area is divided into two regions, the bottom line appears in
reverse video and contains various status messages, the rest of the
screen is used for the text of the file.

When the pager is ready to accept a command, the status line shows one
of two formats, depending on whether the end of file has been reached
yet.  Both formats show the name of the pager project ('z'), the
name of the file being read, and the current line number.  If the end of
the file has not been seen, the file length in bytes is displayed;
otherwise the number of the last line and the percentage (based on
lines) into the file appear.  Next, in parentheses, is the current state
of the "ignore-case" flag. Finally, a number in square brackets shows
the current value of the argument <n>, which may be entered before any
command.

The screen is always 80 columns, but the number of rows may be changed. 
The pager will always default to 25 rows, and will not auto-detect
different screen formats, but it can be made to use any number of rows
between 10 and 100 by setting an evironment variable ROWS.


Line numbering
--------------

Lines are numbered from 1 for the first line.  If the file has a newline
character as the last character, then the pager assumes a further blank
line; on such files the number of the last line will appear one greater
than the normal way of counting lines.  The current line number in the
status line is the number of the line at the top of the screen.


Searches
--------

Search arguments for the forward and reverse search commands are
simple Unix-type regular expressions. The following values are allowed:

	.			will match any single character.

	+			will match one or more of the
				character preceeding the '+', eg
				'a+' will match 'a' or 'aaaa'.

	*			will match zero or more of the
				character preceeding the '*', eg
				'a*' will match 'a' or 'aaa' or even
				''.

	^			the argument following must appear as
				the first text on a line, eg '^the' will
				match on any line that starts 'the'.

	$			the argument preceeding must appear as
				the final text on a line, eg 'and$' will
				match on any line then ends with 'and'.

	[]			will match any character that appears within
				the brackets. The characters may
				be any mixture of lists and ranges
				contructed with '-', eg [a-z23] will match
				any lower case letter or '2' or '3'. '^'
				as the first character has the special
				meaning of inverting the logic, so [^a-z,]
				will match anything except a lower case
				letter or comma.

	\			allows any of the special characters already
				listed, including '\', to be taken as
				literals to be matched, eg to search for
				a line containing '.' use '\.'.

	anything else		will be matched literally.

Matches will be found only within a screen line (not a file line),
and after tabs have been expanded to spaces.  Sorry - better pattern
matching may be fitted later.

If a search succeeds, then the line found will be at the top of the
screen.  If a search fails, the displayed text will not change.  If a
multiple search finds the end of file before the <n>th occurrence of the
search text, it will end with the last match at the top of the screen.
If an illegal regular expression is given, there will be a beep and no
searching.

In multiple searches, the <nth> line containing the search string will
be found, not the <n>th occurrence of the search string.

 
Marks
-----

Marks are defined by a single letter, case independent.  At the moment,
only the lower five bits of the 'letter' are used, so any character will
define one of 32 marks, for example '1' and 'q' will be the same mark.


Standard input
--------------

There is currently a bug with standard input, in that whereas the pager
opens all files in binary mode (so that binaries can be searched for
text strings, for example), standard input works in a half-and-half mode
- although CRs are passed through, the stream still terminates on a
control-Z.  So don't pipe in binaries. Also, the pager makes use of the
fact that DOS doesn't have real pipes, and assumes it can seek in pipes.
This will cause some oddities under various PC Unixes.


Colours
-------

To change the screen colours, use the -a<n> switch in the command line. 
<n> is a decimal number, from 0 to 255.  7 is normal, 112 reverse video,
others will result in other colours, and even flashing text. 
Alternatively, <n>a at the command line will have the same effect, with
the additional feature that if <n> is zero or absent the colours will
cycle with each command.  

If no '-a' switch or 'a' command is given, the pager will take the
colour of the top left-hand character at the time it is started, and use
this for the whole screen.

Screen colours will not change if the BIOS option is selected from the
command line.


Display modes
-------------

The pager always operates in video mode 3, 80 by (default) 25 rows,
colour text.  If the PC is in some other mode at the time the pager is
invoked, it will change mode to mode 3; the original mode will be
restored at exit, but the screen will have been cleared.  If the PC is
in mode 3 already, as it normally would be, then the pager will save
the screen data and attributes (colours) present before it starts, and
restore them on exit.

To achieve its high-speed display update, the pager uses direct access
to the hardware of the PC screen.  By default, it will access display
memory at 0b8000h, which works for practically every IBM-compatible text
screen.  Hercules graphics cards in text mode are an exception, having
memory at 0b0000h; to use one of these add the -m option on the command
line.  The -b command line option forces the pager to use BIOS calls
for screen update, which makes it very slow.  I know how to make it
auto-detect screen type on the machines I've tried, and a bullet-proof
method may be added later.


Buffering technique
-------------------

At startup, the pager grabs memory for various internal buffers,
totalling around 100k bytes.  If it can't get this, it won't start.  If
there isn't that much memory, it's probably not worth using this pager.
The rest of available memory is then allocated to further text buffers
of 32k bytes each.  The text is held in these buffers as required; they
are allocated on a simple least-recently-used system.  The pager will
not pre-fill buffers on the offchance, though it can be forced to fill
then by a move to the end of the file.  If the whole file will fit into
the available buffers, access will be fast to all parts of the text;
otherwise the allocation system will occasionally have to re-use an old
buffer, and there will be a short delay while it is refilled from the
file on disk.

The 's' command will save an index containing part of the buffer
allocation tables, enough to enable the pager when run again on the
same file to find any line already read without having to search from
the beginning of the file.  The index file for file 'foo.xxx' is named
'foo.ind', and will be in the same directory, so files with the same
name but different extensions will map to the same index file, and the
later one will overwrite the earlier. Index files are unique to the file
that they refer to, and the pager will not confuse them.

The command 'r' will reset all of the buffer allocation system. It was
useful when index files could be ambiguous, but it shouldn't be needed
now unless an index file is corrupt.


Type ahead and abort
--------------------

When the pager is performing an operation that might take some time, it
reads the keyboard input looking for a control-C abort.  When it does
this, it will purge any characters typed ahead (up to any control-C), to
ensure that any abort is seen.  On control-C, the pager will reset to
line 1, clear down all of its tables, and reload the current file's
index if it exists.

At present, abort is recognised only when the pager is filling a text
buffer from the file.


Copyright
---------

Apart from parts explicitly labelled in the source code with other
copyright notices, I claim authorship of all of the source code for this
utility.  The source and object code are available freely for
non-commercial use, including modification.  My copyright notice should
appear alongside your own in any modified version.  No warranties of any
kind are provided.  


Author
------

Nigel Bromley - +44 7010 700642   (also at nigel.bromley@ffei.co.uk)


      .
     _|_
    | |_'   1998-08-02
