Multilingual Batch Programs

To many of you, the idea of using a batch program to manage programs that are in the form of scripts for other language interpreters may seem a new and advanced concept. It's older than the PC itself - in fact, this idea was in the forefront during the earliest days of DOS batch programming: we had the batch language, the built-in PC-DOS utilities, BASICA, DEBUG, a few rather limited applications, and little else unless we put out a small fortune for a C or FORTRAN compiler.

In those "Old Days", it was not uncommon to write parts of a program in BASICA and let a batch file manage the setup and cleanup functions before and after it invoked the BASICA interpreter and passed it the script. DEBUG scripts were also common, mostly to generate simple assembly programs that accessed some DOS interrupt or poked at some memory location (for example, to twiddle the bit that tells the system that a print screen is in progress in order to turn the PrintScrn key on and off). In those days, most of us didn't have access to Internet (what Internet?) and its archives full of utilities - we hand copied scripts from magazines or wrote them ourselves.

As batch language became somewhat more powerful and utilities more plentiful and easily obtained, the technique pretty much died out except with a few die-hards. I never quite gave it up, but it did become far less important to me as I acquired the means to write and compile utilities in high level languages. These days, it is gaining importance because of the drastic differences among the three major languages called "batch" and the nearly complete brain death of the Win9x batch language. Unfortunately, there is no universally available scripting language to replace BASICA (or GWBASIC in MS-DOS) - Microsoft makes QBasic available for all the platforms, but it is not installed by default in NT. Only DEBUG can reasonably be assumed to be present.

Since you can't know that people who use your programs will have any specific language interpreter, you need to include one with your packages if you distribute your programs, or at least specify one that your users can obtain for free and with reasonable assurance of freedom from viruses. This does somewhat limit the choices pretty much to DEBUG and freeware languages from major archives. Fortunately, there are several excellent candidates Perl and Microsoft's Windows Scripting language (but only in Win32 versions), various AWK clones, and SED come to mind. SED is actually a programmable stream editor rather than a language, but for many tasks, that's exactly what is needed. Since AWK can do it all, and is available for nearly every reasonable platform, I will stick mostly with AWK clones and DEBUG here.

Since DEBUG will be available unless someone has intentionally removed it, I see no need to say anything about obtaining it. On the other hand, there are several good AWK clones available: MAWK and GAWK are the most popular. Both can be obtained from a somewhat hidden part of the Simtelnet archive: the Gnuish section. The GNU project is an effort to clone the entire Unix environment as freeware, complete with source code. Much of the modern Linux environment comes from there. The whole Gnu thing is completely mind-boggling in its scope and functionality. And in the completeness and unapproachability of its documentation: it's mostly written to be displayed on a terminal screen or printed on a PostScript printer.

While GAWK is a more complete package than MAWK and comes with over a megabyte of documentation, the MAWK package is smaller and somewhat more approachable. MAWK is a fairly straight-forward port of the original AWK language as described in The AWK Book (The AWK Programming Language, by Aho, Kernigham, and Weinberger (the 'A', 'K', and 'W' in "AWK"), Addison-Wesley Publishing Company, ISBN 0-201-07981-X (for my copy)). GAWK is closer to the newer versions with many enhancments. Perl is a descendent of AWK. Needless to say, considering that Brian Kernigham is the 'K' in "K&R C", AWK is very C-like, but is much easier to learn and use. Read the book (or the GAWK documentation - you'll need GhostScript and GhostView to read it) - the languages themselves are beyond the scope of this work dealing with batch file techniques.

An additional point about the documentation mentioned above: most of that stuff is formatted for printing, not for viewing on the screen or in an editor, and specifically for printing on dot matrix or daisy wheel printers. The on-line versions (or local copies you save yourself, though that may not be allowed) are more easily dealt with on the screen, and the latest versions have documentation in Windows help format.

If you want to read the GAWK documentation easily, it's available on-line at <http://www.cco.caltech.edu/cco/texinfo/gawk/gawk_1.html> and <http://www.rtr.com/win95pak/gawk_toc.htm> at the time of writing (8 August 1999) or you can get GhostScript and GhostView (GSview)from Simtel.net in the print directory for the Windows sections or the postscrp directory of the DOS section, and read the documentation on your own system.

Eric Pement pointed out some ways to ease the GAWK documentation problem: use LIST (shareware) or Less (free), which provide a good presentation of the man pages. Still, the Web pages are probably the easest to use - see the above links. See also The GNU Awk User's Guide (current version of "GAWK: Effective AWK Programming: A User's Guide for GNU Awk") ( read "@command{awk}" as "command line awk programs"). You can find all the Gnuish stuff at <ftp://ftp.simtel.net/pub/simtelnet/gnu/gnuish/>. Read the General Public License that applies to the Gnu stuff before you distribute any of the Gnuish programs - there are restrictions designed to make sure that the programs remain forever free.

The specific files you will want some or all of are
GAWK
gawk306d.zip (documentation in printing format)
gawk306x.zip (the executables)
gawk306h.zip (documentation in Windows help format)
mawk122x.zip
Since later versions are very likely to be fully compatible with the older ones, use the latest versions available. The specific versions of MAWK and GAWK used here are those listed above.

Win32 users (Win95, Win98, ME, NT, Win2000, XP) users should check the GAWK entry at http://sourceforge.net/project/showfiles.php?group_id=23617 for the GNU-WIN port at SourceForge (try not to be distracted by all the other goodies on that page (except maybe LESS - you probably want it as a replacement for MORE, but use the version at http://www.greenwoodsoftware.com/less/ because it doesn't require finding and getting a missing DLL)).


This essay is divided into several streams In general, simple text processing does not need anything beyond the basic original AWK functionality, and other functions need either GAWK or assembly. Most text processing examples work with MAWK and are in that section, though all of those require generation of a script file while GAWK accepts short scripts as command line arguments (MAWK is supposed to, but doesn't do so in at least some operating environments). The MAWK scripts mostly work without modification in GAWK as well, and many are simple enough to be used as command line scripts.

One of the driving forces behind these essays is to provide operating system independent solutions for those FAQs (Frequently Asked Questions) that require extremely difficult code to solve in pure batch language, and often have only operating system, and/or human language specific, solutions. The ones that lend themselves to solutions with scripts written or managed by batch programs are only a subset of the total and mostly involve Each major section of each essay has (or someday will have) a list of FAQs for which it provides tools. This is the overview list with links to the section that discusses the technique - the specific questions with example answers generally follow the main discussion because answers without explanations can't readily be adapted to even slightly different tasks.

Date and time
DEBUG scripts (keyboard access, disk information, etc.)
User Input
Text processing (Web Pages)
Text processing (files containing lists)
Disk and file utilities


One point that needs clarification is that while secondary scripts and utilities may be able to access environment variables that were created before the program or script was invoked, if an executable program is needed, either as the result of the script or to interpret the script, they cannot modify the environment of the program that invoked them. There are exceptions: it is possible to write programs that explore memory, find the parent environment, and modify it, but that is a can of worms that will not be opened here - in general, executable programs get a copy of the environment belonging to their parent process. Keep in mind that batch files are scripts for a program already running (the command processor, usually COMMAND.COM or CMD.EXE, but possibly 4DOS or some other program), so they can modify that program's environment - when the batch program invokes another executable, even another instance of the command processor, the new program is passed a copy of the batch program's environment as it is at the instant of launching the secondary program.

Having said that, here's how to modify the environment of the invoking batch program with (that's with not from) a secondary script: have the script write a batch file that the parent batch program then CALLs - since that batch program runs in the same environment and under the same instance of the command processor as the parent program, as a subroutine or subprogram, it can do anything the parent batch program can do.

This example causes the environment variable DDATE to take on the value of the current date in yyyymmdd format (suitable for sorting by date) for use in naming files and directories. It's a bit more complex than is usually needed because it deals with all three operating environments. This code assumes the American English order of the fields (easily changed), that the date is the last field on the first line of the DATE command's report, and that it is delimited by something that is not a numerical character or space.

SETEVNMT.AWK
BEGIN{
    Dflag = 0
}

{
    echo 
    if( !Dflag ) {
        Dflag = 1
        ThisDate = $NF
        gsub( /[^0-9]/, "+", ThisDate )
        Fields = split( ThisDate, Array, "+" )
        ThisDate = Array[ Fields ] Array[ Fields - 2 ] Array[ Fields -1 ]
        print "@set DDATE=" ThisDate
    }
}
That script doesn't contain any magic characters (except one '^' which is magic in NT), so it can be created on the fly by this batch program which writes a script which writes a batch program:

MLTL0070.BAT
 @echo off
 echo BEGIN{ Dflag = 0 }> }{.awk
 echo {if(!Dflag){Dflag=1;ThisDate=$NF >>}{.awk
 if %OS%!==Windows_NT! echo gsub(/[^^0-9]/," ",ThisDate)>> }{.awk
 if not %OS%!==Windows_NT! echo gsub(/[^0-9]/," ",ThisDate)>> }{.awk
 echo Fields=split(ThisDate,Array," ")>> }{.awk
 echo ThisDate=Array[ Fields ]Array[(Fields-2)]Array[(Fields-1)]>> }{.awk
 echo print"@set DDATE="ThisDate}}>> }{.awk
 echo. | date | awk -f}{.awk > }{.bat
 call }{.bat
 echo %DDATE%
 del }{.*
Note that I have removed all unnecessary spaces and newlines from the script. It could be all one line - or maybe two - but there are line length limits here and in the batch program. Note also that separate lines with IF test of the OS environment variable are used to provide a line with '^' escaped ("^^") if the OS is NT and in plain form if it is not (the OS variable doesn't exist except in NT, so its value is an empty string in the other environments).

In the AWK scripts, I sometimes use AWK as the AWK command to process the script - this means that it should work the same under both MAWK and GAWK, and sometimes I use GAWK - this means that the program is GAWK specific. You may find an occasional left over reference to MAWK - read that is if it were AWK. Everything here should be considered as work in progress and as pieces get moved around and rewritten, not everything gets updated to properly fit the new context - sorry about that, but that's the way it is with this free help stuff: time is in extremely short supply (mostly this stuff gets written on rainy (or winter) weekends). If you notice something that's so far out of context that it really has to be fixed, please let me know.

In addition to the above material written especially for these essays, I have some

Real World Batch Programs

that are batch based programs I actually use in my work.


********* more later ********




This stuff has been only partially tested at the time of its initial release, but it is known that the version of MAWK used here does work in Real DOS, Win9x, and NT4. The complete programs have not all been tested under Real DOS.




  ** Copyright 1995, 1996, 1997, 1998, 1999, 2000, 2001 Ted Davis - see License, included by reference. ** 

Input and feedback from readers are welcome. NOTE: the subject of the message must contain the word "batch" for the message to get past the spam filter.

Back to the Table of Contents page

Back to my personal links page - back to my home page