BASIC interpreter![]() A BASIC interpreter is an interpreter that enables users to enter and run programs in the BASIC language and was, for the first part of the microcomputer era, the default application that computers would launch. Users were expected to use the BASIC interpreter to type in programs or to load programs from storage (initially cassette tapes then floppy disks). BASIC interpreters are of historical importance. Microsoft's first product for sale was a BASIC interpreter (Altair BASIC), which paved the way for the company's success. Before Altair BASIC, microcomputers were sold as kits that needed to be programmed in machine code (for instance, the Apple I). During the Altair period, BASIC interpreters were sold separately, becoming the first software sold to individuals rather than to organizations; Apple BASIC was Apple's first software product. After the MITS Altair 8800, microcomputers were expected to ship bundled with BASIC interpreters of their own (e.g., the Apple II, which had multiple implementations of BASIC). A backlash against the price of Microsoft's Altair BASIC also led to early collaborative software development, for Tiny BASIC implementations in general and Palo Alto Tiny BASIC specifically. BASIC interpreters fell from use as computers grew in power and their associated programs grew too long for typing them in to be a reasonable distribution format. Software increasingly came pre-compiled and transmitted on floppy disk or via bulletin board systems, making the need for source listings less important. Additionally, increasingly sophisticated command shells like MS-DOS and the Mac GUI became the primary user interface, and the need for BASIC to act as the shell disappeared. The use of BASIC interpreters as the primary language and interface to systems had largely disappeared by the mid-1980s. HistoryBASIC helped jumpstart the time-sharing era, became mainstream in the microcomputer era, then faded to become just another application in the DOS and GUI era, and today survives in a few niches related to game development, retrocomputing, and teaching. Time-sharing eraFirst implemented as a compile-and-go system rather than an interpreter, BASIC emerged as part of a wider movement towards time-sharing systems. General Electric, having worked on the Dartmouth Time-Sharing System and its associated Dartmouth BASIC, wrote their own underlying operating system and launched an online time-sharing system known as Mark I featuring a BASIC compiler (not an interpreter) as one of its primary selling points. Other companies in the emerging field quickly followed suit. By the early 1970s, BASIC was largely universal on general-purpose mainframe computers.[1] ![]() BASIC, as a streamlined language designed with integrated line editing in mind, was naturally suited to porting to the minicomputer market, which was emerging at the same time as the time-sharing services. These machines had very small main memory, perhaps as little as 4 KB in modern terminology, and lacked the high-performance storage like hard drives that make compilers practical. In contrast, an interpreter would take fewer computing resources, at the expense of performance. In 1968, Hewlett Packard introduced the HP 2000, a system that was based around its HP Time-Shared BASIC interpreter.[2] In 1969, Dan Paymar and Ira Baxter wrote another early BASIC interpreter for the Data General Nova.[3] One holdout was Digital Equipment Corporation (DEC), the leading minicomputer vendor. They had released a new language known as FOCAL, based on the earlier JOSS developed on a DEC machine at the Stanford Research Institute in the early 1960s. JOSS was similar to BASIC in many respects, and FOCAL was a version designed to run in very small memory systems, notably the PDP-8, which often shipped with 4 KB of main memory. By the late 1960s, DEC salesmen, especially in the educational sales department, found that their potential customers were not interested in FOCAL and were looking elsewhere for their systems. This prompted David H. Ahl to hire a programmer to produce a BASIC for the PDP-8 and other DEC machines. Within the year, all interest in alternatives like JOSS and FOCAL had disappeared.[4] Microcomputer eraThe introduction of the first microcomputers in the mid-1970s continued the explosive growth of BASIC, which had the advantage that it was fairly well known to the young designers and computer hobbyists who took an interest in microcomputers, many of whom had seen BASIC on minis or mainframes. BASIC was one of the few languages that was both high-level enough to be usable by those without training and small enough to fit into the microcomputers of the day. In 1972, HP introduced the HP 9830A programmable desktop calculator with a BASIC Plus interpreter in read-only memory (ROM).[5] In June 1974, Alfred Weaver, Michael Tindall, and Ronald Danielson of the University of Illinois at Urbana-Champaign proved it was possible to produce "A BASIC Language Interpreter for the Intel 8008 Microprocessor," in their paper of the same name, though their application was deployed to an 8008 simulator for the IBM 360/75 and required 16 KB.[6] ![]() In January 1975, the Altair 8800 was announced and sparked the microcomputer revolution. One of the first microcomputer versions of BASIC was co-written by Gates, Allen, and Monte Davidoff for their newly formed company, Micro-Soft. This was released by MITS in punch tape format for the Altair 8800 shortly after the machine itself,[7] showcasing BASIC as the primary language for early microcomputers. In March 1975, Steve Wozniak attended the first meeting of the Homebrew Computer Club and began formulating the design of his own computer. Club members were excited by Altair BASIC.[8] Wozniak concluded that his machine would have to have a BASIC of its own. At the time he was working at Hewlett Packard and used their TS-BASIC minicomputer dialect as the basis for his own version. Integer BASIC was released on cassette for the Apple I, and was supplied in ROM when the Apple II shipped in the summer of 1977.[9] Other members of the Homebrew Computer Club began circulating copies of Altair BASIC on paper tape, causing Gates to write his Open Letter to Hobbyists, complaining about this early example of software piracy. Partially in response to Gate's letter, and partially to make an even smaller BASIC that would run usefully on 4 KB machines,[a] Bob Albrecht urged Dennis Allison to write their own variation of the language. How to design and implement a stripped-down version of an interpreter for the BASIC language was covered in articles by Allison in the first three quarterly issues of the People's Computer Company newsletter published in 1975 and implementations with source code published in Dr. Dobb's Journal of Tiny BASIC Calisthenics & Orthodontia: Running Light Without Overbyte. This led to a wide variety of Tiny BASICs with added features or other improvements, with well-known versions from Tom Pittman and Li-Chen Wang, both members of the Homebrew Computer Club.[10] Tiny BASIC was published openly and Wang coined the term "copyleft" to encourage others to copy his source code. Hobbyists and professionals created their own implementations, making Tiny BASIC an example of a free software project that existed before the free software movement. Many firms developed BASIC interpreters. In 1976, SCELBI introduced SCELBAL for the 8008[11] and the University of Idaho and Lawrence Livermore Laboratory announced that they would be publishing to the public domain LLL BASIC, which included floating-point support.[12] In 1977, the Apple II and TRS-80 Model I each had two versions of BASIC, a smaller version introduced with the initial releases of the machines and a licensed Microsoft version introduced later as interest in the platforms increased. ![]() Microsoft ported its interpreter to the MOS 6502, which quickly became one of the most popular microprocessors of the 8-bit era. When new microcomputers began to appear, such as the Commodore PET, their manufacturers licensed a Microsoft BASIC, customized to the hardware capabilities. By 1978, MS BASIC was a de facto standard and practically every home computer of the 1980s included it in ROM. In 1980, as part of a larger licensing deal that included other languages and PC DOS, IBM rejected an overture from Atari and instead licensed MS-BASIC over its own implementation, eventually releasing four versions of IBM BASIC, each much larger than prior interpreters (for instance, Cartridge BASIC took 40 KB).[13] Don Estridge, leader of the IBM PC team, said, "IBM has an excellent BASIC--it's well received, runs fast on mainframe computers, and it's a lot more functional than micro-computer BASICs... But [its] number of users were infinitesimal compared to the number of Microsoft BASIC users. Microsoft BASIC had hundreds of thousands of users around the world. How are you going to argue with that?"[14] (See Microsoft BASIC for the subsequent history of these different implementations.) Many vendors did "argue with that" and used other firms or wrote their own interpreters. In September 1978, Shepardson Microsystems was finishing Cromemco 16K Structured BASIC for the Z80-based Cromemco S-100 bus machines.[15][16] Paul Laughton and Kathleen O'Brien then created Atari BASIC[17] as essentially a pared-down version of Cromemco BASIC ported to the 6502.[18] In 1979, Warren Robinett developed the BASIC Programming cartridge for Atari, Inc., even though it only supported programs with 9 lines of code (64 characters in total). Also in 1979, Texas Instruments released TI BASIC with its TI-99/4, which would sell nearly 3 million systems when revamped as the TI-99/4A. Sinclair BASIC was developed for the ZX-80 by John Grant and Steve Vickers of Nine Tiles. In 1980, Sophie Wilson of Acorn Computers developed Atom BASIC, which she later evolved into BBC BASIC, one of the first interpreters to offer structured BASIC programming, with named In 1978, David Lien published the first edition of The BASIC Handbook: An Encyclopedia of the BASIC Computer Language, documenting keywords across over 78 different computers. By 1981, the second edition documented keywords from over 250 different computers, showcasing the explosive growth of the microcomputer era.[23] Interpreters as applicationsWith the rise of disk operating systems and later graphical user interfaces, BASIC interpreters became just one application among many, rather than providing the first prompt a user might see when turning on a computer. In 1983, the TRS-80 Model 100 portable computer debuted, with its Microsoft BASIC implementation noteworthy for two reasons. First, programs were edited using the simple text editor, TEXT, rather than typed in line by line (but line numbers were still required).[24] Second, this was the last Microsoft product that Bill Gates developed personally.[25][26] Also in 1983, Microsoft began bundling GW-BASIC with DOS. Functionally identical to IBM BASICA, its BASIC interpreter was a fully self-contained executable and did not need the Cassette BASIC ROM found in the original IBM PC. According to Mark Jones Lorenzo, given the scope of the language, "GW-BASIC is arguably the ne plus ultra of Microsoft's family of line-numbered BASICs stretching back to the Altair--and perhaps even of line-numbered BASIC in general."[27] With the release of MS-DOS 5.0, GW-BASIC's place was taken by QBasic. MacBASIC featured a fully interactive development environment for the original Macintosh computer and was developed by Donn Denman,[28] Marianne Hsiung, Larry Kenyon, and Bryan Stearns.[29] MacBASIC was released as beta software in 1985 and was adopted for use in places such as the Dartmouth College computer science department, for use in an introductory programming course. It was doomed to be the second Apple-developed BASIC killed in favor of a Microsoft BASIC. In November 1985, Apple abruptly ended the project as part of a deal with Microsoft to extend the license for BASIC on the Apple II.[30][31] ![]() BASIC interpreters were not just an American/British development. In 1984, Hudson Soft released Family BASIC in the Japanese market for Nintendo's Family Computer video game console, an integer-only implementation designed for game programming, based on Hudson Soft BASIC for the Sharp MZ80 (with English keywords).[32] Turbo-Basic XL is a compatible superset of Atari BASIC, developed by Frank Ostrowski and published in the December 1985 issue of German computer magazine Happy Computer, making it one of the last interpreters published as a type-in program. The language included a compiler in addition to the interpreter and featured structured programming commands. Several modified versions working with different DOS systems were released by other authors. In France, François Lionet and Constantin Sotiropoulos developed two BASIC interpreters with a focus on multimedia: STOS BASIC for the Atari ST, in 1988,[33] and AMOS BASIC for the Amiga, in 1990. In May 1991, Microsoft released Visual Basic, a third-generation event-driven programming language known for its Component Object Model (COM) programming model.[34] Visual Basic supported the rapid application development (RAD) of graphical user interface (GUI) applications, access to databases using Data Access Objects, Remote Data Objects, or ActiveX Data Objects, and creation of ActiveX controls and objects. Visual Basic was used to develop proprietary in-house applications as well as published applications. Niche BASICsIn 1993, Microsoft released Visual Basic for Applications, a scripting language for Microsoft Office applications, which supersedes and expands on the abilities of earlier application-specific macro programming languages such as Word's WordBASIC (which had been introduced in 1989). In 1996, Microsoft released VBScript as an alternative to JavaScript for adding interactive client-side functionality to web pages viewed with Internet Explorer.[35] In 1999, Benoît Minisini released Gambas as an alternative for Visual Basic developers who had decided to migrate to Linux.[36] In 2000, Lee Bamber and Richard Vanner released DarkBASIC, a game creation system for Microsoft Windows, with accompanying IDE and development tools.[37] In 2001, SmallBASIC was released for the Palm PDA.[38] Another BASIC interpreter for Palm was HotPaw BASIC, an offshoot of Chipmunk Basic. In 2002, Emmanuel Chailloux, Pascal Manoury and Bruno Pagano published a Tiny BASIC as an example of developing applications with Objective Caml.[39] In 2011, Microsoft released Small Basic (distinct from SmallBASIC), together with a teaching curriculum[40] and an introductory guide.,[41] designed to help students who have learnt visual programming languages such as Scratch learn text-based programming.[42] The associated IDE provides a simplified programming environment with functionality such as syntax highlighting, intelligent code completion, and in-editor documentation access.[43] The language has only 14 keywords.[44] In 2019, Microsoft announced Small Basic Online (SBO), allowing students to run programs from a web browser.[45][46] In 2014, Robin H. Edwards released Arduino BASIC for the Arduino, and now a widely forked implementation.[47] Another implementation using the same name was adapted from Palo Alto Tiny BASIC in 1984 by Gordon Brandly for his 68000 Tiny BASIC, later ported to C by Mike Field.[48] ![]() Many BASIC interpreters are now available for smartphones and tablets via the Apple App Store, or Google Play store for Android. Today, coding BASIC interpreters has become part of the retrocomputing hobby. Higher level programming languages on systems with extensive RAM have simplified implementing BASIC interpreters. For instance, line management is simple if your implementation language supports sparse matrixes, variable management is simple with associative arrays, and program execution is easy with eval functions. As examples, see the open-source project Vintage BASIC, written in Haskell[49] or the OCaml Tiny BASIC. Sales and distributionInitially, interpreters were either bundled with computer hardware or developed as a custom service, before an industry producing independently packaged software for organizations came about in the late 1960s.[50] BASIC interpreters were first sold separately from microcomputers, then built-in, before becoming sold as applications again in the DOS era.
As the market shifted to ROMs, ROM size came to dominate decisions about how large a BASIC interpreter could be. Because RAM were sold as 4 KB chips, Altair BASIC was initially packaged in separate editions for 4K, 8K, and 12K; this carried over to ROM chips, as manufacturers would decide how many ROM chips they could fit in their design, given price goals and other constraints. Compilers vs. interpreters
The first implementation of BASIC, Dartmouth BASIC, was a compiler. Generally, compilers examine the entire program in a multi-step process and produce a second file that is directly executable in the host computer's underlying machine language without reference to the source code. This code is often made up of calls to pre-written routines in the language's runtime system. The executable will normally be smaller than the source code that created it. The main disadvantage of compilers, at least in the historical context, is that they require large amounts of temporary memory. As the compiler works, it is producing an ever-growing output file that is being held in memory along with the original source code. Additional memory for temporary lookups, notably line numbers in the case of BASIC, adds to the memory requirement. Computers of the era had very small amounts of memory; in modern terms a typical mainframe might have on the order of 64 KB. On a timesharing system, the case for most 1960s BASICs, that memory was shared among many users. In order to make a compiler work, the systems had to have some form of high-performance secondary storage, typically a hard drive. Program editing took place in a dedicated environment that wrote the user's source code to a temporary file. When the user ran the program, the editor exited and ran the compiler, which read that file and produced the executable code, and then finally the compiler would exit and run the resulting program. Splitting the task up in this fashion reduced the amount of memory needed by any one of the parts of the overall BASIC system; at any given time, only the editor, compiler, or runtime had to be loaded, the rest was on storage. While mainframes had small amounts of memory, minicomputers had even smaller amounts: 4 and 8 KB systems were typical in the 1960s. But far more importantly, minicomputers tended to lack any form of high-performance storage; most early designs used punch tape as a primary storage system, and magnetic tape systems were for the high end of the market. In this environment, a system that wrote out the source, compiled it, and then ran the result would have taken minutes. Because of these constraints, interpreters proliferated. Interpreters ultimately perform the same basic tasks as compilers, reading the source code and converting that into executable instructions calling runtime functions. The primary difference is when they perform the various tasks. In the case of a compiler, the entire source code is converted during what appears to the user as a single operation, whereas an interpreter converts and runs the source one statement at a time. The resulting machine code is executed, rather than output, and then that code is then discarded and the process repeats with the next statement. This dispenses with the need for some form of secondary storage while an executable is being built. The primary disadvantage is that you can no longer split the different parts of the overall process apart - the code needed to convert the source into machine operations has to be loaded into memory along with the runtime needed to perform it, and in most cases, the source code editor as well. Producing a language with all of these components that can fit into a small amount of memory and still has room for user's source code is a major challenge, but it eliminates the need for secondary storage and was the only practical solution for early minicomputers and most of the history of the home computer revolution. DevelopmentLanguage designLanguage design for the first interpreters often simply involved referencing other implementations. For instance, Wozniak's references for BASIC were an HP BASIC manual and a copy of 101 BASIC Computer Games. Based on these sources, Wozniak began sketching out a syntax chart for the language.[51] He did not know that HP's BASIC was very different from the DEC BASIC variety used in 101 Games. The two languages differed principally in terms of string handling and control structures.[52] Data General Business Basic, an integer-only implementation, was the inspiration for Atari BASIC.[53] In contrast, Dennis Allison, a member of the Computer Science faculty at Stanford University, wrote a specification for a simple version of the language.[54] Allison was urged to create the standard by Bob Albrecht of the Homebrew Computer Club, who had seen BASIC on minicomputers and felt it would be the perfect match for new machines like the Altair. Allison's proposed design only used integer arithmetic and did not support arrays or string manipulation. The goal was for the program to fit in 2 to 3 kilobytes of memory. The overall design for Tiny BASIC was published in the September 1975 issue of the People's Computer Company (PCC) newsletter. The grammar is listed below in Backus–Naur form.[55] In the listing, an asterisk (" line ::= number statement CR | statement CR
statement ::= PRINT expr-list
IF expression relop expression THEN statement
GOTO expression
INPUT var-list
LET var = expression
GOSUB expression
RETURN
CLEAR
LIST
RUN
END
expr-list ::= (string|expression) (, (string|expression) )*
var-list ::= var (, var)*
expression ::= (+|-|ε) term ((+|-) term)*
term ::= factor ((*|/) factor)*
factor ::= var | number | (expression)
var ::= A | B | C ... | Y | Z
number ::= digit digit*
digit ::= 0 | 1 | 2 | 3 | ... | 8 | 9
relop ::= < (>|=|ε) | > (<|=|ε) | =
This syntax, as simple as it was, added one innovation: Sinclair BASIC used as its language definition the 1978 American National Standards Institute (ANSI) Minimal BASIC standard, but was itself an incomplete implementation with integer arithmetic only.[57] The ANSI standard was published after the design of the first generation of interpreters for microcomputers. ArchitectureCommon components of a BASIC interpreter:[58]
CodingEarly microcomputers lacked development tools, and programmers either developed their code on minicomputers or by hand. For instance, Dick Whipple and John Arnold wrote Tiny BASIC Extended directly in machine code, using octal.[59] Robert Uiterwyk handwrote MICRO BASIC for the SWTPC (a 6800 system) on a legal pad.[60] Steve Wozniak wrote the code to Integer BASIC by hand, translating the assembler code instructions into their machine code equivalents and then uploading the result to his computer.[61] (Because of this, the program was very hard to change, and Wozniak was not able to modify it quickly enough for Steve Jobs, who subsequently licensed BASIC from Microsoft.[62]) Gates and Allen did not have an Altair system on which to develop and test their interpreter. However, Allen had written an Intel 8008 emulator for their previous venture, Traf-O-Data, that ran on a PDP-10 time-sharing computer. Allen adapted this emulator based on the Altair programmer guide, and they developed and tested the interpreter on Harvard's PDP-10.[63] When Harvard stopped their use of this system, Gates and Allen bought computer time from a timesharing service in Boston to complete their BASIC program debugging. Gates claimed, in his Open Letter to Hobbyists in 1976, the value of the computer time for the first year of software development was $40,000.[64] Not that Allen couldn't handcode in machine language. While on final approach into the Albuquerque airport on a trip to demonstrate the interpreter, Allen realized he had forgotten to write a bootstrap program to read the tape into memory. Writing in 8080 machine language, Allen finished the program before the plane landed. Only when he loaded the program onto an Altair and saw a prompt asking for the system's memory size did he know that the interpreter worked on the Altair hardware.[65][66] One of the most popular of the many versions of Tiny BASIC was Palo Alto Tiny BASIC, or PATB for short. PATB first appeared in the May 1976 edition of Dr. Dobbs, written in a custom assembler language with non-standard mnemonics. Li-Chen Wang had coded his interpreter on a time-share system with a generic assembler. One exception to the use of assembly was the use of ALGOL 60 for the Paisley XBASIC interpreter for Burroughs large systems.[67] Another exception, and type-in program, was Classic BASIC, written by Lennart Benschop in Forth and published in the Dutch Forth magazine Vijgeblad (issue #42, 1993).[68] The source code of interpreters was often open source (as with Tiny BASIC) or published later by the authors. The complete annotated source code and design specifications of Atari BASIC were published as The Atari BASIC Source Book in 1983.[69] Virtual machinesSome BASIC interpreters were coded in the intermediate representation of a virtual machine to add a layer of abstraction and conciseness above native machine language.
While virtual machines had been used in compile and go systems such as BASIC-PLUS, these were only for executing BASIC code, not parsing it.[70] Tiny BASIC, in contrast, was designed to be implemented as a virtual machine that parsed and executed (interpreted) BASIC statements; in such an implementation, the Tiny BASIC interpreter is itself run on a virtual machine interpreter.[71] The length of the whole interpreter program was only 120 virtual machine operations, consisting of 32 commands.[72] Thus the choice of a virtual machine approach economized on memory space and implementation effort, although the BASIC programs run thereon were executed somewhat slowly. (See Tiny BASIC: Implementation in a virtual machine for an excerpt and sample commands.) While the design intent was for Tiny BASIC to use a virtual machine, not every implementation did so; those that did included Tiny BASIC Extended, 6800 Tiny BASIC,[73] and NIBL. For its TI-99/4 and TI-99/4A computers, Texas Instruments designed a virtual machine with a language called GPL, for "Graphic Programming Language".[74] (Although widely blamed for the slow performance of TI-BASIC, part of the problem was that the virtual machine was stored in graphics ROM, which had a slow 8-bit interface.)[75] A misunderstanding of the Apple II ROMs led some to believe that Integer BASIC used a virtual machine, a custom assembler language contained in the Apple ROMs and known as SWEET16. SWEET16 is based on bytecodes that run within a simple 16-bit virtual machine, so memory could be addressed via indirect 16-bit pointers and 16-bit math functions calculated without the need to translate those to the underlying multi-instruction 8-bit 6502 code.[76] However, SWEET16 was not used by the core BASIC code, although it was later used to implement several utilities, such as a line renumbering routine.[77] Program editing and storageProgram editingMost BASIC implementations of the era acted as both the language interpreter as well as the line editor. When BASIC was running, a Statements that were entered with leading numbers are entered into the program storage for "deferred execution",[79] either as new lines or replacing any that might have had the same number previously.[80] Statements that were entered without a line number were referred to as commands, and ran immediately. Line numbers without statements (i.e., followed by a carriage return) deleted a previously stored line. When a program was present in memory and the user types in the Different implementations offered other program-editing capabilities. Altair BASIC 8K had an Tokenizing and encoding linesTo save RAM, and speed execution, all BASIC interpreters would encode some ASCII characters of lines into other representations. For instance, line numbers were converted into integers stored as bytes or words, and keywords might be assigned single-byte tokens (for instance, storing
AbbreviationsAs an alternative to tokenization, to save RAM, early Tiny BASIC implementations like Extended Tiny BASIC,[82] Denver Tiny BASIC[83] and MINOL[84] truncated keywords: In contrast, Palo Alto Tiny BASIC accepted traditional keywords but allowed any keyword to be abbreviated to its minimal unique string, with a trailing period. For instance, To expand an abbreviation, the Atari BASIC tokenizer searches through its list of reserved words to find the first that matches the portion supplied. More commonly used commands occur first in the list of reserved words, with TokenizationMost BASIC interpreters perform at least some conversion from the original text form into various platform-specific formats. Tiny BASIC was on the simple end: it only converted the line number from its decimal format into binary. For instance, the line number "100" became a single byte value, $64, making it smaller to store in memory as well as easier to look up in machine code (a few designs of Tiny BASIC permitted line numbers from only 1 to 254 or 255, although most used double byte values and line numbers of at least 1 to 999). The rest of the line was left in its original text format.[86] In fact, Dennis Allison argued that, given memory constraints, tokenization would take more code to implement than it would save.[87] MS-BASICs went slightly further, converting the line number into a two-byte value and also converting keywords, like 10 FOR I=1 TO 10 would be tokenized as: $64$81 I$B211$A410 Note that the space between In contrast, Integer BASIC would convert the line Carrying this even further, Atari BASIC's tokenizer parses the entire line when it is entered or modified. Numeric constants are parsed into their 48-bit internal form and then placed in the line in that format, while strings are left in their original format, but prefixed with a byte describing their length. Variables have storage set aside as they are encountered, instead of at runtime, and their name is replaced with a pointer to their storage location in memory. Shepardson referred to this early-tokenizing concept as a "pre-compiling interpreter"; statements with syntax errors could not actually be stored, and the user was immediately prompted to correct them.[92] Tokenization at the keyboard![]() Some interpreters, such as the Sinclair systems, basically had the user do the tokenization by providing special keystrokes to enter reserved words. The most common commands need one keystroke only; for example, pressing only P at the start of a line on a Spectrum produces the full command Many "pocket computers" similarly use one keystroke (sometimes preceded by various kinds of shift keys) to produce one byte (the keyword token) that represented an entire BASIC keyword, such as EXP, SQR, IF, or PEEK, such as Sharp pocket computer character sets and TI-BASIC. The BASIC expansion for the Bally Astrocade use this as well. Line management
Valid line numbers varied from implementation to implementation, but were typically from 1 to 32767. Most of the memory used by BASIC interpreters was to store the program listing itself. Numbered statements were stored in sequential order in a sparse array implemented as a linear collection (technically not a list as no line number could occur more than once). Many Tiny BASIC implementations stored lines as follows:
Microsoft BASIC, starting with Altair BASIC, stored lines as follows:[94]
LLL BASIC:[95]
The maximum length of a line varied: 64 characters in Palo Alto Tiny BASIC, including the decimal representation of the line number; 120 characters in Atari BASIC; 128 characters in Integer BASIC;[96] and 255 characters in MS-BASIC (not including the line number). Interpreters would search the program a line at a time, looking at each line number. If it were lower than the new line number, the later lines would be moved in memory to make room for the space required for the new line. If it were the same line number, and not the exact same length, subsequent lines would need to be moved forward or backward.[97] (Because sequential order was always maintained in memory, these were not linked lists.) In Tiny BASIC, these searches required checking every byte in a line: the pointer would be incremented again and again until a carriage return was encountered, to find the byte before the next line. In Altair BASIC and LLL BASIC, on the other hand, the pointer would instead be set to the start of the next sequential line; this was much faster, but required two bytes per line. Given that Tiny BASIC programs were presumed to be 4 KB or less in size, this was in keeping with Tiny BASIC's general design philosophy of trading off performance in favor of minimizing memory usage. When the user typed As developers added structured programming constructs to BASIC, they often removed the need for line numbers altogether and added text editors and, later, integrated development environments. Variables and data typesVariable namesDartmouth BASIC and HP-BASIC limited variable names to at most two characters (either a single letter or a letter followed by one digit; e.g., A to Z9). MS-BASIC allowed variable names of a letter followed by an optional letter or digit (e.g., A to ZZ) but ignored subsequent characters: thus it was possible to inadvertently write a program with variables "LOSS" and "LOAN", which would be treated as being the same; assigning a value to "LOAN" would silently overwrite the value intended as "LOSS". Integer BASIC was unusual in supporting any length variable name (e.g., SUM, GAMEPOINTS, PLAYER2), provided it did not contain a reserved word.[98] Keywords could not be used in variables in many early BASICs; "SCORE" would be interpreted as "SC" OR "E", where OR was a keyword. String variables are usually distinguished in many microcomputer dialects of BASIC by having $ suffixed to their name, and values are often identified as strings by being delimited by "double quotation marks". Later implementations would use other punctuation to specify the type of a variable: A% for integer, A! for single precision, and A# for double precision. With the exception of arrays and (in some implementations) strings, and unlike Pascal and other more structured programming languages, BASIC does not require a variable to be declared before it is referenced. Values will typically default to 0 (of the appropriate precision) or the null string. Symbol tableBecause Tiny BASIC only used 26 single-letter variables, variables could be stored as an array without storing their corresponding names, using a formula based on the ASCII value of the letter as the index. Palo Alto Tiny BASIC took this a step further: variables 'two-byte values were located in RAM within the program, from bytes 130 (ASCII 65, 'A', times two) to 181 (ASCII 90, 'Z', times two, plus one for the second byte).[85] Most BASICs provided for the ability to have far more than 26 variables and so needed symbol tables, which would set aside storage capacity for only those variables used. In LLL BASIC, each entry in the symbol table was stored as follows:[99]
Unlike most BASIC interpreters, UIUC BASIC had a hash function, hashing by the letter of the variable/function/array name, then conducting a linear search from there. In UIUC BASIC, a symbol table entry was:[58]
In Atari BASIC, a set of pointers (addresses) indicated various data: variable names were stored in the variable name table (VNTP – 82, 8316) and their values were stored in the variable value table (pointed to at VVTP – 86, 8716). By indirecting the variable names in this way, a reference to a variable needed only one byte to address its entry into the appropriate table. String variables had their own area. One BBC BASIC performance optimization included using multiple linked lists for variable lookup rather than a single long list, as in Microsoft BASIC. Memory managementBecause of the small RAM capacity of most systems originally used to run BASIC interpreters, clever memory management techniques had to be employed. Altair BASIC let users reclaim the space for trigonometry functions if those weren't being used during a session. PATB placed the start of the most common subroutines at the front of the program for use by the 1-byte Video was often memory addressable, and certain esoteric functions were available by manipulating values at specific memory values. For instance, addresses 32 to 35 contained the dimensions of the text window (as opposed to the graphics window) in Applesoft BASIC. The Some implementations of the Microsoft interpreter, for example those running on the TRS-80 Models I/III, required the user to specify the amount of memory to be used by the interpreter. This was to permit a region of memory to be reserved for the installation of machine language subroutines that could be called by the interpreted program, for greater speed of execution. When the Models I/III are powered up, the user is greeted with the prompt "Memory size?" for this purpose. MathematicsInteger BASIC, as its name implies, uses integers as the basis for its math package. These were stored internally as a 16-bit number, little-endian (as is the 6502). This allowed a maximum value for any calculation between −32767 and 32767. Calculations that resulted in values outside that range produced an error.[103] Most Tiny BASIC interpreters (as well as Sinclair BASIC 4K) supported mathematics using integers only, lacking floating-point support. Using integers allowed numbers to be stored in a much more compact 16-bit format that could be more rapidly read and processed than the 32- or 40-bit floating-point formats found in most BASICs of the era. However, this limited its applicability as a general-purpose language. Business BASIC implementations, such as Data General Business Basic, were also integer-only, but typically at a higher precision: "double precision", i.e. 32-bit (plus or minus 2,147,483,648) and "triple precision" (plus or minus 1.4x10^14). Other computer number formats were sometimes used. For instance, the MINOL Tiny BASIC supported only unsigned bytes,[84] and the MICRO-BASIC Tiny BASIC used Binary Coded Decimal.[104] But floating point would come to predominate. Floating pointOne story encapsulates why floating point was considered so important. The original prototype of the TRS-80 Model I ran Li-Chen Wang's public domain version of Tiny BASIC. This required only 2 KB of memory for the interpreter, leaving an average of another 2 KB free for user programs in common 4 KB memory layouts of early machines. During a demonstration to executives, Tandy Corporation's then-President Charles Tandy tried to enter his salary but was unable to do so. This was because Tiny BASIC used 2-byte signed integers with a maximum value of 32,767. The result was a request for floating-point math for the production version.[105] This led to the replacement of the existing 16-bit integer code with a version using 32-bit single-precision floating-point numbers by Tandy-employee Steve Leininger.[106] SCELBAL used floating point routines published by Wadsworth in 1975 in Machine Language Programming for the 8008 based on a 32-bit (four byte) format for numeric calculations, with a 23-bit mantissa, 1-bit sign for the mantissa, a 7-bit exponent, and 1-bit sign for the exponent. These were organized in reverse order, with the least significant byte of the mantissa in the first byte, followed by the middle and then most significant byte with the sign in the high bit. The exponent came last, again with the sign in the high bit.[107] The manual provides well-documented assembly code for the entire math package, including entry points and usage notes.[108] Consultants were typically brought into handle floating-point arithmetic, a specialist domain well studied and developed for the scientific and commercial applications that had characterized mainframes. When Allen and Gates were developing Altair BASIC, fellow Harvard student Monte Davidoff convinced them to switch from integer arithmetic. They hired Davidoff to write a floating-point package that could still fit within the 4KB memory limits. Steve Wozniak turned to Roy Rankin of Stanford University for implementing the transcendental functions LOG, LOG10, and EXP;[109] however, Wozniak never finished adding floating-point support to Integer BASIC. LLL BASIC, developed at the University of Idaho by John Dickenson, Jerry Barber, and John Teeter, turned to David Mead, Hal Brand, and Frank Olken for their floating-point support.[110] For UIUC BASIC, a Datapoint 2200 floating-point package was licensed.[111] In contrast, time-shared systems had often relied on hardware. For instance, the GE-235 was chosen for implementing the first version of Dartmouth BASIC specifically because it featured an "Auxiliary Arithmetic Unit" for floating point and double-precision calculations.[112][113] Early interpreters used 32-bit formats, similar to the IEEE 754 single-precision binary floating-point format, which specifies:
Here is the value 0.15625 as stored in this format:
While 32-bit formats were common in this era, later versions of BASIC, starting with Microsoft BASIC for the MOS 6502, generally adopted a 40-bit (five byte) format for added precision.[114] Operators and functionsInfix operators typically included Dartmouth BASIC's initial edition included the following functions: The
ArraysThe second version of Dartmouth BASIC supported matrices and matrix operations, useful for the solution of sets of simultaneous linear algebraic equations; In contrast, Tiny BASIC as initially designed didn't even have any arrays, due to the limited main memory available on early microcomputers, often 4 KB, which had to include both the interpreter and the BASIC program. Palo Alto Tiny BASIC added a single variable-length array of integers, the size of which did not have to be dimensioned but used RAM not used by the interpreter or the program listing, SCELBAL supported multiple arrays, but taken together these arrays could have no more than 64 items. Integer BASIC supported arrays of a single dimension, limited in size only by the available memory.[118] Tiny BASIC Extended supported two-dimensional arrays of up to 255 by 255. Altair BASIC 4K supported only arrays (one dimension) while the 8K version supported matrices of up to 34 dimensions.[119] Many implementations supported the Dartmouth BASIC practice of not requiring an array to be dimensioned, in which case it was assumed to have 11 elements (0 to 10); e.g., The dope vector of arrays varied from implementation to implementation. For instance, the dope vector of an Altair BASIC 4K array:[94]
Then the array values themselves:
Implementations that supported matrices had to record the number of dimensions and the upper bound of each dimension. Further, as some interpreters had only one data type (either floating point or integer), the dope vector merely needed to record the number of dimensions and the upper bound of each dimension. Interpreters with multiple data types had to record the data type of the array. Even though Microsoft and other BASICs did support matrices, matrix operations were not built in but had to be programmed explicitly on array elements. StringsThe original Dartmouth BASIC, some of its immediate descendants, and Tiny BASIC implementations lacked string handling. Two competing schools of string-handling evolved, pioneered by HP and DEC, although other approaches came later. These required different strategies for implementation.
The simplest string handling copied HP Time-Shared BASIC and defined string variables as arrays of characters that had to be Substrings within strings are accessed using a "slicing" notation: This is in sharp contrast to BASICs following the DEC pattern that use functions such as Later versions of Dartmouth BASIC did include string variables. However, they did not use the Integer BASIC, North Star BASIC[121] and Atari BASIC[122] mimicked HP's approach, which again contrasted with the style found in BASICs derived from DEC, including Microsoft BASIC, where strings are an intrinsic variable-length type.[123] Some of the Tiny BASIC implementations supported one or more predefined integer arrays, which could be used to store character codes, provided the language had functionality to input and output character codes (e.g., Astro BASIC had Garbage collection![]() Having strings use a fixed amount of memory regardless of the number of characters used within them, up to a maximum of 255 characters, may have wasted memory[124] but had the advantage of avoiding the need for implementing garbage collection of the heap, a form of automatic memory management used to reclaim memory occupied by strings that are no longer in use. Short strings that were released might be stored in the middle of other strings, preventing that memory from being used when a longer string was needed. On early microcomputers, with their limited memory and slow processors, BASIC garbage collection could often cause apparently random, inexplicable pauses in the midst of program operation. Some BASIC interpreters, such as Applesoft BASIC on the Apple II family, repeatedly scanned the string descriptors for the string having the highest address in order to compact it toward high memory, resulting in O(n2) performance, which could introduce minutes-long pauses in the execution of string-intensive programs. Garbage collection was notoriously slow or even broken in other versions of Microsoft BASIC.[125] Some operating systems that supported interrupt-driven background tasks, such as TRSDOS/LS-DOS 6.x on the TRS-80 Model 4, exploited periods of user inactivity (such as the milliseconds-long periods between keystrokes and periods following video screen refresh) to process garbage collection during BASIC program runs. Other functionalityGraphics and soundMost BASIC interpreters differed widely in graphics and sound, which varied dramatically from microcomputer to microcomputer. Altair BASIC lacked any graphics or sound commands, as did the Tiny BASIC implementations, while Integer BASIC provided a rich set. Level I BASIC for the TRS-80 had as minimal a set as possible: In contrast, Integer BASIC supported color graphics, simple sound, and game controllers. Graphics mode was turned on with the ![]() Hardware manufacturers often included proprietary support for semigraphics, simple shapes and icons treated as special characters. Examples included the block graphics of the ZX-81, and the card symbols of ♠, ♣, ♥ and ♦ in the Commodore International PETSCII character set. BASIC could generate these symbols using Microsoft added many graphics commands to IBM BASIC: Input/outputAnother area where implementations diverged was in keywords for dealing with media (cassettes and floppy disks), keyboard input, and game controllers (if any). Since ROM-based BASIC interpreters often functioned as shells for loading in other applications, implementations added commands related to cassette tapes (e.g., Dartmouth BASIC lacked a command for getting input from the keyboard without pausing the program. To support videogames, BASICs added proprietary commands for doing so: Palo Alto Tiny BASIC lacked strings but would allow users to enter mathematical expressions as the answer to Some systems supported game controllers. Astro BASIC supported Integer BASIC lacked any custom input/output commands, and also lacked the Structured programmingWhile structured programming, through the examples of ALGOL 58 and ALGOL 60, were known to Kemeny and Kurtz when they designed BASIC, they adapted only the for-loop, ignoring the else-statement, while-loop, repeat loop, named procedures, parameter passing, and local variables. As a result, subsequent dialects often differed dramatically in the wording used for structured techniques. For instance, Of the Tiny BASIC implementations, only National Industrial Basic Language (NIBL) offered a loop command of any sort, BBC BASIC was one of the first microcomputer interpreters to offer structured BASIC programming, with named The following example is in Microsoft QBASIC, Microsoft's third implementation of a structured BASIC (following Macintosh BASIC in 1984 and Amiga BASIC in 1985).[138] REM QBASIC example
REM Forward declaration - allows the main code to call a
REM subroutine that is defined later in the source code
DECLARE SUB PrintSomeStars (StarCount!)
REM Main program follows
DO
INPUT "How many stars do you want? (0 to quit) ", NumStars
CALL PrintSomeStars(NumStars)
LOOP WHILE NumStars>0
END
REM subroutine definition
SUB PrintSomeStars (StarCount)
REM This procedure uses a local variable called Stars$
Stars$ = STRING$(StarCount, "*")
PRINT Stars$
END SUB
Object orientedInitial support for object-oriented programming provided only the re-use of objects created with other languages, such as how Visual Basic and PowerBASIC supported the Windows Component Object Model. As BASIC interpreters continued to evolve, they added support for object-oriented features such as methods, constructors, dynamic memory allocation, properties and temporary allocation. Included assemblerThe Integer BASIC ROMs also included a machine code monitor, "mini-assembler", and disassembler to create and debug assembly language programs.[90][139][140] One of the unique features of BBC BASIC was the inline assembler, allowing users to write assembly language programs for the 6502 and, later, the Zilog Z80, NS32016 and ARM. The assembler was fully integrated into the BASIC interpreter and shared variables with it, which could be included between the [ and ] characters, saved via *SAVE and *LOAD, and called via the CALL or USR commands. This allowed developers to write not just assembly language code, but also BASIC code to emit assembly language, making it possible to use code-generation techniques and even write simple compilers in BASIC. ExecutionDebuggingAs in most BASICs, programs were started with the For step-by-step execution, the Some implementations such as the Microsoft interpreters for the various TRS-80 models included the command Unlike most BASICs, Atari BASIC scanned the just-entered program line and reported syntax errors immediately. If an error was found, the editor re-displayed the line, highlighting the text near the error in inverse video. In many interpreters, including Atari BASIC, errors are displayed as numeric codes, with the descriptions printed in the manual.[145] Many MS-BASIC used two-character abbreviations (e.g., SN for SYNTAX ERROR). Palo Alto Tiny BASIC and Level I BASIC used three words for error messages: "WHAT?" for syntax errors, "HOW?" for run-time errors like GOTOs to a line that didn't exist or numeric overflows, and "SORRY" for out-of-memory problems. ParsingWhile the BASIC language has a simple syntax, mathematical expressions do not, supporting different precedence rules for parentheses and different mathematical operators. To support such expressions requires implementing a recursive descent parser.[146] This parser can be implemented in a number of ways:
PerformanceThe range of design decisions that went into programming a BASIC interpreter were often revealed through performance differences. Line-management implementations often affected performance and typically used linear search. In Tiny BASIC, and others, in order to find a line the system had to read each byte of the source code looking for a CR, which indicated the next item was a line number. This meant the system had to read the entire program until it found its target for a GOTO or GOSUB, imposing a large performance penalty. MS-BASICs, and its many derivatives, stored a pointer to the next line, so if one line number was seen not to match, it could immediately move to the next line number. Atari BASIC used an 8-bit line length instead of a pointer, saving a byte. Many implementations would always search for a line number to branch to from the start of the program; MS-BASIC would search from the current line if the destination line number was greater. Pittman added a patch to his 6800 Tiny BASIC to use a binary search.[148] Working solely with integer math provides another major boost in speed. As many computer benchmarks of the era were small and often performed simple math that did not require floating-point, Integer BASIC trounced most other BASICs.[e] On one of the earliest known microcomputer benchmarks, the Rugg/Feldman benchmarks, Integer BASIC was well over twice as fast as Applesoft BASIC on the same machine.[150] In the Byte Sieve, where math was less important but array access and looping performance dominated, Integer BASIC took 166 seconds while Applesoft took 200.[151] It did not appear in the Creative Computing Benchmark, which was first published in 1983, by which time Integer BASIC was no longer supplied by default.[152] The following test series, taken from both of the original Rugg/Feldman articles,[150][149] show Integer's performance relative the MS-derived BASIC on the same platform.
MS-BASIC only converted line numbers and a set of keywords into tokens, a process they referred to as "crunching". This meant that things like line numbers in GOTO statements were left in ASCII format, and had to be re-converted every time they were encountered. In contrast, a number of interpreters would convert everything into the runtime format, examples including Atari BASIC and Sinclair BASIC. In theory this would greatly speed performance over MS-style interpreters, as many bits of the program that would be bare ASCII did not have to be parsed at runtime. In general, however, these dialects often ran much slower for a variety of reasons. For instance, on two widely used benchmarks of the era, Byte magazine's Sieve of Eratosthenes and the Creative Computing benchmark test written by David H. Ahl, Atari finished near the end of the list in terms of performance, and was much slower than the contemporary Apple II or Commodore PET.[153] See alsoNotes
References
Bibliography
Further readingSource code and design documents, in chronological order of the release of the BASIC implementations:
|
Portal di Ensiklopedia Dunia