QBJC Project Back

Programmer: Toshi Horie

Date Started: December 27, 1999

Goal: Write a working BASIC->JAVA/asm compiler that emulates the SCREEN 13h
video architecture to the port level, so it can run graphics demos
written in QB to come to life on the web as Java applets, and spur
development of network games by QB programmers by adding TCP/IP
extended functionality to this version of QB.

Compiler Structure:
Source: a subset of QuickBASIC 4.5
Compiler code: Java 1.1 source
Target: Java 1.1 JVM source code, x86 assembly

* The Long-Awaited Source Files *****
I will make more of the sources available as this project progresses.
All of the source and specification files are copyrighted by Toshihiro Horie and UC Regents.
They may be modified or used only via the written consent of the author.
However, the final project is planned to be released under the terms of the GNU Public License.


        QBJLEX.TXT - JLex lexer specification file.
        QBJCUP.TXT - JavaCUP parser specification file.
        Java CUP - required to compile qbjcup.txt. (homepage)
        JLex - required right click and use "Save Target As..." to download. Needed to compile qbjlex.txt. (homepage)
       JDK 1.2 - required compiler for Java (to compile Java sources) and JVM runtime.
       JVIEW - (optional) Microsoft's Java virtual machine to "run" the compiled .class files
              You may have it already, because it comes with Internet Explorer and Win98.

* Installation directions *****
  1. Install JDK 1.2 into C:\JAVA (not the default directory).
  2. copy Main.java (from JLex link) to C:\JAVA\CLASSES\JLex\Main.java
  3. C:
  4. CD \JAVA\CLASSES\JLex
  5. javac Main.java
  6. Unzip java_cup_v10j.zip with directories under C:\JAVA\CLASSES\
  7. SET CLASSPATH=.;C:\JAVA\CLASSES
  8. SET PATH=%PATH%;C:\JAVA\BIN
  9. Copy QBJLEX.TXT to C:\QB45\COMPILER\QB1.LEX
  10. Copy QBJCUP.TXT to C:\QB45\COMPILER\QB1.CUP
  11. java java_cup/Main -symbols Sym < QB1.CUP
  12. javac Sym.java
  13. java JLex.Main QB1.LEX
  14. ren QB1.LEX.java Yylex.java
  15. javac Yylex.java
  16. If you don't get any errors doing the above, congratulations!

* General Idea for SCREEN 13 emulation ***
Basically any POKE that is directed to SEGMENT 9001h-AFFFh can potentially
plot a pixel on SCREEN 13, provided the mode is set. Of course PSET can
too, but that's the easy case. Now, to map the POKE address to our emulated
video buffer (java.awt.MemoryImageSource char[65536]) by linearizing the
segmented address. The logic looks something like:

    if (DEFSEG>0x9000 && DEFSEG<0xB000)
    {
       arrayoffset=(DEFSEG-0xA000)*16+POKEOFS
       if (arrayoffset>=0 && arrayoffset<65536)
           videobuf[arrayoffset]=POKEVALUE
    }

I'm not going to worry about CALL ABSOLUTE, because it's not my intent
to emulate the x86 CPU in Java.
The buffer will be periodically flushed (emulating vsync) by a
rendering thread, and the virtual port &h3da will be updated
accordingly, so that WAITing for Vsync should work too.
Palette emulation should be easy, although I need to do a little
more investigation on this. I think an 8-bit MemoryImageSource will
work just like SCREEN 13, so reinitializing the MemoryImageSource
with the new colormap will correctly reflect palette changes.


* Lexer Generator Implementation ***
Uses a wordlist and generator program in QB to create a JLex input file.
JVIEW.EXE or similar Java Virtual Machine is required to run JLex, but
since Microsoft supports JVIEW (as of 1999), pretty much everyone can
recreate my results, using the generator program.

* Parser Generator Implementation ***
Most of the terminals declarations were generated by a QB program, using
the same wordlists as the Lexer generator. The nonterminal declarations
were done by hand. Fortunately, I had a copy of my JO99 parser, so I used
that as a base for the Java code for this parser. I was able to write
more human-readable grammar rules compared to my JO99 parser project by
using more "dummy" nonterminals.

8086 opcode generation
actual generator

* Development History ***
= December 29, 1999 ==========================================
Lexer Status:   70% complete.  QB operator and keyword list generated.
                               Lexer generated (624 states), but untested.
Parser Status:   5% complete.  wordlist ready to use for terminal definitions.
                               rough draft of parser rules in the works.
Semant Status:   0% complete.
CodeGen Status:  1% complete.  Ideas for implementing SCREEN 13 emulation done.
Testing Status:  0% complete.
Overall Status:  8% complete.

= January 6, 2000 ==========================================
Lexer Status:  100% complete.  Lexer is at NFA=1574, optimal DFA=664 states 
                               or so, and works flawlessly on all files 
                               tested except for choking on some 
                               high bit characters in one string.
Parser Status:  35% complete.  Parser is proving to be *very* time consuming.  Things
                               went more smoothly after I discovered how to use the 
                               {: RESULT=some_nonterminal; :} action.  However,
                               I'm having Shift/Reduce errors between formal arguments
                               and MethodCall Statements.  They can look exactly the
                               the same in some cases, making the grammar ambiguous.
                               CUP reports 0 errors and 143 warnings
                               235 terminals, 32 non-terminals,
                               and 165 productions declared,
                               producing 462 unique parse states.
Semant Status:   0% complete.
CodeGen Status:  1% complete.  Ideas for implementing SCREEN 13 emulation done.
Testing Status:  0% complete.  Got a few test cases from EFnet #quickbasic
Overall Status: 30% complete.

= January 7, 2000 ==========================================
Lexer status:  100% complete.  1591 NFA states -> 651 DFA states.
                               lexer generation now takes 8+ mins.
Parser Status:  40% complete.  SUB calls without the CALL keyword
                               override my array subscripting, so
                               I take it out for now.  Now it gets
                               handled by DerefExp+SubscriptVar.
                               Got rid of RangeList and used a 
                               souped up ExpList (uses wexp) instead.
                               This will kill parser bug #2b, but
                               will complicate semantCheck.
                               Now has 536 unique parse states.
                               Minor improvement in string handling.
                               parser generation now takes 8+ mins.
Notes:
* Do not add a {NEWLINE} to the end of the definition of commentline!
 This is because some comments can start after a statement, and those
 statements depend on seeing the newline, which won't exist if it gets 
 consumed by the comment itself.
    e.g. CALL warp(9)  'make the Enterprise go as fast as possible


= January 10, 1999 ===================================
Lexer Status:  100% complete.  Lexer has NFA=1761, DFA=696 states. 
                               I added support for Metacommands like 
                               '$INCLUDE: 'filename' and '$STATIC.
                               Hopefully this doesn't break the lexer.
Parser Status:  45% complete.  242 terminals, 38 non-terminals, 
                               and 211 productions declared,
                               producing 600 unique parse states.
                               119 terminals declared but not used.
                               I tested dim statements extensively,
                               and it seems to parse them correctly,
                               (before the metacommand change).
                               READ statments should work now.
= January 12, 1999 ===================================
Lexer Status:  100% complete.  Lexer has NFA=1746, DFA=709 states. 
Parser Status:  60% complete.  245 terminals, 41 non-terminals, 
                               and 237 productions declared,
                               producing 670 unique parse states.
                               111 terminals declared but not used.
                               Allow full-line comments within TYPE decls.
                               Have problems parsing several forms of
                               subdeclstmt and funcdeclstmt.
Contact
Toshi Horie if you're interested in joining or contributing to the project.