Article
MIDP Terminal Emulation, Part 2: Advanced Terminal Emulation
 

by Michael Powers
April 2004

Read MIDP Terminal Emulation, Part 1: A Simple Emulator MIDlet
Download the project code

In the first article in this series we built a simple terminal emulator that runs on any MIDP device that supports TCP/IP sockets. It boasts a Connection that implements the Telnet protocol and a Canvas customized to display terminal content. In this second article I'll assume you've read the first one and are familiar with these components.

Now we'll take the application a bit further. First we'll add some sophistication by emulating a more advanced terminal type. Then we'll add support for user input -- keeping in mind the limitations of most mobile devices. When we're done, we'll be able to use the application to connect to remote servers over Telnet and run many kinds of programs, all from your MIDP device.

About ANSI Terminals

In the preceding article we implemented a "dumb" terminal. It displayed little more than a character stream, wrapping to the next line when the incoming characters reached the edge of the screen, and skipping to the next line when it encountered a linefeed character. While this degree of interactivity was sufficient for an entire generation of command-line applications, more sophisticated software addresses the screen as a whole, writing and erasing characters at specific locations, to provide a better user experience.

There's an interesting bit of history here. In the late 1970s, a wide range of vendors manufactured video terminals with varying, proprietary capabilities for screen manipulation, implemented in incompatible ways. To write screen-oriented software that could take advantage of more than a few of these devices was difficult.

Enter the American National Standards Institute (ANSI). Adhering to its mandate to facilitate commerce through interoperability, ANSI stepped in and published standard X3.64: Additional Controls for Use with American National Standard Code for Information Interchange. This document defined what is now known as the ANSI terminal type. It established standard command sequences for moving the cursor to specific locations on the screen, and for inserting and deleting characters at the cursor's location.

Most important was the definition of the command sequence itself, because even terminals that didn't implement all of the commands could at least recognize unsupported commands as commands, and safely ignore them. This development allowed software developers to write to a common standard, confident that their applications would run at least in some minimal capacity across many vendors' devices. The story of ANSI is one instance of a recurring motif in the history of the software industry: adoption of a standard greatly expanding interoperability.

The ANSI terminal standard is a protocol much like the Telnet protocol. It defines special sequences of characters that allow an application to distinguish between commands to be interpreted and data to be written to the screen.

The ANSI terminal is the terminal type we'll emulate. TelnetConnection already filters out the Telnet handshaking and negotiation before it gets to the screen; now we'll add an additional filter to strip out and interpret the ANSI commands. Because these commands are instructions to our terminal, the right place to implement this logic is in the TelnetCanvas class itself.

How Do Terminal Escape Sequences Work?

Where Telnet uses the byte value 255 to signal a command sequence, ANSI uses the ASCII escape character, whose value is 27. ANSI works like this:

We read a byte from the input. If its value isn't 27 (ESC), then it's not a command: we pass it through to the application and read on.

We read the next byte. If its value isn't 133 ([), then it's not a command; we pass 27 followed by this byte through to the application and read on.

We keep reading bytes until we reach a byte greater than 63. These bytes form a string containing the parameters to the command. The final byte (which is 64 or greater) is the command code. We process (or ignore) this command and read on.

Many commands are officially parts of the standard. While perfect emulation is a worthy goal, we'll focus on getting enough capability to allow a lot of software to run acceptably. We'll implement the following commands:

Cursor Control Sequences Erase Sequences
A Move cursor up n lines @ Insert n blank spaces
B Move cursor down n lines J Erase display: after cursor (n=0), before cursor (n=1), or entirely (n=2).
C Move cursor forward n spaces
D Move cursor backward n spaces K Erase line: after cursor (n=0), before cursor (n=1), or entirely (n=2).
G Move cursor to column x
H Move cursor to column x, row y L Insert n new blank lines
d Move cursor to row y M Delete n lines from cursor
s Save current cursor position P Delete n characters from cursor
u Return to saved cursor position  

Most of these commands expect parameters in the form of a semicolon-delimited string. Parameters appear after the sequence ESC [ and before the command byte. For example, a command to move the cursor to the tenth row and tenth column would look like this: ESC [ 1 0 ; 1 0 H. Many commands take only one parameter; for instance, ESC [ 2 J will clear the display. Omitted parameters default to 1, so ESC [ H will return the cursor to coordinates (1,1), and ESC [ B will move the cursor down a single line.

While there are lots of things to get right, the protocol is fairly simple.

Enhancing the Telnet Canvas Class

We need to update TelnetCanvas in two areas. It needs to interpret the ANSI commands received from the remote host and to send ANSI commands in response to user input. For input, we'll allow the user to move the cursor using the keypad on the device.

Implementing ANSI

While we must modify the internals of the TelnetCanvas class, it isn't necessary to change the public interface. All data is still received by the receive() method; we just change its implementation to watch for escape sequences:

/**
* Appends the specified ascii byte to the output.
*/
public void receive( byte b )
{
    // ignore nulls
    if ( b == 0 ) return; 
    if ( state == PARAMS_STATE )
    {
        // if still receiving parameters
        if ( b < 64 )
        {
            argbuf[0]++;
            
            // grow if needed
            if ( argbuf[0] == argbuf.length )
            {
                char[] tmp = new char[ argbuf.length * 2 ];
                System.arraycopy( 
                    argbuf, 0, tmp, 0, argbuf.length );
                argbuf = tmp; 
            }
            
            argbuf[ argbuf[0] ] = (char) b;
        }
        else // final byte: process the command
        {
            processCommand( b );
            // reset for next command
            argbuf[0] = 0;
            state = NORMAL_STATE;
        }
    }
    else
    if ( state == ESCAPE_STATE )
    {
        // if a valid escape sequence
        if ( b == '[' )
        {
            state = PARAMS_STATE;
        }
        else // not an escape sequence
        { 
            // allow escape to pass through
            state = NORMAL_STATE;
            processData( (byte) 27 );
            processData( b );
        }
    }
    else // NORMAL_STATE
    {
        if ( b == 27 )
        {
            state = ESCAPE_STATE;
        }
        else
        {
            processData( b );
        }
    }
}


This method implements a simple state machine with three states. NORMAL_STATE watches for any ESC bytes and sends everything else to processData(). When an ESC byte arrives, ESCAPE_STATE takes over to verify that the next byte is 133 ([). If so, we move to PARAMS_STATE, accumulating the parameter string until we encounter the command byte. When we do, we call processCommand(), then shift back to NORMAL_STATE.

The code that reads command parameters is worth examining. To avoid the overhead of creating and keeping a StringBuffer, we use a character array called argbuf. To avoid constant reallocation, we make it larger than needed and keep it around, extending it as necessary. Finally, to track where the next character should go, we borrow a trick from Pascal and store the length of the parameter string in the first element of the array. The getArgument() and getArgumentCount() methods handle the parsing and extraction of the individual parameters from this array.

We now shift the code that had been in the receive() method to the processData() method. The logic is the same, placing the incoming byte at the current cursor position, and the code is unchanged except for these two lines near the end of the method body:

/**
* Appends the specified byte to the display buffer.
*/
protected void processData( byte b )
{
    ...

    // increment bound if necessary
    while ( cursor > bound ) bound += columns;
    
    ...
}


With the simple stream-oriented approach of the earlier version of MIDTerm, the cursor marked not only the insertion point for incoming data but also the outer bound of how much of the data buffer should be visible on screen. Because the cursor can now move up and backward into the data buffer, an additional variable is needed to track the outer bound of the data, so we can determine where the bottom of the screen should be. This variable, called bound, must therefore track the cursor and stay ahead of it as it moves forward.

While processData() handles most bytes, processCommand() is called whenever a valid terminal command is received. This method is the heart of our ANSI implementation.

/**
* Executes the specified ANSI command, obtaining arguments
* as needed from the getArgument() and getArgumentCount() 
* methods.
*/
protected void processCommand( byte command )
{
    try 
    {
        switch ( command )
        {
            ... // other commands go here

            case 'd': // cursor to row x 
                if ( argbuf[0] > 0 ) 
                {
                    cursor = bound 
                     - ((rows-getArgument( 0 )+1)*columns) 
                     + ( cursor % columns );
                }
                break;
            
            case 'G': // cursor to column x 
                if ( argbuf[0] > 0 ) 
                {
                    cursor = cursor 
                     - ( cursor % columns ) 
                     + getArgument( 0 );
                }
                break;
            
            ... // other commands go here
                
            default:
                System.err.println( "unsupported command: " 
                    + (char) command 
                    + " : " 
                    + new String( argbuf, 1, argbuf[0] ) );
                
        }
    }
    catch ( Throwable t ) 
    {
        // probably parse exception or wrong number of args
        System.err.println( "Error in processCommand: " );
        t.printStackTrace();
    }
}


processCommand() is essentially a big switch statement with a case for each command that's supported. Unsupported commands fall into the default case and are ignored. The two cases listed here exemplify the logic for handling the rest of the commands.

Here you can get a flavor of the kind of array arithmetic needed to manipulate the cursor. Remember that our two-dimensional screen is represented by a one-dimensional array of bytes, and we don't discard data that scrolls off the top of the screen until we run out of memory. We preserve as much as we can so users can scroll backward to see anything they might have missed. For this reason, the origin of the screen must be calculated relative to the end of the array rather than the beginning. Coordinates are one-based, so the origin is at row 1, column 1.

If we didn't support the scrollback feature, calculating the array index for a pair of coordinates would be as simple as y * columns + x. Instead, we have to do a little more work and calculate the coordinates relative to the outer bound of our display buffer. If it makes our users happy, though, it's worth the extra effort.

Making It Interactive

What will make users even happier -- and won't require much more work -- is to allow them to interact with the remote host using the keys on their devices.

Earlier our application interacted with the remote host only by executing a scripted set of commands. The user could only wait for the operations to complete, then use the arrow keys to scroll the data buffer up and down to see what came back.

In the next version of MIDTerm we send the user's keystrokes directly to the remote host. To send anything at all we need an output stream, so TelnetCanvas now has a setOutputStream() method for this purpose. User input is handled by modifying the keyPressed() method as follows:

...
private byte[] move = new byte[] { 27, (byte) '[', 0 };
...

public void keyPressed( int keyCode )
{
    switch ( getGameAction( keyCode ) )
    {
        case LEFT:
            // move cursor left one column
            move[2] = 'D';
            send( move );
            break;
        case RIGHT:
            // move cursor right one column
            move[2] = 'C';
            send( move );
            break;
        case DOWN:
            if ( isScrolling() )
            {
                // scroll down one row
                scrollY++;
                if ( scrollY > calcLastVisibleScreen() )
                {
                    scrollY = calcLastVisibleScreen();
                }
                repaint();
            }
            else 
            {
                // move cursor down one row
                move[2] = 'B';
                send( move );
            }
            break;
        case UP:
            if ( isScrolling() )
            {
                // scroll up one row
                scrollY--;
                if ( scrollY < 0 ) scrollY = 0;
                repaint();
            }
            else 
            {
                // move cursor down one row
                move[2] = 'A';
                send( move );
            }
            break;
        case FIRE:
            // send a line feed:
            send( (byte) '\n' );
            break;
        default:
            // send code directly
            send( (byte) keyCode );
    }
}

The first thing to note is that we use getGameAction() to convert the key code to a game code before we decide which key was pressed. MIDP devices' keypad layouts differ: some have both arrow keys and number pads, some have only number pads that act like arrow keys. The getGameAction() method hides those complications.

If the key pressed was UP, DOWN, LEFT, or RIGHT, we generate the corresponding ANSI command and send it to the remote host. Note that UP and DOWN have two modes: one for moving the cursor and one for scrolling the display. The current scroll mode is discovered and controlled by two new methods, isScrolling() and setScrolling(). If scrolling is turned on, UP and DOWN scroll the output instead of moving the cursor. Note that we could save a little overhead by testing the scrolling variable directly instead of calling isScrolling()

The FIRE key sends a linefeed, which is analogous to the Enter or Return key on a conventional keyboard. This feature is great for menu-based applications where the arrow keys highlight an option and the Enter key selects the highlighted option; the Lynx web browser is a good example. Such applications are completely functional using only the keypad on the user's handset.

If the key code does not map to any of these game actions, it is sent directly to the remote host. The MIDP specification says devices with more keys than a standard phone handset should send equivalent ASCII characters as their key codes. Sending these key codes directly allows the application to take full advantage of devices that have full keyboards. Users with these devices can simply start typing; their keystrokes are sent over the connection as they would expect.

Finally, because the TelnetCanvas component hides all the terminal emulation logic from the rest of the application, we need the new methods getRows(), getColumns(), and getTerminalType() to advertise the screen dimensions and the kind of terminal emulation the implementation supports.

Updating the MIDlet

As before, the MIDlet class itself, MIDTerm, ties together TelnetConnection and TelnetCanvas. The setup of the connection changes only slightly:

...
connection = new TelnetConnection( 
    (StreamConnection) Connector.open( 
        connectString, Connector.READ_WRITE, true ),
    canvas.getColumns(), 
    canvas.getRows(), 
    canvas.getTerminalType() );
input = connection.openInputStream();
output = connection.openOutputStream();
canvas.setOutputStream( output );
...

But MIDTerm needs to do more. While TelnetCanvas handles basic user input, devices with numeric-only keypads need specialized means of entering text, such as repeated tapping on a key to choose each letter, or predictive text entry methods. The only way to take advantage of these native capabilities is to use MIDP's TextField and TextBox components. Because these components cannot be on the same screen as a Canvas, text entry requires a separate screen. While we're at it, the application should also provide a separate screen to set up the connection, with fields for the host name and port on which to connect. MIDTerm will act as the central broker to tie these three screens together.

The Form class is very flexible, so we don't need a separate subclass for each screen -- and that's welcome news. Most MIDP devices have limited space for application storage and some arbitrarily limit application size, usually to 32 KB or 64 KB, so size matters. Each class in the application adds at least half a kilobyte to the size of the JAR file, even after obfuscation, so avoid creating subclasses when you can.

The input form is going to be used often so it's created in MIDTerm's constructor and kept around for the duration of the application. This approach eliminates any perceptible delay in creating the form and avoids the memory churn of repeatedly allocating and garbage-collecting the form. The setup of the form is very straightforward:

...
inputForm = new Form( "Input" );

inputField = new TextField( null, "", 255, TextField.ANY );
inputForm.append( inputField );

inputOptions = new ChoiceGroup( null, Choice.MULTIPLE );
inputOptions.append( INPUT_ENTER, null );
inputOptions.append( INPUT_CTRL, null );
inputOptions.append( INPUT_ESC, null );
inputOptions.setSelectedIndex( 0, true ); // default true
inputForm.append( inputOptions );

scrollOptions = new ChoiceGroup( null, Choice.MULTIPLE );
scrollOptions.append( INPUT_SCROLL, null );
inputForm.append( scrollOptions ); 

inputForm.addCommand( okCommand );
inputForm.addCommand( cancelCommand );
inputForm.setCommandListener( this );
...

TelnetCanvas invokes this form any time the user wants to send some text. The first field is the TextField, and on most platforms it should receive the default focus. With a minimum of effort, the user can quickly invoke the form, enter some text, then select the OK command to send the data.

Because many terminal-based applications assume a terminal-style keyboard, the user needs a way to send special keystrokes like Return, Escape, and Control-key combinations. We use a ChoiceGroup for this purpose, set to MULTIPLE mode so we get checkboxes and not mutually exclusive radio buttons. Depending on two of the checkbox settings, the user's text is sent following an Escape character, or followed by a linefeed character. If the Send as CTRL option is selected, each of the characters is sent as if the Control key was pressed. (Control-key codes are calculated by converting a character to its upper-case equivalent and subtracting 64 from its value.) The Append ENTER option defaults to true because a modicum of usability testing proves that most text input is followed by a linefeed.

Finally, this form is an appropriate place to allow the user to set the scroll mode of TelnetCanvas. This feature is the functional equivalent of the Scroll Lock key on a terminal keyboard, and the checkbox is labeled as such. For visual separation, this option goes in a separate ChoiceGroup.

The login form is rarely used more than twice in a typical application life-cycle, so we create it only on demand and dispose of it when it's dismissed. The creation and layout happen in onShowLogin():

public void onShowLogin()
{
    // create and populate login form
    loginForm = new Form( "Connect to" );
    hostField = 
        new TextField( "Host", host, 50, TextField.URL );
    loginForm.append( hostField );
    portField = 
        new TextField( "Port", port, 6, TextField.NUMERIC );
    loginForm.append( portField );
    loginForm.addCommand( exitCommand );
    loginForm.addCommand( openCommand );
    loginForm.setCommandListener( this );
    
    // show form
    display.setCurrent( loginForm );
}

While this is a simple form, usability is still our foremost concern. Some implementations can take advantage of the "hint" on a TextField (URL or NUMERIC) and customize the user interface appropriately. Because the user will typically enter the domain name of a server in Host, it makes sense to indicate that the user input expected is like a URL. Some devices may have a special screen layout optimized for letters and symbols rather than numbers. Likewise, the Port field should be limited to numeric input; some devices may allow the user to enter the numbers on their keypad quickly, bypassing any key-repeat text entry system.

The MIDlet is the command listener for the login form as well as for the input form and the Telnet canvas. Think of it as the conductor of the application's workflow.

On startup, the login form is presented, with options to Open or Exit. Once opened, the Telnet canvas appears, with options to Close or Input. The Input option shows the input form, with an OK option that hides the form and sends the text, and a Cancel option that just hides the form. The Close option hides the Telnet canvas and shows a login form. Last, the Exit option calls notifyDestroyed() and quits the application. All of these commands are handled by MIDTerm's commandAction() method.

Let's Put It To Work

Now that all the pieces are in place, we're ready to put our terminal emulator to work. Terminal applications are widely used in corporate and educational computing environments for enterprise applications, software development, system administration -- even gaming, believe it or not. MIDTerm makes all these kinds of applications and resources accessible from your mobile device.

Software Development: emacs

System Administration: top

Gaming: starcross

Going further, any of these applications can be scraped: you can write a mobile application that interacts with a resource-intensive program on a remote server, extracting the output and rendering it in a user-friendly graphical interface. Enterprise developers commonly use this technique to craft modern user interfaces on legacy systems.

Working Telnet and ANSI terminal implementations on the MIDP platform remove significant barriers to these kinds of software projects.

Summary

We've updated the first article's rudimentary terminal display to emulate an ANSI terminal, by implementing support for ANSI's terminal escape sequences. The application also takes better advantage of MIDP's user interface capabilities, with support for keyboards and customized forms for user input. You can mix and match these software components to provide the foundation for a new class of network-aware mobile applications.

For More Information

The links in this article are repeated here: