What Does Java Provide? (Java Distributed Computing)

The original design motivations behind Java and its predecessor, Oak, were concerned mainly with reliability, simplicity, and architecture neutrality. Subsequently, as the potential for Java as an "Internet programming language" was seen by its developers at Sun Microsystems, support for networking, security, and multithreaded operations was incorporated or improved. All of these features of the Java language and environment also make for a very powerful distributed application development environment. This is, of course, no accident. The requirements for developing an Internet-based application overlap to a great extent with those of distributed application development.

In this section, we review some of the features of Java that are of particular interest in distributed applications, and how they help to address some of the issues described in the previous section.

1.3.1. Object-Oriented Environment

Java is a "pure" object-oriented language, in the sense that the smallest programmatic building block is a class. A data structure or function cannot exist or be accessed at runtime except as an element of a class definition. This results in a well-defined, structured programming environment in which all domain concepts and operations are mapped into class representations and transactions between them. This is advantageous for systems development in general, but also has benefits specifically for you as the distributed system developer. An object, as an instance of a class, can be thought of as a computing agent. Its level of sophistication as an autonomous agent is determined by the complexity of its methods and data representations, as well as its role within the object model of the system, and the runtime object community defining the distributed system. Distributing a system implemented in Java, therefore, can be thought of as simply distributing its objects in a reasonable way, and establishing networked communication links between them using Java's built-in network support. If you have the luxury of designing a distributed system from the ground up, then your object model and class hierarchy can be specified with distribution issues incorporated.

1.3.2. Abstract Interfaces

Java's support for abstract object interfaces is another valuable tool for developing distributed systems. An interface describes the operations, messages, and queries a class of objects is capable of servicing, without providing any information about how these abilities are implemented. If a class is declared as implementing a specified interface, then the class has to implement the methods specified in the interface. The advantage of implementation-neutral interfaces is that other agents in the system can be implemented to talk to the specified interface without knowing how the interface is actually implemented in a class. By insulating the class implementation from those using the interface, we can change the implementation as needed. If a class needs to be moved to a remote host, then the local implementation of the interface can act as a surrogate or stub, forwarding calls to the interface over the network to the remote class.

Abstract interfaces are a powerful part of the Java language and are used to implement critical elements of the Java API. The platform independence of the Abstract Windowing Toolkit (AWT) is accomplished using abstract component interfaces that are implemented on each platform using the native windowing system (the X Window System, Macintosh, Windows, etc.). Certain key packages in the core Java API, such as the java.security package, also make use of interfaces to allow for specialized implementations by third-party vendors. The Java Remote Method Invocation (RMI) package uses abstract interfaces to define local stubs for remote objects. The concept of abstract interfaces is also common in other distributed object systems such as CORBA, in which interfaces defined in Interface Definition Language (IDL) define the object model of a CORBA system.[2] The Inter-Language Unification system (ILU), developed at Xerox PARC, also depends upon an implementation-neutral interface language called Interface Specification Language (ISL).[3]

1.3.3. Platform Independence

Code written in Java can be compiled into platform-independent bytecodes using Sun's Java compiler, or any of the many third-party Java compilers now on the market. These bytecodes run on the Java Virtual Machine, a virtual hardware architecture which is implemented in software running on a "real" machine and its operating system. Java bytecodes can be run on any platform with a Java Virtual Machine. At the time of this writing, a Java VM is available for most major Unix variants, OS/2, Windows95 and NT, MacOS, and a variety of other operating systems.

This is a major boon for you, since it allows virtually any available PC or workstation to be home to an agent in a distributed system. Once the elements of the system have been specified using Java classes and compiled into Java bytecodes, they can migrate without recompilation to any of the hosts available. This makes for easy data- and load-balancing across the network. There is even support in the Java API for downloading a class definition (its bytecodes) through a network connection, creating an instance of the class, and incorporating the new object into the running process. This is possible because Java bytecodes are runnable on the Java Virtual Machine, which is guaranteed to be underneath any Java application or applet.

1.3.4. Fault Tolerance Through Exception Handling

Java supports throwing and catching errors and exceptions, both system-defined and application-defined. Any method can throw an exception; it is the calling method's responsibility to handle the exception, or propagate the exception up the calling chain. Handling an exception is a matter of wrapping any potential exception-causing code with a try/catch/finally statement, where each catch clause handles a particular type of exception. If a method chooses to ignore particular exceptions, then it must declare that it throws the exceptions it is ignoring. When a called method generates an exception, it will be propagated up the calling chain to be handled by a catch clause in a calling method, or, if not, to result in a stack dump and exit from the Java process. After all is said and done, whether the try block runs to completion without a problem, or an exception gets thrown, the code in the finally block is always called. So you can use the finally block to clean up any resources you created in the try block, for example, and be sure that the cleanup will take place whether an exception is thrown or not.

An agent can be written to handle the exceptions that can be thrown by each method it calls. Additionally, since any subclass of java.io.Throwable can be declared in a method's throws clause, an application can define its own types of exceptions to indicate specific abnormalities. Since an exception is represented as an object in the Java environment, these application-specific exceptions can carry with them data and methods that can be used to characterize, diagnose, and potentially recover from them.

1.3.5. Network Support

The Java API includes multilevel support for network communications. Low-level sockets can be established between agents, and data communication protocols can be layered on top of the socket connection. The java.io package contains several stream classes intended for filtering and preprocessing input and output data streams. APIs built on top of the basic networking support in Java provide higher-level networking capabilities, such as distributed objects, remote connections to database servers, directory services, etc.

While the majority of this book will be concerned with the use of distributed object schemes like RMI, along with other higher-level networking APIs, it's also important to get a feeling for the basic networking capabilities included in the core Java API. Figure 1-1 shows a simple network application involving a client and a server; the client sends commands to the server, the server executes the commands and sends responses back to the client. To demonstrate the network support in Java and how it can be exploited for distributed applications, Examples Example 1-1 through Example 1-4 show an implementation of this simple client-server system using sockets and input/output streams. The implementation includes the following elements:

Figure 1-1. A simple client/server system

The client connects to the server over a socket, then sends commands to the server over the socket. The server uses the specialized DataInputStream to read the commands from the socket. The input stream automatically creates the right command object based on the type of message from the client (e.g., a "GET" message will be converted into a GetCmd object). The server then executes the command and sends the result to the client over the socket.

Example 1-1 shows a set of classes that represent the commands a client can send to our server. The SimpleCmd class simply holds a single String argument and has an abstract Do() method that subclasses will implement to do the right thing for the particular command they represent. Our protocol consists of three basic commands: "GET," "HEAD," and "POST,"[4] along with a command to close the connection, "DONE." The GetCmd, HeadCmd, PostCmd, and DoneCmd classes derived from SimpleCmd represent these commands.

Example 1-1. Commands for the Client-Server System

package dcj.examples;

import java.lang.*;

abstract class SimpleCmd
{
  protected String arg;

  public SimpleCmd(String inArg) {
    arg = inArg;
  }

  public abstract String Do();
}

class GetCmd extends SimpleCmd
{
  public GetCmd(String s) { super(s); }

  public String Do() {
    String result = arg + " Gotten\n";
    return result;
  }
}

public class HeadCmd extends SimpleCmd
{
  public HeadCmd(String s) { super(s); }
  public String Do() {
    String result = "Head \"" + arg + "\" processed.\n";
    return result;
  }
}

class PostCmd extends SimpleCmd
{
  public PostCmd(String s) { super(s); }

  public String Do() {
    String result = arg + " Posted\n";
    return result;
  }
}

class DoneCmd extends SimpleCmd
{
  public DoneCmd() { super(""); };
  public String Do() {
    String result = "All done.\n";
    return result;
  }
}

The classes in Example 1-1 represent the communication protocol for our client-server application, and the SimpleCmdInputStream class in Example 1-2 acts as the communication link that understands this protocol. The SimpleCmdInputStream is a subclass of java.io.DataInputStream that adds a readCommand() method to its interface. This method parses the data coming in over the stream, determines which command is being sent, and constructs the corresponding command class from Example 1-1.

Example 1-2. A Specialized DataInputStream

package dcj.examples;

import java.lang.*;
import java.io.*;
import java.net.*;

public class SimpleCmdInputStream extends DataInputStream
{
  public SimpleCmdInputStream(InputStream in) {
    super(in);
  }

  public String readString() throws IOException {
    StringBuffer strBuf = new StringBuffer();
    boolean hitSpace = false;
    while (!hitSpace) {
      char c = readChar();
      hitSpace = Character.isSpace(c);
      if (!hitSpace)
        strBuf.append(c);
    }

    String str = new String(strBuf);
    return str;
  }

  public SimpleCmd readCommand() throws IOException {
    SimpleCmd cmd;
    String commStr = readString();
    if (commStr.compareTo("HEAD") == 0)
      cmd = new HeadCmd(readString());
    else if (commStr.compareTo("GET") == 0)
      cmd = new GetCmd(readString());
    else if (commStr.compareTo("POST") == 0)
      cmd = new PostCmd(readString());
    else if (commStr.compareTo("DONE") == 0)
      cmd = new DoneCmd();
    else
      throw new IOException("Unknown command.");

    return cmd;
  }
}

Finally, the SimpleClient in Example 1-3 and SimpleServer in Example 1-4 serve as the client and server agents in our distributed system. Our SimpleClient is very simple indeed. In its constructor, it opens a socket to a server on a given host and port number. Its main() method makes a SimpleClient object using command-line arguments that specify the host and port to connect to, then calls the sendCommands() method on the client. This method just sends a few commands in the right format to the server over the OutputStream from the socket connection.

Notice that the client's socket is closed in its finalize() method. This method will only get called after all references to the client are gone, and the system garbage-collector runs to mark the object as finalizable. If it's important that the socket be closed immediately after the client is done with it, you may want to close the socket explicitly at the end of the sendCommands() method.

Example 1-3. A Simple Client

package dcj.examples;

import java.lang.*;
import java.net.*;
import java.io.*;

public class SimpleClient
{
  // Our socket connection to the server
  protected Socket serverConn;

  // The input command stream from the server
  protected SimpleCmdInputStream inStream;

  public SimpleClient(String host, int port)
      throws IllegalArgumentException {
    try {
      System.out.println("Trying to connect to " + host + " " + port);
      serverConn = new Socket(host, port);
    }
    catch (UnknownHostException e) {
      throw new IllegalArgumentException("Bad host name given.");
    }
    catch (IOException e) {
      System.out.println("SimpleClient: " + e);
      System.exit(1);
    }

    System.out.println("Made server connection.");
  }

  public static void main(String argv[]) {
    if (argv.length < 2) {
      System.out.println("Usage: java SimpleClient [host] [port]");
      System.exit(1);
    }
    
    String host = argv[0];
    int port = 3000;
    try {
      port = Integer.parseInt(argv[1]);
    }
    catch (NumberFormatException e) {}
    
    SimpleClient client = new SimpleClient(host, port);
    client.sendCommands();
  }
  
  public void sendCommands() {
    try {
      OutputStreamWriter wout =
        new OutputStreamWriter(serverConn.getOutputStream());
      BufferedReader rin = new BufferedReader(
        new InputStreamReader(serverConn.getInputStream()));

      // Send a GET command...
      wout.write("GET goodies ");
      // ...and receive the results
      String result = rin.readLine();
      System.out.println("Server says: \"" + result + "\"");

      // Now try a POST command
      wout.write("POST goodies ");
      // ...and receive the results
      result = rin.readLine();
      System.out.println("Server says: \"" + result + "\"");

      // All done, tell the server so
      wout.writeChars("DONE ");
      result = rin.readLine();
      System.out.println("Server says: \"" + result + "\"");
    }
    catch (IOException e) {
      System.out.println("SimpleClient: " + e); 
      System.exit(1);
    }
  }

  public synchronized void finalize() {
    System.out.println("Closing down SimpleClient...");
    try { serverConn.close(); }
    catch (IOException e) {
      System.out.println("SimpleClient: " + e);
      System.exit(1);
    }
  }
}

The SimpleServer class has a constructor that binds itself to a given port, and a listen() method that continually checks that port for client connections. Its main() method creates a SimpleServer for a port specified with command-line arguments, then calls the server's listen() method. The listen() method loops continuously, waiting for a client to connect to its port. When a client connects, the server creates a Socket to the client, then calls its serviceClient() method to parse the client's commands and act on them. The serviceClient() takes the InputStream from the client socket, and wraps our SimpleCmdInputStream around it. Then the method loops, calling the readCommand() method on the stream to get the client's commands. If the client sends a DONE command, then the loop stops and the method returns. Until then, each command is read from the stream, and the Do() method is called on each. The string returned from the Do() call is returned to the client over the OutputStream from the client socket.

Example 1-4. A Simple Server

package dcj.examples;

import java.net.*;
import java.io.*;
import java.lang.*;

// A generic server that listens on a port and connects to any clients it
// finds. Made to extend Thread, so that an application can have multiple
// server threads servicing several ports, if necessary.

public class SimpleServer
{
  protected int portNo = 3000; // Port to listen to for clients
  protected ServerSocket clientConnect;

  public SimpleServer(int port) throws IllegalArgumentException {
    if (port <= 0)
      throw new IllegalArgumentException(
                  "Bad port number given to SimpleServer constructor.");

    // Try making a ServerSocket to the given port
    System.out.println("Connecting server socket to port...");
    try { clientConnect = new ServerSocket(port); }
    catch (IOException e) {
      System.out.println("Failed to connect to port " + port);
      System.exit(1);
    }

    // Made the connection, so set the local port number
    this.portNo = port;
  }

  public static void main(String argv[]) {
    int port = 3000;
    if (argv.length > 0) {
      int tmp = port;
      try {
        tmp = Integer.parseInt(argv[0]);
      }
      catch (NumberFormatException e) {}

      port = tmp;
    }
    
    SimpleServer server = new SimpleServer(port);
    System.out.println("SimpleServer running on port " + port + "...");
    server.listen();
  }

  public void listen() {
    // Listen to port for client connection requests.
    try {
      System.out.println("Waiting for clients...");
      while (true) {
        Socket clientReq = clientConnect.accept();
        System.out.println("Got a client...");
        serviceClient(clientReq);
      }
    }
    catch (IOException e) {
      System.out.println("IO exception while listening for clients.");
      System.exit(1);
    }
  }

  public void serviceClient(Socket clientConn) {
    SimpleCmdInputStream inStream = null;
    DataOutputStream outStream = null;
    try {
      inStream = new SimpleCmdInputStream(clientConn.getInputStream());
      outStream = new DataOutputStream(clientConn.getOutputStream());
    }
    catch (IOException e) {
      System.out.println("SimpleServer: Error getting I/O streams.");
    }
    
    SimpleCmd cmd = null;
    System.out.println("Attempting to read commands...");
    while (cmd == null ||
           !(cmd instanceOf DomeCmd)) {
      try { cmd = inStream.readCommand(); }
      catch (IOException e) {
        System.out.println("SimpleServer: " + e);
        System.exit(1);
      }

      if (cmd != null) {
        String result = cmd.Do();
        try { outStream.writeBytes(result); }
        catch (IOException e) {
          System.out.println("SimpleServer: " + e);
          System.exit(1);
        }
      }
    }
  }
  
  public synchronized void finalize() {
    System.out.println("Shutting down SimpleServer running on port "
                       + portNo);
  }
}

We could easily adapt this simple communication scheme to other applications with different protocols. We would just need to define new subclasses of SimpleCmd, and update our SimpleCmdInputStream to parse them correctly. If we wanted to get exotic, we could expand our communication scheme to implement a "meta-protocol" between agents in the system. The first piece of information passed between two agents when they establish a socket connection would be the protocol they want to use to communicate with each other. Using the class download capabilities mentioned in the previous section, we could actually load a subclass of java.io.InputStream over the newly created socket, create an instance of the class, and attach it to the socket itself. We won't indulge ourselves in this exotic exercise in this chapter, however.

What all of this demonstrates is that Java's network support provides a quick way to develop the communication elements of a basic distributed system. Java's other core features, such as platform-independent bytecodes, facilitate the development of more complex network transactions, such as agents dynamically building a protocol for talking to each other by exchanging class definitions. The core Java API also includes built-in support for sharing Java objects between remote agents, with its RMI package. Objects that implement the java.io.Serializable interface can be converted to byte streams and transmitted over a network connection to a remote Java process, where they can be "reconstituted" into copies of the original objects. Other packages are available for using CORBA to distribute objects within a Java distributed application. We'll discuss both methods for distributed Java objects in later chapters.

1.3.6. Security

Java provides two dimensions of security for distributed systems: a secure local runtime environment, and the ability to engage in secure remote transactions.

1.3.6.1. Runtime environment

At the same time that Java facilitates the distribution of system elements across the network, it makes it easy for the recipient of these system elements to verify that they can't compromise the security of the local environment. If Java code is run in the context of an applet, then the Java Virtual Machine places rather severe restrictions on its operation and capabilities. It's allowed virtually no access to the local file system, very restricted network access (e.g., it can only open a network connection back to the server it was loaded from), no access to local code or libraries outside of the Java environment, and restricted thread manipulation capabilities, among other things. In addition, any class definitions loaded over the network, whether from a Java applet or a Java application, are subjected to a stringent bytecode verification process, in which the syntax and operations of the bytecodes are checked for incorrect or potentially malicious behavior.

1.3.6.2. Secure remote transactions

In Section 1.3.5, "Network Support", we demonstrated how Java simplifies the creation, manipulation, and extension of network communications sockets. This capability of the environment makes it easy to add user authentication and data encryption to establish secure network links, assuming that the basic encryption and authentication algorithms already exist. Suppose, for example, that we wanted to use public key encryption to establish secure, authenticated connections to named agents on remote machines. We can extend the BufferedInputStream and BufferedOutputStream classes in java.ioto authenticate and decrypt incoming data, and to sign and encrypt outgoing data. Example 1-5 displays the encrypted input stream.

Example 1-5. Encrypted Input Stream

import java.io.*;

public abstract class EncryptedInputStream extends BufferedInputStream
{
    public EncryptedInputStream(InputStream in);
        // Assumes the key ID and signature will be embedded
        // in the incoming data
    public EncryptedInputStream(InputStream in, String id);
        // Will only allow communication once identified
        // entity is authenticated with a public key

    // Protected methods
    public int decrypt(int) throws SecurityException;
    public int decrypt(byte[] b) throws SecurityException;
    public int decrypt(byte[] b, int off, int len) 
        throws SecurityException;
    
    // Public methods
    public int read() throws IOException, SecurityException
    {
        return decrypt(super.read());
    }

    public int read(byte[] b) throws IOException, SecurityException
    {
        super.read(b);
        return decrypt(b);
    }

    public int read(byte[] b, int off, int len)
        throws IOException, SecurityException
    {
        super.read(b, off, len);
        return decrypt(b, off, len);
    }
}

Of course, the example is greatly simplified by the fact that we haven't actually implemented the EncryptedInputStream.decrypt() methods, which are at the heart of the matter, since they actually detect key IDs and signatures, look up public keys on some key list in memory or on disk, and decrypt the incoming data stream once the agent at the other end has been authenticated. We've also avoided the issues of data expansion or compression caused by encryption. When an EncryptedInputStream is asked to read n bytes of data, the intention is typically to read n decrypted bytes. Any change in data size would have to be made opaque by the decrypt() methods.

Once we establish an encrypted communications stream with the remote agent, we can layer any kind of data protocol we like on top of it. For example, our simple GET/HEAD/POST messaging scheme from an earlier example could be carried out securely by simply putting an encrypted input/output stream pair underneath:

Of course, this assumes that the agent at the other end of the socket has been suitably augmented to encrypt and decrypt streams.

These examples have simply alluded to how the basic network capabilities of Java could be extended to support secure network communications. The java.se-curity package provides a framework for implementing the authentication and encryption algorithms needed to complete our secure input stream example. The authentication process could be implemented using the KeyPair and Signature classes, for example. We'll discuss the java.security API in more detail in a later chapter.

1.3.7. Multithreading Support

The ability to generate multithreaded agents is a fundamental feature of Java. Any class that you create can extend the java.lang.Thread class by providing its own implementation of a run() method. When the thread is started, this run() method will be called and your class can do its work within a separate thread of control. This is one way to delegate tasks to threads in a given agent; another is to have your workhorse classes derive from java.lang.Runnable, and allocate them as needed to threads or thread groups. Any class implementing the Runnable interface (which essentially means providing a run() method that represents the body of work to be done in the thread) can be wrapped with a thread by simply creating a new Thread with the Runnable object as the argument.

Java also lets you tweak the performance of a given agent through control and manipulation of its threads. Threads are assigned priorities that are publicly pollable and settable, giving you (or even some intelligent agent) the ability to suggest how processing time is allocated to threads by the Virtual Machine. Threads can also be made to yield to other threads, to sleep for some period of time, to suspend indefinitely, or to go away altogether. These kinds of operations become important, for example, in asynchronous systems, in which a thread is tasked with client polling and spawns new threads to service client requests.