pjain@dre.vanderbilt.edu and schmidt@dre.vanderbilt.edu
Department of Computer Science
Washington University, St Louis
This work is supported in part by a grant from Siemens Medical Engineering, Erlangen, Germany. The article appeared in the January 1997 C++ Report magazine. Java has improved quite a bit over the years and many of the problems identified in this article have been fixed in later versions of Java.
This article presents our experiences applying the Java language and its run-time features to convert a large C++ communication software framework to Java. We describe the key benefits and limitations of Java we found while converting our C++ framework to Java. Some of Java's limitations arise from differences between language constructs in C++ and Java. Other limitations are due to weaknesses with Java itself. The article explains various techniques we adopted to workaround Java's limitations and to mask the differences between Java and C++. If you're converting programs written in C++ to Java (or more likely, if you're converting programmers trained in C++ to Java), you'll invariably face the same issues that we did. Addition insights on this topic are available from Taligent.
The Java team at Sun Microsystems intentionally designed the syntax of Java to resemble C++. Their goal was to make it easy for C++ developers to learn Java. Despite some minor differences (e.g., new keywords for inheritance and RTTI), C++ programmers will be able to ``parse'' Java syntax easily. It took us about a day to feel comfortable reading and writing simple Java examples.
Understanding the syntactic constructs of the language, however, is just one aspect of learning Java. To write non-trivial programs, C++ developers must understand Java's semantics (e.g., copy-by-reference assignment of non-primitive types and type-safe exception handling), language features (e.g., multi-threading, interfaces, and garbage collection), and standard library components (e.g., the AWT and sockets). Moreover, to truly master Java requires a thorough understanding of its idioms and patterns [Lea:96], many of which are different from those found in C++ [Coplien:92].
![]() |
Figure 1. The C++ ACE Architecture |
The ACE source code release contains over 85,000 lines of C++. ACE has been ported to Win32, most versions of UNIX, VxWorks, and MVS OpenEdition. Approximately 9,000 lines of code (i.e., about 10% of the total toolkit) are devoted to the OS Adaptation Layer, which shields the higher layers of ACE from OS platform dependencies.
As part of a project with the Siemens Medical Engineering group in Erlangen, Germany, we've converted most of ACE from C++ to Java. The primary motivation for converting ACE to Java was to enhance Java's built-in support for networking and concurrent programming functionality. Many ACE components (such as Task, Timer Queue, Thread Manager, and Service Configurator), along with traditional concurrency primitives (such as Semaphore, Mutex, and Barrier) were converted from C++ to the Java version of ACE. These ACE components provide a rich set of constructs layered atop of Java's existing libraries. In addition, the Java version of ACE includes implementation of several distributed services (such as a Time Service, Naming Service, and Logging Service).
The following figure illustrates the architecture of the Java version of ACE:
![]() |
Figure 2. The Java ACE Architecture |
The Java version of ACE shown in Figure 2 is smaller than the C++ version of ACE shown in Figure 1. There are several reasons for the reduction in size and scope:
select
, poll
, and
WaitForMultipleObjects
) commonly found on UNIX and Win32.
In addition, since Java doesn't support shared memory or memory-mapped
files, the memory management wrappers (such as Mem_Map
and Shared_Malloc
) were omitted from Java ACE.
Over the past decade, C++ has been our language of choice for building high-performance communication frameworks. A major strength of C++ is its efficient run-time performance and multi-paradigm support for object-oriented programming and generic programming. However, after both teaching and applying C++ extensively in practice, we recognize that it can be a complex language to learn and use. In our experience, a major source of C++'s complexity arises from explicit memory management and lack of portability across compilers and OS platforms. Not surprisingly, therefore, we found some of the most important benefits of using Java were its garbage collection and standard libraries. These features enabled us to develop portable Java applications rapidly without expending considerable effort tracking down memory management errors and wrestling with broken compilers and incompatible development environments.
While the syntax of Java resembles C++, certain features, semantics, and patterns are quite different. For instance, Java lacks many features commonly used in C++ programs. Chief among these are templates, pointers-to-methods, explicit parameter passing modes, enumerated types, and operator overloading.
While the omission of these features simplifies the Java language, it also affects the way that developers fluent in C++ can design their programs. Converting C++ code to Java, while keeping the same functionality, can be tedious due to the absence of some C++ language constructs. In particular, we found that using Java often significantly increased the number of classes that we wrote compared to C++, where we would have employed templates to factor out common code.
We recognize that omitting C++ constructs doesn't necessarily make Java a less powerful language. In fact, we found that rethinking our original design of ACE using Java often solved the problem as effectively as by using C++. Many of the limitations we describe occurred because we were converting a large framework designed to exploit C++ features. Had we designed ACE as a Java framework from scratch, some of these limitations wouldn't have been problematic.
We expect that many other C++ developers are, or will soon be, programming in Java. We wrote this article to help capture and document how we used Java effectively for systems-level programming. Therefore, in addition to outlining Java's limitations, this article describes the techniques we used to workaround the lack of C++ features like templates and pointers-to-methods.
Learning Java and using it to convert ACE from C++ to Java was a relatively smooth process for us. It only took a few days to convert major portions of ACE from C++ to Java. Since many Java constructs are similar to C++, converting some of ACE to Java was mostly a matter of mapping the C++ constructs to their Java counterparts. Having done this, it was relatively easy to build the higher-layer ACE communication services and example applications using Java.
For example, it took us less than half a day to reimplement the ACE distributed Time Service using the Java version of ACE. Of course, one reason for our productivity was that we'd already figured out how to design and implement the distributed Time Service in C++. However, the simplicity of Java also contributed significantly to this rapid development cycle.
In contrast, Java offers the promise of platform portability. This, of course, is due to the fact that Java is more than just a language -- it defines a Virtual Machine (the JVM) on which programs implemented in the language will execute. Therefore, unlike C++ (which generally treats platform issues outside the scope of the language definition), the Java programming language can make certain assumptions about the environment in which it runs. This is one of the key factors that changes the idioms and patterns used by Java programmers.
Java programs now run on many OS platforms (such as Solaris, Windows NT, Windows '95, and OS/2) without requiring major modifications. However, as new Java development environments are released by different vendors (who add their own bugs and extensions), it may be hard to maintain such a high degree of portability. In addition, since Java doesn't support preprocessors or conditional compilation it's unclear how to encapsulate the inevitable platform differences that may arise in practice.
The java.net package and java.lang.Thread class provide useful classes for creating concurrent and distributed client/server applications. Using these classes simplifies client/server application development because the Java wrappers shield programmers from tedious and error-prone threading and socket-level details. For example, the following code shows how a simple ``thread-per-connection'' server application can be written in Java:
// A simple concurrent server that accepts connections
// from clients and creates a ServiceHandler to do the
// required processing for each connection.
// Note that each ServiceHandler runs in its own
// thread of control. The code for the ServiceHandler
// class has been left out for brevity.
public class Server
{
// Main entry point.
public static void main (String args[]) {
int port = DEFAULT_PORT;
if (args.length == 1) {
try {
port = Integer.parseInt (args[0]);
} catch (NumberFormatException e) {
System.err.println (e);
System.exit (1);
}
}
new Server (port);
}
public Server (int port) {
ServerSocket acceptorSocket;
try {
// Create a new server listen socket.
acceptorSocket = new ServerSocket (port);
} catch (IOException e) {
System.err.println (e);
System.exit (1);
}
System.out.println ("Server listening on port " + port);
try {
// Event loop for accepting client connections.
for (;;) {
Socket s = acceptorSocket.accept ();
// Create a handler for every accepted connection
// and then have the handler run in its own thread
// of control. This assumes that the ServiceHandler
// class implements the Java Runnable interface.
new Thread (new ServiceHandler (s)).start ();
}
} catch (IOException e) {
System.err.println (e);
System.exit (1);
}
}
}
Note the relatively few lines of code required to write a simple
concurrent server. Moreover, note how easily the server can be
multi-threaded by starting each ServiceHandler
in its
own thread of control. Although the C++ version of ACE provides
equivalent functionality and parsimony, it's harder to use features
like exception handling, sockets, and threading portably across
multiple OS platforms. The following code illustrates how a Java client application can be written to communicate with the server:
// A simple client that connects to the server.
public class Client
{
public static void main (String args []) {
String serverName = null;
int port = DEFAULT_PORT;
try {
switch (args.length) {
case 2:
port = Integer.parseInt (args[1]);
// fall through
case 1:
serverName = args[0];
break;
default:
System.err.println ("Incorrect parameters");
System.exit (1);
}
} catch (NumberFormatException e) {
System.err.println (e);
System.exit (1);
}
new Client (serverName, port);
}
public Client (String serverName, int port) {
try {
// Create a socket to communicate with the server.
Socket s = new Socket (serverName, port);
} catch (IOException e) {
System.err.println (e);
System.exit (1);
}
// Client can now send and receive data to/from the
// server using the underlying data streams of the socket.
// ...
}
}
The Java wrappers for sockets play a similar role as the C++ socket
wrappers in ACE. They both provide a uniform interface that
simplifies communication software implementations. In our conversion
of ACE from C++ to Java, it was trivial to use the Java socket
wrappers to provide a communication interface equivalent to the C++
version of ACE.
java.util
package and include Enumeration
,
Vector
, Stack
, Hashtable
,
BitSet
, and Dictionary
. Providing generic
collections as part of the standard development environment simplies
application programming. For example, in the C++ version of ACE,
we've implemented reusable components (such as the
Map_Manager
and the Unbounded_Set
) to
simplify the development of higher-level ACE components. In contrast to Java, the standard components we used in the C++ version of ACE were developed from scratch since they didn't exist as in all our C++ development environments. Although the ANSI/ISO draft standard is nearing completion, most C++ compilers still don't provide standard libraries that are portable across platforms. Therefore, programmers must develop these libraries, port them from public domain libraries (such as GNU libg++ and HP's STL), or purchase them separately from vendors like Rogue Wave and Object Space.
For instance, to build a portable application in C++ that uses dynamic
arrays, developers must either buy, borrow, implement, and/or port a
dynamic array class (such as the STL vector
). In the
case of Java, developers can simply use the Vector
class
provided in the development environment without concern for
portability. The ubiquity of Java libraries is particularly important
for WWW applets because the standard Java components can be
pre-configured into browsers to reduce network traffic. In addition,
the unified strategies provided by the Java standard libraries (such
as iteration, streaming,
and externalization) provide idioms that Java programmers can easily
recognize, utilize, and extend.
There are tools (such as Purify, Bounds Checker, and Great Circle) that reduce the effort of writing memory-safe C++ code. However, it's been our experience that even though these tools exist, C/C++ programmers typically expend considerable effort avoiding memory leaks and other forms of memory corruption.
We have consciously avoided the use of exceptions in the C++ version of ACE due to portability problems with C++ compilers. In contrast, when converting the ACE framework from C++ to Java, we were able to make extensive use of Java's exception handling mechanisms. The ability to use exception handling significantly improved the clarity of the Java ACE error-handling logic, relative to the C++ version of ACE.
Although Java exceptions are elegant, they also exact a performance penalty. Depending upon how heavily exception handling is used, the impact on performance can vary significantly. As mentioned earlier, performance measurements of the Java version of ACE will be covered in a subsequent article.
java.lang.ClassLoader
mechanism.
The java.lang.ClassLoader
is an abstract class that
defines the necessary hooks for Java to load classes over the network
or from other sources such as the file system. This class made it
easy to implement the Service
Configurator framework in ACE. The ACE Service Configurator
framework provides a flexible architecture that simplifies the
development, configuration, and reconfiguration of communication
services. It helps to decouple the behavior of these communication
services from the point in time at which these services are configured
into an application.
Implementing the Service Configurator framework in the C++ version of ACE is challenging because C++ doesn't define a standard mechanism for dynamic linking/loading of classes and objects. Implementing this functionality across platforms, therefore, requires various non-portable mechanisms (such as OS support for explicit dynamic linking). Moreover, since the draft ISO/ANSI C++ standard doesn't address dynamic linking, C++ compilers and run-time systems are not required to support dynamic linking. For instance, many operating systems will not call the constructors of static C++ objects linked in dynamically from shared libraries.
The fact that Java provides standard mechanisms for dynamic linking/loading of classes significantly simplified the implementation of the ACE Service Configurator framework. This reiterates the fact that Java is more than just a programming language. It defines a run-time environment, and can therefore make certain assumptions about the environment in which it runs. Thus, unlike other languages such as C and C++, the Java run-time environment can portably support important run-time features such as explicit dynamic linking/loading.
JavaDoc
tool generates API documentation in HTML
format for the specified package or for individual Java source files
specified on the command line. Using JavaDoc is straightforward. The
syntax /** documentation */
indicates a documentation
comment (a.k.a., a ``doc comment'') and is used by the
JavaDoc
tool to automatically generate HTML
documentation. In addition, doc comments may contain special tags
that begin with the @
character. These tags are used by
JavaDoc for additional formatting when generating documentation. For
example, the tag @param
can be used to specify a
parameter of a method. JavaDoc extracts all such entries containing
the @param
tag specified inside a doc comment of a method
and generates HTML documentation specifying the complete parameter
list of that method. The C++ version of ACE also provides automatic generation of documentation using a modified version of the freely available OSE tools. The ACE documentation tools produce UNIX-style man pages (in nroff format), as well as JavaDoc-style documentation (in HTML format).
ACE_Map_Manager
and ACE_Malloc
classes.
One workaround for Java's lack
of templates is to use Object
-based containers like the Smalltalk-style
collections
available from Doug Lea or
the Java Generic Library (which is a conversion of the Standard
Template Library from C++ to Java) available from Object Space. [Editors note:
Graham Glass' STL in Action column in this issue describes
the design of the Java Generic Library.]
These solutions are not entirely ideal, however, since they require application programmers to insert casts into their code. Although Java casts are strongly-typed (which eliminates a common source of errors in C and C++ programs that use traditional untyped casts), it is hard to optimize away the overhead of run-time type checking.
ACE_Acceptor
and
ACE_Connector
are parameterized with a class that must conform to the ACE_Svc_Handler
interface. An ACE_Svc_Handler
is created and
initialized when a connection is established either actively or
passively. It is an abstract
class that applications can subclass to provide a concrete
service handler implementation. To parameterize the ACE_Acceptor
with an
ACE_Svc_Handler
, (e.g., HTTP_Svc_Handler
), we would
do the following in C++ ACE:
class HTTP_Svc_Handler :
public ACE_Svc_Handler<ACE_SOCK_Stream>
{ /* ... */ };
class HTTP_Acceptor :
public ACE_Acceptor<HTTP_Svc_Handler,
ACE_SOCK_Acceptor>
{ /* ... */ };
HTTP_Acceptor *acceptor =
new HTTP_Acceptor (addr);
The C++ code parameterizes the ACE_Acceptor
statically
with the HTTP_Svc_Handler
. The advantage of using
templates is that there's no run-time function call overhead. Another
advantage is the ability to parameterize locking mechanisms. For
instance, the C++ version of ACE uses templates to select an
appropriate synchronization strategy (e.g., mutexes vs. readers/writer
locks), as well as to remove all locking overhead when there is no
concurrency.
Although Java lacks templates, it contain some interesting features
that enabled us to support signature-based type conformance by using a
pattern based on its meta-class facilities. For instance, here's how
the Java ACE Acceptor
is defined:
package ACE.Connection;
class Acceptor
{
public Acceptor (Class svcHandlerFactory, int port) {
// Cache the Class factory used to create instances
// of the handlers using the newInstance() method.
svcHandlerFactory_ = svcHandlerFactory;
// Cache the port to listen for connections on.
port_ = port;
}
// Perform the Acceptor pattern...
public void accept () {
// Create a SvcHandler.
SvcHandler sh =
(SvcHandler) svcHandlerFactory_.newInstance ();
// Accept connection into the SvcHandler.
SOCKStream sockStream = sockAcceptor_.accept ();
sh.setHandle (sockStream);
// Activate the SvcHandler.
sh.open ();
}
private:
// Factory that accepts client connections.
SOCKAcceptor sockAcceptor_ = new SOCKAcceptor ();
}
// ...
To achieve the C++ ACE behavior in Java ACE, a Class object can be
created using the class name ``HTTPSvcHandler'' and this can then be
passed to the constructor of Acceptor
, as follows:
class HTTPSvcHandler extends SvcHandler
{ /* ... */ }
Acceptor acceptor =
new Acceptor (Class.forName ("HTTPSvcHandler"),
DEFAULT_PORT);
// ...
acceptor.accept ();
Once the acceptor
object is initialized to listen on a
well-known port, the accept
method will accept
connections and create HTTPSvcHandler
objects to
communicate with clients.
The Java code uses the Class
object created using the
string corresponding to the name of the SvcHandler
factory as a parameter to the constructor of Acceptor
.
The Java ACE Acceptor
uses this to create a new instance
of the SvcHandler
when needed. As long as
HTTPSvcHandler
is a subclass of SvcHandler
,
a new Class
object can be created and passed to the
Acceptor. Therefore, the signature will match that expected by the
Acceptor
factory.
In Java, there are no enumerated types. This typically isn't a
problem when developing new code, or when writing in an orthodox
``object-oriented'' style, because subclasses and the Java
instanceOf
typesafe dynamic cast feature can be used in
place of enumerals (as illustrated below).
However, there were several situations where lack of enumerated types
was a problem when converting ACE from C++ to Java:
int
) instead. For example, translating the
following C++ ACE code:
// C++ code.
class ACE_Naming_Msg_Block
{
public:
enum Naming_Msg_Type {
BIND, // Request for bind
REBIND, // Request for rebind
RESOLVE, // Request for resolve/find
UNBIND, // Request for unbind
// ... rest omitted...
MAX_ENUM // maximum enumeration
};
ACE_Naming_Msg_Block (Naming_Msg_Type mt,
const char *data);
into Java code is tedious and error-prone. The result looks like this:
// Java code
package ACE.ASX;
class NamingMsgBlock
{
// Request for bind.
public static final int BIND = 0;
// Request for rebind.
public static final int REBIND = 1;
// Request for resolve/find
public static final int RESOLVE = 2;
// Request for unbind
public static final int UNBIND = 3;
// ... rest omitted...
// Maximum value.
public static final int MAX_ENUM = 11;
NamingMsgBlock (int mt, String data) {
// ...
}
}
Not only is this less concise, but it is also more error-prone because
any value of type int
can be accidentally passed to the
NamingMsgBlock
constructor. Moreover, enumeral values
can be duplicated accidentally. In contrast, the C++ type-system
ensures that only NamingMsgType
parameters are passed
as arguments to the constructor of
ACE_NamingMsgBlock
.
One workaround in Java for the lack of enumerated types is to use
subclassing. In this approach, a base class called
NamingMsgType
is defined and a subclass of
NamingMsgType
is created for each type of
NamingMsgType
. The following code illustrates this
common Java pattern:
Now we can ensure that an argument of type
// Defines the base type for all message types.
public abstract class NamingMsgType {}
public class BIND extends NamingMsgType {}
public class REBIND extends NamingMsgType {}
public class RESOLVE extends NamingMsgType {}
// ...
NamingMsgType
is passed to the constructor of
NamingMsgBlock
. Here's how we can do a ``switch'' to
determine the type of the message:
This solves the problem at the expense of creating a large number of
subclasses and forcing iterative search using
public NamingMsgBlock (NamingMsgType mt,
String data)
{
// Use instanceof operator to determine
// what mt is an instance of...
if (mt instanceof BIND) { /* ... */ }
else if (mt instanceof REBIND) { /* ... */ }
else if (mt instanceof RESOLVE) { /* ... */ }
// ...
}
instanceOf
.
ACE_Name_Handler
implementation dispatches the appropriate method by using the message
type to index into a table of pointers to C++ methods. The following
code demonstrates how a table of pointers to methods can be created to
dispatch efficiently based on enum Naming_Msg_Type
literals (to simplify the example, some ACE C++ class names have been
changed):
Our dispatch routine is straightforward:
class ACE_Name_Handler
{
// Specify an array of pointers to member functions
typedef int (ACE_Name_Handler::*OPERATION) (void);
OPERATION op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::MAX_ENUM];
ACE_Name_Handler (void) {
// Set up the array of pointers to member functions
op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::BIND] =
&Name_Handler::bind;
op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::REBIND] =
&Name_Handler::rebind;
op_table_[ACE_Naming_Msg_Block::Naming_Msg_Type::RESOLVE] =
&Name_Handler::resolve;
...
Building this type of efficient dispatch mechanism in Java requires
the use of a primitive type like
int ACE_Name_Handler::dispatch
(ACE_Naming_Msg_Block::Naming_Msg_Type msg_type)
{
// Dispatch based on the index of the message type.
return (*op_table_[msg_type]) ();
}
int
for the message
type, which incurs the drawbacks described above. A workaround this
problem is presented in Section 3.3, along
with a workaround for Java's lack of pointers to methods.
A common workaround for Java's lack of pointers to methods is to build
callback objects via subclassing [Lea:96]. This
can be tedious, however, as shown by the following Java rewrite of the
C++ ACE_Name_Handler
class presented earlier. The
example below also illustrates another solution to Java's lack of
enumerated types:
// Specify an interface that can be implemented
// by classes to build callback objects.
public interface Operation
{
public int invoke ();
}
// Define callback object to handle BIND messages.
public class BINDHandler implements Operation
{
public int invoke () { /* Handle BIND messages. */ }
}
// Define callback object to handle REBIND messages.
public class REBINDHandler implements Operation
{
public int invoke () { /* Handle rebind messages. */ }
}
// Define callback object to handle RESOLVE messages.
public class RESOLVEHandler implements Operation
{
public int invoke () { /* Handle RESOLVE messages. */ }
}
// ... rest omitted...
Here is how we can define the NameHandler
class. This
example first uses the class NamingMsgType
to emulate the
functionality of enums, as well as to provide abstraction and
type-safety.
final class NamingMsgType
{
// Request for bind.
public static final NamingMsgType BIND =
new NamingMsgType ("BIND", 0);
// Request for rebind.
public static final NamingMsgType REBIND =
new NamingMsgType ("REBIND", 1);
// Request for resolve/find.
public static final NamingMsgType RESOLVE =
new NamingMsgType ("RESOLVE", 2);
// Request for unbind.
public static final NamingMsgType UNBIND =
new NamingMsgType ("UNBIND", 3);
// ... rest omitted...
public String toString () { return name_; }
public int val () { return value_; }
private NamingMsgType (String name, int value) {
name_ = name;
value_ = value;
}
// The int-value corresponding to the enum.
private int value_;
// The string associated with the enum.
private String name_;
}
The following NameHandler
class uses a
Vector
to keep instances of callback objects and uses the
dispatch
routine to extract the right instance of the
callback object:
public class NameHandler
{
public NameHandler () {
// Initialize a dynamically resizing vector.
opTable_ = new Vector();
// Insert the elements, starting at location 0.
opTable_.addElement (new BINDHandler ());
opTable_.addElement (new REBINDHandler ());
opTable_.addElement (new RESOLVEHandler ());
// ... other entries omitted ...
}
// Specify the dispatching routine. The
// message types MUST be NamingMsgType.
public int dispatch (NamingMsgType msgType) {
// Dispatch based on the value of the message type.
int index = msgType.val ();
Operation op =
(Operation) opTable_.elementAt (index);
return op.invoke ();
}
// Array of callback objects.
private Vector[] opTable_;
}
Note that if the values of the NamingMsgType
``enumeration'' weren't contiguous, we could use the Java
HashTable
rather than a Vector
.
java.lang.Thread
class contains methods for
creating, controlling, and synchronizing Java threads. The main
synchronization mechanism in Java is based on Dijkstra-style
Monitors. Monitors are a relatively simple and expressive
model that allow threads to (1) implicitly serialize their execution
at method-call interfaces and (2) to coordinate their activities via
explicit wait
, notify
, and
notifyAll
operations. While Java's simplicity is often beneficial, we encountered subtle traps and pitfalls with its concurrency model [Cargill:96]. In particular, Java's Monitor-based concurrency model can be non-intuitive and error-prone for programmers accustomed to developing applications using threading models (such as POSIX Pthreads, Solaris threads, or Win32 threads) supported by modern operating systems. These threading models provide a lower-level, yet often more flexible and efficient, set of concurrency primitives such as mutexes, semaphores, condition variables, readers/writer locks, and barriers.
In general, Java presents a different paradigm for multi-threaded programming. We found that this paradigm was initially non-intuitive since we were accustomed to conventional lower-level multi-threaded programming mechanisms such as mutexes, condition variables, and semaphores. As a result, we had to rethink many of the concurrency control and optimization patterns used in C++ ACE. We found that converting C++ ACE code that used these conventional synchronization mechanisms required careful analysis and often changed the implementation of C++ ACE components. This was due to differences in Java concurrency mechanisms, compared with conventional POSIX Pthreads-like threading mechanisms used in C++ ACE. The following discussion explores some threading challenges we faced when converting C++ ACE to Java:
wait
method --
To implement Monitors, Java defines the wait
,
notify
, and notifyAll
methods in class
Object
. Thus, all classes implicitly inherit
these methods. The wait
method allows a thread to wait
for a condition to occur on an object, whereas the notify
and notifyAll
methods allow a thread to signal other
waiting threads when the condition associated with an object becomes
true.
There are three forms of wait
:
public final void wait ()
throws InterruptedException;
public final void wait (long millisecTimeout)
throws InterruptedException;
public final void wait (long millisecTimeout,
int nanosecTimeout)
throws InterruptedException;
Using the first form of wait
, a thread performs a
blocking wait on an object until it is notified by another
thread. Using the other two forms of wait
, an object
blocks until it is notified or the timeout expires. Timeouts are very
useful when writing robust protocols that won't block indefinitely if
peers misbehave.
Unfortunately, Java doesn't directly inform applications whether the
wait
call returned because the object was notified or
because a timeout occurred. This differs from the C++ version of ACE,
which uses the condition variable mechanism in the underlying OS
threading packages. In C++ ACE, the ACE_Condition::wait
method returns a value that explicitly disambiguates between timeouts
and notifications.
Java's lack of an explicit return value or exception to differentiate
between notification and timeout increases the responsibilities of
application programmers. For example, when converting the thread-safe
ACE_Message_Queue
to Java, we first tried to implement the method enqueue
as follows:
package ACE.ASX;
class MessageQueue
{
synchronized void enqueue (Message msg,
long msecTimeout) {
while (isFull ()) {
try {
// Wait until we are notified that the
// queue is no longer full or we time out.
wait (msecTimeout);
}
// Syntax error! The following isn't possible
// in Java because there is no way for wait()
// to distinguish a timeout from a normal return...
catch (TimeoutException) {
// Indicate that a timeout occurred.
}
// Check condition
if (!isFull ())
break;
else
// Condition is still false so loop
// back again to wait() once more.
// ...
}
// Enqueue the msg.
}
// ...
The solution above won't work in Java since the Java wait
method doesn't distinguish between returns due to timeouts
vs. notifications. In particular, the wait
call might
return because of a notifyAll
or because it was
interrupted, rather than because it timed out.
Our first solution to the ambiguity inherent in Java's timed
wait
used an algorithm presented by Doug Lea in [Lea:96]. This algorithm involves explicitly
determining if the timeout has occurred, as follows:
package ACE.ASX;
class MessageQueue
{
synchronized int enqueue (Message msg,
long msecTimeout)
throws InterruptedException {
if (isFull ()) {
long start = System.currentTimeMillis ();
long waitTime = msecTimeout;
for (;;) {
// Wait until we are notified that the
// queue is no longer full or we time out.
wait (waitTime);
if (isFull ()) {
// Condition is still false so now check
// if we still have enough waiting time.
long now = System.currentTimeMillis ();
long timeSoFar = now - start;
// Timed out!
if (timeSoFar >= msecTimeout)
return -1;
else
// We still have some time left to
// wait, so adjust the waitTime.
waitTime = msecTimeout - timeSoFar;
}
else
break; // Condition became true.
}
}
// Enqueue the msg.
// ...
Although this algorithm solved the problem in this case, we wanted to
generalize the solution. Our goal was to avoid replicating common
code in enqueue
and dequeue
and to create a
reusable TimedWait
class utility. However, this turned
out to require very careful thought. The problem is that the
ifFull
condition logic must be factored out of the
enqueue
method.
As usual, it's ``patterns to the rescue.'' We'll start by using the
Template Method pattern [GoF:95] to implement a
generic TimedWait
abstraction:
public class TimeoutException
extends Exception
{ /* ... */ }
public abstract class TimedWait
{
// By default, we ``delegate'' to ourself.
public TimedWait () { object_ = this; }
// Subclasses can also supply us with an
// Object that is delegated the wait() call
// so we can borrow its monitor lock.
public TimedWait (Object obj)
{
object_ = obj;
}
// This is the object we delegate to if a
// subclass gives us a particular object,
// otherwise, we ``delegate'' to ourself
// (i.e., to this).
protected Object object_;
// This hook method must be overridden
// by a subclass to provide the condition.
public abstract boolean condition ();
// Make this final so that no one can
// override this Template Method. This
// method assumes it's called with the
// object_'s monitor lock already held.
public final void timedWait (long msecTimeout)
throws InterruptedException,
TimeoutException
{
if (!condition ()) {
// Only attempt to perform the timed wait
// if the condition isn't true initially.
long start = System.currentTimeMillis ();
long waitTime = msecTimeout;
for (;;) {
// Wait until we are notified
// (releases the monitor lock).
object_.wait (waitTime);
// Recheck the condition.
if (!condition ()) {
long now = System.currentTimeMillis ();
long timeSoFar = now - start;
// Timed out!
if (timeSoFar >= msecTimeout)
throw new TimeoutException ();
else
// We still have some time left to wait,
// so adjust the waitTime.
waitTime = msecTimeout - timeSoFar;
}
else
break; // Condition became true.
}
}
}
// Notify all threads waiting on the object_.
public final void broadcast () {
object_.notifyAll ();
}
}
We use the Template Method pattern to (1) define the skeleton of the
algorithm that computes the timeout in the timedWait
method and (2) defer the definition of the condition
hook
method to subclasses. Thus, subclasses can redefine the
condition
logic without changing the
timedWait
algorithm.
Although our TimedWait
class works for simple usecases,
we can't directly extend MessageQueue
from
TimedWait
since TimedWait
only allows us to
specify one condition at a time (i.e., it only has a single
condition
hook). This is overly restrictive for the
MessageQueue
implementation since its
enqueue
and dequeue
methods depend upon two
conditions: !isFull()
and !isEmpty()
,
respectively.
Once again, it's patterns to the rescue. In this case, we can
generalize the TimedWait
solution by applying another
pattern -- a variant of the Delegated Notification pattern from [Lea:96] that we call Borrowed
Monitor
. By using the Borrowed Monitor pattern, the
condition
hook of the TimedWait
class
delegates to the appropriate implementation of the
MessageQueue
's isFull
or
isEmpty
methods, as follows:
class NotFullCondition extends TimedWait
{
public NotFullCondition (MessageQueue mq)
{ super (mq); }
public boolean condition () {
// Delegate to the appropriate conditional
// check on the MessageQueue.
MessageQueue mq = (MessageQueue) object_;
return !mq.isFull ();
}
}
class NotEmptyCondition extends TimedWait
{
public NotEmptyCondition (MessageQueue mq)
{ super (mq); }
public boolean condition () {
// Delegate to the appropriate conditional
// check on the MessageQueue.
MessageQueue mq = (MessageQueue) object_;
return !mq.isEmpty ();
}
}
Finally, we can put all the patterns and classes together to create a
MessageQueue
that uses the timed
Not{Full|Empty}Condition
classes defined above:
public class MessageQueue
{
// ...
public void synchronized enqueue (Message msg,
long timeout)
throws TimeoutException, InterruptedException
{
// Do timedwait (which borrows our monitor lock).
notFullCondition_.timedWait (timeout);
// Enqueue the message....
// Notify all waiting threads.
notEmptyCondition_.broadcast ();
}
public void synchronized dequeue (Message msg,
long timeout)
throws TimeoutException, InterruptedException
{
// Do timedwait (which borrows our monitor lock).
notEmptyCondition_.timedWait (timeout);
// Dequeue the message...
// Notify all waiting threads.
notFullCondition.broadcast ();
}
// The Delegated Notification mechanisms.
private NotFullCondition notFullCondition_
= new NotFullCondition (this);
private NotEmptyCondition notEmptyCondition_
= new NotEmptyCondition (this);
}
This solution allows us to perform the timeouts without
reimplementing the low-level time-out logic in the
enqueue
and dequeue
methods.
It's important to note that our TimedWait
scheme only
provides approximate timeout granularity. In particular, the timeout
is really a lower bound since the enqueue
and
dequeue
methods may not return immediately when a timeout
occurs if they encounter contention when trying to reacquire the
monitor lock. Moreover, in theory, the scheme doesn't even guarantee
liveness because a thread may block indefinitely waiting to reclaim
the monitor lock. This is extremely unlikely to be a problem
in practice, however.
It is worthwhile to point out that all these classes and patterns
would be unnecessary if Java simply differentiated between timeouts
and ``normal'' notifications in the first place! Unfortunately, the
current semantics of Java's wait(long timeout)
method
makes this impossible.
TimedWait
timedWait
method, you'll notice that it is not
adorned with the synchronized
keyword. This omission is
necessary to avoid ``nested monitor lockout,'' which is surprisingly
common in Java.
The nested monitor problem occurs when a thread acquires object
X's monitor lock, e.g., TimedWait
, without
relinquishing the lock already held on monitor Y, e.g.,
MessageQueue
, thereby preventing a second thread from
acquiring the monitor lock for Y. This can lead to a lockout
occurring since after acquiring monitor X, the first thread may
wait for a condition to become true that can only change as a result
of actions by the second thread after it has acquired monitor
Y. Naturally, this can't happen as long as the first
thread holds X's monitor lock...
The following example is based on an example in [Lea:96] and illustrates the nested monitor lockout problem:
class Inner {
protected boolean cond_ = false;
public synchronized void awaitCondition ()
{
while (!cond)
try { wait (); }
catch (InterruptedException e) {}
// Any other code.
}
public synchronized void signalCondition (boolean c)
{
cond_ = c;
notifyAll ();
}
}
class Outer {
protected Inner inner_ = new Inner ();
public synchronized void process ()
{
inner_.awaitCondition ();
}
public synchronized void set (boolean c)
{
inner_.signalCondition (c);
}
}
The code above illustrates the canonical form of the the nested
monitor problem in Java. When a Java thread blocks in the monitor's
wait queue, all its locks are held except the lock of the
object placed in the queue [Lea:96]. Consider
what would happen if a thread T made a call to
Outer.process
and as a result blocked in the
wait
call in Inner.awaitCondition
. Since
Inner
and Outer
classes don't share their
monitor locks, the awaitCondition
call would release the
Inner
monitor, while retaining the Outer
monitor. However, no other thread can acquire the Outer
monitor since it's locked by the synchronized process
method. Therefore, no thread can call Outer.set
to set
the condition to be true. As a result, T would continue to be
blocked in wait
forever.
There are several ways to avoid the nested monitor problems in Java [Lea:96]. In our particular example of
MessageQueue
, the solution to the nested monitor problem
is to not declare TimedWait.timedWait
as a
synchronized
method. Instead, we borrow the monitor lock
(which must already be held) from TimedWait.object_
.
Thus, in the following code:
public void synchronized enqueue (Message msg,
long timeout)
throws TimeoutException, InterruptedException
{
// Do timedwait (which borrows our monitor lock).
notFullCondition_.timedWait (timeout);
// ...
the notFullCondition_.timedWait
call will end up calling
wait
on the MessageQueue
instance (which is
stored within TimedWait
in the object_
field). Therefore, the wait
operation will automatically
release the MessageQueue
's monitor lock. This is the
correct behavior since it avoids nested
monitor lockout.Although this solution works, it is non-intuitive and overly subtle unless you are intimately familiar with both Java's Monitor semantics and patterns like Template Method, Delegated Notification, and Borrowed Monitor. In our experience, detecting and fixing nested monitor lockout in Java is tricky. In fact, this example illustrates how the simplicity of Java's threading semantics can be a limitation. In general, we found that while implementing simple concurrency models in Java is easy, implementing more complex concurrency models often requires heroic efforts, particularly when trying to alleviate deadlock, race conditions, and synchronization overhead. Therefore, it's crucial to understand Java's threading semantics and design patterns thoroughly to avoid these kinds of nested monitor problems [Lea:96].
Thread.yield
calls into the code to give other threads a chance to run. However,
this solution can be tedious, error-prone, and inefficient since it
requires the programmer to second-guess the level of concurrency in
the application and the Java run-time system.
notify
wakes up the thread that has been waiting the
longest. Our solution in Java ACE was to port the ACE_Token
class, which implements the Specific Notification pattern [Cargill:96] in order to wake up waiting
threads in a deterministic order.
wait
should
be awakened by a notify
.
synchronized
keyword in the public interface of a class. Unfortunately, this can
break object encapsulation since it's possible for clients to disrupt
the locking protocol of any Java object with synchronized methods.
For instance, a client can accidentally lock a
MessageQueue
, as follows:
class MessageQueue
{
public synchronized int enqueueHead (MessageBlock msg,
TimeValue timeout);
//
}
MessageQueue myQueue = new MessageQueue ();
// ...
{
// Any client can acquire myQueue's monitor and spin forever.
synchronized (myQueue) {
for (;;)
continue;
}
In C++, this type of problem can be prevented by not exposing the
locking mechanism to clients, as follows:
template <class SYNCH_POLICY>
class ACE_Message_Queue
{
public:
int enqueue_head (Message_Block msg,
ACE_Time_Value *timeout) {
// Note how the locking is hidden from the client.
lock_.acquire ();
// ...
lock_.release ();
}
// ...
private:
// A parameterized synchronization strategy.
SYNCH_POLICY::MUTEX lock_;
// ...
}
Naturally, the C++ solution is not impervious to malicious programmers
or to other classic sources of program corruption in C/C++ (such as
scribbling over memory due to stray pointers).
For instance, consider the following simplified version of the C++ ACE_Message_Queue
(for simplicity, most error handling code has been omitted):
template <class SYNCH_POLICY>
class ACE_Message_Queue
{
public:
ACE_Message_Queue (void) { /* Initialize queue. */ }
~ACE_Message_Queue (void) { /* Delete queue. */ }
// Other methods omitted...
};
Since the destructor is called automatically once the class goes out
of scope, we can ensure that the contents in the queue are released.
This ``acquire/release'' protocol is particularly important if the
contents of the queue are resources (like locks or sockets) that are
relatively scarce, compared with virtual memory.
Java lacks general-purpose destructors. Instead, it provides
constructs like finally
and finalize
to
manage the release of resources. The ACE_Message_Queue
class shown above can be written in Java using the construct
finalize
as follows:
class MessageQueue
{
public MessageQueue () {
// Initialize the message queue...
}
public void close () {
// Close down the message queue and
// release all resources...
}
protected void finalize () throws Throwable {
// Delete queue resources when garbage
// collection is run on this object...
}
}
Similarly, the Java finally
construct can be used inside
a method to ensure that resources are released when the method goes
out of scope (even if an exception is thrown). Here's an example from
ACE in which the svc
method uses a
MessageQueue
to process several messages. To ensure that
the MessageQueue
is properly closed (and the resources
released) when the method goes out of scope, we use the Java
finally
construct.
class Task
{
// Run by daemon thread to do deferred processing.
public int svc () {
try {
for (;;) { // Process the service in a loop.
// Dequeue a message for processing.
MessageBlock msg =
msgQueue ().dequeue ();
// Process the msg...
}
} catch (Exception e) {
System.err.println (e);
return -1;
} finally {
// Note: this block is executed even if an
// exception is raised.
// Shutdown the queue.
msgQueue ().close ();
}
}
}
Unfortunately, Java's finally
and finalize
constructs exhibit the following two problems:
finally
in Java ACE, the cleanup code needs to be
inserted manually in finally
blocks, which can be tedious
and error-prone.
In contrast, C++ ACE makes it easier since the compiler ensures that a
destructor of the MessageQueue
class is called
automatically on exit from the svc
method's scope.
Therefore, programmers need not insert finally
blocks
containing cleanup code into multiple blocks.
finalize
, the problem centers around ``lazy deletion.''
Since the garbage collector may not get run until the application runs
low on memory, this approach is prone to premature resource
exhaustion. In particular, a program that queues many locks or socket
descriptors will run out of these resources long before running out of
memory. Java does provide hooks to allow programmers to explicitly
force garbage collection using the Runtime
class's
gc
method. In addition, the Runtime
class
also provides a runFinalization
method that runs any
pending class finalizers when invoked. However, doing any of this
requires manual intervention on the part of the programmers. Since
garbage collection takes time, running it explicitly multiple times
may not be desirable in time-critical applications.
someMethod
invocation, bar
still points to
the String
object associated with the literal string
``original.''
void someMethod (String foo) {
foo = "changed";
}
void caller () {
String bar = "original";
someMethod (bar);
// bar is unchanged...
}
We encountered this problem in ACE when we tried to implement the
class ACE_SOCK_Stream
, which encapsulates the OS data
transfer mechanisms for sockets. The recv
and
recv_n
methods of C++ ACE rely on the message being
passed back to the caller by reference. Since Java only passes
references ``by value,'' this became problematic. To circumvent this limitation when converting ACE from C++ to Java, we identified the following four solutions:
CString
. We then passed a reference to this wrapper
object that contained the actual String
. Here's what the
CString
class looks like:
public class CString
{
// Constructor.
public CString (String s) { s_ = s; }
// Set the underlying string.
public void rep (String s) { s_ = s; }
// Get the underlying string.
public String toString () { return s_; }
// Note, other type-specific constructors.
// can be added.
private String s_;
}
With this change, the String
object can now be changed
both by the caller and the callee. Thus, the first
someMethod
example can now be rewritten as follows:
void someMethod (CString foo) {
foo.rep ("changed");
}
void caller () {
CString bar = new CString ("original");
someMethod (bar);
}
someMethod
example can be rewritten as follows:
void someMethod (String[] foo) {
foo[0] = "changed";
}
void caller () {
String[] bar = new String[1];
bar [0] = "original";
someMethod (bar);
}
Although this syntax is somewhat ugly, it is relatively concise.
Moreover, we can generalize this approach to pass in multiple objects
by reference via a multiple-element array.
StringBuffer
instead of String
to
pass the messages between the caller and the callee. Using this
solution, the above example can be rewritten as follows:
void someMethod (StringBuffer foo) {
foo.setLength (0);
foo.append ("changed");
}
void caller () {
StringBuffer bar =
new StringBuffer ("original");
someMethod (bar);
}
Note that this is the solution we used in implementing
SOCKStream
in ACE Java. This is because, this solution
required the least amount of change to our original design.
Unfortunately, the numeric wrappers (e.g., class Double
and Integer
) lack mutator operations. Therefore, they
aren't useful for returning scalar parameters by reference (despite
the claims made by certain Java books...). This forces programmers to
write many additional (and incompatible) wrapper classes that allow
these types to be passed by reference.
String someMethod (String foo) {
return new String ("changed");
}
void caller () {
String bar = new String ("original");
// Call someMethod and reassign
// bar to the return value.
bar = someMethod (bar);
}
However, to generalize this approach to return multiple values by
reference requires defining even more helper classes.
inline
static
functions
can be used.
In contrast, adding an extra level of abstraction can increase the
cost of method calls in Java since it does not support explicit
inlining. There are various patterns for handling this problem using
the Sun JDK.
For instance, the JDK Java compiler does allow the user to specify
"-O" flag, which does some optimization, but the user has no direct
control over this without using convoluted, non-portable techniques.
Whenever possible, methods (or entire classes) in Java should be
declared final
to facilitate inlining.
In general, Java doesn't provide standard hooks (like the C/C++
register
keyword) to programmers to manually suggest
optimizations. Instead, Java relies on effective compiler
optimizations that, in theory, remove the need for programmers to
``hand-optimize'' performance. In practice, it remains to be seen
whether Java compilers can perform sufficient optimizations to compete
with the performance of C/C++ compilers.
The following are our recommendations to developers who plan to build systems in Java, either from scratch or by converting code originally written in C++:
The C++ and Java versions of ACE are freely available via the WWW at URL ACE.html.
[AWT:96] AWT Components, Source: http://java.sun.com/tutorial/ui/overview/components.html
[Coplien:92] James Coplien. Advanced C++ Programming Styles and Idioms. Reading, MA.: Addison-Wesley, 1992.
[Schmidt:94] Douglas C. Schmidt, ACE: an Object-Oriented Framework for Developing Distributed Applications, Proceedings of the 6th USENIX C++ Technical Conference, Cambridge, Massachusetts, April, 1994.
[Cargill:96] Specific Notification for Java Thread Synchronization, Proceedings of the 3rd Pattern Languages of Programming Conference, Allerton Park, Illinois, September, 1996.
[GoF:95] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Reading, MA: Addison-Wesley, 1995.
This page is maintained by Prashant Jain and Douglas C. Schmidt. If you have any comments or suggestion for improving this document, please send us email.
Last modified 18:06:18 CST 25 January 2019