Department of Computer Science
University of California at Davis
Davis, CA 95616
(530) 752-1953
matloff@cs.ucdavis.edu
Recently there has been considerable interest in software distributed shared memory (SDSM) systems [1]. SDSM is aimed at achieving ``the best of both worlds''--the clarity of the shared-memory programming model and the price/performance benefits of running parallel applications on message-passing hardware such as networks of workstations (NOW).
Several SDSM public-domain tools are available, such as Adsmith [7], CVM [5] and Quarks [6]. However, these are in one sense less suitable for instructional purposes, if the instructor wishes to teach about internal SDSM operation. In other words, if one wishes to explain to the students how this ``miracle'' is accomplished in which we provide a shared-memory programming paradigm on top of non-shared memory hardware, the packages listed above are too complex in their implementation to be enable easy student access to the basic concepts.
With this in mind, I decided to develop my own package, aimed not necessarily at efficient operation but instead at facilitating simple and clear student access to the major implementation issues involved with SDSM systems. My idea was to provide the user with simple calls which request permission to access a shared-data item and which release that permission, largely as in the CRL package [4], implemented on top of MPI [8]. However, after writing a fairly large amount of code for this, I suddenly realized that I could do it much better by using Linda [3] as the base instead of MPI. It turned out that the use of the Linda paradigm allows a strikingly simple and elegant solution to the problem. I call the resulting package TupleDSM.
In TupleDSM the application programmer declares global variables in a C or C++ program and accesses them in the usual shared-memory fashion. The programmer thinks that the values being accessed are stored in those global variables, but in actuality they are stored in tuple space. This is completely transparent to the programmer; at no time is the programmer asked to deal with a tuple, and he/she never even hears the term tuple. The request/release-permission routines which are called by the programmer do internally access the tuples, but again transparently.
For instructional purposes, the student must play two roles. First the student should play the role of a TupleDSM application programmer, writing some TupleDSM application programs. At this point, the instructor should avoid exposing the students to the Linda paradigm, so that the students understand the view of the TupleDSM application programmer. Later, the instructor can present material on Linda,1 having the students write some Linda programs, and once they get accustomed to that the instructor can present the internal details of the Linda implementation of TupleDSM.
To illustrate the application programmer and internal views of the system, suppose we have an application program App.c. The application programmer adds a statement
#include <DSMInclude.h>to App.c, compiles it and links in libDSM.a. The user invokes the resulting binary, say App, by running a front-end program; the user specifies App and its arguments on the front end's command line. There would be no references to tuples at all in App.c.
Suppose App.c declares a global variable Sum. The programmer views Sum's value as being stored there in that global variable, though he/she must first request ``permission'' to access Sum, with a call to a TupleDSM routine such as DSMSharedReadRequest(). Internally, though, the latest value of Sum is actually stored in a tuple
("var","Sum",I,X,K)where I is an array index (I = 0 for scalars such as Sum), X is the latest value of Sum, and K is the number of current readers of Sum.
So, if the application source code includes, say,
Y += X;it would be enclosed by TupleDSM calls:
DSMSharedReadRequest("X",&X,0); Y += X; DSMSharedReadRelease("X",&X,0);This formulation makes the SDSM implementation elegant and simple. For example, the internal operation of DSMSharedReadRequest() is roughly
perform a Linda in() operation to remove the tuple ("var","Sum",I,X,K) from the tuple space execute the assignment Sum = X perform a Linda out() operation to replace the tuple ("var","Sum",I,X,K+1) to the tuple space, incrementing the last component to reflect the addition of one more readerThe Linda approach works particularly well in the implementation of DSMExclusiveWriteRequest(), which needs to wait until there are no readers. We use the Linda in() here on the tuple ("var","Sum",I,X,0); in() will block until the tuple exists with a 0 component at the end, which is exactly what we need. The implementation of many other SDSM mechanisms, such as barriers, mutexes and so on is also made simpler and more elegant by using the Linda approach.
The shared-variables tuples are created at the outset of the program by user calls to DSMVarInit(), one call for each shared variable. (But not for global variables which do not have shared access.) For instance, for the variable Sum in our example above, the user program would have a line
DSMVarInit("Sum",1);which tells TupleDSM that we have a shared variable named Sum, consisting of one element.
Also at the outset of the program, the user code has processing node 0 issuing calls to DSMMutexCreate() and DSMBarrierCreate() to set up ("mutex",...) and ("bar",...) tuples for the mutexes and barriers used by the program.
The user program receives command-line arguments, individual node number and so on from the front-end program, in the form of tuples processed by the user's call to DSMEnvInit().
Here are excerpts from a TupleDSM application program, showing various declarations and calls:
#include <DSMInclude.h> #define UPDATE_COUNT 1 #define JOIN 1 ... int Composite[1000]; ... Worker() { int First,Last,Iter,NIters,I,Node,Block,MyComposites = 0; ... DSMNonExclWriteRequest("Composite",Composite,I); Composite[I] = 1; DSMNonExclWriteRelease("Composite",Composite,I); ... DSMMutexBegin(UPDATE_COUNT); Composite[0] += MyComposites; DSMMutexEnd(UPDATE_COUNT); } main(int argc,char **argv) { DSMEnvInit(); if (DSMNodeNum == 0) { DSMVarInit("Composite",MAX_N); DSMMutexCreate(UPDATE_COUNT); DSMBarrierCreate(JOIN); } N = atoi(DSMArgv[1]); Chunk = atoi(DSMArgv[2]); Worker(); DSMBarrier(JOIN); if (DSMNodeNum == 0) { DSMSharedReadRequest("Composite",Composite,0); DSMSharedReadRelease("Composite",Composite,0); printf("number of primes = %d\n",N-Composite[0]-1); } DSMExit(); }
TupleDSM could easily be adapted to any implementation of Linda, but the one chosen here was POSYBL [9]. POSYBL has the advantage of being easy to install, not requiring some other software basis; the Glenda implementation, for instance, requires PVM.
Although TupleDSM was not designed for efficiency, the POSYBL documentation goes into some depth about the efficiency considerations which entered into its design. This presents yet another opportunity for instructors, as they can discuss the implications of POSYBL's efficiencies for TupleDSM.
The effort involved in adapting TupleDSM to another implementation of Linda would be minimal. The main effort would consist of adapting from one tuple format to another. Consider for instance the tuple
("x",2)In POSYBL this would be represented in the source code for TupleDSM internals as
(lstring("x"),lint(2))where lstring() and lint() are macros which are used to produce the actual tuples; there is one macro for each type, e.g. the char * (``string'') and int types seen here. By contrast, the Glenda implementation of Linda [10] uses Gelertner's original tuple format (the Glenda package includes its own preprocessor), so the TupleDSM file DSMCalls.c would have to be changed accordingly, perhaps automated by a perl script.
This type of conversion would also be needed by p4-linda [2]. In addition, p4-linda differs from POSYBL and Glenda (but is similar to the Gelertner version) in that it sets up multiple processes at the function level, instead of program level. To adapt to this, the front-end program in TupleDSM should be changed to a function, and that function would be called at the beginning of every user program.
TupleDSM was originally implemented in prototype form in a classroom setting in Spring 1998, and is available on the Web, at http://heather.cs.ucdavis.edu/~matloff/tdsm.html. I am currently adding some new features to it such as the debugging aid described below, but unlike a production-oriented package, the educational nature of TupleDSM implies that the number of features should be limited, in order to allow the instructor to ask the students to add new capabilities, such as different memory consistency models.
One new feature currently being added is a primitive debugging capability. The user will be able to hit ctrl-c to the front-end program, generating a SIGINT signal which will result in the application program stopping at all the nodes at a ``safe'' time, giving the user a chance to query values of the shared variables. In order to implement this feature, we can have the front end program initially execute out() with an ``OK to proceed'' tuple, which is read via a Linda rd() by each access-request call before proceeding with the request and by each access-release call after completing the request. When the user hits ctrl-c, the front end will remove the ``OK to proceed'' tuple using in(), causing the request/release routines to block. The user can then inspect the values of the variables of interest, and as well as some unusual debugging information such as the number of current readers which a variable has. After the user is done, the front end will restore the tuple using out(), and execution will be resumed. Again, this is only a primitive facility, but this allows more opportunities for the instructor to assign homework in which the students modify the facility. Once again, the use of tuple space is key to making modification easy.
K.L. Johnson, M.F. Kaashoek, and D.A. Wallach. ``CRL: High-Performance All-Software Distributed Shared Memory,'' Proceedings of the Fifteenth Symposium on Operating Systems Principles, December 1995.
B. Seyfarth, J. Bickham and M. Arumughum. Glenda Installation and Use, http://sushi.st.usm.edu/ seyfarth/research/glenda.html.
1 The presentation of the Linda parallel-processing paradigm is itself timely these days, as its tuple space approach is the basis of the new Jini language being promoted by Sun Microsystems for appliance drivers.