



Special effects created by high end computer graphics hardware awe almost everyone. Jurassic Park is a recent and widely seen example. Less theatrical applications are just as worthy of 3d graphics, and many even require the third dimension. Examples include medical imaging, molecular modeling, scientific data visualization, and visualization of detailed information hierarchies. This article briefly describes an architecture for interactive 3d applications, discusses implementation issues and suggestions for that architecture, and describes the Qd3d library that serves as the foundation for such an architecture.
At the center of any 3d application is the organization of the data being modeled. Examples of application specific models include simple geometry information describing a building, a kinetic simulation of a human gymnast, a vector field interpolation of engineering data, and etc. There are three key things to remember regarding the application specific model. First, application code is responsible for saving, organizing, and retrieving the essence of the model. Second, application code breaks the draw yourself message down into the 3d drawing primitives recognized by the 3d renderer. Third, to decrease module cross coupling the model rendering code rarely concerns itself with the 3d viewing parameters.
Separating the lighting model, model editing tools, and view controller elements from the model itself is important from a flexibility and maintainability standpoint. By keeping the lighting model separate, it becomes feasible to have a default high-speed lighting model that can be swapped out with a slower but more realistic lighting model. The model need only tag different elements with attributes such as "sand," "copper," or "red." These attributes are then passed to the lighting model which in turn checks viewing parameters and calculates a suitable RGB color.
Even more critical is the flexibility required for the editing tools used for model manipulation. Editing tools often interface to real devices such as the Z Mouse from Multipoint Technology Corporation [Multi93] and the Mattel Power Glove [Glove93]. Alternatively, editing tools may interface to virtual 3d control devices such as the triad mouse [NiO86], or a virtual sphere controller [Chen93]. Editing tools may also take advantage of application specific conditions. For example, if a 3d object's movement is restricted to a plane, 2d desktop mouse movement can directly be mapped to that plane. Query tools such as "tell me the name of the object I'm pointing at" or "what is the height value at this point on the surface" also qualify as editing tools. Finally, the current editing tool must easily switch between different tools similar to how a draw program lets users switch between translation and rotation operations. Only by separating editing tool code from the model can these manipulation requirements be met.
At the bottom of Figure 1 lies the 3d renderer. The renderer is responsible for creating the image resulting from the primitives called out by the model "draw yourself" code. Editing tools and the view controller also use the renderer to provide 3d manipulation feedback. Finally, the renderer maintains and manipulates the viewing parameters under direction from the view controller.
Key to the 3d application framework of Figure 1 is the lack of specifics and constraints. Swapping out different elements of the architecture, such as the lighting model, has already been discussed. In addition, the lighting model, the model editing tools, and the view controller are optional. When working entirely with wireframes, there is little need for a lighting model. If the model is static, there is no need to provide editing tools. And, if the application is a "fly through" visual explorer, the application is the view controller. Also, the diagram does not preclude multiple view windows from sharing a single application model. Simultaneously viewing a model from multiple directions helps with understanding and manipulation issues. Another consideration is that multiple hardware input devices could feed into the system at once. Finally, nothing prevents this 3d application framework from being built on top of favorite application frameworks such as MacApp, the Think Class Library (TCL), or Bedrock. For example, "viewing window" could be an item in a MacApp or TCL view hierarchy.
Qd3d has Macintosh roots reaching back to the Fall of 1988. After existing as an in house development tool, Qd3d went commercial in the Spring of 1992 with version 1.2. Since that time, individuals and institutions have used Qd3d in applications such as sports medicine, geology, molecular modeling, information sciences, robotics, and scientific visualization. The most recent version, 2.1, was released in September 1993 and features support for the Cyberscope stereoscopic viewer from Simsalabim Systems.
Similar to QuickDraw's rendering of 2d primitives through the GrafPort pointed to by "thePort," Qd3d sends its 3d primitives through "the3dPort." Unlike QuickDraw's thePort structure pointer, the3dPort is an object pointer of type CQd3dPort. As a result, access to Qd3d viewing parameters, such as viewer location and viewing direction, are cleanly wrapped in messages defined for the CQd3dPort class. In addition, all aspects of 3d rendering can be specialized or augmented by subclassing, overriding, and inheriting.
CQd3dPorts maintain six rendering options which affect the appearance of every primitive. These rendering options dramatically modify rendering speed and quality and are known as OnlyQD, Wireframe, UseZBuff, DepthCue, and clipping level. OnlyQD controls whether Qd3d quickly renders color varying primitives as color constant. Wireframe maps polygon fill primitives to polygon frame primitives. Clipping level controls how accurately Qd3d clips primitives to the boundary of the viewing window. The default clipping level clips primitives so they fit just within the view window. In contrast, the drastic clipping level entirely skips primitives that cross the viewing area. UseZBuff requires OnlyQD to be off and uses a "z-buffer" the same size as the image for hidden surface removal. Finally, DepthCue causes distant primitive colors to be blended with the scene background color but leaves primitives closer to the viewer at their original color. This background color blending helps with depth perception. The manipulation of these six rendering options allows the same model rendering code to be used for quick OnlyQD previews and for final Gouraud-shaded z-buffered renderings.
Text placement primitives render 2d text at the projection point of a 3d location. Text rendering options and attributes include: left, right, and center justification; whether the text honors depth cueing; automatic font scaling with respect to distance from the viewer; whether text is removed if its area is greater than the projection area of a 3d object; and whether all text should be skipped.
CQd3dPort supports parallel and perspective projections. The familiar perspective projection renders close objects larger than more distant objects of the same size. In contrast, parallel projections appear flat because objects of the same size are rendered at the same size regardless of distance from the viewer. Parallel projections appear unnatural but have engineering applications that perspective projections can not support.
Qd3d also supports stereoscopic projections with the CStereo3dPort and CCyberscope3dPort subclasses. Depth perception is a function of the visual cortex comparing minor differences between left and right eye views. Unfortunately, the 2d nature of CRT monitors strip 3d graphics of their depth information. This becomes acute in scenes that are unfamiliar to users or when the refresh rate of view reorientation is low. Using stereoscopic techniques restores depth perception by creating and presenting different images to both the left and right eyes.
CStereo3dPort acts as a template for Qd3d stereoscopic projections and provides for straight-eyed and cross-eyed side-by-side stereoscopic image pairs. Figure 2 is an example of a cross-eyed side-by-side stereo image pair.
To view Figure 2, hold the images squarely to the line of sight. Look cross-eyed at the image pair and four images will appear. Vary cross-eyedness so the two inner images overlap. Concentrate on the center image until it becomes focused. Practice. Side-by-side stereo images require no additional gadgets for viewing but beginners find it difficult to fuse image pairs and such viewing causes eye strain and fatigue.
An economical, on the order of US $180, device to recombine stereo image pairs is Simsalabim's Cyberscope [Sims93]. The Cyberscope is a hood velcroed to the front of a regular computer monitor. Looking straight through the Cyberscope reveals the monitor screen as before with the exception of a vertical divider to separate left and right eye views. Looking down in the Cyberscope allows its front surfaced mirrors to recombine left and right eye views rotated on the monitor into a stereoscopic image. Advantages of the Cyberscope over other 3d viewing methods include no need to go cross-eyed, full color, increased depth perception from rotated images being wider than they are tall, and no flicker from shuttered LCD glasses. The CCyberscope3dPort class, included with Qd3d, creates the rotated images as required for the Cyberscope.
The SmartPane library supplements the existing TCL Pane hierarchy with offscreen image buffering that reduces flicker. SmartPane also adds animation support and QuickTime movie recording of images.
The idea of referencing vertices by index is important from a rendering performance standpoint. Transformation and projection of a world coordinate point to screen coordinates is computationally expensive. By using the vertex indexing scheme during polyhedron rendering each vertex can be transformed and projected once into a cache of screen coordinates. As each individual polygon is rendered the screen coordinates of the associated vertices are looked up in the screen coordinate cache. Without this cache rendering a polyhedron can result in approximately 4 to 6 times as many transformation and projection operations as necessary.
Qd3d includes a meta-primitive library called Hedra. Hedra implements the polyhedron description and screen coordinate caching scheme just described. Hedra is fully described in Qd3d documentation [Viv93] and an example usage is given in [Hess92b]. Hedra suffices for very simple modeling requirements but begins to fail when polygonal rendering is used to approximate curved surfaces.
Figure 3 depicts a wireframe polygonal triangular Bézier patch approximation. Composite triangles are rendered triangle by triangle and row by row. For performance reasons the Bézier patch points, the vertices of the triangles, are cached at world coordinate and screen coordinate levels. World coordinate patch points are cached because their calculation is expensive. Screen coordinate points are cached to avoid the redundant transformations and projections previously noted. Colors at triangle vertices are also cached and are used when Gouraud shading the patch. Pseudo code for rendering the patch:
get points, normals, colors, & screen coordinates for row 0 points for every row i Get points, normals, colors & screen coordinates for row i+1 points for every triangle j in row i fix parameter block cache indices render triangle (call PBPoly3dPrim) set caches for next row
To avoid the use of a z-buffer and still have realistic pictures painter's algorithm techniques are recommended. Briefly stated, the painter's algorithm renders polygons most distant in the scene first followed by polygons closer to the viewer. When a scene is completed the closest polygons were rendered most recently and have overwritten more distant polygons. However, the painter's algorithm will fail when polygons intersect or when polygons have large depths of field.
Again consider the triangular Bézier patch. As depicted in figure 3, the rendering order honors the painter's algorithm if the point C is closest to the viewer, then the point B, and finally A. If point A were closest to the viewer, the labeled rendering order would exhibit hidden surface rendering errors. A solution for the triangular Bézier patch is to modify the rendering order dependent on the relative distance of A, B, and C from the viewer. For triangular Bézier patches this reordering may still fail if the patch is not well behaved.
retrieve and store the existing transform matrix (CQd3dPort::GetTMat)
create the matrix mapping local object coordinates to destination
world coordinates
multiply the mapping matrix and the stored matrix with the stored
matrix on the right
set the Qd3d transform matrix to the new matrix (CQd3dPort::SetTMat)
generate the primitives in local coordinates for the transform based
object
restore the original transform matrix (CQd3dPort::SetTMat)
For complete transform control, CQd3dPort::Transform could be overridden.
What are key concepts and traits for a 3d modeling class hierarchy? Usage, composing, containment, processing, and easy expansion.
Usage implies that different types of elements in a model should be accessed and manipulated in similar ways. For example, if a manipulation tool keeps track of selected elements in a selection list, the tool should be able to send list elements "move this direction with this distance" messages and all list elements should respond accordingly. Stated another way, the base class of the modeling hierarchy should define a minimal set of operations that all elements respond to.
Composing and containment relate to building higher level elements from compositions of multiple sub-elements. For example, a "human" is a composite of "head," "abdomen," and 4 "limb" elements. In turn, limbs are composed of "long bone" elements and etc. Containment implies that the union of the volumes of constituent elements are completely enclosed within the bounding volume of the composed element. Because of this containment property, ray intersections and other volume space queries need not be performed for every single atomic element in a model. If a ray does not intersect the bounding volume of a composite element, the ray can not possibly intersect one of the contained elements. This type of trivial rejection for queries is important for performance considerations in element and user interaction. Composing does not imply containment because a reference element, such as an alignment plane, could be used in the composite definition of an element. Nevertheless, that composite element does not "contain" the reference element.
Containment expedites implementation of the painter's algorithm for hidden surface removal. Draw messages are sent to contained elements in back to front order. If the sub-elements also honor the painter's algorithm, a quick hidden surface removal algorithm for the model is complete.
A strict tree composition hierarchy suffices for most models. In tree hierarchies display and query functions can be implemented with simple recursion. However, if a model allows more than a tree hierarchy, such as two composite elements referencing another element for alignment, simple recursion will result in redundant processing or even infinite recursion. Therefore, "processing" refers to the necessary mechanisms to prevent such redundancy and infinite recursion. For a more detailed description of processing issues for a specific application see [Hess92a].
The final aspect of a class hierarchy for 3d modeling is the most vague. When creating new element classes it preferably requires as little overriding and creation of new methods as possible. Hopefully many of the details in maintaining usage, composing, containment, and processing will already be taken care of and supported by the existing element hierarchy. Therefore, if this support is already in place, an element hierarchy will provide for "easy expansion."
Many application models do not require all the key concepts and traits discussed here. As previously stated, an application model that is strictly tree based does not require an involved mechanism for processing. Trying to implement processing support when simple recursive techniques will do clutters and unnecessarily complicates the application. If an application only performs quality rendering and has no interaction, there may be little need for implementing the containment concept. In short, the model hierarchy need only be as sophisticated as the application requires. Anything more than that becomes a performance, maintenance, and development liability.
One archaic method for placing points in a 3d scene involves the use of at least two different projections of the 3d scene. Operators are required to indicate a 2d point in each projection and those 2d points are then converted into corresponding 3d rays. The 3d point specified is the intersection of the rays or the midpoint of the shortest segment between the rays.
A more elegant solution is the triad mouse [NiO86]. Triad mouse operation divides 2d mouse movement into six regions of directional movement. The six directions correspond to the positive and negative directions of the projections of the 3d x, y, and z axes. Relative mouse movement in one of the six 2d directions is mapped and restricted to a relative 3d movement in the associated 3d axis direction. Using the triad mouse also requires "triad cursor" feedback as a visual cue for the 6 mouse directions. Mapping 2d movements to visually corresponding 3d movements gives the triad mouse an intuitive characteristic unlike other 2d methods for 3d manipulation. Unfortunately, triad mouse operation becomes awkward when one of the 3d axes becomes perpendicular to the screen or when projections of two of the axes become relatively collinear.
Another strength of the Macintosh user interface is the lack of modes and the automatic transition between modes [App93]. If a 3d application supports pull down menus and 3d point specification using a triad mouse, there is an inherent mode transition between 2d and triad mouse behavior. If the user must make a conscious effort for this mode transition, such as issuing a command key solely to start triad operation and another command solely to return to 2d mouse operation, it is likely an example of a poor user interface.
The author's favorite editing tool scheme allows 3d model element selection using the 2d mouse. When a user starts dragging on an element the application interface naturally presumes the user wishes to translate (move) the element in three dimensional space. The mode transition to triad mouse operation is inherently implied, automatically made, and triad mouse operation begins. After the user has specified the 3d translation with accumulated triad mouse movement, the user releases the mouse, the 3d translation is applied, and mouse operation returns to 2d mode.
A 3d application user interface may also be streamlined by taking advantage of application specific constraints. For example, if a user needs to place points on a 3d sphere, there is no need to attempt to use a triad mouse or some scheme of mapping mouse x and y movements to longitude and latitude coordinates on the sphere. Simply use the closest intersection of the sphere with the 3d ray corresponding to a normal clicked 2d mouse location.
Since conversion of 2d screen locations to world coordinate 3d rays requires access to viewing parameters and transformations, such functionality should be a feature of the foundation 3d library. This functionality is needed to isolate manipulation code from the details of viewing parameters and associated coordinate transformations. Similar arguments apply to the interplay between the 3d axes of the triad mouse and their 2d projections. Qd3d answers these requirements with the GetXRay and UpdateTriadPosition methods. The stereo port subclasses also override these CQd3dPort methods so they function properly with stereoscopic projections-manipulation code works correctly whether being used with a mono or stereo projection.
Consider Figure 4 from [Hess92a] in light of the automatic 2d and 3d mouse mode switching.
The bubbles represent different states of the manipulation tool and the arcs represent different events that can occur. Arcs may also associate processing actions to events such as changing state information and providing user feedback. For example, the "MouseMove" arc of the neutral state could adjust the cursor shape to provide feedback on the type of element underneath the cursor. Using state diagrams is very useful to designers for communicating thoughts and considering interface alternatives. Techniques for finite state machines from theoretical computer science also allow the diagrams to be analyzed for unreachable nodes and states of no return. Finally, the diagrams provide concrete guides to interface implementors.
Event language concepts provide facilities for converting the direct manipulation state diagrams into actual code. Event language "keywords" correspond to events such as ButtonDown and MouseMoved. Event language "handlers" correspond to a single type of manipulation tool by collecting code for recognized keywords and containing tool state information. For implementation, the object oriented class construct provides a framework to build event language "handlers" from. Messages defined in a tool class define the keywords of the event language and instance variables store state information. An object instantiated from such a class is an event handler-a manipulation tool. Additionally, the inheritance in object oriented subclassing facilitates the development or specialization of tools for a specific application. Finally, since tools have been created as objects, it is easy to switch between tools at runtime. This switching includes between tools that connect to physical 3d devices or, when those devices are not present, to fall back on virtual 3d tools that use the 2d mouse.
[App93] Apple Computer Inc., Macintosh Human Interface Guidelines, Addison-Wesley, 1993.
[Chen93] Chen, M.: 3-D Rotation Using a 2-D Input Device, Develop: The Apple Technical Journal, 14 pp. 40-53 (June 1993).
[Glove93] Email subscription list. Send message body "subscribe glove-list <your name>" to listserv@boxer.nas.nasa.gov.
[Gre86] Green, M.: A Survey of Three Dialogue Models, ACM Transactions of Graphics, 5(3) 244-275 (1986).
[Hess92a] Hess, J. A.: A Direct Manipulation Three Dimensional Software Visualization Tool, MS Thesis, Arizona State University, 1992.
[Hess92b] Hess, J. A.: Qd3d in Action, THINKin' CaP: The Journal of the Symantec Programming Languages Association, No. 4, pp. 66-73 (August 1992).
[Multi93] Multipoint Technology Corporation; Suite 201, 319 Littleton Road; Westford, MA 01886; (508) 692-0689; AppleLink: MULTIPOINT
[NiO86] Nielson, G. M. and Olsen, D. R. Jr.: Direct Manipulation Techniques for 3D Objects Using 2D Locator Devices, Proceedings 1986 Workshop on Interactive 3-D Graphics, Chapel Hill, pp. 259-269, 1986.
[Sims93] Simsalabim Systems, Inc.; PO Box 4446; Berkeley, CA 94704-0446; (510) 528-2021.
[Viv93] ViviStar Consulting: Qd3d & 3dPane: User Manual: Version 2, Scottsdale, AZ, 1993.



