Hacking gemsvnc: The Internals of the gemsvnc Server |
Code licensed under the GPL, Documentation licensed under the GFDL |
|
OVERVIEW
The gemsvnc program is a vnc server that permits a vncviewer to remotely view and interact with an already running X-server, typically the console. It was derived from the x11vnc program that is included with the libvncserver library. The x11vnc program was inspired by the x0rfbserver program, which is written in C++ and does not use libvncserver. The x11vnc program, and libvncserver also, are both written in C. Because libvncserver provides support for important compression techniques, such as tight encoding and JPEG compression, gemsvnc is written in C to exploit these techniques. NOTE: The x11vnc program reffered to in the gemsvnc documenataion has been completely re-written by Karl Runge in November of 2002. The libvncserver v0.5 snapshot contains his original version, and since April 2003, additional changes have been made, and are available via the libvncserver CVS. SOURCE FILESBesides the source files for libvncserver, the source files for the gemsvnc vnc server are:
All vnc servers make use of the rfb (remote frame buffer) protocol to communicate with a vnc viewer. The libvncserver library provides support for the rfb protocol for gemsvnc. This library expects the program using it to provide it with a framebuffer that contains the video image to be relayed to the vnc viewer. When the viewer starts up and connects to the server, the library sends a complete copy of this framebuffer to the viewer. After that, the server program notifies the library of changes to the framebuffer by telling the library of the location of a rectangle in the framebuffer that contains the changed information. The library then sends only the information inside the changed rectangle to the viewer. In this way, the viewer is able to maintain a copy of the server's framebuffer. The rfb protocol, as implemented by libvncserver, uses several compression and optimization techniques to minimize the actual amount of data sent over the network. It is beyond the scope of this document to describe these optimizations, as they are part of libvncserver. When the remote user at the vnc viewer moves the mouse, presses or releases any of the mouse buttons, or presses or releases any keys on the keyboard, an input event is sent from the viewer to the server. The libvncserver library handles the details of the rfb protocol for such input events, and makes these events available to the server program by means of a function call. When gemsvnc starts, it first processes any command line options. After this, it does its initialization. Initialization consists of specifying the initial mouse cursor to use, the generation of interlace patterns for the horizontal and vertical probing, and opening the X-server display that is to be relayed. A copy of the entire X-server's framebuffer is made to initialize the gemsvnc server's framebuffer. Initialization of the libvncserver rfb protocol support is performed. The gemsvnc server's framebuffer is allocated, as is the secondary framebuffer where the colorations for the mouse cursor and the 2 categories of rectangles with visual cues will be made. This is followed by initialization of a set of X images for the different sizes and shapes of regions of the X-server's framebuffer that will need to be made during the operation of the server. After initialization is complete, the server begins executing its "big loop". This loop repeats until something causes the server to terminate. The first operation in this loop is to determine if any clients are connected, and to time-out if the server has been waiting for a connection for too long. The server then processes any input events that have arrived. After that, any pending commands from the command language interface are executed. Finally, the host X-server's framebuffer is compared to the gemsvnc server's copy of it to see if there have been any changes to the video display, and if so, they are relayed to the connected viewer. After all this, the process repeats, and the loop starts again from the top. CHECKING FOR IMAGE UPDATESMost of the work done by the gemsvnc server is connected with maintaining a copy of the host X-server's video display on the remote client viewer's display. Consider the size of some X-server framebuffers, such as a dual-headed 1280x1024 system with a framebuffer of 2560x1024 pixels, and 32 bits per pixel. Such a framebuffer contains 10 MB of memory. Obviously, a workable vnc server could not possibly relay all this information on every pass through the big loop and be expected to provide usable responsiveness. Rather than send the entire framebuffer every pass through the big loop, a workable vnc server that attaches to a pre-existing host X-server needs to scan the host's framebuffer, looking for regions where some change has occurred, and then send only the information in those regions. This is the strategy employed by gemsvnc, and was previously used by x0rfbserver as well. This technique was borrowed from x0rfbserver by gemsvnc by re-implementing the scanning algorithm, and then extending it to permit additional flexibility in tuning it to a particular application. The x0rfbserver divides the framebuffer up into tiles of 32x32 pixels each, with partial tiles possibly occuring along the right and bottom edges of the framebuffer. Whenever a change is detected, only the information inside the tile containing the changed pixel is sent to the vnc viewer. With gemsvnc, this same technique is also used, except that gemsvnc allows the height and width of the tiles to be specified by command line options. The change detection scanning algorithm of gemsvnc is likewise a re-implementation of that found in x0rfbserver, with the appropriate modifications to allow for variable sized tiles. Each tile is a rectangle of pixels. These pixels can be thought of as a set of horizontal rows of pixels, and also as a set of vertical columns of pixels. At gemsvnc server initialization, an interlace sequence is generated for the number of rows, and for the number of columns, in a tile. The earlier x0rfbserver just had a statically initialized sequence, since the tiles had the same numer of rows and columns, both being equal to 32. The host X-server's framebuffer is compared to the gemsvnc server's backup framebuffer according to the following technique:
After the scanning for changes described above has been completed, the send_changed_tiles_to_viewer() function scans the dirty bits for all the tiles, copying the contents of each dirty tile from the host X-server's framebuffer to the gemsvnc server's backup framebuffer. This operation is performed by the copy_tile() function. The copy_tile() function first determines where the tile being copied is located. If the tile is on the right edge of the framebuffer, or on the bottom edge, or on the lower right corner, then it is potentially smaller than a full sized tile. During system initialization, X-images were created for each of the possible sizes. The copy_tile() function determines the proper X-image to use, and fetches the data from the host X-server's framebuffer. It then checks to see if any part of the rectangle containing the mouse cursor is also inside that tile, and if so, it sets the "bogus_mouse_cursor_was_clobbered" flag so that the mouse can be restored after all the dirty tiles have been copied. Finally, copy_tile() copies the data from the X-image it fetched from the host X-server's framebuffer into the gemsvnc server's backup framebuffer, and also into the gemsvnc server's markup framebuffer. Because gemsvnc implements special visual cues for certain special rectangular regions of the display, it maintains 2 framebuffers. The first, the backup framebuffer, is a true copy of the host X-server's framebuffer. This backup framebuffer is used during the scanning for changes by comparing the host X-server's framebuffer with this backup framebuffer. The second framebuffer, the markup framebuffer, is used to relay video information to the client vnc viewer. This markup framebuffer has additional modifications made to its contents which would render it unusable for comparison with the original host X-server's framebuffer. The mouse cursor is drawn here, as are any blocked or guarded rectangles. Blocked rectangles are colored with an opaque color, and guarded rectangles are colored with a transparent tint. After all the dirty tiles have been copied into the 2 gemsvnc framebuffers, the visual cues for the guarded and blocked rectangles are colored. As this is being done, a check is made to determine if the mouse is in a blocked or guarded region. The transparent guarded rectangles are colored first, followed by the opaque blocked rectangles. This order is important, as the opaque color must override the transparent tint, and the blocked mouse graphic must override the guarded mouse graphic. After the blocked and guarded rectangles are colored, the mouse is drawn, if necessary. Next, any "image" category rectangles that intersect any dirty tiles are drawn atomically. This is done by clearing all the dirty bits of all tiles intersecting the dirty image, and then the image is marked as a changed rectangle by a call to the libvncserver library. The dirty bits must first be cleared, or else the information would be sent twice, defeatng the optimization intended by the "image" rectangle. Finally, each dirty tile is sent to the vnc viewer by a call to the libvncserver library. OPTIMIZING TUNABLE PARAMETERSInput Events Because mouse motion can generate a lot of input events, the logic in the big loop will attempt to process multiple input events before proceeding on to handling the command language interface or checking for updates to the framebuffer. Without this logic, each mouse motion event would be followed by a subsequent video update operation. Scanning for updates takes cpu time, and transmitting updates takes network time and bandwidth. By processing multiple input events at a time, this effect is reduced; however, another danger exists. If too many input events are processed without a video update, the operator cannot see what he is doing, and becomes mentally decoupled from the mouse motion he is generating. For this reason, the maximum number of multiple input events that will be processed before moving on and sending a video update must be limited. This number is controlled by the "-events" command line option. Tile SizeThe size and shape of the tiles has a direct impact on the responsiveness, speed, and bandwidth usage of gemsvnc. Because of this, gemsvnc provides the "-tw" and "-th" command line options to specify the tile width and height. The width and height may be set to any value between 1 and the maximum dimensions of the host X-server's framebuffer. It seems that choosing a tile size that evenly divides the raster size produces better performance than not doing so. Choosing a tile size that is too small cripples the compression algorithms, and also generates a lot more fixed overhaed in the form of packet headers, etc. Choosing a tile size that is too large causes undue bandwidth usage. Larger tile sizes result in faster scans of the complete set of tiles, but each tile is less thouroughly scanned within itself. Scan LimitingThe "-scans" command line option controls the number of pairs of horizontal and vertical scan lines in each row of tiles that will be scanned before exiting the scanning algorithm and proceeding on to copying the tiles and updating the screen. A larger number here will discover more changes before sending those changes to the viewer, but will also reduce responsiveness to mouse motion events. This parameter is pretty sensitive to the particular application and the way the mouse and video data interact. If performance is important to your application on gemsvnc, you should experiment with this parameter to achieve best results. SPECIAL RECTANGLESBecause gemsvnc is intended for use as a remote control tool for applications that may control physical machinery, and that may contain sensitive information on their display screens, considerations of safety and security require that certain regions of the display must not be visable to the remote user, and certain other regions must not permit interaction with the remote user, even though they are visable. This requirement is met by the use of rectangular regions that are either displayed in an opaque color, thus blocking the remote user's view, or in a transparent tint, as a visual cue that any keystrokes or mouse button presses will be ignored while the mouse cursor is in these regions. These rectangular regions have their edges parallel to the framebuffer's edges. The existence, location, and category of these rectangles is controlled by means of a command language. Each command in the command language is a simple English-like statement represented as an ASCII text string terminated with a newline. These commands are read by gemsvnc from stdin, and the response to these commands is output to stdout. In the future, commands will also be permitted over Berkeley sockets. There are 4 categories of rectangles: hold, guard, block, and image: The hold category provides a sort of holding tank for a rectangle to exist in when it is desired that it not affect the remote display. Commands exist to move rectangles from one category to another. The guard category is used to prohibit interaction but still permit viewing. These areas are indicated by a transparent tint of color on the remote display. Whenever the mouse cursor is in one of these regions, keystrokes and button presses from the remote user will be ignored. The block category likewise inhibits remote keystrokes and button presses, but it uses an opaque color rather than a transparent tint to indicate the affected region of the screen on the remote display. The block category is used to render sensitive information that is displayed on the host display so that it appears as a solid colored rectangle on the remote display. Think of it as a kind of mask, or shield. The image category is somewhat different from the guard and block categories, as it does not affect remote interaction, nor does it produce an explicit visible cue on the remote display. The image category is intended to be used to specify regions of the screen that contain images, such as photographs, etc. that have the property that if any part of such an image changes, then it is highly likely that every part of that image changed. Such rectangles are updated as a unit, rather than being divided up into tiles like the rest of the display. The use of image rectangles permits the compression algorithms to work with larger chunks of data, thereby increasing their effectiveness. It also prevents that region of the remote display from updating piecemeal, tile by tile, which can result in some distracting artifacts while the update is in progress. THE COMMAND INTERPRETERCommands are read by calling the enquiry() function. Responses to that enquiry are sent by calling the response() function. Presently, these functions operate only with stdin and stdout, but the design is intended to support multiple connections via Berkeley sockets. The I/O uses the select() system call, and is non-blocking. If no command was read, NULL is returned. If an end-of-file was returned, it returns the value EOF. The functions defined in the file commands.c -- especially the function commander() -- implement the command interpreter. The commander() function attempts to read and process one command each time that it is called. If a command line is available, it reads it and processes it; if no command is available, then commander() just returns. Each command is broken up into tokens which are delimited by spaces or the newline at the end of the command line. The first token is always the name of the command. Subsequent tokens are arguments to that command. With the present command set of "block", "guard", "hold", "image", "kill", "new", "place", and "show", the second token is always the name of a rectangle, which in every case except for the "new" command, may be a regular expression. Only the "place" command has more than 2 tokens. The "place" command specifies the upper left and lower right coordinates of the rectangle. In other words, it specifies the "place" of the rectangle. If additional commands are added, they must have the first token as the command name, but beyond that, it is the responsibility of that particular command processor to interpret the remainder of the command line. The implementor of any additional commands is encouraged to try to maintain the style of the previously existing commands, although this is not a design requirement. RECTANGLES AS OBJECTSIn order to keep track of these rectangular regions, where the rectangles are located, which category they presently belong to, etc., the files rectangle.h and rectangle.c implement a data structure and a set of functions for the manipulation of such rectangles. Thus a rectangle class of objects is implemented in C instead of C++. A rectangle is a data structure whose typename is rectangle. It is composed of a pair of pointers to rectangles, a pointer to a character string that holds the name of the rectangle, and a set of 4 integers that represent the x and y coordinates of the upper left and lower right corners of the rectangle. The reason for the pair of pointers is so that rectangles can be linked together to form lists of rectangles. These lists are used to represent the different categories to which a rectangle can belong. Since a rectangle is only permitted to belong to one category at a time, it only needs to be on one list at a time. These lists are implemented as doubly linked circular lists, primarily because it is easy to insert and remove nodes from such a list. One rectangle serves as the list head for each list, and list traversal is accomplished by walking through the links from the list head until the list head is again encountered. As an optimization, the rectangle represented by the list head reflects the bounding box that contains all the rectangles on that list. As rectangles are added and removed from the list, the bounding box is updated. There is a constructor and a destructor for a rectangle object. The constructor allocates memory for the data structure and initializes the name string and the corner coordinates. The destructor frees the storage for the data structure and the name string. The show_rectangle() method displays the data structure in a nice human readable format. It is primarily intended to be used for debugging. There are 2 insert methods: insert_rectangle_before() and insert_rectangle_after(). The intent of these methods is that they permit a given rectangle, which is typically not on any list, to be inserted either before or after another given rectangle, which is on a list, even if it is the only element on that list, in which case it is the list head. There is a method to remove a rectangle from a list. It is the remove_rectangle_from_list() function. This takes a single argument: the pointer to the rectangle to remove. The find_rectangle_on_list() method takes 2 arguments: a pointer to the list head, and a pointer to a character string containing a regular expression to match with the name strings of the rectangles on the list. It returns a pointer to the first rectangle whose name matches the rectangle expression. If there is no match, NULL is returned. If there is more than one match, only the first is returned. The area() function takes a pointer to a rectangle and returns its area in pixels. This function is not presently being used by the gemsvnc server. The point_is_inside_rectangle_P() function is a predicate function: it returns TRUE if the specified point is inside the specified rectangle. It takes the x and y coordinates of the point in question, and the pointer to the rectangle. This function uses the point_is_in_interval_P() function as a helper. The point_is_in_interval_P() function takes 3 arguments: the left boundry, the number in question, and the right boundry. If the number is between the boundaries, then it returns TRUE. The construct_empty_list_of_rectangles() method acts a constructor for list heads. It constructs a rectangle that is on a list all by itself, and whose corners are at negative infinity. This represents an impossible bounding box that will be expanded when other rectangles are added to the list to be the proper bounding box for the set of rectangles on that list. The update_bounding_box() method takes a pointer to a list head and a pointer to a rectangle to be added to the bounding box information of that list head. A bounding box potentially expands when another rectangle is added to it, but it will never contract with the addition of another rectangle. If a rectangle is moved or removed from the list, then the bounding box must be reset to negative infinity and every rectangle on the list then added to the bounding box. For this reason, additions are less expensive than deletions or repositionings. Since the number of rectangles on any given list at any given time is not expected to be very large, this should not present any problem in actual operation. The add_rectangle_to_list() method inserts the rectangle on the specified list and then updates the bounding box of the list. The use of this method insures that the bounding box information in the list head will be kept up to date. The remove_named_rectangle_from_list() method takes a pointer to a list head and a pointer to a string containing a regular expression. It searches the list for a rectangle whose name matches the regular expression and removes it from the list. It updates the bounding box information if the removed rectangle had an edge colinear with any edge of the bounding box. It does this by resetting the bounding box to negative infinity and then adding all the remaining rectangles to the bounding box. Finally, remove_named_rectangle_from_list() returns a pointer to the rectangle that it removed, or NULL if if found no matches for the regular expression. If multiple names match the regular expression, only the first match will be returned, but if remove_named_rectangle_from_list() is called repeatedly, it will remove subsequent matching rectangles until it ultimately returns NULL, indicating that no more matches remain. The destroy_named_rectangle_on_list() method is like remove_named_rectangle_from_list(), except that instead of returning the rectangle that was removed, it calls the destructor for that rectangle. The 3 methods initial_rectangle_list_iterator(), continue_rectangle_list_iterator(), and next_rectangle_list_iterator(), are used to traverse a list of rectangles. The 3 functions each take a pointer to a pointer to a structure of the type state_rectangle_list_iterator, which saves the context between calls. These 3 methods are designed to be used in a for-loop as follows: rectangle* list; rectangle* R; state_rectangle_list_iterator* context; for (R = initial_rectangle_list_iterator(list, &context); continue_rectangle_list_iterator(&context); R = next_rectangle_list_iterator(&context)) { do_something(R); } This will use R as the loop variable which points to the current rectangle. The loop will run for all the rectangles on the list; especially, the loop will not run at all if the list only contains the list head itself, and no other rectangles. The reposition_rectangle_on_list() method modifies the corner coordinates of a specified rectangle to have the specified value. Because this operation can potentially change the bounding box of the list, the bounding box is reset and recomputed. The intersection_of_2_rectangles() method take 2 pointers to 2 rectangles, and a pointer to a string containing a name. It returns a new rectangle with that name that is the intersection of the original 2 rectangles. If the 2 original rectangles do not intersect, NULL is returned. COMMAND EXECUTIONblock <name> This command searches the hold, guard, and image lists for a match with the regular expression <name>. It moves any matching rectangles to the block list by removing them from the other list they were on and adding them to the block list. It then tells the framebuffer update scanning mechanism that all the tiles that intersect the rectangle have experienced a change. It does this by marking them as dirty by a call to the mark_tiles_in_rectangle_as_dirty() function. guard <name>This command is like the block command, except that it searches all the lists except guard, and moves matching rectangles to the guard list. hold <name>This command is the same except that the target of the move is the hold list. image <name>This command follows the above pattern, only for the image list. kill <name>This command searches all 4 lists for matching names. It marks the tiles in the rectangle as dirty, and then calls the destructor to destroy the rectangle. new <name>This command first searches all 4 lists to make sure that the name is not already in use. If the name is available, it constructs a new rectangle with that name, and with both corners at the origin. It then puts the new rectangle on the hold list. place <name> ulx uly lrx lryAfter parsing out the 4 corner coordinates from the command line, this command searches all 4 lists for rectangles whose names match the regular expression. For every match, it marks the the tiles in the rectangle's original position as dirty, then repositions the rectangle on that list, and finally marks the tiles at the rectangle's new location as dirty. Thus the change is marked when the rectangle is moved from a location, as the color cues associated with it would disappear, and a change is marked when the rectangle moves to a new location, as the color cues associated with it need to appear. show <name>This command does not affect the framebuffer or the operation of the gemsvnc server. It is intended to provide status to the issuer of the command. This command responds with a one line status message for each rectangle whose name matches the provided regular expression. The message has the rectangle's name, which list it is on, and its corner coordinates. COMMANDERThe commander() function is not a command, but rather the function that reads and interprets command lines. It is called from the big loop of the gemsvnc server. It attempts to read and execute a single command for each time that it is called. Since the reading of the command line is done with non-blocking I/O, if no command line was available to be read, this routine simply returns without executing any command. This permits the big loop to continue to execute to relay video information, keystrokes, mouse buttions, and mouse motion. SCANNING FOR UPDATESThe big loop at the end of main() first processes waiting-for-connection time-outs, then it handles input events, then it executes a pending rectangle command, provided one exists. Finally, it checks for image updates by calling checkForImageUpdates(). The checkForImageUpdates() function makes multiple passes through the framebuffer probing along a vertical or horizontal scan line comparing the pixels from the live framebuffer with the pixels saved in the backup framebuffer. This probing is done by the probe_horizontal() and probe_vertical() functions, which are called alternately inside a for loop in checkForImageUpdates(). the probe_axis variable is static, so its value will persist between calls of checkForImageUpdates(). This permits the for loop to resume with the correct probe -- either horizontal or vertical -- the next time checkForImageUpdates() is invoked. The for loop itself is used to limit the number of probes before a video update to the remote is performed, and then the big loop starts over. The value of the scan_count variable determines how many probes are made of the framebuffer before gettting on with other things. This variable has a default value of 16, but that may be overridden with the -scans command line option. If this variable is set too high, the video, including the host mouse position indication, may update too slowly for human operator at the remote viewer to effectively operate the application programs on the host display. If this variable is set too low, video updates may occur too rapidly, thus swamping the bandwidth and reducing responsiveness. For slower network connections, generally use a higher value. If you are overriding the default values for the tile-width or tile-height with the -tw or -th command line options, you should experiment with the -scans options also, as these things interact to affect overall performance. The probe_horizontal() and probe_vertical() functions work in nearly identical fashion. This description will concentrate on probe_horizontal(), but the generalization to probe_vertical() is straightforward. There are two static variables: j and probe_y. These provide persistence between calls, so that when probe_horizontal() exits and is called again later, it will resume its scanning where it last left off. Thus probe_horizontal() and probe_vertical() operate somewhat like co-routines, or cooperative tasks in a multitasking system, only each gets its scheduled change to run each time the for loop in checkForImageUpdates() makes another iteration. The entire scanning operation gets its chance to run each time the big loop makes another iteration. The end result of the for loop in checkForImageUpdates() is that any differences between the live and the backup framebuffers that were detected will cause the dirty bit of the tile containing the pixel where the difference was detected to be set. This is done by calling set_dirty_bit(x, y), where x and y are the x and y position of the tile. Note that x and y are not the pixel coordinates, but the tile coordinates, beginning at zero, and running up to tilewidth or tileheight, depending on the coordinate. After the for loop finishes, the send_changed_tiles_to_viewer() function is called. This first copies all the tiles that were marked dirty by calling copy_tile(i, j). This function first determines what size tile to use. Most tiles are the size specified by the -tw and -th command line options, or the defaults if these options were not used; however, tiles on the bottom edge can be shorter -- not as high as tileheight, and tiles on the right edge can be narrower -- not as wide as tilewidth. Finally, the bottom right tile can be both shorter and narrower than a normal tile. So, depending on its location, copy_tile(i, j) determines what size tile to use, and also the corresponding Ximage to use to fetch the contents of the tile from the live framebuffer. Once this is determined, the contents of the tile are fetched from the live framebuffer by a call to the getImage() function. The copy_tile(i, j) function then determines whether the mouse cursor will be affected by the copying of the tile, and if so, a flag is set to force re-drawing of the mouse cursor, and the mouse cursor is removed -- just in case it gets moved before we can re-draw it, so no artifacts of the cursor image will be left lying around. After all these details have been taken care of, the tile is copied from the Ximage that was fetched to both the backup framebuffer and the markup framebuffer by a pair of nested for loops. After all the dirty tiles have been copied, send_changed_tiles_to_viewer() then adds the visual cues to the markup framebuffer. It first traverses the guard list, coloring a rectangle by calling color_transparent_rectangle() and then checking if the mouse cursor is inside the guarded rectangle, and if so, then the mouse cursor's graphic image is changed to the image used for guarded rectangles. After all the guarded rectangles have been colored transparently, the blocked rectangles are similarly processed by coloring them with color_opaque_rectangle(). The opaque rectangles must be processed after the transparent ones so that if both kinds overlap, then the opaque coloring will block out the transparent coloring, thus producing the desired visual effect. |
|
|