Notes for Week 11

Case Study - Document Editor

Chapter 2 of our book introduces some of the problems of object-oriented design involved in creating a document editor. This editor combines text and imported material such as graphic images. Some of the problems involved in this project are

Document structure
The internal structure of the document is the method used for storing the information seen on the screen. For even the simplest text editor with formating, an array of char will not give sufficient flexability. If we incorporate non textual material, clearly any structure with only char will not work. Once we have our structure, we need an iterator that can be used in the EvPaint message to show the document as the user expects to see it.
Formatting
In addition to showing the information in the proper row and column alignment, the actual font information must be stored in an efficient way.
Embellishing of the user interface
Must add all the scroll bars, borders and menus normally associated with a windows type document editor.
Supporting multiple look-and-feel standards
In systems where more than one interface standard (Motif and Presentation Manager in XWindows) can be used, the program should easily shift between paradigms.
Supporting mutiple window systems
Different look-and-feel standards are usually implemented in diferent window systems. The design should be independent of the various types of windows systems.
User operations
In addition to menus, we want the user to be able to select items in a document using a mouse in the user interface. Selections in the interface must relate back to the underlying structure.
Spelling checking and hyphenation
Add-on classes such as spelling checkers and hyphenation, should be readily attached to the program.

The Document Structure

The most flexible structure for our document editor is a tree (not a binary tree) in which all nodes are descendents of an abstract base class. This class is often called Glyph. A document with first line shown below

G g
The document is represented by the following tree.

The Type Glyph

Each entry in the tree must be of type a pointer to a glyph. What operations must a glyph be capable of performing?

While Glyph is a purely abstract class, classes like the row class will have all of their functions implemented. Each item in the row can be stored as a pointer in an array contained in the row glyph. To save this array in a file we would have to write the number of items in the array followed by the list of items. Each item would write its type followed by its value.

For our discussion, I will restrict myself to a document with only formatted text. To save the large amounts of space required by glyphs, both the characters and the glyph contex are frequently implemented as flyweights.

The Type Flyweight

In a typical text document there are thousands of characters. If each character was represented by a unique glyph, the document would soon become extremely large. Similarly if the graphic context of each character was stored along with the character we would soon have documents far beyond the size that we would want. The flyweight implementation of a document has one glyph for each character called a concrete Flyweight. Rows consist of pointers to the character glyphs. The pointers are called flyweights. Rows and Columns are unshared concrete flyweights. When a new character is inserted into a row, a glyph factory which contains the concrete flyweights, generates the flyweight for that character. The following diagram represents this structure.

Implementation of The Concrete Flyweights

The class Character is a glyph, so inherits all of the properties of glyph. In addition it has a constructor, overrides the draw function and holds the actual character code.

class Character :public Glyph
{
public:
    Character(char);
    virtual void draw(TWindow *, GlyphContext &);
private:
    char _charCode;
};
typedef RcPointer< Character> Flyweight; 

Implementation of The Flyweight Factory For Characters

This concrete factory contains an array of RcPointers to Character class elements. New characters are added only when needed. This saves the memory needed for an item of type character. Here are samples of the code for the Glyph factory.

class GlyphFactory
{
public:
     GlyphFactory(); //Does nothing in this implementation
     virtual Flyweight createCharacter(char);
     virtual Row* createRow();
     virtual Column* createColumn();
     virtual FontPtr createFont(TFont*,TColor&);
// ....
private:
      Array< Flyweight> _character;
      Array< FontPtr> _font;
};

Flyweight GlyphFactory::createCharacter(char c)
{
     if (!_character[c])
        _character[c] = new Character(c);
     return _character[c];
}

All the createRow and createColumn functions have to do is return pointers to a row or column respectively. The document can wrap row and column pointers in RcPointers to avoid memory loss.

Implementing The Font Factory

Fonts are somewhat system dependent. You can hide the system dependent characteristics by wrapping a font in a glyph envelope. This will allow you to package things like font and font color in one package.

Borland keeps the font style in TFont objects that can be initialized using the TChooseFontDialog. The TDC function SelectObject sets the font when a TFont is passed in as a parameter. This font is used by the TextOut function until another is set using SelectObject. The color is not contained in TFont, and is set using the TDC function SetTextColor which takes a TColor as arguement.

Once the font is set for the TDC it is possible to get line height and other text infomation using functions such as GetTextMetrics and GetTextExtent. These can be used by Row glyphs to determine the height of each row. This information can also be used to find out if a point intersects the bounds of a given character.

The following class can be used to hold a font flyweight.

class Font 
{
public:
    Font(const TFont*,const TColor&);
    TRect bounds();
    int operator == (const Font&);
    TFont style();
    TColor color();
private:
    RcPointer< TFont>  _font;
    TColor _color;
};
typedef RcPointer< Font> FontPtr

Once we have the class Font, we can build the factory. In the character factory, we could just go directly to the location subscripted by the character. We can't use fonts as subscripts, so we have to uses find to see if the font is already in the array. If it is return it, otherwise insert it.

FontPtr GlyphFactory::createFont(TFont* tf ,TColor& c )
{
    FontPtr p = new Font(tf,c);
    int loc = _fonts.find(p);
    if (loc <0)
    {
        //Put recently defined fonts at the beginning since 
        //they will probably be used again shortly
        loc =0;
        _fonts.insertItemAt(p,0);
    }
   return _fonts[loc];
} 

Implementing the GlyphContext

As in the case of characters, we need to reduce the amount of memory used to store fonts. We will not store a font (or even a font pointer) for every character, but maintain a structure that will hold font characteristics for a range of characters. This is based on the diagrams on page 203 of the book.

The book uses a BTree ordered on the position of the beginning of a run of characters with the same format. One could also use a skip list or in fact for most types of documents an ordered list would suffice. No matter which structure is used for our GlyphContext, we will also need an iterator for the structure.

If you use a BTree, the interior nodes of the tree would give range of subscripts information. This allows the functions using the structure to find the location of the format for the character at a given location. Leaf nodes contain a font pointer generated by the font factory and a range of subscripts for which this font is in effect. If a skip list is used, ordering is based on the first subscript of the format. To get the range just find the next node in the list. The structure should be initialized with a default font.

Insertion of a new format or deletion of an existing format is a bit more complicated than for a simple BTree or ordered list. If characters that fall within the range of one node have their format changed, two nodes must be inserted. One for the new format and one for the tail end of the old format. Similarly if the formating change covers the beginning of an existing format, the subscript number in the existing node must be changed. Deletions cause similar problems.

Putting It All Together-The Document

Each document will contain a GlyphFactory, a glyph which contains all of the characters and a FontList for the formats. To display the text, the document glyph calls on an iterator to retrieve each column glyph so it can draw itself. Each column in turn uses an iterator to retrieve row glyphs to display their contents. Finally, each row scrolls through each character glyph which displays itself. Each of the character glyphs must contain a pointer and rows must work in conjunction with columns to maintain a position count. That count is checked agains the current glyph context to see if the format must change. It is only nessary to know where the next context begins and what the font for this is. The iterator for the GlyphContext is incremented after a font change, so that a pointer to the next font and the beginning of the next format are known.