The Code Project

Sample Image - maximum width is 600 pixels

Introduction

ICSharpCode.AvalonEdit is the WPF-based text editor that I've written for SharpDevelop 4.0. It is meant as a replacement for ICSharpCode.TextEditor, but should be:

Extensible
Easy to use
Better at handling large files

Extensible means that I wanted SharpDevelop AddIns to be able to add features to the text editor. For example, an AddIn should be able to allow inserting images into comments - this way you could put stuff like class diagrams right into the source code!

With, Easy to use, I'm referring to the programming API. It should just work™. For example, this means if you change the document text, the editor should automatically redraw without having to call Invalidate(). And if you do something wrong, you should get a meaningful exception, not corrupted state and crash later at an unrelated location.

Better at handling large files means that the editor should be able to handle large files (e.g. the mscorlib XML documentation file, 7 MB, 74100 LOC), even when features like folding (code collapsing) are enabled.

Using the Code

The main class of the editor is ICSharpCode.AvalonEdit.TextEditor. You can use it just similar to a normal WPF TextBox:

<avalonEdit:TextEditor
    xmlns:avalonEdit="http://icsharpcode.net/sharpdevelop/avalonedit"
    Name="textEditor"
    FontFamily="Consolas"
    FontSize="10pt"/>

To enable syntax highlighting, use:

textEditor.SyntaxHighlighting = HighlightingManager.Instance.GetDefinition("C#");

AvalonEdit has syntax highlighting definitions built in for: ASP.NET, Boo, Coco/R grammars, C++, C#, HTML, Java, JavaScript, Patch files, PHP, TeX, VB, XML

If you need more of AvalonEdit than a simple text box with syntax highlighting, you will first have to learn more about the architecture of AvalonEdit.

Architecture

TODO: overview of the namespaces, insert graph from NDepend As you can see in this dependency graph, AvalonEdit consists of a few sub-namespaces that have cleanly separated jobs. Most of the namespaces have a kind of 'main' class.

ICSharpCode.AvalonEdit.Utils: Various utility classes
ICSharpCode.AvalonEdit.Document: TextDocument — text model
ICSharpCode.AvalonEdit.Rendering: TextView — extensible view onto the document
ICSharpCode.AvalonEdit.Editing: TextArea — controls text editing (e.g. caret, selection, handles user input)
ICSharpCode.AvalonEdit.Highlighting: HighlightingManager — highlighting engine
ICSharpCode.AvalonEdit.Highlighting.Xshd: HighlightingLoader — XML syntax highlighting definition support (.xshd files)
ICSharpCode.AvalonEdit.Folding: FoldingManager — enables code collapsing
ICSharpCode.AvalonEdit: TextEditor — the main control that brings it all together

Here is the visual tree of the TextEditor control:

It's important to understand that AvalonEdit is a composite control with the three layers: TextEditor (main control), TextArea (editing), TextView (rendering). While the main control provides some convenience methods for common tasks, for most advanced features you have to work directly with the inner controls. You can access them using textEditor.TextArea or textEditor.TextArea.TextView.

Document (The Text Model)

So, what is the model of a text editor that has support for complex features like syntax highlighting and folding?
Would you expect to be able to access collapsed text using the document model, given that the text is folded away?
Is the syntax highlighting part of the model?

In my quest for a good representation of the model, I decided on a radical strategy: if it's not a char, it's not in the model!

The main class of the model is ICSharpCode.AvalonEdit.Document.TextDocument. Basically, the document is a StringBuilder with events. However, the Document namespace also contains several features that are useful to applications working with the text editor.

In the text editor, all three controls (TextEditor, TextArea, TextView) have a Document property pointing to the TextDocument instance. You can change the Document property to bind the editor to another document; but please only do so on the outermost control (usually TextEditor), it will inform its child controls about that change. Changing the document only on a child control would leave the outer controls confused.

Simplified definition of TextDocument:

public sealed class TextDocument : ITextSource
{
    public event EventHandler UpdateStarted;
    public event EventHandler<DocumentChangeEventArgs> Changing;
    public event EventHandler<DocumentChangeEventArgs> Changed;
    public event EventHandler TextChanged;
    public event EventHandler UpdateFinished;

    public TextAnchor CreateAnchor(int offset);
    public ITextSource CreateSnapshot();

    public IList<DocumentLine> Lines { get; }
    public DocumentLine GetLineByNumber(int number);
    public DocumentLine GetLineByOffset(int offset);
    public TextLocation GetLocation(int offset);
    public int GetOffset(int line, int column);

    public char GetCharAt(int offset);
    public string GetText(int offset, int length);

    public void BeginUpdate();
    public bool IsInUpdate { get; }
    public void EndUpdate();

    public void Insert(int offset, string text);
    public void Remove(int offset, int length);
    public void Replace(int offset, int length, string text);

    public string Text { get; set; }
    public int LineCount { get; }
    public int TextLength { get; }
    public UndoStack UndoStack { get; }
}

Offsets

In AvalonEdit, an index into the document is called an offset.

Offsets usually represent the position between two characters. The first offset at the start of the document is 0; the offset after the first char in the document is 1. The last valid offset is document.TextLength, representing the end of the document.

This is exactly the same as the 'index' parameter used by methods in the .NET String or StringBuilder classes. Offsets are used because they are dead simple. To all text between offset 10 and offset 30, simply call document.GetText(10, 20) – just like String.Substring, AvalonEdit usually uses Offset / Length pairs to refer to text segments.

To easily pass such segments around, AvalonEdit defines the ISegment interface:

public interface ISegment
{
    int Offset { get; }
    int Length { get; } // must be non-negative
    int EndOffset { get; } // must return Offset+Length
}

All TextDocument methods taking Offset/Length parameters also have an overload taking an ISegment instance – I have just removed those from the code listing above to make it easier to read.

Lines

Offsets are easy to use, but sometimes you need Line / Column pairs instead. AvalonEdit defines a struct called TextLocation for those.

The document provides the methods GetLocation and GetOffset to convert between offsets and TextLocations. Those are convenience methods built on top of the DocumentLine class.

The TextDocument.Lines collection contains one DocumentLine instance for every line in the document. This collection is read-only to user code and is automatically updated to always^* reflect the current document content.

Internally, the DocumentLine instances are arranged in a binary tree that allows for both efficient updates and lookup. Converting between offset and line number is possible in O(lg N) time, and the data structure also updates all offsets in O(lg N) whenever text is inserted/removed.

* tiny exception: it is possible to see the line collection in an inconsistent state inside ILineTracker callbacks. Don't use ILineTracker unless you know what you are doing!

Change Events

Here is the order in which events are raised during a document update:

BeginUpdate()

UpdateStarted event is raised

Insert() / Remove() / Replace()

Changing event is raised
The document is changed
TextAnchor.Deleted events are raised if anchors were in the deleted text portion
Changed event is raised

EndUpdate()

TextChanged event is raised
TextLengthChanged event is raised
LineCountChanged event is raised
UpdateFinished event is raised

If the insert/remove/replace methods are called without a call to BeginUpdate(), they will call BeginUpdate() and EndUpdate() to ensure no change happens outside of UpdateStarted/UpdateFinished.

There can be multiple document changes between the BeginUpdate() and EndUpdate() calls. In this case, the events associated with EndUpdate will be raised only once after the whole document update is done.

The UndoStack listens to the UpdateStarted and UpdateFinished events to group all changes into a single undo step.

TextAnchor

If you are working with the text editor, you will likely run into the problem that you need to store an offset, but want it to adjust automatically whenever text is inserted prior to that offset.

Sure, you could listen to the TextDocument.Changed event and call GetNewOffset on the DocumentChangeEventArgs to translate the offset, but that gets tedious; especially when your object is short-lived and you have to deal with deregistering the event handler at the correct point of time.

A much simpler solution is to use the TextAnchor class. Usage:

TextAnchor anchor = document.CreateAnchor(offset);
ChangeMyDocument();
int newOffset = anchor.Offset;

The document will automatically update all text anchors; and because it uses weak references to do so, the GC can simply collect the anchor object when you don't need it anymore.

Moreover, the document is able to efficiently update a large number of anchors without having to look at each anchor object individually. Updating the offsets of all anchors usually only takes time logarithmic to the number of anchors. Retrieving the TextAnchor.Offset property also runs in O(lg N).

When a piece of text containing an anchor is removed; that anchor will be deleted. First, the TextAnchor.IsDeleted property is set to true on all deleted anchors, then the TextAnchor.Deleted events are raised. You cannot retrieve the offset from an anchor that has been deleted.

This deletion behavior might be useful when using anchors for building a bookmark feature, but in other cases you want to still be able to use the anchor. For those cases, set TextAnchor.SurviveDeletion = true.

Note that anchor movement is ambiguous if text is inserted exactly at the anchor's location. Does the anchor stay before the inserted text, or does it move after it? The property TextAnchor.MovementType will be used to determine which of these two options the anchor will choose. The default value is AnchorMovementType.BeforeInsertion.

If you want to track a segment, you can use the AnchorSegment class which implements ISegment using two text anchors.

TextSegmentCollection

Sometimes it is useful to store a list of segments and be able to efficiently find all segments overlapping with some other segment.
Example: you might want to store a large number of compiler warnings and render squiggly underlines only for those that are in the visible region of the document.

The TextSegmentCollection serves this purpose. Connected to a document, it will automatically update the offsets of all TextSegment instances inside the collection; but it also has the useful methods FindOverlappingSegments and FindFirstSegmentWithStartAfter. The underlying data structure is a hybrid between the one used for text anchors and an interval tree, so it is able to do both jobs quite fast.

Thread Safety

The TextDocument class is not thread-safe. It expects to have a single owner thread and will throw an InvalidOperationException when accessed from another thread.

However, there is a single method that is thread-safe: CreateSnapshot()
It returns an immutable snapshot of the document, and may be safely called even when the owner thread is concurrently modifying the document. This is very useful for features like a background parser that is running on its own thread. The overload CreateSnapshot(out ChangeTrackingCheckpoint) also returns a ChangeTrackingCheckpoint for the document snapshot. Once you have two checkpoints, you can call GetChangesTo to retrieve the complete list of document changes that happened between those versions of the document.

Rendering

Noticed how through the whole 'Document' section, there was no mention of extensibility? The text rendering infrastructure now has to compensate for that by being completely extensible.

The ICSharpCode.AvalonEdit.Rendering.TextView class is the heart of AvalonEdit. It takes care of getting the document onto the screen.

To do this in an extensible way, the TextView uses its own kind of model: the VisualLine. Visual lines are created only for the visible part of the document.

The rendering process looks like this:
rendering pipeline
The last step in the pipeline is the conversion to one or more System.Windows.Media.TextFormatting.TextLine instances. WPF then takes care of the actual text rendering.

Lifetime of visual lines

When the TextView needs to construct visual lines (usually before rendering), it first determines which DocumentLine is the top-most visible line in the currently viewed region. From there, it starts to build visual lines and also immediately does the conversion to TextLine (word-wrapping). The process stops once the viewed document region is filled.

The resulting visual lines (and TextLines) will be cached and reused in future rendering passes. When the user scrolls down, only the visual lines coming into view are created, the rest is reused.

The TextView.Redraw methods are used to remove visual lines from the cache. AvalonEdit will redraw automatically on the affected lines when the document is changed; and will invalidate the whole cache when any editor options are changed. You will only have to call Redraw manually if you write extensions to the visual line creation process that maintain their own data source. For example, the FoldingManager invokes Redraw whenever text sections are expanded or collapsed.

Calling Redraw does not cause immediate recreation of the lines. They are just removed from the cache so that the next rendering step will recreate them. All redraw methods will enqueue a new rendering step, using the WPF Dispatcher with a low priority.

Elements inside visual line

A visual line consists of a series of elements. These have both a DocumentLength measured in characters as well as a logical length called VisualLength. For normal text elements, the two lengths are identical; but some elements like fold markers may have a huge document length, yet a logical length of 1. On the other hand, some elements that are simply inserted by element generators may have a document length of 0, but still need a logical length of at least 1 to allow addressing elements inside the visual line.

The VisualColumn is a position inside a visual line as measured by the logical length of elements. It is counted starting from 0 at the begin of the visual line.
Also, inside visual lines, instead of normal offsets to the text document; relative offsets are used.
Absolute offset = relative offset + VisualLine.FirstDocumentLine.Offset
This means that offsets inside the visual line do not have to be adjusted when text is inserted or removed in front of the visual line; we simply rely on the document automatically updating DocumentLine.Offset.

The main job of a visual line element is to implement the CreateTextRun method. This method should return a System.Windows.Media.TextFormatting.TextRun instance that can be rendered using the TextLine class.

Visual line elements can also handle mouse clicks and control how the caret should move. The mouse click handling might suffice as a light-weight alternative to embedding inline UIElements in the visual lines.

Element Generators

You can extend the text view by registering a custom class deriving from VisualLineElementGenerator in the TextView.ElementGenerators collection. This allows you to add custom VisualLineElements. Using the InlineObjectElement class, you can even put interactive WPF controls (anything derived from UIElement) into the text document.

For all document text not consumed by element generators, AvalonEdit will create VisualLineText elements.

Usually, the construction of the visual line will stop at the end of the DocumentLine. However, if some VisualLineElementGenerator creates an element that's longer than the rest of the line, construction of the visual line may resume in another DocumentLine. Currently, only the FoldingElementGenerator can cause one visual line to span multiple DocumentLines.

Screenshot Folding and ImageElementGenerator

Here is the full source code for a class that implements embedding images into AvalonEdit:

public class ImageElementGenerator : VisualLineElementGenerator
{
    readonly static Regex imageRegex = new Regex(@"<img src=""([\.\/\w\d]+)""/?>",
                                                 RegexOptions.IgnoreCase);
    readonly string basePath;
    
    public ImageElementGenerator(string basePath)
    {
        if (basePath == null)
            throw new ArgumentNullException("basePath");
        this.basePath = basePath;
    }
    
    Match FindMatch(int startOffset)
    {
        // fetch the end offset of the VisualLine being generated
        int endOffset = CurrentContext.VisualLine.LastDocumentLine.EndOffset;
        TextDocument document = CurrentContext.Document;
        string relevantText = document.GetText(startOffset, endOffset - startOffset);
        return imageRegex.Match(relevantText);
    }
    
    /// Gets the first offset >= startOffset where the generator wants to construct
    /// an element.
    /// Return -1 to signal no interest.
    public override int GetFirstInterestedOffset(int startOffset)
    {
        Match m = FindMatch(startOffset);
        return m.Success ? (startOffset + m.Index) : -1;
    }
    
    /// Constructs an element at the specified offset.
    /// May return null if no element should be constructed.
    public override VisualLineElement ConstructElement(int offset)
    {
        Match m = FindMatch(offset);
        // check whether there's a match exactly at offset
        if (m.Success && m.Index == 0) {
            BitmapImage bitmap = LoadBitmap(m.Groups[1].Value);
            if (bitmap != null) {
                Image image = new Image();
                image.Source = bitmap;
                image.Width = bitmap.PixelWidth;
                image.Height = bitmap.PixelHeight;
                // Pass the length of the match to the 'documentLength' parameter
                // of InlineObjectElement.
                return new InlineObjectElement(m.Length, image);
            }
        }
        return null;
    }
    
    BitmapImage LoadBitmap(string fileName)
    {
        // TODO: add some kind of cache to avoid reloading the image whenever the
        // VisualLine is reconstructed
        try {
            string fullFileName = Path.Combine(basePath, fileName);
            if (File.Exists(fullFileName)) {
                BitmapImage bitmap = new BitmapImage(new Uri(fullFileName));
                bitmap.Freeze();
                return bitmap;
            }
        } catch (ArgumentException) {
            // invalid filename syntax
        } catch (IOException) {
            // other IO error
        }
        return null;
    }
}

Line Transformers

Line transformers can modify the visual lines after they have been generated. The main usage of this is to colorize the text, as done both by syntax highlighting and the selection.

The base classes ColorizingTransformer and DocumentColorizingTransformer help with this task by providing helper methods for colorizing that split up visual line elements where necessary. The difference between the two classes is that one works using visual columns whereas the other one uses offsets into the document.

Here is an example DocumentColorizingTransformer that highlights the word 'AvalonEdit' using bold font:

public class ColorizeAvalonEdit : DocumentColorizingTransformer
{
    protected override void ColorizeLine(DocumentLine line)
    {
        int lineStartOffset = line.Offset;
        string text = CurrentContext.Document.GetText(line);
        int start = 0;
        int index;
        while ((index = text.IndexOf("AvalonEdit", start)) >= 0) {
            base.ChangeLinePart(
                lineStartOffset + index, // startOffset
                lineStartOffset + index + 10, // endOffset
                (VisualLineElement element) => {
                    // This lambda gets called once for every VisualLineElement
                    // between the specified offsets.
                    Typeface tf = element.TextRunProperties.Typeface;
                    // Replace the typeface with a modified version of the same typeface
                    element.TextRunProperties.SetTypeface(new Typeface(
                        tf.FontFamily,
                        FontStyles.Italic,
                        FontWeights.Bold,
                        tf.Stretch
                    ));
                });
            start = index + 1; // search for next occurrence
}   }   }

Background renderers

Background renderers are simple objects that allow you to draw anything in the text view. They can be used to draw nice-looking backgrounds behind the text.

AvalonEdit contains the class BackgroundGeometryBuilder that helps with this task. You can use the static BackgroundGeometryBuilder.GetRectsForSegment to fetch a list of rectangles that contain text from the specified segment (you will get one rectangle per TextLine); or you can use the instance methods to build a PathGeometry for the text's outline. AvalonEdit also internally uses this geometry builder to create the selection with the rounded corners.

Inside SharpDevelop, the first option (getting list of rectangles) is used to render the squiggly red line that for compiler errors, while the second option is used to produce nice-looking breakpoint markers.

Editing

The TextArea class is handling user input and executing the appropriate actions. Both the caret and the selection are controlled by the TextArea.

You can customize the text area by modifying the TextArea.DefaultInputHandler by adding new or replacing existing WPF input bindings in it. You can also set TextArea.ActiveInputHandler to something different than the default to switch the text area into another mode. You could use this to implement an "incremental search" feature, or even a VI emulator.

The text area has the useful LeftMargins property - use it to add controls to the left of the text view that look like they're inside the scroll viewer, but don't actually scroll. The AbstractMargin base class contains some useful code to detect when the margin is attached/detaching from a text view; or when the active document changes. However, you're not forced to use it; any UIElement can be used as margin.

Folding

Folding (code collapsing) could be implemented as an extension to the editor without having to modify the AvalonEdit code. A VisualLineElementGenerator takes care of the collapsed sections in the text document; and a custom margin draws the plus and minus buttons.

That's exactly how folding is implemented in AvalonEdit. However, to make it a bit easier to use; the static FoldingManager.Install method will create and register the necessary parts automatically.

All that's left for you is to regularly call FoldingManager.UpdateFoldings with the list of foldings you want to provide.

Here is the full code required to enable folding:

foldingManager = FoldingManager.Install(textEditor.TextArea);
foldingStrategy = new XmlFoldingStrategy();
foldingStrategy.UpdateFoldings(foldingManager, textEditor.Document);

If you want the folding markers to update when the text is changed, you have to repeat the foldingStrategy.UpdateFoldings call regularly.

The sample application to this article also contains the BraceFoldingStrategy that folds using { and }. However, it is a very simple implementation and does not realize that { and } inside strings or comments are not code.

Syntax highlighting

TODO: write this section

Points of Interest

Did you learn anything interesting/fun/annoying while writing the code? Did you do anything particularly clever or wild or zany?

History

Keep a running update of any changes or improvements you've made here.

Note: although my sample code is provided under the MIT license, ICSharpCode.AvalonEdit itself is provided under the terms of the GNU LGPL.