Charles Petzold



Pagination with DirectWrite

October 11, 2013
New York, N.Y.

Beginning in the August 2013 issue of MSDN Magazine, I've turned the focus of my DirectX Factor column away from XAudio2 and towards Direct2D and DirectWrite. The August installment shows how to implement a simple finger-painting program with Direct2D geometries, and the September column explored various geometry manipulations.

The October 2013 DirectX Factor seemingly abandons Direct2D geometries for some basic DirectWrite text formatting. But I have a longer arc in mind: As you'll see in the months ahead, I'll be using the GetGlyphRunOutline method of IDWriteFontFace to obtain geometries that describe the outline of text characters, and then plunge into all the neat things you can do with that, including Direct2D effects that provide a smooth and gentle entry into the world of 3D graphics.

A prerequisite to calling GetGlyphRunOutline is a familiarity with glyph runs themselves, which is the topic of November's column. (Look for it around the first of the month.) But glyph runs represent a rather advanced approach to rendering text, and I thought it would be perverse to plunge right into glyph runs without first discussing more routine text formatting and rendering techniques. The normal way to render text makes use of the IDWriteTextFormat and IDWriteTextLayout interfaces, and the corresponding DrawText and DrawTextLayout methods defined by ID2D1RenderTarget. That's part of the rationale behind the October column.

As I was originally working on that column, I very much wanted to demonstrate how to paginate text or put text in columns (which are algorithmically equivalent exercises) or wrap text around a picture (which is algorithmically similar). I realized that IDWriteTextLayout came tantalizingly close to providing this facility but it was missing a crucial method. This missing method is conceptually simple but the devil is in the implementation.

Of course, DirectWrite lets you get as low-level as you need, so you can always take on the job of formatting paragraphs and pages yourself by obtaining character metrics and figuring out where the white space is and where the line breaks and paragraph breaks occur. But IDWriteTextLayout does such a good job at paragraph formatting that it would be a shame not to exploit it for pagination jobs.

So that's my basic goal here: Continue to use IDWriteTextLayout for paragraph formatting while also using it to implement pagination. This blog entry provides the missing magic method necessary for this feat. This is blog entry is where I would have taken the DirectX Factor column had I chosen to stick with DirectWrite instead of getting into glyph run geometries.

Each of the numbered headers below is associated with a project named PaginationExperimentN, where N is the number of the header. You can download all 6 projects here

1. Pagination with Whole Paragraphs

The October DirectX Factor column was the first one I wrote where I was able to target Windows 8.1 Preview and use Visual Studio Express 2013 Preview. A previous blog entry discussed the differences between the "DirectX App (XAML)" template in that new version of Visual Studio and the earlier "Direct2D App (XAML)" template in Visual Studio 2012.

The DirectX project templates have changed once again in the release candidate (RC) of Visual Studio 2013, although not nearly as much. The biggest change involves the rendering loop: In Visual Studio 2013 Preview, the "DirectX App (XAML)" template used a CompositionTarget::Rendering event handler, which runs in the user interface thread. In Visual Studio 2013 RC, the rendering loop is implemented in the ProjectNameMain class, and it's an infinite while loop passed to a WorkItemHandler constructor so it runs in a secondary thread:

auto workItemHandler = ref new WorkItemHandler([this](IAsyncAction ^ action)
{
    while (action->Status == AsyncStatus::Started)
    {
        critical_section::scoped_lock lock(m_criticalSection);
        Update();
        if (Render())
        {
            m_deviceResources->Present();
        }
    }
});

Notice the code that instantiates a critical_section::scoped_lock object named lock. I discussed that class in the October DirectX Factor column, and I guess my discussion was somewhat prescient, for this class is now used extensively in the new "DirectX App (XAML)" template. In both the DirectXPage and ProjectNameMain classes, it prevents code from two different threads accessing DirectX concurrently, which can be problematic.. An object of type critical_section is a private data member in the ProjectNameMain class, and the class provides a public GetCriticalSection method so that DirectXPage can access it as well. The template architecture implies that Renderer classes should not need access to it.

Let's begin with a simple exercise: Paginating a document without splitting paragraphs between pages. Each resultant page contains only entire paragraphs. This program is based on the ScrollableAlice project in the October DirectX Factor column, and it retains a lot of the code in that project, but converted to the new "DirectX App (XAML)" template.

The earlier ScrollableAlice program used the IDWriteTextLayout interface and DrawTextLayout method to display Chapter 7 of Alice's Adventures in Wonderland with vertical scrolling. The new PaginationExperiment1 program divides the chapter into columns, which it displays horizontally with scrolling. (As I mentioned, dividing text into columns is algorithmically equivalent to pagination.)

The Content folder of the PaginationExperiment1 program contains the same AliceParagraphGenerator class as the earlier project, and it makes use of the same FirstLineIndent class as the earlier project. AliceParagraphGenerator creates a collection of Paragraph objects, one for each paragraph in the chapter. I slightly enhanced the Paragraph structure to contain a new field named Origin:

struct Paragraph
{
    std::wstring Text;
    Microsoft::WRL::ComPtr<IDWriteTextLayout> TextLayout;
    float TextHeight;
    float SpaceAfter;
    D2D1_POINT_2F Origin;
};

This field is not set by AliceParagraphGenerator. It's set instead in the pagination logic when paragraphs are divided into pages. The project also includes a new stucture named Page:

struct Page
{
    D2D1_SIZE_F            Size;
    std::vector<Paragraph> Paragraphs;
};

As with the earlier program, the text of the chapter is loaded asynchronously in DirectXPage:

    Loaded += ref new RoutedEventHandler([this](Object^, Object^)
    {
        create_task(PathIO::ReadLinesAsync("ms-appx:///Text/AliceChapterVII.txt",
            UnicodeEncoding::Utf8))
            .then([this](IVector<String^>^ lines)
        {
            std::vector<std::wstring> vLines;

            for (String^ line : lines)
                vLines.push_back(std::wstring(line->Data()));

            critical_section::scoped_lock lock(m_main->GetCriticalSection());
            m_main->SetAliceText(vLines);
        });
    });

The SetAliceText in PaginationExperiment1Main calls a same-named method in PaginationExperiment1Renderer.

The PaginationExperiment1Renderer class declares a private field to store a collection of Page objects:

std::vector<Page> m_pages;

The chapter is re-paginated whenever the text of the chapter changes (which should only happen once when the text is loaded) and when the window size changes. The pagination code sets the height of each "page" (actually column) to the height of the window, but the width and margins are hard-coded:

void PaginationExperiment1Renderer::CreateWindowSizeDependentResources()
{
    PaginateChapter();
}

void PaginationExperiment1Renderer::SetAliceText(std::vector<std::wstring> lines)
{
    m_aliceParagraphGenerator.SetText(lines);
    PaginateChapter();
}

void PaginationExperiment1Renderer::PaginateChapter()
{
    if (m_aliceParagraphGenerator.ParagraphCount() == 0)
        return;

    Windows::Foundation::Size windowSize = m_deviceResources->GetLogicalSize();

    if (windowSize.IsEmpty)
        return;

    // Set some constants and calculate page and text sizes
    float pageHeight = windowSize.Height;
    float topMargin = 25;
    float bottomMargin = 25;
    float textHeight = pageHeight - topMargin - bottomMargin;

    float pageWidth = 400;
    float leftMargin = 25;
    float rightMargin = 25;
    float textWidth = pageWidth - leftMargin - rightMargin;

    // Set the width of all the paragraphs; also generates formatted heights
    m_aliceParagraphGenerator.SetWidth(textWidth);

    // Prepare for assembling paragraphs into pages
    std::vector<Paragraph> paragraphs = 
                    m_aliceParagraphGenerator.GetParagraphs();
    m_pages.clear();

    // Prepare for the first page
    float x = leftMargin;
    float y = topMargin;
    float maxY = pageHeight - bottomMargin;
    Page page;
    page.Size = SizeF(pageWidth, pageHeight);   // uniform page size

    // Loop through all the paragraphs
    for (Paragraph& paragraph : paragraphs)
    {
        // Check if the paragraph will make the page too long
        if (y + paragraph.TextHeight > maxY)
        {
            // If so, save the page and begin a new page
            m_pages.push_back(page);
            x = leftMargin;
            y = topMargin;
            page.Paragraphs.clear();
        }

        // Determine where the paragraph should appear and save it
        paragraph.Origin = Point2F(x, y);
        page.Paragraphs.push_back(paragraph);

        // Bump up the y coordinate for the next paragraph
        y += paragraph.TextHeight + paragraph.SpaceAfter;
    }

    // Save the last page as well
    m_pages.push_back(page);

    ...
}

Notice the early call to the SetWidth method of AliceParagraphGenerator. That call results in the desired paragraph width and an infinite height being set on each IDWriteTextLayout object through calls to SetMaxWidth and SetMaxHeight. The GetMetrics method of IDWriteTextLayout then obtains information that includes the height of the formatted paragraph, and this height is stored in the TextHeight field of the Paragraph object.

The Render method in PaginationExperiment1Renderer then simply loops through the Page objects and Paragraph objects and displays each one at the indicated Origin field, bumping up the translate transform after each column is completed:

void PaginationExperiment1Renderer::Render()
{
    ...

    for (const Page& page : m_pages)
    {
        for (const Paragraph& paragraph : page.Paragraphs)
        {
            context->DrawTextLayout(paragraph.Origin, 
                                    paragraph.TextLayout.Get(), 
                                    m_blackBrush.Get());
        }

        // Set transform for next column
        D2D1_MATRIX_3X2_F xform;
        context->GetTransform(&xform);
        xform = Matrix3x2F::Translation(page.Size.width, 0) * xform;
        context->SetTransform(&xform);
    }

    ...
}

Here's the result, shrunk down to 1/3 size, but click to see the full image:

Notice that if a paragraph doesn't fully fit in a column, the entire paragraph is moved to the next column. This is obviously a very primitive pagination algorithm! Fixing that problem is a high priority. The pagination code also contains a bug: If a paragraph is too long to display in a column, an infinite loop will result.

But also notice that otherwise the text is nicely formatted, with a first line indent at the beginning of each paragraph, and with different font sizes and italics, all of which are part of the power of IDWriteTextLayout as I demonstrated in the October 2013 DirectX Factor column.

The program also provides horizontal scrolling for viewing the rest of the chapter. As in the earier ScrollableAlice project, this program creates a GestureRecognizer class that converts Pointer events to Manipulation events, which I prefer to use for scrolling because they provide touch inertia. This GestureRecognizer is created in the same secondary thread that executes the Pointer events, and fires a ManipulationUpdated event handled in DirectXPage like so:

void DirectXPage::OnManipulationUpdated(GestureRecognizer^ sender, 
                                        ManipulationUpdatedEventArgs^ args)
{
    m_main->ScrollHorizontalDelta(args->Delta.Translation.X);
}

The PaginationExperiment1Main class passes the information on to PaginationExperiment1Renderer. The PaginateChapter method concludes by calculating a maximum scrolling value, so ScrollHorizontalDelta can use that to calculate a current srolling value:

void PaginationExperiment1Renderer::PaginateChapter()
{
    ...

    // Calculate the maximum scroll
    m_maximumScroll = m_pages.size() * pageWidth - windowSize.Width;
    ScrollHorizontalDelta(0);
}

void PaginationExperiment1Renderer::ScrollHorizontalDelta(float deltaX)
{
    m_currentScroll = max(0, min(m_maximumScroll, m_currentScroll - deltaX));
}

The Render method uses m_currentScroll to set an initial transform prior to the DrawTextLayout calls:

void PaginationExperiment1Renderer::Render()
{
    ...

    Matrix3x2F scrollTransform = Matrix3x2F::Translation(-m_currentScroll, 0);
    context->SetTransform(scrollTransform * 
                          m_deviceResources->GetOrientationTransform2D());

    for (const Page& page : m_pages)
    {
        for (const Paragraph& paragraph : page.Paragraphs)
        {
            context->DrawTextLayout(paragraph.Origin, 
                                    paragraph.TextLayout.Get(), 
                                    m_blackBrush.Get());
        }

        // Set transform for next column
        D2D1_MATRIX_3X2_F xform;
        context->GetTransform(&xform);
        xform = Matrix3x2F::Translation(page.Size.width, 0) * xform;
        context->SetTransform(&xform);
    }

    ...
}

To keep the program simple, I've cut a few corners. Various fields of the Paragraph structure have roles in different parts of the program: The Text field is only used within the AliceParagraphGenerator class to simplify applying italic formatting. The TextHeight and SpaceAfter fields are used by the PaginateChapter method, but only the Origin and TextLayout fields are required in the Render method.

The program can be made more efficient in a couple ways: The PaginationExperiment1Renderer class redisplays the text at the frame rate of the video display, and it only needs to redisplay the text when something is changed, such as the scrolling offset. When rendering, it need only call DrawTextLayout for pages that are currently displayed on the screen. These issues will be address much later in this blog entry.

2. Splitting Paragraphs Between Pages

But I really want to focus now on the problem of splitting paragraphs that come at the end of a page (or column). We need to display part of the paragraph at the bottom of one page, and the remainder of the paragraph at the top of the next page.

Each paragraph is an IDWriteTextLayout object, and IDWriteTextLayout has tools to determine how paragraphs can be split between pages or columns. Besides the GetMetrics method that tells you the total height of the formatted paragraph, IDWriteTextLayout also has a GetLineMetrics method. This method returns a collection of DWRITE_LINE_METRICS structures, one for each line in the formatted paragraph. These tell you the number of text characters in that line, and the rendered height of the line.

Using these DWRITE_LINE_METRICS structures, you can determine how many lines of the paragraph can fit at the bottom of a page, and also the number of characters in those lines. The remaining lines of the paragraph go at the top of the next page.

Using GetLineMetrics to determine how a paragraph should straddle two pages is fairly straightforward. But actually splitting that paragraph into two pieces is not so easy. I suppose one approach is to display the same IDWriteTextLayout for the part of the paragraph at the bottom of one page and the part of the paragraph at the top of the next page, but with two different clipping rectangles that restrict the display to exactly the part of the paragraph you want. But this seems tricky to me, and I'm not real fond of the solution.

Moreover, while this clipping approach might work for displaying text in pages or columns, it would not work for a similar problem of wrapping text around a rectangular image. For wrapping you need to determine how much of the paragraph can be displayed on top of the image, and then how much can fit on one side of the image. The part of the paragraph at the side of the image has a different width and text starting position than the part of the paragraph above the image.

What you really want is the ability to split an IDWriteTextLayout object into two new IDWriteTextLayout objects based on an offset into the text string of the original IDWriteTextLayout:

HRESULT Split(IDWriteTextLayout* inputLayout, 
              UINT32 textOffset,
              IDWriteTextLayout** newLayout1,
              IDWriteTextLayout** newLayout2);

For example, suppose the text associated with an IDWriteTextLayout paragraph contains 300 characters. You determine from examining the line heights that 225 characters should be at the bottom of one page, and the reminaing 75 at the top of the next page. You pass this IDWriteTextLayout object to Split with a text offset of 225, and you get back two new IDWriteTextLayout objects, one with the first 225 characters of the paragraph and the other with the last 75 characters.

This Split function would be great! But it’s not that easy. First of all, you can’t write such a function solely using information available externally from IDWriteTextLayout. IDWriteTextLayout doesn't even have a GetText method. This text is crucial. You originally specify this text string in the CreateTextLayout method of IDWriteFactory when creating an IDWriteTextLayout object, but you can’t later retrieve that text from the object. This hypothetical Split function can’t create begin to create two new IDWriteTextLayout objects without that original text string.

Of course you can just retain that original text string and include it as an argument to the Split function. But there’s an even more intractable problem: Consider the SetTextStyle method of IDWriteTextLayout. This is how you italicize a group of consecutive characters in the paragraph. You pass this method a member of the DWRITE_FONT_STYLE enumeration to indicate NORMAL, OBLIQUE, or ITALIC; and an DWRITE_TEXT_RANGE structure with a text start position and a length that indicates where the formatting is to be applied.

If you’re splitting an IDWriteTextLayout object into two new IDWriteTextLayout objects, you need to transfer this styling into one or both of the two new objects with adjusted DWRITE_TEXT_RANGE values.

IDWriteTextLayout has a GetTextStyle that at first seems like it might help to obtain styling currently associated with the paragraph. But the GetTextStyle method must be called with a particular text position, which means that if you need to extract all the italic information from a paragraph, you need to call GetTextStyle for every text position. Moroever, there are 14 Set methods that work like SetTextStyle and accept a text range. (I'll refer to these as the "range-based" methods.) All the information passed to these methods must be retained and modified for a successful Split function.

What this strongly suggests to me is that Split can't be external to the IDWriteTextLayout object. It needs to be a public method of the class, and this means that wee need a custom class that itself implements the IDWriteTextLayout interface. This class maintains a real IDWriteTextLayout object internally, and it also defines a Split method that creates two instances of this custom class from a single instance. This custom class must itself retain sufficient information when all the 14 range-based Set methods are called so the new instances created from the Split method have the correct formatting applied.

As a feasibility study of sorts, the PaginationExperiment2 project includes a class named SplittableTextLayout that implements the IDWriteTextLayout interface and includes a Split method. To keep this feasibility study simple — well, as simple as possible given that it must implement the IDWriteTextLayout interface — SplittableTextLayout does not retain any formatting applied through method calls.

SplittableTextLayout contains a public static method named Create for instantiation. (A static method can return an HRESULT to indicate success or failure whereas a constructor cannot, so the SplittableTextLayout constructor is protected.) This Create method has the same arguments as the CreateTextLayout method of IDWriteFactory, and it uses CreateTextLayout to create an IDWriteTextLayout object that it saves as a private field. All the arguments to Create are also saved as private fields:

HRESULT SplittableTextLayout::Create(_In_ const WCHAR * string,
                                     UINT32  stringLength,
                                     IDWriteTextFormat * textFormat,
                                     FLOAT  maxWidth,
                                     FLOAT  maxHeight,
                                     _Outptr_ SplittableTextLayout ** splittableTextLayout)
{
    ComPtr<IDWriteFactory2> dwriteFactory;

    HRESULT hr = DWriteCreateFactory(DWRITE_FACTORY_TYPE_SHARED,
                                    __uuidof(IDWriteFactory2),
                                    &dwriteFactory);

    if (FAILED(hr))
        return hr;

    ComPtr<IDWriteTextLayout> dwriteTextLayout;
    
    hr = dwriteFactory->CreateTextLayout(string, 
                                         stringLength, 
                                         textFormat, 
                                         maxWidth, 
                                         maxHeight, 
                                         &dwriteTextLayout);

    if (FAILED(hr))
        return hr;

    SplittableTextLayout* pSplittableTextLayout = new SplittableTextLayout();

    pSplittableTextLayout->m_dwriteFactory = dwriteFactory;
    pSplittableTextLayout->m_dwriteTextFormat = textFormat;
    pSplittableTextLayout->m_string = string;
    pSplittableTextLayout->m_stringLength = stringLength;
    pSplittableTextLayout->m_maxWidth = maxWidth;
    pSplittableTextLayout->m_maxHeight = maxHeight;

    dwriteTextLayout.As(&pSplittableTextLayout->m_dwriteTextLayout);

    *splittableTextLayout = pSplittableTextLayout;
    return S_OK;
}

SplittableTextLayout implements the IDWriteTextLayout1 interface, which derives from IDWriteTextLayout, which derives from IDWriteTextFormat, which derives from IUnknown, which means that SplittableTextLayout contains all the methods declared by all these interfaces, and that totals over 70. However, in this feasibility-study version, these methods are defined in the header file to simply pass the method call along to the private IDWriteTextLayout1 object, for example:

HRESULT STDMETHODCALLTYPE SetUnderline(BOOL hasUnderline, 
                                       DWRITE_TEXT_RANGE textRange) 

    return m_dwriteTextLayout->SetUnderline(hasUnderline, textRange);
}

An exception to this rule is SetMaxWidth and SetMaxHeight. The class already defines private fields for these items because they are set in the Create method, so defining these methods to save the new values was very easy:

HRESULT STDMETHODCALLTYPE SetMaxHeight(FLOAT maxHeight) 

    m_maxHeight = maxHeight;
    return m_dwriteTextLayout->SetMaxHeight(maxHeight);
}

HRESULT STDMETHODCALLTYPE SetMaxWidth(FLOAT maxWidth) 

    m_maxWidth = maxWidth;
    return m_dwriteTextLayout->SetMaxWidth(maxWidth);
}

The only new method that SplittableTextLayout defines on its own is Split, and like the description above it creates two SplittlableTextLayout by splitting the text of an existing object based on a character offset:

HRESULT SplittableTextLayout::Split(UINT32 stringOffset, 
                                    _Outptr_ SplittableTextLayout ** splittableTextLayout1, 
                                    _Outptr_ SplittableTextLayout ** splittableTextLayout2)
{
    if (stringOffset < 1 || stringOffset >= m_stringLength)
        return E_INVALIDARG;

    std::wstring str1 = std::wstring(m_string).substr(0, stringOffset);
    std::wstring str2 = std::wstring(m_string).substr(stringOffset, m_stringLength - stringOffset);

    SplittableTextLayout* textLayout1;
    HRESULT hr = SplittableTextLayout::Create(str1.data(), 
                                              str1.length(), 
                                              m_dwriteTextFormat.Get(), 
                                              m_maxWidth, 
                                              m_maxHeight, 
                                              &textLayout1);
    if (FAILED(hr))
        return hr;

    SplittableTextLayout* textLayout2;
    hr = SplittableTextLayout::Create(str2.data(), 
                                      str2.length(), 
                                      m_dwriteTextFormat.Get(), 
                                      m_maxWidth, 
                                      m_maxHeight, 
                                      &textLayout2);
    if (FAILED(hr))
        return hr;

    *splittableTextLayout1 = textLayout1;
    *splittableTextLayout2 = textLayout2;

    return S_OK;
}

The PaginationExperiment2 project includes this class, but I made some other changes as well: I removed the TextHeight field from the Paragraph structure and I removed the SetWidth method from AliceParagraphGenerator because I realized that heights of split paragraphs would need to be recalculated during the pagination process. The TextLayout field of Paragraph is now a SplittableTextLayout object:

struct Paragraph
{
    std::wstring Text;
    Microsoft::WRL::ComPtr<SplittableTextLayout> TextLayout;
    float SpaceAfter;
    D2D1_POINT_2F Origin;
};

AliceParagraphGenerator now creates that object instead of creating a normal IDWriteTextLayout object.

The only other real change in this program is the PaginateChapter method in the PaginationExperiment2Renderer class. This method now works in conjunction with another method named AddParagraphToPage that is responsible for determining when and how a paragraph must be split between pages. The Boolean return value of AddParagraphToPage indicates that the page is complete. Here are both methods in their entirety:

void PaginationExperiment2Renderer::PaginateChapter()
{
    if (m_aliceParagraphGenerator.ParagraphCount() == 0)
        return;

    Windows::Foundation::Size windowSize = m_deviceResources->GetLogicalSize();

    if (windowSize.IsEmpty)
        return;

    // Set some constants and calculate page and text sizes
    float pageHeight = windowSize.Height;
    float topMargin = 25;
    float bottomMargin = 25;
    float textHeight = pageHeight - topMargin - bottomMargin;

    float pageWidth = 400;
    float leftMargin = 25;
    float rightMargin = 25;
    float textWidth = pageWidth - leftMargin - rightMargin;

    // Prepare assembling pages
    std::vector<Paragraph> paragraphs = m_aliceParagraphGenerator.GetParagraphs();
    m_pages.clear();

    // Prepare for the first page
    float x = leftMargin;
    float y = topMargin;
    float maxY = pageHeight - bottomMargin;
    Page page;
    page.Size = SizeF(pageWidth, pageHeight);   // uniform page size

    // Loop through all the paragraphs
    for (size_t i = 0; i < paragraphs.size(); i++)
    {
        Paragraph paragraph = paragraphs.at(i);

        // Set paragraph height and width
        DX::ThrowIfFailed(
            paragraph.TextLayout->SetMaxWidth(textWidth)
            );

        DX::ThrowIfFailed(
            paragraph.TextLayout->SetMaxHeight(FloatMax())
            );

        // Add paragraph to page (or possibly pages)
        bool isSplit = false;

        do
        {
            while (AddParagraphToPage(page, paragraph, x, y, maxY, isSplit))
            {
                m_pages.push_back(page);
                x = leftMargin;
                y = topMargin;
                page.Paragraphs.clear();
            }
        } while (isSplit);
    }

    m_pages.push_back(page);

    // Calculate the maximum scroll
    m_maximumScroll = m_pages.size() * pageWidth - windowSize.Width;
    ScrollHorizontalDelta(0);
}

bool PaginationExperiment2Renderer::AddParagraphToPage(Page& page, 
                                                       Paragraph& paragraph, 
                                                       float x, 
                                                       float& y, 
                                                       float maxY, 
                                                       bool& isSplit)
{
    // We already know where the paragraph is going
    paragraph.Origin = Point2F(x, y);

    // Obtain the height of the paragraph
    DWRITE_TEXT_METRICS textMetrics;

    DX::ThrowIfFailed(
        paragraph.TextLayout->GetMetrics(&textMetrics)
        );

    float paragraphHeight = textMetrics.height;

    // Check if it overruns the maximum height
    if (y + paragraphHeight > maxY)
    {
        // Paragraph is too long, check for a split
        UINT32 lineCount = textMetrics.lineCount;

        std::vector<DWRITE_LINE_METRICS> lineMetrics(lineCount);

        DX::ThrowIfFailed(
            paragraph.TextLayout->GetLineMetrics(lineMetrics.data(), 
                                                 lineMetrics.size(), 
                                                 &lineCount)
            );

        // Accumulate the number of characters in the lines that fit
        int characterCount = 0;

        for (DWRITE_LINE_METRICS lineMetric : lineMetrics)
        {
            if (y + lineMetric.height > maxY)
            {
                break;
            }
            characterCount += lineMetric.length;
            y += lineMetric.height;
        }

        // At this point, characterCount might be zero, but is less than the paragraph
        if (characterCount != 0)
        {
            ComPtr<SplittableTextLayout> textLayout1;
            ComPtr<SplittableTextLayout> textLayout2;

            DX::ThrowIfFailed(
                paragraph.TextLayout->Split(characterCount, &textLayout1, &textLayout2)
                );

            // This is the displayable first half of the paragraph
            paragraph.TextLayout = textLayout1;
            page.Paragraphs.push_back(paragraph);

            // This is the second half of the paragraph
            paragraph.TextLayout = textLayout2;
        }

        isSplit = characterCount != 0;
        return true;
    }

    y += paragraphHeight + paragraph.SpaceAfter;
    page.Paragraphs.push_back(paragraph);
    isSplit = false;
    return false;
}

The scrolling logic and the Render method are all the same as the previous program. Here's the result:

As you can see, paragraphs are successfully split to straddle two columns. The feasibility study is a success! However, in the process these split paragraphs have lost a lot of their formatting. There's no first-line indent, there are no italicized words, and the paragraph alignment has reverted back to the default rather than being justified. All these problems are a result of the simple implementation of SplittableTextLayout that does not retain formatting information to transfer to the two new objects during the paragraph split.

But before I get to that, you might be scratching your head a bit. I mentioned that the Render method in PaginationExperiment2 is the same as PaginationExperiment1, which means it contains code that looks like this:

void PaginationExperiment2Renderer::Render()
{
    ...

    for (const Page& page : m_pages)
    {
        for (const Paragraph& paragraph : page.Paragraphs)
        {
            context->DrawTextLayout(paragraph.Origin, 
                                    paragraph.TextLayout.Get(), 
                                    m_blackBrush.Get());
        }

        ...
    }

    ...
}

But that TextLayout field of the Paragraph structure is a SplittlableTextLayout! This SplittableTextLayout barely contains any real code! How can this possibly work?

When your program calls DrawTextLayout, you pass to it an object that implements IDWriteTextLayout. But DrawTextLayout barely does anything with that object except to call a method defined by IDWriteTextLayout named Draw. And here's how SplittableTextLayout implements Draw:

HRESULT STDMETHODCALLTYPE Draw(void * clientDrawingContext, 
                               IDWriteTextRenderer * renderer, 
                               FLOAT  originX, 
                               FLOAT  originY)
{
    return m_dwriteTextLayout->Draw(clientDrawingContext, 
                                    renderer, 
                                    originX, 
                                    originY);
}

That Draw method in the real IDWriteTextLayout object is what's actually performing all the rendering, and obviously it has access to all the information in the real IDWriteTextLayout object.

3. Retaining the Formatting in Split Paragraphs

The PaginationExperiment3 project is identical to PaginationExperiment2 except for an enhanced version of SplittableTextLayout that transfers all the original formatting to the two SplittableTextLayout objects that the Split method creates.

I performed this enhancement in increments. I added the first enhancement to the Split method after the two new SplittableTextLayout objects are created, named textLayout1 and textLayout2. The following code calls methods defined by IDWriteTextFormat and apply to the entire paragraph:

HRESULT SplittableTextLayout::Split(UINT32 stringOffset, 
                                    _Outptr_ SplittableTextLayout ** splittableTextLayout1, 
                                    _Outptr_ SplittableTextLayout ** splittableTextLayout2)
{
    ...

    // Transfer settings from this instance to the two split text layouts
    DWRITE_FLOW_DIRECTION flowDirection = this->GetFlowDirection();
    textLayout1->SetFlowDirection(flowDirection);
    textLayout2->SetFlowDirection(flowDirection);

    float tabStop = this->GetIncrementalTabStop();
    textLayout1->SetIncrementalTabStop(tabStop);
    textLayout2->SetIncrementalTabStop(tabStop);

    DWRITE_LINE_SPACING_METHOD lineSpacingMethod;
    float lineSpacing, baseline;
    if (FAILED(
        hr = this->GetLineSpacing(&lineSpacingMethod, &lineSpacing, &baseline))
        ) return hr;
    if (FAILED(
        hr = textLayout1->SetLineSpacing(lineSpacingMethod, lineSpacing, baseline))
        ) return hr;
    if (FAILED(
        hr = textLayout2->SetLineSpacing(lineSpacingMethod, lineSpacing, baseline))
        ) return hr;

    DWRITE_PARAGRAPH_ALIGNMENT paragraphAlignment = this->GetParagraphAlignment();
    textLayout1->SetParagraphAlignment(paragraphAlignment);
    textLayout2->SetParagraphAlignment(paragraphAlignment);

    DWRITE_READING_DIRECTION readingDirection = this->GetReadingDirection();
    textLayout1->SetReadingDirection(readingDirection);
    textLayout2->SetReadingDirection(readingDirection);

    DWRITE_TEXT_ALIGNMENT textAlignment = this->GetTextAlignment();
    textLayout1->SetTextAlignment(textAlignment);
    textLayout2->SetTextAlignment(textAlignment);

    DWRITE_TRIMMING trimmingOptions;
    ComPtr<IDWriteInlineObject> trimmingSign;
    if (FAILED(
        hr = this->GetTrimming(&trimmingOptions, &trimmingSign))
        ) return hr;
    if (FAILED(
        hr = textLayout1->SetTrimming(&trimmingOptions, trimmingSign.Get()))
        ) return hr;
    if (FAILED(
        hr = textLayout2->SetTrimming(&trimmingOptions, trimmingSign.Get()))
        ) return hr;

    DWRITE_WORD_WRAPPING wordWrapping = this->GetWordWrapping();
    textLayout1->SetWordWrapping(wordWrapping);
    textLayout2->SetWordWrapping(wordWrapping);

    ...

    *splittableTextLayout1 = textLayout1;
    *splittableTextLayout2 = textLayout2;

    return S_OK;
}

The only use I make of these formatting properties in the display of the "Alice" chapter is the text alignment, which is set to DWRITE_TEXT_ALIGNMENT_JUSTIFIED for everything except the titles. The new split paragraphs should now be justified:

Well, yes and no, and the first time I realized what was happening here, I feared that my entire approach to pagination was doomed.

Yes, doomed.

When a paragraph is justified, the renderer inserts tiny slivers of space between the letters to make each line occupy the full width allowed for the paragraph. This happens for all the lines of the paragraph. All the lines, that is, except the last line. That line is allowed to stop short of the right margin. Take a look at the last line in that first column: "Alice indignantly, and she sat". That line is not justified correctly.

DrawTextLayout thinks that the IDWriteTextLayout objects passed to it represent full paragraphs, so it treats the last line of justified paragraphs special. But that's not what I want when I split the paragraphs for pagination. I want the last line in the first part of the paragraph treated like the earlier lines. Unfortantely I've found no way to inform DrawTextLayout of my desires.

In fact, I couldn't think of any solution except a kludge. Suppose I were to add a bunch of spaces and an invisible character to the first of the two pieces of the split paragraphs. Those spaces would be discarded (because they occur at the end of a line) and the invisible character would be displayed on the next line, which would be the new last line of the paragraph, so the old last line would be justified correctly.

Here's my kludge, which appears right after the text string is split:

std::wstring str1 = std::wstring(m_string).substr(0, stringOffset);
std::wstring str2 = std::wstring(m_string).substr(stringOffset, m_stringLength - stringOffset);

// Add some space plus non-printable character so last line 
//  of first paragraph segment has proper alignment
if (this->GetTextAlignment() == DWRITE_TEXT_ALIGNMENT_JUSTIFIED)
{
    float fontSize = this->GetFontSize();
    float maxWidth = this->GetMaxWidth();

    // Assume space has width of 5% of em height.
    //  (Very low, but the Parchment font is 9%)
    int numSpaces = (int) (maxWidth / (fontSize * 0.05));

    str1 += std::wstring(numSpaces, ' ') + L"\x0001";
}

Hey! I already admitted it was a kludge! I'm trying to solve problems here! And it seems to work:

I'm not sure how this affects hit-testing however.

Anyway, at this point I was ready to take on the really big job: Preserving all the range-based character formatting. IDWriteTextLayout and IDWriteTextLayout1 contain a total of 14 range-based character-formatting properties, and these are identified in an enumeration I made private to SplittableTextLayout:

enum class PropertyType
{
    // IDWriteTextLayout
    DrawingEffect,
    FontCollection,
    FontFamilyName,
    FontSize,
    FontStretch,
    FontStyle,
    FontWeight,
    InlineObject,
    LocaleName,
    Strikethrough,
    Typography,
    Underline,

    // IDWriteTExtLayout1
    CharacterSpacing,
    PairKerning
};

These properties are associated with several different data types:

The next step is a structure (also private to SplittableTextLayout) that can store values or objects of all these types:

struct PropertySetting
{
    PropertyType                        PropertyType;
    Microsoft::WRL::ComPtr<IUnknown>    IUnknownPointer;
    const WCHAR *                       StringPointer;
    float                               FloatValue[3];
    int                                 EnumValue;
    BOOL                                Boolean;
    DWRITE_TEXT_RANGE                   TextRange;
};

The first field in this structure identifies the PropertyType and the last field is a DWRITE_TEXT_RANGE object. A private data member in SplittableTextLayout stores all the range-based formatting associated with the instance:

std::vector<PropertySetting> m_propertySettings;

Storing values in this vector is the responsibility of five very similar methods:

void SplittableTextLayout::SaveInterfaceType(PropertyType propertyType, 
                                             IUnknown * pIUnknown, 
                                             DWRITE_TEXT_RANGE textRange)
{
    PropertySetting propertySetting;
    propertySetting.PropertyType = propertyType;
    propertySetting.IUnknownPointer = pIUnknown;
    propertySetting.TextRange = textRange;

    m_propertySettings.push_back(propertySetting);
}

void SplittableTextLayout::SaveStringType(PropertyType propertyType, 
                                          const WCHAR * pString, 
                                          DWRITE_TEXT_RANGE textRange)
{
    PropertySetting propertySetting;
    propertySetting.PropertyType = propertyType;
    propertySetting.StringPointer = pString;
    propertySetting.TextRange = textRange;

    m_propertySettings.push_back(propertySetting);
}

void SplittableTextLayout::SaveFloatType(PropertyType propertyType, 
                                         float val0, float val1, float val2, 
                                         DWRITE_TEXT_RANGE textRange)
{
    PropertySetting propertySetting;
    propertySetting.PropertyType = propertyType;
    propertySetting.FloatValue[0] = val0;
    propertySetting.FloatValue[1] = val1;
    propertySetting.FloatValue[2] = val2;
    propertySetting.TextRange = textRange;

    m_propertySettings.push_back(propertySetting);
}

void SplittableTextLayout::SaveEnumType(PropertyType propertyType, 
                                        int enumValue, 
                                        DWRITE_TEXT_RANGE textRange)
{
    PropertySetting propertySetting;
    propertySetting.PropertyType = propertyType;
    propertySetting.EnumValue = enumValue;
    propertySetting.TextRange = textRange;

    m_propertySettings.push_back(propertySetting);
}

void SplittableTextLayout::SaveBoolType(PropertyType propertyType, 
                                        BOOL boolValue, 
                                        DWRITE_TEXT_RANGE textRange)
{
    PropertySetting propertySetting;
    propertySetting.PropertyType = propertyType;
    propertySetting.Boolean = boolValue;
    propertySetting.TextRange = textRange;

    m_propertySettings.push_back(propertySetting);
}

Let me show you just one of the 14 Set methods (in SplittableTextLayout.h) so you see how simple this is with all the infrastructure in place.

HRESULT STDMETHODCALLTYPE SetFontStretch(DWRITE_FONT_STRETCH fontStretch, 
                                            DWRITE_TEXT_RANGE textRange) 

    SaveEnumType(PropertyType::FontStretch, (int) fontStretch, textRange);
    return m_dwriteTextLayout->SetFontStretch(fontStretch, textRange);
}

Towards the end of the Split method, a loop enumerates through all the PropertySetting objects in m_propertySettings and adjusts the DWRITE_TEXT_RANGE for the two new SplittableTextLayout objects. If that text range is still valid, then TransferPropertySetting is called to set the property:

for (PropertySetting& propertySetting : m_propertySettings)
{
    DWRITE_TEXT_RANGE textRange = propertySetting.TextRange;
    UINT32 endPosition = textRange.startPosition + textRange.length;
    HRESULT hr = S_OK;

    if (textRange.startPosition < stringOffset)
    {
        DWRITE_TEXT_RANGE textRange1;
        textRange1.startPosition = textRange.startPosition;
        textRange1.length = min(endPosition, stringOffset) - 
                                textRange1.startPosition;
        if (FAILED(hr = TransferPropertySetting(textLayout1, 
                                                propertySetting, 
                                                textRange1)))
            return hr;
    }

    if (endPosition > stringOffset)
    {
        DWRITE_TEXT_RANGE textRange2;
        textRange2.startPosition = max(0, (int)textRange.startPosition - 
                                            (int)stringOffset);
        textRange2.length = endPosition - stringOffset - 
                                            textRange2.startPosition;
        if (FAILED(hr = TransferPropertySetting(textLayout2, 
                                                propertySetting, 
                                                textRange2)))
            return hr;
    }
}

The TransferPropertySetting method uses the PropertyType field to determine what property to set from what fields of the PropertySetting structure:

HRESULT SplittableTextLayout::TransferPropertySetting(SplittableTextLayout* textLayout, 
                                                      PropertySetting propertySetting, 
                                                      DWRITE_TEXT_RANGE textRange)
{
    HRESULT hr = S_OK;

    switch (propertySetting.PropertyType)
    {
    case PropertyType::DrawingEffect:
        hr = textLayout->SetDrawingEffect(propertySetting.IUnknownPointer.Get(), 
                                          textRange);
        break;

    case PropertyType::FontCollection:
        hr = textLayout->SetFontCollection((IDWriteFontCollection *) 
                                                propertySetting.IUnknownPointer.Get(), 
                                           textRange);
        break;

    case PropertyType::FontFamilyName:
        hr = textLayout->SetFontFamilyName(propertySetting.StringPointer, 
                                           textRange);
        break;

    case PropertyType::FontSize:
        hr = textLayout->SetFontSize(propertySetting.FloatValue[0], 
                                     textRange);
        break;

    case PropertyType::FontStretch:
        hr = textLayout->SetFontStretch((DWRITE_FONT_STRETCH) 
                                            propertySetting.EnumValue, 
                                        textRange);
        break;

    case PropertyType::FontStyle:
        hr = textLayout->SetFontStyle((DWRITE_FONT_STYLE) 
                                            propertySetting.EnumValue, 
                                      textRange);
        break;

    case PropertyType::FontWeight:
        hr = textLayout->SetFontWeight((DWRITE_FONT_WEIGHT) 
                                            propertySetting.EnumValue, 
                                       textRange);
        break;

    case PropertyType::InlineObject:
        hr = textLayout->SetInlineObject((IDWriteInlineObject *) 
                                            propertySetting.IUnknownPointer.Get(), 
                                         textRange);
        break;

    case PropertyType::LocaleName:
        hr = textLayout->SetLocaleName(propertySetting.StringPointer, 
                                       textRange);
        break;

    case PropertyType::Strikethrough:
        hr = textLayout->SetStrikethrough(propertySetting.Boolean, 
                                          textRange);
        break;

    case PropertyType::Typography:
        hr = textLayout->SetTypography((IDWriteTypography *) 
                                            propertySetting.IUnknownPointer.Get(), 
                                       textRange);
        break;

    case PropertyType::Underline:
        hr = textLayout->SetUnderline(propertySetting.Boolean, 
                                      textRange);
        break;

    case PropertyType::CharacterSpacing:
        hr = textLayout->SetCharacterSpacing(propertySetting.FloatValue[0], 
                                             propertySetting.FloatValue[1], 
                                             propertySetting.FloatValue[2], 
                                             textRange);
        break;

    case PropertyType::PairKerning:
        hr = textLayout->SetPairKerning(propertySetting.Boolean, 
                                        textRange);
        break;
    }
    return hr;
}

And you can see that the first-line indent and italic formatting is now restored:

Obviously I haven't tested out this code thoroughly for all the different types of formatting, but aside from the kludge, this seems like a reasonable approach.

4. Eliminating Widows and Orphans

Metaphors sometimes have unfortunate linquistic consequences. I remember a story about a Unix programmer who one day discovered he was writing code to kill a child stuck in a pipe. Despite the heading here, rest assured that I have no plans for acts of hostility against widows and orphans. Some of my best friends and family members are widows and orphans.

In typography, a widow is the last line of a paragraph that appears by itself at the top of a page, and an orphan is the first line of a paragraph that appears by itself at the bottom of a page. These are generally considered to be undesirable. To eliminate an orphan, the line can be moved to the top of the next page, so the paragraph doesn't straddle the page break at all.. To eliminate a widow, the penultimate line of the paragraph can also be moved to the top of the next page, except for a three-line paragraph, in which case that would create an orphan, and the first line of the paragraph would also be moved to the top of the next page. If you don't want widows and orphans, two-line paragraphs and three-line paragraphs never straddle a page break.

Eliminating widows and orphans involves a little code added to the AddParagraphToPage method. The comments should be adequate to figure out how it works:

bool PaginationExperiment4Renderer::AddParagraphToPage(Page& page, 
                                                       Paragraph& paragraph, 
                                                       float x, 
                                                       float& y, 
                                                       float maxY, 
                                                       bool& isSplit)
{
    // We already know where the paragraph is going
    paragraph.Origin = Point2F(x, y);

    // Obtain the height of the paragraph
    DWRITE_TEXT_METRICS textMetrics;

    DX::ThrowIfFailed(
        paragraph.TextLayout->GetMetrics(&textMetrics)
        );

    float paragraphHeight = textMetrics.height;

    // Check if it overruns the maximum height
    if (y + paragraphHeight > maxY)
    {
        // Paragraph is too long, check for a split
        UINT32 lineCount = textMetrics.lineCount;

        // We already know the paragraph is too long,
        //  so if it consists of only one line, it won't fit.
        // If it has two lines, only one line will fit,
        //  and that creates both a widow and orphan.
        // If it has three lines, either an orphan or 
        //  widow will result.
        // Hence...
        if (lineCount <= 3)
        {
            isSplit = false;
            return true;            // end of page
        }

        // Get the line metrics
        std::vector<DWRITE_LINE_METRICS> lineMetrics(lineCount);

        DX::ThrowIfFailed(
            paragraph.TextLayout->GetLineMetrics(lineMetrics.data(), 
                                                 lineMetrics.size(), 
                                                 &lineCount)
            );

        // Accumulate the number of lines and character count that fit
        int fittableLineCount = 0;
        int characterCount = 0;

        for (DWRITE_LINE_METRICS lineMetric : lineMetrics)
        {
            // Check for next line that exceeds page height
            if (y + lineMetric.height > maxY)
            {
                break;
            }

            fittableLineCount++;
            characterCount += lineMetric.length;
            y += lineMetric.height;

            // Check for maximum lines before a widow is made
            if (fittableLineCount + 2 == lineCount)
            {
                break;
            }
        }
        
        // Check for 0 or 1 line (an orphan) that fits on the page
        if (fittableLineCount <= 1)
        {
            isSplit = false;
            return true;            // end of page
        }
        
        // Ready for a valid split!
        ComPtr<SplittableTextLayout> textLayout1;
        ComPtr<SplittableTextLayout> textLayout2;

        DX::ThrowIfFailed(
            paragraph.TextLayout->Split(characterCount, &textLayout1, &textLayout2)
            );

        // This is the displayable first half of the paragraph
        paragraph.TextLayout = textLayout1;
        page.Paragraphs.push_back(paragraph);

        // This is the second half of the paragraph
        paragraph.TextLayout = textLayout2;

        isSplit = true;
        return true;            // end of page
    }

    y += paragraphHeight + paragraph.SpaceAfter;
    page.Paragraphs.push_back(paragraph);
    isSplit = false;
    return false;               // not end of page
}

As you can see, that eliminates the widows and orphans:

Of course, you'll have to scroll through the rest of document in the real program to assure yourself that they're all gone. If you've been doing that all along, you might have noticed some verse that gets wrapped because the columns are too narrow. The columns are narrow because I wanted to generate a lot of columns to expose problems and then try to solve them.

5. Getting a Uniform Page Height

With the elimination of widows and orphans, another problem has become even more obvious: The text on each page ends at a different vertical distance from the bottom.

That's unsightly, and the way to get rid of it is to spread around a little extra vertical spacing on the page. On these pages, there are two types of vertical spacing: The paragraphs all have a built-in line spacing based on the font and font size that dictates how the successive lines are spaced. I've also added some inter-paragraph spacing which is indicated by the SpaceAfter data member of the Paragraph structure.

Where should the extra space go? Here was my original reasoning: If it goes only in the SpaceAfter values, that won't work for a page that consists solely of a part of one paragraph. If it goes only in the paragraph line spacing, that won't work for a page that consists entirely of one-line paragraphs (perhaps some Mamet-style dialogue). So at first I thought: Both.

The IDWriteTextFormat interface (from which IDWriteTextLayout derives) defines a SetLineSpacing method, but this may not work the way you think (even though the concepts have been carried over into XAML-based text facilities).

The first argument to the SetLineSpacing method is an enumeration value, which can be either DWRITE_LINE_SPACING_METHOD_DEFAULT or DWRITE_LINE_SPACING_METHOD_UNIFORM.

With the DWRITE_LINE_SPACING_METHOD_DEFAULT option (which is the default), you can't specify any additional information. The lines of the paragraph are spaced based on font metric information. If the paragraph contains fonts of different sizes, the line spacing will be irregular to accomodate those larger fonts. If the lines contain inline objects (set with SetInlineObject method) the height of the individual lines will accomodate those inline objects.

Let me emphasize again: This option does not allow you to alter the paragraph line spacing. The line spacing is governed entirely by the content of the lines.

If you call SetLineSpacing with the DWRITE_LINE_SPACING_METHOD_UNIFORM, you specify two other values:

textFormat->SetLineSpacing(DWRITE_LINE_SPACING_METHOD_UNIFORM,
                           lineSpacing, baseline);

The lineSpacing value controls the spacing of the successive lines of the paragraph from baseline to baseline. The baseline value requires a bit of explanation: When you use DrawText or DrawTextLayout to display text, you're specifying the starting location based on the upper-left corner of the paragraph. Internally, however, DirectWrite positions text using the baseline. (This becomes evident when you begin working with glyph runs.) If a line contains fonts of different sizes, you want the baseline to be the same for all the text regardless of font size.

The baseline argument to SetLineSpacing indicates the location of the baseline of the first line of text relative to the position you specify in DrawText or DrawTextLayout. For example, suppose you set lineSpacing to 30 and baseline to 24, and call DrawTextLayout with an origin of (5, 10). The first line of text has its basline positioned at a Y coordinate of 34 (10 plus 24). The successive lines of text are positioned with their baslines at Y coordinates of 64, 94, 124, and so forth.

For a particular font, you can determine what lineSpacing and baseline should be by examing a DWRITE_FONT_METRICS method available from the GetMetrics method of IDWriteFont or IDWriteFontFace. For normal behavior, you'd set lineSpacing to the sum of ascent, descent and lineGap (adjusted for the em size, of course), and baseline to the ascent value.

The documentation indicates that a baseline value of 80% of lineSpacing is reasonable so that's what I've used.

The DWRITE_LINE_SPACING_METHOD_UNIFORM option makes most sense when the paragraph contains only a single font of a single font size. When it doesn't, you the programmer are taking responsibility for getting the line spacing right to accomodate the paragraph contents.

Of course, if you're dealing with a document that you know has a uniform font and font size, you can use DWRITE_LINE_SPACING_METHOD_UNIFORM with impunity,, and that's what I'll be doing.

When you set the DWRITE_LINE_SPACING_METHOD_UNIFORM option on an IDWriteTextLayout object and then call GetMetrics, the DWRITE_TEXT_METRICS height field will be equal to the product of the lineCount field of the structure and the lineSpacing value you specified in SetLineSpacing. What this means is that this lineSpacing value also controls the height of single-line paragraphs.

Which means that it's possible to make adjustments so that pages end at the same place solely using SetLineSpacing without altering the space between paragraphs, and that's the approach I'll be taking.

Previous versions of the PaginationExperiment program mostly used the same SplittableTextLayout objects as the source for the pagination process and also for rendering. The exception, of course, is when a source SplittableTextLayout must be split between two pages. The Split method creates two new SplittableTextLayout objects for rendering.

The PaginationExperiment5 will need to call SetLineSpacing on the SplittableTextLayout objects prior to rendering, but the SplittableTextLayout objects used for pagination should not have SetLIneSpacing applied. This means that every source SplittableTextLayout needs to cloned for rendering.

It's easy enough to add a Clone method to SplittableTextLayout. The logic is basically in there already, and the Split method can be simplified to make use of this new Clone method. I'm sure I don't need to show you the revised code.

I've also enhanced the Paragraph structure:

struct Paragraph
{
    std::wstring             Text;
    Microsoft::WRL::ComPtr<SplittableTextLayout> TextLayout;
    float                    SpaceAfter;
    D2D1_POINT_2F            Origin;

    // New for PaginationExperiment5
    bool                     CanAdjustLineSpacing;
    unsigned int             LineCount;
    float                    Height;
};

I decided I only wanted to adjust line spacing in paragraphs that weren't titles, so CanAdjustLineSpacing is set appropriately in the AliceParagraphGenerator class. The LineCount and Height fields are set during pagination.

The Page structure is also enhanced just a bit:

struct Page
{
    D2D1_SIZE_F            Size;
    std::vector<Paragraph> Paragraphs;

    // New for PaginationExperiment5
    float                  TextBottom;
};

The TextBottom field is set during pagination. It's the maximum Y value that encompasses all the text on the page.

In earlier versions of the program, the Origin field of Paragraph was determined during the pagination process and used to position the paragraph on the page. I won't be touching the rendering logic in this version of the program, so these Origin values must be recalculated based on the new line spacing. The logic that calculates the new line spacing must therefore be working with Paragraph objects whose LineCount, Height and SpaceAfter fields are valid. Here's the method that does it:

void PaginationExperiment5Renderer::MakePageUniformHeight(Page& page, 
                                                          float desiredHeight)
{
    int totalLineCount = 0;

    for (Paragraph& paragraph : page.Paragraphs)
    {
        if (paragraph.CanAdjustLineSpacing)
        {
            totalLineCount += paragraph.LineCount;
        }
    }

    float extraLineSpacing = (desiredHeight - page.TextBottom) / totalLineCount;
    D2D1_POINT_2F origin = page.Paragraphs[0].Origin;

    for (Paragraph& paragraph : page.Paragraphs)
    {
        // Set new origin
        paragraph.Origin = origin;

        if (paragraph.CanAdjustLineSpacing)
        {
            // Calculate and set new line spacing
            float lineSpacing = paragraph.Height / paragraph.LineCount;
            lineSpacing += extraLineSpacing;

            DX::ThrowIfFailed(
                paragraph.TextLayout->SetLineSpacing
                                (DWRITE_LINE_SPACING_METHOD_UNIFORM,
                                 lineSpacing, 0.8f * lineSpacing)
                );

            // Set new paragraph height
            paragraph.Height = lineSpacing * paragraph.LineCount;
        }

        // Calculate origin of next paragraph
        origin.y += paragraph.Height + paragraph.SpaceAfter;
    }
}

This is called as each page is completed, except for the last page. (You don't want to adjust line spacing on the last page! That would be very funny looking.) Here's the call to MakePageUniformHeight towards the end of the PaginatePage method:

void PaginationExperiment5Renderer::PaginateChapter()
{
    ...

        // Add paragraph to page (or possibly pages)
        bool isSplit = false;

        do
        {
            while (AddParagraphToPage(page, paragraph, x, y, maxY, isSplit))
            {
                // End of page so adjust line spacing & add page to collection
                MakePageUniformHeight(page, maxY);
                m_pages.push_back(page);

                // Begin a new page
                x = leftMargin;
                y = topMargin;
                page.Paragraphs.clear();
            }
        } while (isSplit);
    }

    // Don't adjust line spacing on this last page!
    m_pages.push_back(page);

    ...
}

As we all know, kludges often come back to bite us, and this is certainly true of the kludge I performed on the text string to make sure the last line of the first part of a split paragraph is correctly justified. The problem with this kludge is that GetMetrics will no longer work correctly on this SplittableTextLayout. It will report an extra line count and an extra height for that invisible line. However, the MakePageUniformHeight method requires correct values of LineCount and Height in the Paragraph structure.

For that reason the new AddParagraphToPage method needs to accumulate a fittableHeight value as well as a fittableLineCount value to properly set the Paragraph fields of the first of the two SplittableTextLayout objects created from Split:

bool PaginationExperiment5Renderer::AddParagraphToPage(Page& page, 
                                                       Paragraph& paragraph, 
                                                       float x, 
                                                       float& y, 
                                                       float maxY, 
                                                       bool& isSplit)
{
    // We already know where the paragraph is going
    paragraph.Origin = Point2F(x, y);

    // Obtain the height of the paragraph
    DWRITE_TEXT_METRICS textMetrics;

    DX::ThrowIfFailed(
        paragraph.TextLayout->GetMetrics(&textMetrics)
        );

    // Save information in Paragraph object
    paragraph.LineCount = textMetrics.lineCount;
    paragraph.Height = textMetrics.height;

    // Check if it overruns the maximum height
    if (y + paragraph.Height > maxY)
    {
        // Paragraph is too long, check for a split
        UINT32 lineCount = textMetrics.lineCount;

        // We already know the paragraph is too long,
        //  so if it consists of only one line, it won't fit.
        // If it has two lines, only one line will fit,
        //  and that creates both a widow and orphan.
        // If it has three lines, either an orphan or 
        //  widow will result.
        // Hence...
        if (paragraph.LineCount <= 3)
        {
            isSplit = false;
            return true;            // end of page
        }

        // Get the line metrics
        std::vector<DWRITE_LINE_METRICS> lineMetrics(lineCount);

        DX::ThrowIfFailed(
            paragraph.TextLayout->GetLineMetrics(lineMetrics.data(), 
                                                 lineMetrics.size(), 
                                                 &lineCount)
            );

        // Accumulate the number of lines and character count that fit
        int fittableLineCount = 0;
        float fittableHeight = 0;
        int characterCount = 0;

        for (DWRITE_LINE_METRICS lineMetric : lineMetrics)
        {
            // Check for next line that exceeds page height
            if (y + lineMetric.height > maxY)
            {
                break;
            }

            fittableLineCount++;
            fittableHeight += lineMetric.height;
            characterCount += lineMetric.length;
            y += lineMetric.height;

            // Check for maximum lines before a widow is made
            if (fittableLineCount + 2 == lineCount)
            {
                break;
            }
        }
        
        // Check for 0 or 1 line (an orphan) that fits on the page
        if (fittableLineCount <= 1)
        {
            isSplit = false;
            return true;            // end of page
        }
        
        // Ready for a valid split!
        ComPtr<SplittableTextLayout> textLayout1;
        ComPtr<SplittableTextLayout> textLayout2;

        DX::ThrowIfFailed(
            paragraph.TextLayout->Split(characterCount, &textLayout1, &textLayout2)
            );

        // This is the displayable first half of the paragraph.
        // Create a new Paragraph with this new layout.
        Paragraph newParagraph = paragraph;
        newParagraph.TextLayout = textLayout1;
        newParagraph.Height = fittableHeight;
        newParagraph.SpaceAfter = 0;            // because at end of page
        newParagraph.LineCount = fittableLineCount;
        page.Paragraphs.push_back(newParagraph);
        page.TextBottom = y;

        // This is the second half of the paragraph,
        //  to be handled in next call to method.
        paragraph.TextLayout = textLayout2;

        isSplit = true;
        return true;            // end of page
    }

    y += paragraph.Height + paragraph.SpaceAfter;

    // Create a new Paragraph with a cloned SplittableTextLayout
    Paragraph newParagraph = paragraph;

    DX::ThrowIfFailed(
        paragraph.TextLayout->Clone(&newParagraph.TextLayout)
        );

    // Add it to the page and update page height
    page.Paragraphs.push_back(newParagraph);
    page.TextBottom = y - paragraph.SpaceAfter;
    isSplit = false;
    return false;               // not end of page
}

And here's the result:

6. From Chapter to Book

Now let's go big. Let's see how this pagination process works for an entire book. And since I'm currently reading Anthony Trollope's Orley Farm, let's get the plain text version of that book from Project Gutenberg.

The final version of this program is named PaginationExperiment6. It contains as a program resource the 1.7 megabyte pg23000.txt file I downloaded from Project Gutenberg. That file is loaded and processed in the Loaded event of MainPage:

Loaded += ref new RoutedEventHandler([this](Object^, Object^)
{
    create_task(PathIO::ReadLinesAsync("ms-appx:///Text/pg23000.txt",
        UnicodeEncoding::Utf8))
        .then([this](IVector<String^>^ lines)
    {
        std::vector<std::wstring> paragraphs;
        std::wstring paragraph;

        for (String^ line : lines)
        {
            // End of paragraph, make new paragraph
            if (line->Length() == 0)
            {
                paragraphs.push_back(paragraph);
                paragraph.clear();
            }
            // Continue the paragraph
            else
            {
                const wchar_t * textLine = line->Data();
                wchar_t firstChar = textLine[0];
                wchar_t lastChar = textLine[line->Length() - 1];

                // Let indented lines retain line breaks
                if (paragraph.length() != 0 && firstChar == ' ')
                {
                    paragraph.append(L"\r\n");
                }

                paragraph.append(textLine);

                // Prepare for concatenation with next line
                if (lastChar != ' ')
                {
                    paragraph.append(L" ");
                }
            }
        }

        critical_section::scoped_lock lock(m_main->GetCriticalSection());
        m_main->SetBookText(paragraphs);

    }, task_continuation_context::use_arbitrary());
});

The paragraphs in Project Gutenberg plain-text files consist of lines of about 70 characters with hard line breaks. Those lines must be concatenated to turn them into wrappable paragraphs. That's the main job going on here. The result is a collection of single-line paragraphs passed to the SetBookText method (formerly known as the SetAliceText method).

Turning a plain-text Project Gutenberg file into something a reasonable ebook reader might display is an art in itself. What I'm doing here is the bare minimum. For more extensive pre-processing, see the BookInfo class in PhreeBookReader presented in the November 2011 issue of MSDN Magazine.

Notice the use of task_continuation_context::use_arbitrary() on the callback method. This allows the callback method to run in the thread used by PathIO::ReadLinesAsync rather than the user interface thread. Usually you want these callback methods to run in the UI thread, but in this case the callback is doing a lot of work — enumerating through the 34,555 lines of the file to create 6,654 single-line paragraphs — and the architecture of the program already accomdates the use of multiple threads synchronized with the critical_section::scoped_lock object.

However, this loop only requires about 0.2 seconds on a Surface Pro, whereas the SetBookText method requires over 2-1/2 seconds. That encompasses both the construction of the 6,654 IDWriteTextLayout objects (less than half a second) and the pagination process (a bit over 2 seconds).

As I discussed extensively in the columns leading up to PhreeBookReader, it makes much more sense to paginate on a chapter level rather than a book level. Creating and maintaining 6,654 IDWriteTextLayout source objects and even more rendering objects is definitely not the way to make an ebook reader. But I'm deliberately trying to push the envelope here, and paginating an entire Trollope novel in less than 2.5 seconds isn't too shabby.

However, it's not something you want to do in the UI thread. And that's a distinct possibility because the text must also be repaginated when the display changes size or orientation, and in those cases the CreateWindowSizeDependentResources method definitely runs in the UI thread. Let's see if we can fix that.

In previous versions of this program, the SetAliceText method in the renderer called the SetText method of the AliceParagraphGenerator object, and then PaginateChapter. The CreateWindowSizeDependentResources method just called PaginateChapter.

Because the AliceParagraphGenerator class is specifically geared towards Chapter 7 of Alice's Adventures in Wonderland, it is not part of this project. Instead, the SetBookText method in the renderer now includes the logic to create a collection of Paragraph objects for all the paragraphs in the book:

void PaginationExperiment6Renderer::SetBookText(std::vector<std::wstring> lines)
{
    ComPtr<IDWriteFactory> dwriteFactory = m_deviceResources->GetDWriteFactory();

    // Create a basic IDWriteTextFormat for all book text
    ComPtr<IDWriteTextFormat> textFormat;

    DX::ThrowIfFailed(
        dwriteFactory->CreateTextFormat(L"Century Schoolbook",
                                        nullptr,
                                        DWRITE_FONT_WEIGHT_NORMAL,
                                        DWRITE_FONT_STYLE_NORMAL,
                                        DWRITE_FONT_STRETCH_NORMAL,
                                        24.0f,
                                        L"en-US",
                                        &textFormat)
        );

    // Create first-line indent object
    ComPtr<IDWriteInlineObject> halfInchIndent = new FirstLineIndent(48);

    // Clear out any earlier paragraphs
    m_paragraphs.clear();

    // Loop through the single-line paragraphs
    for (unsigned int index = 0; index < lines.size(); index++)
    {
        // Set the text with an initial character for first-line indenting
        Paragraph paragraph;
        std::wstring text = L"\x200B" + lines.at(index);

        // Create the SplittableTextLayout object
        DX::ThrowIfFailed(
            SplittableTextLayout::Create(text.c_str(),
                                         text.length(),
                                         textFormat.Get(),
                                         0, 0,
                                         &paragraph.TextLayout)
            );

        if (lines[index][0] != ' ')
        {
            // Justify paragraphs that don't begin with a space
            DX::ThrowIfFailed(
                paragraph.TextLayout->SetTextAlignment(DWRITE_TEXT_ALIGNMENT_JUSTIFIED)
                );

            // And first-line indent by a half inch
            DWRITE_TEXT_RANGE textRange = { 0, 1 };

            DX::ThrowIfFailed(
                paragraph.TextLayout->SetInlineObject(halfInchIndent.Get(), textRange)
                );
        }
        else
        {
            // Set smaller fixed-pitch font for other paragraphs
            DWRITE_TEXT_RANGE textRange;
            textRange.startPosition = 0;
            textRange.length = text.length();

            DX::ThrowIfFailed(
                paragraph.TextLayout->SetFontFamilyName(L"Courier New", textRange)
                );

            DX::ThrowIfFailed(
                paragraph.TextLayout->SetFontSize(18.0f, textRange)
                );
        }

        paragraph.SpaceAfter = 12.0f;
        paragraph.CanAdjustLineSpacing = true;
        m_paragraphs.push_back(paragraph);
    }

    // Now perform pagination
    PaginateBook();
    m_needsRedraw = true;
}

Keep in mind that this method is not running in the UI thread, and the critical_section object is in effect, so the Update and Render methods are blocked as well.

Now let's ask ourselves what we want to happen when the display changes size or orientation. We first want any text on the screen to disappear. In other words, we want the m_pages collection to be cleared. At that point, some kind of status indication should be displayed, perhaps the word "Paginating..." Then, when the book has been repaginated for the new display size, we want the display updated with the new pages.

This seems to be pretty much what we get by first defining a ClearPages method in PaginationExperiment6Renderer

void PaginationExperiment6Renderer::ClearPages()
{
    m_pages.clear();
    m_needsRedraw = true;
}

The CreateWindowSizeDependentResources method in PaginationExperiment6Main can be rewritten very slightly to look like this:

void PaginationExperiment6Main::CreateWindowSizeDependentResources() 
{
    m_PaginationExperiment6Renderer->ClearPages();

    concurrency::create_async([this]()
    {
        critical_section::scoped_lock lock(m_criticalSection);
        m_PaginationExperiment6Renderer->CreateWindowSizeDependentResources();
    });
}

Keep in mind that the critical_section lock is in effect from DirectXPage when ClearPages is called. The concurrency::create_async method creates a block of code that runs in a separate thread. but that thread is blocked by the critical_section::scoped_lock object until CreateWindowSizeDependentResources returns back to DirectXPage, which now happens very quickly. When the critical_section::scoped_lock goes out of scope in DirectXPage, the previoiusly blocked Update and Render calls in can execute, and then the asynchronous code in CreateWindowSizeDependentResources takes control of the critical_section::scoped_lock, which allows the pagination to occur.

Nice! (If I do say so myself.)

For the pagination, the bottomMargin value has been increased to 40 to accomodate a page number at the bottom of each page, and the pageWidth is set to the minimum of 650 and the window width. The Render method is rather different and only updates the screen when necessary. The PaginationExperiment6Renderer class contains a private Boolean data member named m_needsRedraw that is set to true whenever the screen needs to be re-rendered, and Render resets it to false and returns true to indicate that the screen has been updated.

The Render method has also been improved to render only the visible pages. (When you see how many pages are involved, you'll agree this is totally necessary.) The Render method also uses two IDWriteTextFormat objects that are used to display the current page number and status text. These are created by the PaginationExperiment6 constructor. Here's the new Render method:

bool PaginationExperiment6Renderer::Render()
{
    if (!m_needsRedraw)
        return false;

    ID2D1DeviceContext* context = m_deviceResources->GetD2DDeviceContext();
    Windows::Foundation::Size logicalSize = m_deviceResources->GetLogicalSize();

    context->SaveDrawingState(m_stateBlock.Get());
    context->BeginDraw();
    context->Clear(ColorF(ColorF::White));

    // Display status 
    if (m_pages.size() == 0)
    {
        std::wstring text = L"Paginating...";
        D2D1_RECT_F layoutRect = RectF(0, 0, logicalSize.Width, 
                                             logicalSize.Height);
        context->DrawText(text.c_str(), 
                          text.length(), 
                          m_textFormatStatus.Get(), 
                          &layoutRect, 
                          m_blackBrush.Get());
    }

    float x = -m_currentScroll;
    int pageNumber = 0;

    //Loop through pages
    for (const Page& page : m_pages)
    {
        pageNumber++;

        if (x + page.Size.width > 0 && x < logicalSize.Width)
        {
            context->SetTransform(Matrix3x2F::Translation(x, 0) *
                        m_deviceResources->GetOrientationTransform2D());

            for (const Paragraph& paragraph : page.Paragraphs)
            {
                context->DrawTextLayout(paragraph.Origin,
                    paragraph.TextLayout.Get(),
                    m_blackBrush.Get());

                // Set page number
                std::wstring text = L"Page " + std::to_wstring(pageNumber) + 
                                    L" of " + std::to_wstring(m_pages.size());

                D2D1_RECT_F layoutRect = RectF(0, 0, page.Size.width, 
                                                     page.Size.height);
                context->DrawText(text.c_str(), 
                                  text.length(), 
                                  m_textFormatPageNumber.Get(), 
                                  &layoutRect, 
                                  m_blackBrush.Get());
            }
        }

        x += page.Size.width;
    }

// Ignore D2DERR_RECREATE_TARGET here. This error indicates that the device
// is lost. It will be handled during the next call to Present.
    HRESULT hr = context->EndDraw();
    if (hr != D2DERR_RECREATE_TARGET)
    {
        DX::ThrowIfFailed(hr);
    }

    context->RestoreDrawingState(m_stateBlock.Get());

    m_needsRedraw = false;
    return true;
}

The front matter that Project Gutenberg inserts into their books looks terrible when displayed by this program, but once you get into the actual chapters, it's not too bad:

Moreover, somehow I've managed to improve performance. I mentioned earlier that an entire repagination job of "Orley Farm" required over 2 seconds on the Surface Pro. But now when turning the tablet from side to side to repaginate, it goes much faster. Often the status text appears in the wrong spot and with the wrong orientation, but the whole book seems to repaginate in only 0.6 seconds now.

I guess DirectX really is pretty fast!