Charles Petzold



Coding for Multi-Touch for Silverlight 3

December 1, 2009
New York, N.Y.

Two weeks ago, at the second-day keynote at the Microsoft Professional Developers Conference (PDC), Windows and Windows Live Division President Steven Sinofsky told us that Microsoft would be giving everyone there a new Acer notebook computer. He got a big round of applause. Nothing quite excites a roomful of programmers more than free hardware.

These Acer notebooks came with a feature that only a comparatively few programmers have so far been able to explore: a multi-touch screen. Unlike older touch screens and tablet PCs of the past, multi-touch can simultaneously detect multiple fingers or styli. The use of multiple fingers allows on-line manipulations such as scaling and rotation that are difficult to define with just one finger or a mouse.

We've all seen multi-touch in action: Tom Cruise in Minority Report, newscasters and weathermen on TV, and if you've gone to any Microsoft event in the past couple years, you've probably played around with a coffee-table Microsoft Surface device. But multi-touch is also coming to the small screen: The iPhone, iPod Touch, and the Microsoft Zune HD all have multi-touch capabilities, and many more multi-touch devices will be released in the years ahead.

This means we have a new input paradigm, and Microsoft would very like us programmers to start coding for it. That's my interpretation of the PDC distribution of thousands of free multi-touch notebooks.

There are different levels of support of multi-touch in the Windows 7 API, and in WPF, and I suspect this support will develop and perhaps mutate a bit in the years ahead. But I was surprised to find multi-touch support in Silverlight 3. That's what I'll be discussing here.

If your Silverlight application is running on a machine with a multi-touch screen, and if the operating system supports multi-touch, and if the Silverlight plug-in is aware of this operating system support, then the application can read multi-touch input. I don't know how many machines fit these qualifications just yet, but the Acer notebooks we got at the PDC certainly do.

If a Silverlight application chooses not to trap multi-touch events, then they are "promoted" to mouse events, except that you lose the "multi" part of the "multi-touch." Only one finger can be promoted to emulate the mouse, and other fingers are then ignored.

I tried running the Silverlight text-deformation program I posted two days ago on the multi-touch machine, and in theory I could move the Thumb controls with my fingers, but the controls presented way too small a target — even for the stylus. Besides, what fun is multi-touch if you can't use multiple fingers?

I really had no choice but to enhance the program for multi-touch.

The current multi-touch support in Silverlight 3 is very light. You basically get "down," "move," and "up" events but no gesture recognition or inertia support (useful for tossing screen objects with a flick of the wrist). Moreover, these touch events do not go through the visual tree like keyboard, mouse, and stylus events. Just one static event provides all the support.

I'm going to refer to "fingers" as input devices, even though other objects (such as styli) can be used instead. Some multi-touch hardware — such as the Microsoft Surface devices — can detect many simultaneous fingers. On the low end, only two fingers can be sensed, and that's the case with the Acer notebooks we got at the PDC. Of course, when programming, "two" are many more than "one," and "many" are not much more than "two," so if done correctly, any code written for two fingers should run fine on many-finger hardware.

To use multi-touch, a Silverlight program first installs a handler for the static Touch.FrameReported event:

Touch.FrameReported += OnTouchFrameReported;

If the machine does not support multi-touch, no exception will be raised. The event handler looks like this:

void OnTouchFrameReported(object sender, TouchFrameEventArgs args)
{
    ...
}

TouchFrameEventArgs derives from EventArgs and defines a get-only property named Timestamp (which I did not use) and three methods: GetPrimaryTouchPoint, GetTouchPoints (notice the plural), and SuspendMousePromotionUntilTouchUp. You'll probably be using all three.

The GetPrimaryTouchPoint method returns an object of type TouchPoint, and GetTouchPoints returns an object of type TouchPointCollection. Each of these methods has an argument of type UIElement, but as far as I've been able to determine, this argument does not indicate any type of region to restrict touch events. Regardless of this argument, you'll always get touch events that occur anywhere in the application. This argument merely indicates a reference point for position information. You can set this argument to null to get position information relative to the upper-left corner of the application.

Before I discuss the difference between GetPrimaryTouchPoint and GetTouchPoints, let's look at the TouchPoint object itself.

The TouchPoint is the basic object that tells you what a finger is doing. All properties are get-only. The Action property is TouchAction.Down, TouchAction.Move, or TouchAction.Up. As is customary, many "move" events can be sandwiched between and "down" and "up." The Position property is the position of the finger relative to the UIElement passed to the method that obtains the TouchPoint.

TouchPoint includes a Size property for those devices that can figure out how large an area of the screen is being touched. Documentation indicates that this property returns (–1, –1) for devices that don't support size information, but I get values on (0, 0) on the Acer.

The final property of TouchPoint is called TouchDevice of type TouchDevice, and this object has two additional properties: DirectlyOver of type UIElement indicating the topmost element underneath the finger and an integer Id. When working with input from multiple fingers this Id is extremely important because it's how you keep the fingers separate. Each series of "down," "move," and "up" events associated with a particular finger will have a unique Id.

If a finger touches a screen when no other fingers are touching the screen, that finger is known as a "primary" touch point, which is the finger that potentially gets promoted to mouse input. This can be a little confusing — at least it was for me — so let's take an example: Put one finger on the screen. That's the primary touch point, and it is promoted to a MouseLeftButtonDown event. Any movement of the first finger becomes a MouseMove event. Now put a second finger on the screen. Now lift the first finger. That lift becomes a MouseLeftButtonUp event. Now put the first finger back down on the screen. That finger is not a primary touch point and is not promoted to a mouse event because the second finger is still sitting on the screen. Only when all fingers are removed and a finger goes back down on the screen will another primary touch point (and another mouse event) occur.

Now we're ready to discuss GetPrimaryTouchPoint and GetTouchPoints. In the general case, you'll need to use both methods. GetPrimaryTouchPoint returns a TouchPoint containing "down," "move," or "up" information, but only for the finger touching the screen when no other fingers are present. If you press two fingers on the screen and drag them, the FrameReported event handler will be called for both fingers, but GetPrimaryTouchPoint will return information only for the first finger that hit the screen. For events corresponding to the second finger, GetPrimaryTouchPoint returns null.

If you're interested in just one finger, GetPrimaryTouchPoint is fine. Just ignore it when it returns null. But if you're interested in multiple fingers, you need to call GetTouchPoints, which returns an object of type TouchPointCollection, which can have one or more TouchPoint objects for information about all the fingers.

So why call GetPrimaryTouchPoint at all? You need to call GetPrimaryTouchPoint because of the third method defined by TouchFrameEventArgs: This is SuspendMousePromotionUntilTouchUp. You must call this method if you do not want touch events promoted to mouse events — that's usually the case if you're working with multi-touch — but you can only call it for a primary touch point, and only for an Action of TouchAction.Down.

In summary, the event handler will probably look something like this:

void OnTouchFrameReported(object sender, TouchFrameEventArgs args)
{
    TouchPoint primaryTouchPoint = args.GetPrimaryTouchPoint(null);

    if (primaryTouchPoint != null && 
                    primaryTouchPoint.Action == TouchAction.Down)
        args.SuspendMousePromotionUntilTouchUp();

    TouchPointCollection touchPoints = args.GetTouchPoints(null);

    foreach (TouchPoint touchPoint in touchPoints)
    {
        ...
    }
}

The code is simple enough, but it took me a long time to get there!

The program from two days ago is called ClickAndDeformText. I wanted the new program (TouchAndDeformText) to work pretty much the same as the existing program, except that the Thumb controls pop up not only when you click the screen with the mouse, but when you touch the screen with a finger. These Thumb controls should be larger for manipulation with the fingers, and I wanted the option to move more than one at a time.

To port ClickAndDeformText to TouchAndDeformText program, I knew I needed to replace the Thumb controls with TouchThumb controls that work with both touch and the mouse. I wrote the TouchThumb class from scratch. I would have derived from Thumb were it not sealed, and I didn't feel like messing around with the existing Thumb source code.

For reasons I'll discuss shortly, I wanted all TouchThumb objects in the application to share the same FrameReported event handler. For that reaon, the handler is attached to the event in the static constructor:

static TouchThumb()
{
    Touch.FrameReported += OnTouchFrameReported;
}

TouchThumb also has a static collection of TouchThumb objects:

static List<TouchThumb> touchThumbs = new List<TouchThumb>();

The instance constructor adds each new TouchThumb to this collection:

public TouchThumb()
{
    touchThumbs.Add(this);
}

Like Thumb, TouchThumb defines DragStarted, DragDelta, and DragCompleted events. It does not define the IsDragging property or the CancelDrag method, although these could easily be added.

Let's assume that the TouchThumb objects are displayed on the screen. Most of the action in TouchThumb occurs in the static OnTouchFrameReported event handler. It begins functionally the same as the handler I showed earlier and then has a switch statement based around the Action property. For ThumbAction.Down the code determines if an ancestor of the DirectlyOver property is actually a TouchThumb. If so, we're in business. The Id property becomes a key in a static Dictionary defined like so:

static Dictionary<int, Info> infoDictionary = new Dictionary<int, Info>();

where Info is a tiny private class that defines TouchThumb and Position properties. This is how the program keeps the fingers separate. A DragStarted event is fired.

For TouchAction.Move and TouchAction.Up, the Id provides access to the TouchThumb instance and Position information stored in the Dictionary, which then allows firing DragDelta and DragCompleted events. On TouchAction.Up, the entry in the Dictionary is removed.

This is how it works if the TouchThumb objects are already displayed on the screen. But when the program begins running, they're all hidden. They need to be brought in view by touching the screen. This is why I implemented a static event handler and why I defined a static fourth event in TouchThumb named UnhandledFrameReported. This event is only fired for a primary touch point and for TouchAction.Down when no TouchThumb is underneath the finger. The MainPage class installs a handler for this event to initiate visibility of the TouchThumb objects — basically the same code as the OnMouseLeftButtonDown override in that class.

I added mouse support to TouchThumb but didn't do any testing (or much thinking about) weird interactions between the fingers and the mouse.

I wanted the thumbs to be visually quite different depending on a mouse click or a finger tap. For a mouse click the thumbs should be small and solid. For using the fingers, I wanted them larger and partially transparent. I handled this by defining two different ControlTemplate objects for the two visual appearances and selecting one when the TouchThumb objects come into view. (TouchThumb does not define a default ControlTemplate so it has no default visual.)

As the program came into final shape, I discovered a serious problem: The code that performed the calculations on the coordinates of the polylines was so slow that nothing was being updated while a finger was dragging a TouchThumb. I tried reducing the number of multiplications being performed, but that barely helped, and then I decided to get really extreme: The PathGeometry defining the letters is composed of multiple PathFigure objects. For each DragDelta event I decided to update only one of these PathFigure objects in a round-robin, and updating them all on DragCompleted. It actually worked much better than I anticipated!

You can run the program here:


TouchAndDeformText.html

If you went to the PDC, run it on the notebook you brought home and use your fingers! If you have access to another multi-touch device, I'm curious to know how it works. Here's the source code.