February 2009

Volume 24 Number 02

Foundations - Error Handling In Workflows

By Matt Milner | Feb 2009

Code download bachelor

Contents

Treatment Faults in Workflows
Handling Faults in the Host Procedure
Handling Faults in Custom Activities
Using Compensation
Retry Activity

Windows Workflow Foundation (WF) provides the tools to define rich business processes and a runtime environment to execute and manage those processes. In any business process, exceptions to the expected flow of execution occur, and developers need to be able to write robust application logic to recover from those exceptions. Almost samples of any engineering tend to overlook error handling and proper recovery. In this month'due south installment, I will show you lot how to handle exceptions properly at several levels of the WF programming model and how to build rich exception handling capabilities into your workflows and hosts.

Handling Faults in Workflows

Developers building business processes must be able to handle exceptions to the business organization instance to ensure that the procedure itself is resilient and tin can continue later failures occur. This is especially of import with workflows, equally they oft ascertain long-running processes, and an unhandled failure, in most cases, means having to restart the process. The default behavior for a workflow is that if an exception happens in the workflow and is not handled, the workflow will be terminated. Thus, it is crucial for workflow developers to correctly scope work, handle errors, and build into the workflows the ability to retry piece of work when failures occur.

Treatment faults in workflows has many things in common with handling faults in Microsoft .Internet Framework-targeted code and a few new concepts. To handle a fault, the starting time step is to ascertain a scope of execution. In .NET code, this is accomplished with the try keyword. In workflows, most blended activities can exist used to create an exception treatment telescopic. Each blended activity has the main view showing the kid activities and also has alternate views. In Effigy 1the context bill of fare on a Sequence activity shows how diverse views can be accessed and the result of selecting the View Mistake Handlers option.

Effigy 1 Alternate Views Menu and Selecting View Mistake Handlers

When switching to the Fault Handlers view, a FaultHandlers activity is added to the activities collection of the Sequence activeness. Inside the FaultHandlers action, private FaultHandler activities can exist added. Each of the FaultHandler activities has a belongings to define the fault type and acts like a grab expression in .NET.

The FaultHandler activity is a blended activity that allows workflow developers to add child activities that define how to handle the exception. These activities can provide functionality to log errors, contact administrators, or whatsoever other actions that you would normally take when treatment exceptions in your code.

The FaultHandler activity as well has a Fault property that contains the exception beingness caught. Child activities tin demark to this belongings to get access to the exception. This is illustrated in Effigy 2, where a custom logging activity has its exception property spring to the Fault holding on the FaultHandler activeness. The logging activity tin now write the exception information to a logging API, the Windows Outcome Log, Windows Management Instrumentation (WMI), or any other destination.

Figure two Binding to Faults

Like catch blocks, the FaultHandler activities are evaluated based on their fault types. When defining the workflow, the FaultHandler activities should be added to the FaultHandlers in order from the virtually specific fault to the least specific, left to right.

fig03.gif

Figure 3 Execution Continues afterwards the Blended

When an exception occurs and is caught in .Cyberspace code, after the catch block finishes the execution continues after the try scope. So in a workflow, execution continues with the next activity after the composite activity that handles the exception (run into Effigy 3).

There are ii key concepts about how the FaultHandler activities are evaluated and executed. When the Throw activity executes (equally in Effigy 3), or another activity throws an exception, the runtime puts the activity in the faulted state and schedules the HandleFault method on the activeness to execute. I volition become into more detail on how activities implement this method shortly, but for now it is plenty to know that this is the chance for the activeness to make clean up.

When the activity finishes cleaning up and moves to the closed state, the parent activity is put into the faulting country and is likewise given the gamble to clean upwards any child activities and betoken that it is set to move to the closed state. It is at this indicate, when the blended activeness signals that it is ready to move to the closed state, that the runtime checks the status and, when information technology is Faulting, examines the FaultHandlers collection. If a FaultHandler activity is found with a Fault blazon that matches the current exception, that FaultHandler is scheduled and the evaluation is halted. Once the FaultHandler closes, execution can then continue to the adjacent activity.

When an exception occurs, the runtime will attempt to notice fault-handling activities on the immediate parent composite activity. If no matching handlers are institute, that composite is faulted and the exception bubbles up to the adjacent composite activity in the tree. This is similar to how .NET exceptions bubble up the stack to the calling methods when they are unhandled. If the exception bubbling all the mode to the root action of the workflow and no handler is found, and then the workflow is terminated.

Note that the workflow itself is a composite activity and, therefore, can have fault-treatment logic defined to deal with exceptions that reach the top level. This is the concluding take a chance for a workflow developer to catch and handle exceptions in the business concern process.

Handling Faults in the Host Procedure

While creating robust workflows is important, having a host that can deal with exceptions is every bit important to the stability of your awarding. Fortunately, the workflow runtime is robust in treatment these exceptions out of the box and shields the host process from nigh exceptions bubbling up.

When an exception occurs in a workflow, bubbles upward through the hierarchy, and is uncaught, the workflow will terminate and an event is raised on the runtime. The host of the runtime can register a handler to get notified when these exceptions occur, just the exceptions do not cause the host to crash. To get notified about these terminations, the host process tin can employ code like the post-obit to get information about the exception from the event arguments:

              workflowRuntime.WorkflowTerminated += delegate( object sender, WorkflowTerminatedEventArgs e) { Console.WriteLine(e.Exception.Message); };                          

In add-on to dealing with terminated workflows, the host process also has the ability to get notified nigh exceptions that occur in runtime services. For example, if SqlWorkflowPersistence­Service is loaded into the runtime, it will poll the database and may effort to load workflows periodically when they have further piece of work to practice. When attempting to load a workflow, the persistence service may throw an exception when trying to deserialize the workflow, for example. When this happens, it is again important that the host procedure not fail, which is why these services don't rethrow these exceptions. Instead, they heighten an event to the workflow runtime. The runtime in plow raises a ServicesExceptionNotHandled event that tin can be handled in code, as shown here:

              workflowRuntime.ServicesExceptionNotHandled += delegate( object sender, ServicesExceptionNotHandledEventArgs snhe) { Console.WriteLine(snhe.Exception.Bulletin); };                          

In general, developers of runtime services accept to make a option when communicable an exception nearly whether the exception is critical. In SqlWorkflowPersistenceService, not beingness able to load a single workflow does non mean the service cannot function. Therefore, it makes sense in this case to just enhance an upshot to allow the host process to determine whether further action is needed. However, if the persistence service cannot connect to the SQL Server database, so it cannot function at all. In that instance, rather than raise an outcome, it makes more than sense for the service to throw an exception and bring the host to a halt so that the issue tin be resolved.

When developing custom runtime services, the recommended approach is to have those services derive from the WorkflowRuntimeService base form. This base grade provides both access to the runtime and a protected method to raise a ServicesExceptionNotHandled event. When an exception occurs in the execution of a runtime service, the service should only throw that exception if it is truly an unrecoverable error. If the fault is related to a single workflow instance and not the general execution of the service, and then an event should be raised instead.

Handling Faults in Custom Activities

For activity authors, exception handling takes on a slightly different meaning. The goal with exception handling in activities is twofold: handle exceptions that occur to go along them, when possible, from bubbles upward and disrupting the workflow and clean upward properly in cases where an unhandled exception bubbles out of the activity.

Because an activity is but a class, handling exceptions within the activity is no different than in any other class. Y'all utilize endeavor/catch blocks when calling other components that may throw errors. Nevertheless, once you catch an exception in an activeness, y'all must decide whether to rethrow the exception. If the exception is something that volition not affect the event of the activity, or your activity has a more than controlled way of indicating that it was non successful, this is the preferred mode to provide that feedback. If, however, the exception means that your activity has failed and cannot complete its processing or provide an indication of the failure, and so you lot should throw an exception so the workflow developer tin pattern the business process to handle the exception.

The other facet of handling exceptions in activities is dealing with cleaning up activity resources. Different in a workflow, where fault treatment is focused on the business procedure, logging, and notification, treatment faults in activities is primarily focused on cleaning up the resources used in the activity execution.

The fashion you handle faults will also depend on whether you are writing a leaf activity or a blended activity. In a leafage activity, the HandleFault method is called when an unhandled exception is caught by the runtime in club to permit the activeness to free up whatever resources that might be in use and clean up any execution that has begun. For case, if the activity is using a database during execution, in the HandleFault method it should make sure to close the connection if information technology isn't already closed and dispose of whatever other resources that might exist in use. If the activity has begun any asynchronous piece of work, this would be the time to cancel that work and gratuitous up the resources being used for that processing.

For a blended activity, when the HandleFault method occurs, information technology might be because of a logic fault in the activity itself, or it may be because a child activity has faulted. In either case, the intent in calling the HandleFault method on a blended activeness is to let the activity to make clean up its child activities. This cleanup involves making sure the composite doesn't asking any more activities exist executed and canceling any activities that are executing. Fortunately, the default implementation of the HandleFault method, defined in the Composite­Activity base class, is to call the Cancel method on the blended activity.

Cancellation is another mechanism that allows activities that accept started some piece of work asynchronously and are currently waiting for that work to complete to exist notified that they should abolish the work they have started and clean upwards their resources so they can close. An action may be canceled if another activity has thrown a fault, or under normal circumstances if the control menstruum logic of the parent composite activity decides to cancel the work.

When an activeness is to be canceled, the runtime sets the status of that activity to Canceling and calls the Cancel method on the activity. For case, the Replicator activity tin start several iterations of a child action, one for each piece of information supplied, and schedule those activities to run in parallel. It also has an Until­Status holding that will be evaluated as each child action closes. Information technology is possible, and likely, that evaluation of the Until­Condition volition crusade the activity to determine that it should consummate.

In gild for the Replicator to shut, it must first close all child activities. Since each of those activities has already been scheduled and is potentially executing, the Replicator activeness checks the current value of the ExecutionStatus property and, if it is Executing, makes a asking for the runtime to abolish that activity.

Using Compensation

Handling faults in workflows allows developers to deal with firsthand exception conditions. The employ of transactions besides provides the ability to telescopic work together to ensure consistency. However, in long running workflows, it is possible that two units of work require consistency, simply cannot utilize a transaction.

For instance, in one case a workflow starts it may update the information in a line-of-business awarding, perchance adding a customer into the CRM system. This work may fifty-fifty exist part of a transaction to provide consistency beyond several operations in the CRM and with the state of the workflow. Then, afterward waiting for farther input from a user, which may takes days to happen, the workflow updates an accounting organization with the customer information. It is of import that both the accounting system and the CRM organization have consistent data, but information technology is not possible to use an atomic transaction for those resources across such a large fourth dimension span. Then the question becomes, how do handle exceptions that occur when updating the 2nd organization to ensure consistency with the changes already committed to the commencement organisation?

fig04.gif

Effigy 4 While Activity as Retry Logic

Because the work in the two systems cannot be made consistent with a transaction, what yous demand is a mechanism to find errors that occur when updating the 2d system and provide an opportunity to go back and undo the work applied in the initial system, or otherwise brand changes to ensure consistency. While the human activity of detecting this change and initiating this process tin can be automatic, the work of fixing the initial system plain has to exist specified past the programmer.

In WF this process is referred to every bit Bounty and several activities are provided to assistance develop workflows that apply compensation. For more data on compensation and how to use the compensation related activities, encounter Dino Esposito'southward Cutting Border column on transactional workflows in the June 2007 issue of MSDN Magazine(" Transactional Workflows").

Retry Activeness

One of the bug with dealing with exceptions in workflows is that, when an exception occurs, even if you catch it, the execution moves on to the next step in the process. In many business concern processes, execution actually should not proceed until the business logic defined in the workflows executes successfully. Developers oftentimes deal with this past using a While activity to provide retry logic and defining the condition of the activity to signal that the activity should continue to execute every bit long as an fault has occurred. Further, a Delay activity is often used to keep the retry logic from happening immediately.

To enable this retry model, you lot can employ a Sequence action as the child of a While activity. Farther, a specific unit of work in the sequence is often wrapped in another sequence or blended activity to handle the exceptions, interim as the fault-handling scope with all fault handlers divers in the Fault Handlers view. Then an IfElse activity is usually used to modify the land of the workflow to influence the condition on the While activeness.

In the instance where no exception occurs, the logic sets a holding or flag of some sort so the While activity can shut. If an exception did occur, then the flag is ready to cause the while activity to execute again, and a Delay activity is used to interruption earlier making the side by side try. Effigy fourshows i case of using the While activeness to retry activities in a workflow.

While this particular blueprint works in many scenarios, imagine a workflow with 5 or 10 different operations that need to exist retried. Yous will quickly realize that information technology is a lot of piece of work to build the retry logic for each action. Fortunately, WF enables developers to write custom activities, including custom composite activities. That means I can write my own Retry activity to encapsulate executing the child activities once again when an exception occurs. For this to exist valuable, I want to provide 2 key inputs for users: a filibuster interval betwixt retries, and a maximum number of times to retry the piece of work before letting the exception bubble upwardly and be handled.

In the remainder of this column, I volition item the logic in the Retry activity. For background information on creating custom activities, see my previous article (" Windows Workflow: Build Custom Activities to Extend the Reach of your Workflows", and for more information on using the Activity­ExecutionContext to create activities that tin can iterate over a child activeness, see the June 2007 installment of this cavalcade (" ActivityExecutionContext in Workflows").

To manage the child action correctly, it is important to be able to monitor the activity to know when errors occur. Thus, when executing the child activity, the retry activity not only registers to get notified when the child action closes, merely also registers to get notified when the child activeness is put into a Faulting country. Effigy fiveshows the BeginIteration method used to start each iteration of the child activity. Before scheduling the action, the Airtight and Faulting events have handlers registered.

Figure 5 Executing Child Activities and Registering for Faults

              Activity kid = EnabledActivities[0]; ActivityExecutionContext newContext = executionContext.ExecutionContextManager.CreateExecutionContext(child); newContext.Activity.Closed += new EventHandler<ActivityExecutionStatusChangedEventArgs>(child_Closed); newContext.Action.Faulting += new EventHandler<ActivityExecutionStatusChangedEventArgs>(Activity_Faulting); newContext.ExecuteActivity(newContext.Activity);                          

Unremarkably if a child activity faults, the parent activity would likewise be put in the faulting state. In order to avert that situation, when the kid activity faults, the Retry activity checks to see whether the activity has already been retried the maximum number of times. If the retry count has not been reached, and then this code nulls out the current exception on the child action, thus suppressing the exception:

              void Activity_Faulting(object sender, ActivityExecutionStatusChangedEventArgs e) { east.Action.Faulting -= Activity_Faulting; if(CurrentRetryAttempt < RetryCount) due east.Activity.SetValue( ActivityExecutionContext.CurrentExceptionProperty, naught); }                          

When the child activity closes, the logic must determine how the activity got to the closed land and uses the ExecutionResult property to do so. Since all activities finish in the Closed state, the ExecutionStatus does not provide the information needed to determine the actual outcome, simply the ExecutionResult indicates whether the activeness faulted, succeeded, or was canceled. If the kid activity succeeded, so no retry is needed and the Retry activeness simply closes:

              if (e.ExecutionResult == ActivityExecutionResult.Succeeded) { this.SetValue(ActivityExecutionContext.CurrentExceptionProperty, goose egg); thisContext.CloseActivity(); return; }                          

If the result from the closing activity is not success, and the retry count hasn't been reached, then the activity must exist executed over again, but not before the retry interval has expired. In Figure six, instead of get-go another iteration directly, a timer subscription is created using the interval configured on the activeness.

Effigy 6 Creating a Timer Subscription

              if (CurrentRetryAttempt++ < RetryCount && this.ExecutionStatus == ActivityExecutionStatus.Executing) { this.SetValue(ActivityExecutionContext.CurrentExceptionProperty, zip); DateTime expires = DateTime.UtcNow.Add(RetryInterval); SubscriptionID = Guid.NewGuid(); WorkflowQueuingService qSvc = thisContext.GetService<WorkflowQueuingService>(); WorkflowQueue q = qSvc.CreateWorkflowQueue(SubscriptionID, faux); q.QueueItemAvailable += new EventHandler<QueueEventArgs>(TimerExpired); TimerEventSubscription subscription = new TimerEventSubscription( SubscriptionID, WorkflowInstanceId, expires); TimerEventSubscriptionCollection timers = GetTimerSubscriptionCollection(); timers.Add(subscription); return; }                          

When the timer expires, the TimerExpired method will be invoked, as shown here:

              void TimerExpired(object sender, QueueEventArgs e) { ActivityExecutionContext ctx = sender every bit ActivityExecutionContext; CleanupSubscription(ctx); BeginIteration(ctx); }                          

fig07.gif

Figure 7 The Retry Activity in a Workflow

This will begin the side by side iteration of the kid action. By using the TimerEventSubscription course and calculation the timer to the workflow'south timer drove, the activity is able to correctly participate in persistence and resumption with whatever persistence service is currently configured in the runtime. If the retry interval is long, the entire workflow tin can be taken out of retentiveness until the timer expires.

The key behavior of the workflow activeness has been met at this point. If a child activity faults, the retry activity will non error. Instead it will break for the retry interval, then attempt to execute the child activeness once again.

The final stride is to deal with the case where the activity has reached the retry count and the child activity has continued to neglect. In this case, the Activity_Faulting method does not clear the exception on the child activity, as the goal is to let that activity fault as normal. And when the kid activity closes, the Retry activity also closes.

When the Retry closes after all retry attempts have failed, the result is the aforementioned every bit if the original piece of work had failed in a sequence. The Retry activeness tin can have FaultHandler activities defined and those fault handlers volition but execute later on all retries have been executed. Using this model simplifies the development of workflows with deportment that may need to be retried, all the same maintains the same development experience for workflow developers in regard to handling faults as shown in Figure 7.

In add-on, the fault handlers will be executed for the child action when the retry attempts take failed, so workflow developers can cull to handle the faults on either activity. The HandleFault method gets called on the child action for each failure, ensuring that the activity has a chance to clean up on each iteration.

Transport your questions and comments to mmnet30@microsoft.com.

Matt Milner is a member of the technical staff at Pluralsight, where he focuses on continued systems technologies. Matt is besides an independent consultant specializing in Microsoft .Cyberspace technologies with a focus on Windows Workflow Foundation, BizTalk Server, ASP.Net, and Windows Communication Foundation. Matt lives in Minnesota with his wife, Kristen, and his two sons.