P3.NET

Creating an SSIS Custom Task, Updated

Last year I posted a series of articles about creating custom SSIS tasks to cleanly wrap some common work you may find yourself doing in SSIS. You could do this with script tasks but if you need this same functionality everywhere then it becomes a maintenance issue. In that series of articles we created a framework for adding new tasks easily and created one task to generate SSRS reports. Since that time my company has been using these tasks successfully but we did identify a couple of issues. Since that series of blog posts was based upon the lessons we learned it makes sense to update the code with the changes we’ve made over the past year.

Basic Changes

There have been a couple of changes made that don’t have an overall impact on the code.

  1. The instructions for creating new tasks has been converted to Markdown since Visual Studio can display it nicely.
  2. SSDT is integrated into Visual Studio 2017 and supports all versions of SSIS from SQL 2012 on.
  3. The older framework tools have been updated to Windows 10 and .NET 4.6.2.
  4. The build outputs have been modified to build to the same folder so it is easier to distribute later on.

Word of warning when updating to newer versions of SSIS. The Microsoft.DataTransformationServices.Controls assembly is in the GAC but does not show up in VS. To ensure you get the correct version, remove the reference in the projects and then browse to the GAC and select the correct version from there. If you don’t then it might reference the wrong assembly.

Versioning Changes

One of the bigger challenges when working with SSIS is versioning. The designer and runtime are implicitly tied to the version of the assembly containing the task. As the code evolves new versions are released. Provided the product version does not change this is mostly harmless. But as soon as the product version changes all the existing packages will no longer work and even the code is wrong.

To manage this better a new, shared AssemblyVersion.cs file has been added to the solution. This file is where the version information resides. All the projects have been updated to use it. Attributes defined in this new file have been removed from the per-project AssemblyInfo.cs files. Now changing the version simply requires modifying this file and rebuilding.

using System;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

#if DEBUG
[assembly: AssemblyConfiguration("Debug")]
#endif

[assembly: AssemblyCompany("P3Net")]
[assembly: AssemblyProduct("SSIS Custom Tasks")]
[assembly: AssemblyCopyright("(c) Michael Taylor 2016, All Rights Reserved")]
[assembly: AssemblyTrademark("")]

[assembly: AssemblyVersion(AssemblyMetadata.ProductVersion)]
[assembly: AssemblyFileVersion(AssemblyMetadata.FileVersion)]
[assembly: AssemblyInformationalVersion(AssemblyMetadata.FileVersion)]

internal static class AssemblyMetadata
{
    public const string ProductVersion = "2.0.0.0";

    public const string FileVersion = "2.0.0.0";
}

Notice the new metadata class as well. The assembly attributes use this metadata class for their version information. We put this into a metadata class because we need it for the DTS attributes as well. If you remember each SSIS task must have the DtsTask attribute applied to it. One of the properties on this attribute is the name of the type that implements the UI for it. The type name is a fully qualified type name that requires the version number. If the product version changes then this attribute has to be updated. To avoid having to search the code for this we instead now rely on the metadata class.

[DtsTask(DisplayName=...
       , UITypeName = GenerateSsrsTask.UITypeName)]

In order to be used in an attribute the value must be known at compile time (e.g. a constant). Similar to how we defined constants in the task class to be used by this attribute we put a constant in for the type name but we use the AssemblyMetadata class to provide the actual value.

public const string UITypeName = "P3Net.IntegrationServices.Tasks.GenerateSsrs.UI.GenerateSsrsTaskUI, P3Net.IntegrationServices.Tasks.GenerateSsrs.UI, Version=" + AssemblyMetadata.ProductVersion + ", Culture=Neutral, PublicKeyToken=0b943396c101760f";

Build Changes

Another challenge is with the build process. Unfortunately SSIS requires that you compile against the exact version of the assemblies that the runtime uses. We cannot work around this without creating multiple projects and sharing the code. For the most part we can ignore this unless we need to support multiple SQL versions.

The bigger issue with the build is the build events. To deploy the assemblies we have to register them and copy them to the correct location. This means the build is tied to the SSIS version we’re targeting and the version of the SDK that is installed. To workaround this and make it easier to maintain the solution now has a RegisterAssembly.ps1 script that is called during pre and post build. This script wraps the logic that was previously in the build events.

To make it more maintainable, the script no longer uses GacUtil. Instead it relies on the Gac Powershell module. Unfortunately this is not installed by default. The script attempts to find and install the module first. If that fails then it will have to be installed by hand. Once installed the script will load and use it.

Variable Bug

What really triggered this updated post is a bug in how variables are handled. We have a custom task that receives files from a backend service. The files are generated asynchronously so we query for the file and, if it is not ready yet, we skip it. To skip the file we set the provided SSIS variable to null. This is necessary so that we don’t accidentally reuse a file generated in the last iteration. But this is where the problem comes in.

If you set an SSIS variable to null then it’s type gets changed to Empty. This is certainly unexpected but makes sense if you consider SSIS’s COM background. Under the hood SSIS variables are probably VARIANTs. When you set a variable to null, rather than storing null it changes the type to Empty. This will break any code that is looking for, say, an Object as it is no longer that type. That was the problem here. The SSRS task would have a similar problem because here is how it handles the report being passed back.

var data = conn.DownloadData();
if (!String.IsNullOrWhiteSpace(Content))
    context.Variables.SetValue(Content, data);

If no report is sent back then the variable is set to null which changes its type. If you look at the validation logic for the task you’ll see it will fail if the type is not Object. It wouldn’t be on subsequent loops and so you get a runtime error.

if (variable.DataType != TypeCode.Object)
{
    context.Events.LogError("Content must be of type Object.");
    return DTSExecResult.Failure;
};

We need to handle this and probably the best place is inside the SetValue extension method we created. That method ultimately calls down to SetVar<T> which is where we set the variable. We need to detect null values here but it isn’t as trivial as you might imagine. While probably oversimplified, this is the updated code.

public static void SetVar<T>(this Variables source, object index, T value)
{
    //When clearing an object, create a new object instead otherwise SSIS will change the type to Empty            
    source[index].Value = (object)value ?? new object();
}

We are relying on the fact that the value is generic and so we only need to worry about null values with Object. This may not be true but for the existing tasks it seems to be good enough. If the value is null then we reset the variable to a new object. This is not something we’d normally do in C# but that is exactly what SSIS does when you create new object variables so it is consistent. Now when we assign null to an object variable it will retain its SSIS type.

But this introduces another issue and that is with dealing with “null” values. Since we cannot rely on SSIS actually using null we need to use heuristics. Here’s the method that I’m using now and it seems reasonable for our needs. It may have some cases it misses though.

private static bool IsNull ( object value )
{
    if (value == null)
        return true;

    var dtsVar = value as Microsoft.SqlServer.Dts.Runtime.Variable;
    if (dtsVar != null)
    {
        if (dtsVar.DataType == TypeCode.DBNull || dtsVar.DataType == TypeCode.Empty)
            return true;

        return IsNull(dtsVar.Value);
    };

    //In SSIS a "null" value is of type object so we'll assume that if it is truly just "object" then it must be null
    return value.GetType() == typeof(object);
}

First we do a normal null check. Next we try to convert the value to a DTS variable. That means SSIS tried to assign one variable to another. If it is a DTS variable then we look at the types that could be “null”. If the variable is not of any of those types then we recursively call ourselves on the variable’s value. If the value is not a DTS variable then we do a final check to see if it is a base object. Note that we cannot use is or as here because all types derive from it. Instead we assume that any type that is exactly object must be empty.

With this method defined we then have to update all the code that checks for null to use this method instead. The only method I did not modify was the set method. The code as it exists is working correctly so I’m not ready to change it. But the next time we open up this code I’ll probably switch to using IsNull and verify it still behaves correctly.

Content is Required

Related to the former bug is the fact that Content on the SSRS task is required. The UI and the task don’t mandate this parameter. I built it this way because I wanted to handle the case where an SSIS package wanted to trigger a report but not capture it. However because of the null issue the check actually fails. Looking at the validation logic for the task you see this.

//Validate the content
if (Content != null)
{
   //Validation rules
}

As already mentioned, SSIS does not allow a null value so all strings are set to empty. So this check would never fail. The simple fix is to use String.IsNullOrWhiteSpace instead and now the parameter is not required.

//Validate the content
if (!String.IsNullOrWhiteSpace(Content))
{
   //Validation rules
}

Download the Code.