Creating an SSIS Custom Task, Part 1
Recently I had the need to create a couple of custom tasks for SQL Server Integration Services (SSIS) at work and I was shocked at how few the resources were for doing this. There are a lot of blogs and articles about creating SSIS custom tasks, but most of them are outdated or wrong. Even Microsoft’s own documentation has clearly been promoted from older versions and is no longer accurate. This leads to a lot of wasted time and scratching of your head trying to figure out what is wrong.
In this series of articles I’m going to demonstrate how to create a custom task for SSIS 2014. This information is also applicable to SSIS 2012 but doesn’t necessarily work with earlier or later versions of SSIS. In this first article we will set up a simple SSIS package that we can use for the remainder of the series.
SQL and SSDT Versioning
In order to follow along you’ll need a copy of SQL 2014 with SSIS and SSRS installed. If you have SQL 2012 then these steps should work but you’ll need to adjust the SQL versioning (discussed later). SQL 2016 or later may work as well. To avoid confusing things even more, let’s talk about SQL versioning now. SQL versions do not line up with product versions. As of today here’s the versioning of SQL to be aware of.
- SQL 2008 R2 = 10.0
- SQL 2012 = 11.0
- SQL 2014 = 12.0
It is critically important that you remember to use the correct numeric version for your target version of SSIS. SSIS requires that you use the exact version for SSIS otherwise things won’t work correctly. For example, if you are targeting SSIS 2012 then you must use 11.0 of SQL assemblies. If you attempt to use newer or older versions then it may compile but it will not work properly.
You’ll also need a copy of SSDT in order to build SSIS packages. If you have Visual Studio installed then you already have SSDT most likely. But you must use the version of SSDT that is tied to your version of SSIS. Each SSDT version indicates what version of SSIS it supports (and each one only supports 1 version). Here’s the current mapping.
Since we will be targeting SSIS 2014 for this article then you need SSDT-BI for Visual Studio 2013. If you do not already have VS 2013 installed then it will install the shell. VS2015 and the version of SSDT that ships with it DOES NOT work with SSIS 2012.
Finally, for this article we’ll be using the AdventureWorks database so you’ll want to ensure you have a copy of that installed as well. Additionally go ahead and install the SSRS reports for AdventureWorks. Currently there is not a SQL2014 version so install SQL2012 instead. When prompted for the installation path be sure to use 120 instead of 100 as the SQL version. You’ll need to upload each report into SSRS (I recommend placing them in a separate folder), rename them to remove the database name reference and adjust the data source to use the correct server information.
What is SSIS?
If you are not sure what SSIS then refer to MSDN for a full description but ultimately it is used to perform ETLs of data into or out of SQL Server. Whenever you need to do bulk processing of data and ultimately that data is going into or coming from SQL then SSIS is a useful tool. SSIS is based upon workflow processing of sets of data and ships as part of SQL Server. An SSIS package is the equivalent of a program in .NET. A package can be scheduled to run at certain times or run manually as needed.
Out of the box SSIS comes with lots of tasks that you can use to build an ETL process. Some example tasks include running data queries against SQL and other data sources, transforming data into new formats and saving data to various data sources. Ultimately a task consists of inputs and outputs which you set to control the process to determine where and what data to store. Tasks are the building blocks of the ETL process.
There are other components in SSIS as well. Connection managers are responsible for managing connections to external resources including databases, the file system and web services. Anything outside of SSIS that requires a connection of some sort (URL, connection string, etc.) is managed through a connection manager. Connection managers are configured at the package level so all tasks within a package have access to the same connections. A connection manager isn’t an actual connection to a remote resource but merely the information needed to make the connection. The connection is created at runtime, on demand. The benefit of connection managers is that the connection details can be changed (such as at deployment) without breaking or requiring any code to be changed.
In the cases where SSIS does not have a task that accomplishes what you want there is a free-form script task that allows you to write .NET code. The .NET code runs in the SSIS sandbox and has access to the same information that standard tasks do. Oftentimes we’ll write scripts to do custom work that SSIS doesn’t support directly.
SSIS Variables
Before we dive into creating an SSIS package we should talk about variables. SSIS variables work similar to programming language variables in the sense that they have a name, type, value and scope. But the way SSIS uses variables can result in problems if you are not careful.
The first thing that is important is the variable’s scope. By default variables are scoped at the package level which means all tasks can access them. Variables can also be scoped to any container and some tasks. A package can have multiple tasks running in parallel if needed so sharing variables across tasks needs to be done carefully. In general I recommend that you scope a variable to the container or task that you need it in. If you need to change a variable’s scope after creation then use the Variables
window and click the button to move the variable.
The next thing to know about variables is that they should be locked for reading and writing. Because tasks can be running in parallel SSIS needs to know when you are reading/writing variables. You do this using the VariableDispenser. For most tasks you do not need to worry about this process but for script tasks you have to specify which variables you want to read and write as we’ll see later.
Lastly, variables are case sensitive so the namespace and variable name must match the variable declaration or you will get errors.
Creating an SSIS Package
Let’s set up a basic SSIS package.
- Start SSDT-BI for Visual Studio 2013.
- On the Start Page click New Project and then Templates\Business Intelligence\Integration Services Project.
- Name the project whatever you want and click OK.
The solution contains a single package and is ready for you to add some work. For this article we’ll query for all sales that were over $100K.
- Add an Execute SQL Task to the designer called ‘Get orders over $100K’.
- On the
General
tab- Set
ConnectionType
to ADO.NET and click theConnection
property and select New Connection to create a new connection. - Configure the connection.
- Set
ResultSet
to Full result set to pull back all records. - For
SQLStatement
use the following.SELECT TOP 100 SalesOrderId, SUM(LineTotal) FROM [Sales].[SalesOrderDetail] GROUP BY SalesOrderId HAVING SUM(LIneTotal) > 100000
- Set
- On the
Result Set
tab- Set the
Result Name
column to 0. This is required for ADO.NET results. - Assign the results to a new variable of type
Object
(i.e. SalesOrder).
- Set the
For each order we’ll generate the Sales_Order_Detail
report from SSRS so we need to enumerate the rows using the foreach enumerator. Currently we have a single variable representing the entire rowset. The foreach enumerator will enumerate each row. To access the columns of the row we’ll define new variables scoped to the foreach container.
- Add a
Foreach Loop Container
task below the SQL task. - Connect the tasks using the green arrow so SSIS knows to execute the loop after the query returns.
- On the
Collection
tab- Select
Foreach ADO Enumerator
(the enumerator must match the type of the data coming in). - In the
ADO object source variable
dropdown, select the variable that was created earlier. - Set
Enumeration mode
to be the rows in the first table.
- Select
- On the
Variable Mappings
tab- Click in the
Variables
column to add a new variable calledSalesOrderId
. - Set the scope to the foreach container so we don’t collide with other variables elsewhere.
- Set the type to
Int32
. It is important the type matches the column type. - Click OK to create the variable.
- In the
Index
column set the index to zero (indice are zero-based).
- Click in the
Inside the foreach container we need to generate the SSRS report but SSIS does not have a task for that. For now we’ll add a script task.
- Add a
Script
task inside the foreach container. - On the
Script
tab- Click on the
ReadOnlyVariables
field and select theSalesOrderId
variable we created earlier. - Click
Edit Script
to open the script editor. - Put in the following code.
- Put in the following code where the TODO comment is.
var salesOrderId = Dts.Variables["User::SalesOrderId"]; bool fireAgain = true; Dts.Events.FireInformation(0, "ScriptTask", String.Format("SalesOrderId = {0}", salesOrderId.Value), null, 0, ref fireAgain);
- Build the code to ensure it is valid and then close the editor.
- Click
OK
to close the script task.
- Click on the
When we set up the script task we had to specify which variables we wanted to read and write. If you fail to do this then when you try to reference the variables it will fail. Setting up the variables in this way is actually a shortcut for locking. You don’t need to use the variable dispenser inside script tasks because of this (although you could if you wanted to).
Testing the Package
We can now test the package. In SSDT build and run the solution. If you get any errors then it will show in the designer. You have to stop the debugger and go to the Progress
view to see the errors.
If everything runs correctly then all the tasks and containers will show a green check. The informational messages should be shown in the results view.
Generating a Report
To finish up this post we’ll add the code to generate the report from the script task. For this we’ll use SSRS url access. It is a lot of boilerplate code so we won’t cover it here but we need to set up some stuff first.
We need an HTTP connection to the reporting server so add a new HTTP connection in the connection manager list. Specify the SSRS url (i.e. http://myserver/ReportServer) and credentials to use. We also need a file connection to a folder so we can store the report results. Remember the connection names for later.
Open the script task and replace the existing code with the new code. You need to add a reference to the System.Web
assembly.
public void Main() { bool fireAgain = true; //TODO: Set this to the path to the report on SSRS var reportPath = "/Sales_Order_Detail"; var salesOrderId = Dts.Variables["User::SalesOrderId"].Value; reportPath = HttpUtility.UrlEncode(reportPath); var builder = new StringBuilder(reportPath.StartsWith("/") ? reportPath : "/" + reportPath); //Add parameters, case matters to SSRS builder.AppendFormat("&SalesOrderIDStart={0}", salesOrderId); builder.AppendFormat("&SalesOrderIDEnd={0}", salesOrderId); builder.AppendFormat("&rs:Command=Render&rs:Format=PDF"); //Get the connection var cm = Dts.Connections["ReportServer"]; //Create a copy of the connection because we're going to change the URL //var conn = new HttpClientConnection(cm.AcquireConnection(Dts.Transaction)).Clone(); var conn = new HttpClientConnection(cm.AcquireConnection(Dts.Transaction)); if (conn == null) throw new Exception("Unable to acquire connection."); // SSRS url access format = {serverUrl}?{reportPath}{parameters}{options} var uri = new UriBuilder(conn.ServerURL) { Query = builder.ToString() }; conn.ServerURL = uri.Uri.ToString(); // Generate Report Dts.Events.FireInformation(0, "ScriptTask", String.Format("Generating report: {0}", conn.ServerURL), null, 0, ref fireAgain); try { var data = conn.DownloadData(); Dts.Events.FireInformation(0, "ScriptTask", String.Format("Report generated for {0}", salesOrderId), null, 0, ref fireAgain); //Save the file var file = Dts.Connections["TargetFolder"]; var target = file.AcquireConnection(Dts.Transaction) as string; File.WriteAllBytes(Path.Combine(target, salesOrderId.ToString() + ".pdf"), data); } catch (Exception e) { Dts.Events.FireError(0, "ScriptTask", e.Message, null, 0); Dts.TaskResult = (int)ScriptResults.Failure; }; Dts.TaskResult = (int)ScriptResults.Success; }
Running the package now should generate a series of reports in the specified target folder. If you are getting SSRS errors then try the URL that is generated in the log in a browser window. In most cases the issue is either report paths or security.
Next
Next time we’ll start moving the report generation logic to a custom task.
Download the Code.
Hello,
Thank you for posting your solution. This is exactly what I needed and I have it almost working. I know that the path to the report is good, as I have copied it from the execution results window, but when it tries to save the report to the path I specify in my script, I receive an error that it can’t save the file to C:\. I am not referencing this location anywhere in my package, so I’m not sure what is wrong. I have verified the paths and everything looks correct. If you have any suggestions on what the issue might be, I would appreciate any help.
Thank you,
Arlene
Saving the report would be occurring in a script task that you created (or that came with the sample). Open the script task that is trying to save the file and take a look at the code. Ultimately you’re probably calling File.WriteAllBytes. The first parameter would be the file path. Using the debugger you should be able to verify the path is correct. The C:\ isn’t generally writable so if you are trying to use that path then you’d need to change it to a temp folder or something.
If you are getting this error inside the SSRS Report Task itself then something is wrong as the task shouldn’t be saving any files.
Hello,
I am trying to test it first to a folder I created called C:\Test. I have a variable called FileFolder that is set using an expression, based upon variables being passed into the foreachloop. I added a FileSystemTask that creates this folder if it does not exist. This works right down to the subfolder the report should be written to in the FilePath. My “TargetFolder” connection is set to “ExistingFolder” and the connection string is my Variable called FilePath. The script runs to create the report and save it to the FilePath, which set using the FileFolder and FileName variables. I verified that all variable values are correct and that the report will run if I copy/paste the variable value to a browser window. It then prompts me to open, save or save as. The error doesn’t make sense because I would never be trying to write to C:\.
Thank you,
Arlene
AFAIK FileSystemTask cannot be used to create a file like this. It can work with existing files and folders or create new folders but not create new files. It also usually requires a File Connection object. In the sample code I posted I used a Script task to save the report PDF to disk. Can you try using that script block instead and see if your problem goes away. I hard coded the path in the script task but you can use a parameter if you need to.
Thank you for your reply. How should the TargetFolder connection be setup? Create new file? Your instructions say…
“We also need a file connection to a folder so we can store the report results”. Can this be set to a path that is derived from the parameters that I am passing in from my select statement for the ForEachLoop? The folder structure may not exist when the first report runs.
If you’re using FileSystemTask to create a folder then you can use the following values – Operation = “Create Directory”, IsSourcePathVariable = True, SourceVariable = where is some variable you have created to store the directory path. If you want to create different folders each time through your loop then you can change this variable and the FST will use the updated value. I would probably recommend that you set UseDirectoryIfExists = True if the directory may already exist otherwise it is an error.
But, as I mentioned already, FST doesn’t support the creation of new files. There is a data source that allows you to create a new file but I found it to not be useful for saving binary results like a report. Hence I recommend that you simply use a Script task. You can actually combine the folder creation and the file creation into a single script eliminating the need for FST altogether. Here’s how you might go about doing it:
1) Create or use a variable to represent the target filename, including any directory (i.e. targetFilename). If all files go in the same directory then simply create a variable for the target directory and a separate variable for the filename itself.
2) Update this variable in your loop as needed.
3) Create a variable to store the report results (i.e. reportContent).
4) Create a script task that accepts as Read-Only variables the aforementioned variables.
5) Inside the script block
//Add to the top of the script
using System.IO;
//Inside Main script
var targetFilename = Dts.Variables[“User::targetFilename”].Value as string;
var reportContent = Dts.Variables[“User::reportContent”].Value as byte[];
Directory.CreateDirectory(Path.GetDirectory(targetFilename));
File.WriteAllBytes(targetFilename, reportContent);
Dts.TaskResult = (int)ScriptResults.Success;
Hello and thank you again. I had to start from the beginning and got it working with your original post so that it writes to a local drive. I am only passing a single parameter, but each report will produce 12 reports in total. For one particular report that I tested, the ForEachLoop stopped at the 4th report and gave this error: [Connection manager “ReportServer”] Error: Server returned status code – 500 : Reporting Services Error. This error occurs when the server is experiencing problems. If I run the first four manually, they all take about the same time to complete, less than a minute. I’m not sure what the problem is now if some can run and others not.
Arlene
That error means SSRS failed the call. You’d need to look at the SSRS logs to see if you can find any additional information. When I’m trying to debug SSRS issues I like to see the report that failed. The code I posted logs the URL being used to call SSRS each time so I would recommend that you copy that URL and try running it through the browser to see if it is an SSRS issue. Also note that the code (currently) is not multi-thread safe. There is a comment in the code when creating the connection about using Clone. If you are trying to use the code I have posted so far in multiple pipelines at the same time within a single SSIS package then you will end up with corrupted results. I blogged about this bug in SSIS a while back if you’re interested. If you do have that need right now then uncomment the Clone line. When I convert the script to a custom task in one of the later articles I’ll be handling this automatically.
Hello, and thank you for helping me again.
I didn’t seen the word “Clone” anywhere in the code, but I am only trying to run a single report at a time through my ForEachLoop. I did a test, setting it to run for the one report that it stopped on last time and it runs successfully. Then I tried again to run the package and pass all values so that it runs the report again for each of them. It runs successfully one time, producing all 11 reports, then the next time stops at the third or fourth report. So it can execute the report, pass the parameter value and save the file as a PDF. It just doesn’t complete every time. Is there a setting somewhere that needs to be modified?
Arlene
There isn’t any setting that I’m aware of. It sounds like your SSRS server is having an issue with processing the reports. I recommend that you have your DBA take a look at the SSRS execution logs while you run the SSIS package and see what the underlying server error is that SSRS is generating. It could be something like a locking issue with your DB or perhaps out of memory or something. Diagnosing this kind of problem will likely require that you post to the SSRS forums as they will better be able to assist with SSRS server errors.
Thank you! Will do.