Using a Script Component Source to Consume a WebService
Thanks to Dennis and Jim for this original post at: http://dennysjymbo.blogspot.pt/2013/05/using-script-component-source-to.html
</DisplayStats> This comes from an ASP.NET web service I built using SOAP:
You can see from this screenshot that the web service takes one parameter, offerDate, with a date format. We're going to want to make this a variable that we can alter at run time every time this package is executed. This can be done through values returned from an Execute Sql Task from a database, set via a function, etc. Regardless of how its implemented, in this example were going to hard code the variable to 03/28/2013:
On the script screen, we want to make sure to add our SSIS variable in ReadOnlyVariables:
This screen helps us to define what our output is going to look like. The columns that will represent our XML elements are as follows: WEBLOGIN DT_STR 50, DATEDT_DBTIMESTAMP, CALLSOFFERED DT_I4, CALLSANSWERED DT_I4 and CALLSMISSEDDT_I4. With this all set up, we can go back to the script screen, click on the Edit Script button and begin coding. In the main.cs file, paste the following code:
Let’s now run the package and view the results:
The XML data is now in our data flow and can be sent to a data destination. No physical XML file was needed, which means we didn’t have to rely on an XML Source to get this.
Using HASHBYTES() to compare columns
Recently, while at a client engagement, I was building some SSIS packages an issue came up where they didn?t want to use the CHECKSUM() function in TSQL to do column comparisons because the results can be inaccurate on some rare occasions. I personally have never come across this but others here at Pragmatic Works have. So we have two options freely available to work around this issue. The first is the third party component plugin that you can get free at codeplex called Multiple Hash. The client wasn?t comfortable with having to install this component on multiple servers throughout the environment so that option wasn?t available to me. Instead I had to use the HASHBYTES() function in TSQL.
HASHBYTES() is much more reliable than checksum when it comes to producing accurate results, but it comes at a slight cost.
The first thing to note is how to construct the HASHBYTES() function. In the first part you tell the function which algorithm you are going to use. I?m using SHA1, but be aware that they single tics ? ? are required followed by a comma. Then you must concatenate the columns you wish to use together as seen below.
There you can see its already a bit more arduous than using CHECKSUM(), but not that big of a deal to concatenate a bunch of columns. Lets look at the results.
Uh-Oh here is our first problem. HASHBYTES() doesn?t work with NULL values inside any columns. So we?ve got to handle the Nulls in our query using the ISNULL() function.
Now the results look like this:
Then next thing that you have to look at is how HASHBYTES() handles(or more accurately doesn?t handle) Data types. Here the ID column is an INT data type, but the same holds true for any non-string data type.
We get an error saying that the data type is wrong for the HASHBYTES() function
So now we have to CAST every column that is a non-string data type.
Now after this fix our results look better.
So as you can see already there will be a good deal more T-SQL coding involved with using HASHBYTES then with CHECKSUM(). But this isn?t all. The last little gotcha isn?t quite as obvious as the first two. Lets go back to our Null handling query.
I?ve gone and edited the data some for this example. Please also note that HASHBYTES() is case sensitive meaning that if you have the same spelling but different casing at an individual character level the hash value returned will be different.
The rows are different from one another but when concatenated together for the HASBYTES() function they produce the same exact value. So to handle this we have to update our code again. We are going to add a rarely used character to the concatenation so ensure that they results will return correctly.
Here I chose to use a pipe to basically delimit the columns thereby making them different from each other. The result is much better.
If we take a quick look at results of the two concatenations we can see why we get the different results
The delimited column is obviously different from one row to the next, and the Non-Delimited column is exactly the same for each row.
While HASHBYTES() is definitely more accurate and reliable than using CHECKSUM(), there are some hoops to jump through to insure you get the results you need. I hope this helps you guys out.