Using SSIS to Import HTTP-Based XML Document

By | May 17, 2015

Introduction

The post, XML Document Sourced SSRS Report Using HTTP URL, demonstrated how a SQL Server Reporting Services (SSRS) data source can be used to reference an Extensible Markup Language (XML) document located via Hypertext Transfer Protocol (HTTP) protocol-based URL. Another approach to reporting off an XML document is to first extract the XML data using SQL Server Integration Services (SSIS), store it in a relational engine such as such Microsoft SQL Server and then point your SSRS data sources to SQL Server.  

Similar to the aforementioned post, the XML document that we will be using is the sitemap file of select SIFISO blogs which is located here.

The aforementioned document looks as shown below:

14

Source Code Changesets

The changesets associated with the source code for this article can be found on GitHub and CodePlex as shown below, respectively:

98

Walkthrough

Okay, let’s get started.

Add new SSIS package and assign it a name (I have labelled mine, ImportSitemap_data)

10

Navigate to SSIS Toolbox in SQL Server Data Tools (SSDT) SSIS Designer. Drag and drop a data flow task into the control flow. I prefer to assign meaningful names to my SSIS components so I have renamed my data flow task to Data Flow Task – Import Sitemap data.

1dft

Let’s configure the newly added data flow task.

Click on the Data Flow tab next Control Flow. Navigate to SSIS Toolbox window. Drag and drop an XML Source component into the data flow. Rename the XML Source component to XML Source – blog sitemap.

At this point your package should look as follows:

1211

Let’s configure the XML Source component.

Under connection manager tab, set Data access mode: property to XML file location.

Set the XML location to http://www.blogs.selectsifiso.com/sitemap.xml.

You will soon be required to provide an XML Schema Definition file (XSD). The easiest way is to generate the XSD file using the XML location document. During the generating of the file, you will be asked for save-file location (I saved mine in c drive).

4

Click on the columns tab to refresh the metadata and then click OK to commit the configuration settings.

5

That’s basically conclude the settings for extracting the XML data.

However, for the purposes of troubleshooting and ensuring that the package works, we can go on and add a Row Sampling transformation component.

Connect the XML source component to Row Sampling.

2

Configure the Row Sampling component as follows:

6

7

Finally, add a data viewer to take a look at the data as you extract it during runtime:

dataviewer

Your data flow should look as follows:

11

Conclusion

That’s all folks.

I hope you were able to learn something from this post.

Until next time, cheers.

One thought on “Using SSIS to Import HTTP-Based XML Document

  1. Pingback: Latest Blog Post – Sourcing XML in SSIS | select SIFISO

Leave a Reply