Warning: is_writable() [function.is-writable]: open_basedir restriction in effect. File(/f5/sqllion/public/wp-content/uploads/shareaholic/) is not within the allowed path(s): (/fs3d/sqllion/:/nfsn/apps/php53/lib/php/:/nfsn/apps/php5/lib/php/:/nfsn/apps/php/lib/php/) in /fs3d/sqllion/public/wp-content/plugins/sexybookmarks/sexy-bookmarks.php on line 335
Programming Foreach Loop Container – Enumerating Excel Files | SQL Lion
Home > SSIS, SSIS programming > Programming Foreach Loop Container – Enumerating Excel Files

Programming Foreach Loop Container – Enumerating Excel Files

Sometimes we were not aware of the contents of any object and we have to do certain tasks as many times as the no. of contents of that object. For example, a folder contains some text files with employee information data and we have to load all the text files data to our SQL server employee table. Here, we don’t know about the count of the source but we have to roll over every file. For this type of situation, SSIS have a unique container called “Foreach Loop Container” that will enumerate with each text file in the folder.

It can be used in some other cases too like enumerating each sheet of a excel file, or parsing each row of a table, enumerating each child node of a given tree, etc. In SQL server 2000 DTS package, it is bit difficult to loop over any task but fortunately SSIS has a special container to accomplish such tasks.

Benefits from SSIS Foreach Loop Container:

  • It keeps the package neat and clean (avoids building of repeated tasks)
  • It can iterate any object dynamically (even if the contents count is not available)

The foreach loop container acts as a repeating control flow in a package. Its operations are similar to work of Foreach keyword in any advanced programming language. We have a definite type of enumerator for each type of objects. Foreach loop container uses one of this enumerator to carry out its operation:

Below are the lists of various types of enumerators provided by SSIS Foreach Loop Container:

  1. Foreach File Enumerator: It enumerates files in a folder. The plus point here is it can traverse through subfolders also.
  2. Foreach Item Enumerator: It enumerates items in a collection. Like enumerating rows and columns in an Excel sheet.
  3. Foreach ADO Enumerator: Useful for enumerating rows in tables.
  4. Foreach ADO.NET Schema Rowset Enumerator: To enumerate through schema information about a data source. For example, to get list of tables in a database.
  5. Foreach From Variable Enumerator: Used to enumerate through the object contained in a variable. (if the object is enumerable)
  6. Foreach NodeList Enumerator: Used to enumerate the result set of an XML Path Language (XPath) expression.
  7. Foreach SMO Enumerator: It enumerates through SQL Server Management Objects (SMO) objects.




Let’s consider the below example;

Here we have a set of Excel files in a folder containing different data but of same structure. Our requirement is to load all the data from these excel files to our destination table in the SQL Server database.

For this example, the excel files contain information about the sales of Pencil across different countries. We have a total of 5 excel files in the folder named “SampleExcelFiles“. To achieve this, we have to take the help of FOR EACH Loop Container provided by SQL Server Integration Service (SSIS).

[Note: All other components and tasks are described in the earlier article named "SSIS Programming Basic"]

After creation of necessary components, the “Foreach Loop Container” is created where only one Data Flow Task has been kept. The single Data Flow Task will load all the excel files data to the destination after changing the source connection string each time the loop iterates.

For this, one more package level variables have to be created named:

  • CurrentExcelPath: This will keep track of the path of current excel file after each iteration.

Create one Excel connection manager to handle all these excel files. Take one existing excel file to avoid validation errors. This workbook is just for validation purpose. At runtime i.e. while package will execute the Connection string will automatically be updated to the value of CurrentExcelPath. Or else you can keep it empty provided DelayValidation property of this connection manager is set to True. In this example the connection manager name is “Excel Connection Manager“.

In the connection manager property, select “Expressions” and then set the first property as “ConnectionString“.

Property Expression Editor

Click on the ellipse button, and in the “Expression Builder” dialog box, Enter the following code:

“Provider=Microsoft.Jet.OLEDB.4.0;Data Source=” + @[User::CurrentExcelPath] + “;Extended Properties=\”Excel 8.0;HDR=Yes\”"

Expression Builder

Note: Here in this expression, the [User::CurrentExcelPath] has been used to complete the expression at runtime.

Then drag the “Foreach Loop Container” to the package area and double click to open the Foreach Loop Editor. On the Collection page, select For Foreach File Enumerator. In Enumerator Configuration, select the Folder where all the excel files are kept or stored. State the File Filter to be considered. For this example, the folder path is “I:\Site work\SampleExcelFiles” and the File filter is “*.xls”.

Foreach Loop Editor

Then in the Variable Mappings page, map the Index 0 to the user defined string variable that is defined above for storing the current Excel file path i.e. CurrentExcelPath. Click OK to close the Foreach Loop Editor.

Now create the dataflow task that will use the “Excel Connection Manager” as its connection manager to handle Excel files. Add the Data Flow components like Source, row counter and destination and then execute the package.

Execute Package

Output:

Output



Let’s do the same in SQL Server Integration Programming.

The creation name for Foreach Loop Container is {STOCK:FOREACHLOOP} and for Excel Source is {DTSAdapter.ExcelSource.1}.

Here we will see how to add Excel as source and how to change its connection string dynamically by integration programming.

The code below shows the above package programmatically and the code itself is self-descriptive.

Add the below References before developing the code.

Microsoft.SqlServer.Dts.Design
Microsoft.SqlServer.DTSPipelineWrap
Microsoft.SQLServer.DTSRuntimeWrap
Microsoft.SQLServer.ManagedDTS

Programming Integration Service in SQL Server 2005

Expand image C# .Net Expand
Expand image VB .Net Expand

Programming Integration Service in SQL Server 2008

Expand image C# .Net Expand
Expand image VB .Net Expand

 

To make your work more easy and effortless, SQL Lion team comes up with a unique tool named “SSIS Component Explorer” that will help you in retrieving the Creation Name, Component Class ID, etc about any component like Control Flow Tasks, Data Flow Transformations, Connections Managers, Log Providers, etc in SSIS package. It also provides code snippets for each component.
Download
the free version from here.
Categories: SSIS, SSIS programming
  1. June 26th, 2009 at 12:32 | #1

    The style has a professional look and formal navigation process…

  2. kkluwang
    January 5th, 2010 at 21:04 | #2

    I’ve tried this piece of code.It works fine but my requirement is to provide the user to map the fields in the web page. I’ve a datatable which has the selected excel column names as well as the selected sql table column names. I want to override the column names while mapping. Is this possible? I’m not able to access the column names directly .How do i customize the code below:

    //map external metadata to the inputcolumn
    foreach (IDTSInputColumn90 inputColumn in input.InputColumnCollection)
    {
    IDTSExternalMetadataColumn90 exMetaColumn = (IDTSExternalMetadataColumn90)input.ExternalMetadataColumnCollection[inputColumn.Name];
    inputColumn.ExternalMetadataColumnID = exMetaColumn.ID;
    }

    Please reply urgently…………………

  3. Bob
    April 21st, 2010 at 04:54 | #3

    What does the data flow component look like? I cannot get the excel source to work.

  4. April 21st, 2010 at 09:33 | #4

    Hi Bob,
    To check out how Data Flow component works, follow the link
    http://www.sqllion.com/2009/05/ssis-programming-basic/

  5. August 3rd, 2010 at 20:08 | #5

    Arun,

    I invite you to add this to the SQL Server Technet wiki.

    http://social.technet.microsoft.com/wiki/contents/articles/sql-server-overview.aspx

  6. zizou
    August 26th, 2010 at 17:23 | #6

    the package fails when i run it i don’t know why instead of green i have red

  7. August 26th, 2010 at 17:26 | #7

    Hi zizou,
    if you can post the exact error, then only i may suggest you some solution.
    The code is tested on both SQL 2005 and SQL 2008; its working fine.

  8. vinoth
    August 30th, 2010 at 10:56 | #8

    HI Friends,
    I need a urgent help, i have a xml data, i want to import this data into data base using ssis.But i want to use for loop for checking this data and import to database.

    checking means if any rows contain wrong data, that is not import to the correct it will be import into error created table. how can i do this pls help me with screnn shots

  9. August 30th, 2010 at 16:34 | #9

    Hi Vinoth,
    Use error handling in SSIS to check for in-correct data and redirect it to the error handling tables.
    Hope that will resolve your problem. :)

  10. vinoth
    August 31st, 2010 at 12:59 | #10

    Hi Nick,
    Thanks for your immediate reply, could u pls give some small example for this, because i am very new for this ssis.

    pls help me thanks

  11. vinoth
    September 1st, 2010 at 10:16 | #11

    HI Friends,
    I need another urgent help with example.
    Task:-
    1.Create ssis package with import multi format file into same table like excel and xml and db resources.
    2.And create loop for checking this data and import into correct table and error table
    3. And create meta data and read this data path from meta table to import

    pls any one my friends help me with small examples

    My email is : vinothlilly@gmail.com

  12. vinesh
    December 3rd, 2010 at 16:17 | #12

    hi

    we are migrating a datawarehouse application from asp – sql 2000 to dotnet2.0 – sql 2005
    currently we are in the process of coming up with an optimum solution for backend design using sql server 2005.
    This application pulls data from 3 different source systems (oracle,sql server and mainframe db2).

    we have 3 different staging databases corresponding to each of the source application database.
    For eg : RetailGarments application is an oltp application with backend oracle database .
    we have a staging SQL server database called Retailgarments_stg for our datawarehouse application.
    similarly OnlineTravelBooking is an OLTP application with DB2 database .we have corresponding
    OnlineTravelBooking_Staging database for staging this data.

    Each of these source systems have more than 100 tables and we are currently pulling most of the
    tables to our side from the source .

    These staging data from different datasources are accumulated in to a cleansed Schema database
    which is used for Reporting and other OLAP requirements.

    Our question is related to data pull from the source system.
    Current SQL Server database pulls these data from sorce system using Stored proc dynamic queries containing
    link server openquery.

    We would like to improve the performance as part of sql server 2000 to sql 2005 migration
    and would like to use SSIS features for this .

    Stored Proc approch that we currently use for SQL Server 2000 helps us to dynamically frame the query
    through driver tables .(looping through the list of tables for each source system and framing corresponding
    table query using appropriate columns)

    As part of new SQL Server 2005 migration , we were planning to use Source – Destination data transformation methods,
    Can we use foreach loop or for loop for this ?

    Is there any better method for this ..?

    AIM : If we have 90 tables for a source system, we would like to iterate and pull data for 30 tables each
    (3 data pull each having 30 tables) as 3 seperate parallel data pulls.

  13. December 3rd, 2010 at 16:49 | #13

    Hi vinesh,
    It’s not a big deal. Your purpose is to load all the tables from the source server to your destination by using SSIS.
    Use foreach loop container and use “Foreach ADO Enumerator” or “Foreach ADO.NET Schema Rowset Enumerator” enumerator to iterate through each table.
    And inside this container use a script task to generate a simple package with one dataflowtask to populate data for one table and run it.
    And regarding 30 tables per run, you can set some conditions but i would suggest to not have anything like that because it is going to load your tables one by one.
    Hope this will help u.

  14. Thirmal
    July 5th, 2011 at 11:33 | #14

    Hi,
    Thanks for sharing knowleged. Please can you explaine with Access Files?
    My Requirement is i have three access files in different locations having same tables i need to import data into MS Sql Server Using SSIS/ Can you explaine please

  15. Mark
    September 4th, 2011 at 23:40 | #15

    I enjoyed this article and found it very easy to use. This stuff isn’t hard but it can sometimes seem that way due to poor article writing – yours doesn’t suffer from that!

    Thanks

  16. December 5th, 2011 at 11:02 | #16

    Thank you for every other fantastic post. Where else may anybody get that kind of information in such an ideal method of writing? I have a presentation subsequent week, and I am at the search for such information.

  17. Joby
    December 7th, 2011 at 19:09 | #17

    Thank you for the post, it helped me to solve my problem.. Thanks again

  1. August 19th, 2011 at 10:18 | #1
  2. August 19th, 2011 at 10:23 | #2
  3. June 1st, 2012 at 20:44 | #3
You must be logged in to post a comment.