This tutorial demonstrates how to extract and transform data from multiple regions within an Excel worksheet and use them within a data mapper transform, merging the data into a single output file.
Let's assume we have a worksheet that looks something like this (Data.xlsx):
The worksheet contains 3 tables, one for each salesperson.
We want to combine them all into a single XML file complying with the XSD Schema (Sales.xsd):
Method 1
The easiest way to work with regions within an Excel worksheet is to create a table for each region.
Note: If changing the Excel worksheet is not an option then have a look at Method 2.
This can be done by selecting the cells in the region, the using the Insert->Table option.
You can then name this table by using the Formulas->Name Manager option.
The file Data_Version2.xlsx contains the original Data.xlsx file with the addition of tables.
Note: Named cell ranges can also be used instead of tables.
Now when you create a new Data Mapper file and import data from your Excel file you will see the tables you have created in the import list.
If we also add an XML target based on our XSD Schema (Sales.xsd) then after adding a few connections we end up with a transform that looks like this:
Executing this transform will give us the 'Fred' sales in XML format:
<Sales>
<Sale>
<Name>Fred</Name>
<Date>2005-01-01</Date>
<Desc>1000 Pens</Desc>
<Total>19.99</Total>
</Sale>
<Sale>
<Name>Fred</Name>
<Date>2005-01-02</Date>
<Desc>5 Laptops</Desc>
<Total>749</Total>
</Sale>
<Sale>
<Name>Fred</Name>
<Date>2005-01-03</Date>
<Desc>1 Stress ball</Desc>
<Total>5.49</Total>
</Sale>
</Sales>
In order to add the other 2 salesperson's data we add 2 more Excel reader components, selecting the appropriate table for each.
We want to combine these additional data sources into our single output XML file. We can do this by duplicating the output node 'Sale' on the XML writer and connecting each Excel source to its own copy.
Now when we run the transform we get the results from all 3 salespeople.
<Sales>
<Sale>
<Name>Fred</Name>
<Date>2005-01-01</Date>
<Desc>1000 Pens</Desc>
<Total>19.99</Total>
</Sale>
<Sale>
<Name>Fred</Name>
<Date>2005-01-02</Date>
<Desc>5 Laptops</Desc>
<Total>749</Total>
</Sale>
<Sale>
<Name>Fred</Name>
<Date>2005-01-03</Date>
<Desc>1 Stress ball</Desc>
<Total>5.49</Total>
</Sale>
<Sale>
<Name>Sue</Name>
<Date>2005-01-02</Date>
<Desc>200 Memory sticks</Desc>
<Total>1400</Total>
</Sale>
<Sale>
<Name>Sue</Name>
<Date>2005-01-05</Date>
<Desc>1 Label Printer</Desc>
<Total>35</Total>
</Sale>
<Sale>
<Name>James</Name>
<Date>2005-01-03</Date>
<Desc>25 Keyboards</Desc>
<Total>150</Total>
</Sale>
<Sale>
<Name>James</Name>
<Date>2005-01-05</Date>
<Desc>2000 Notepads</Desc>
<Total>450</Total>
</Sale>
<Sale>
<Name>James</Name>
<Date>2005-01-06</Date>
<Desc>3 Printer Cartridges</Desc>
<Total>169</Total>
</Sale>
<Sale>
<Name>James</Name>
<Date>2005-01-09</Date>
<Desc>1 Inspiring cat poster</Desc>
<Total>0.99</Total>
</Sale>
</Sales>
What if data in the Excel file changes?
If rows are added or removed from the tables within the Excel worksheet then when the transform is re-run it will incorporate the changes made in the Excel worksheet.
The columns are assigned based on there position, so the 1st column is assumed to be 'Date', the 2nd 'Description' the 3rd 'Total'. Additional columns can be added but changes to the ones defined in the data mapper will cause issues.
Method 2
If it is not possible to modify the Excel worksheet, then the Excel data sources within the Data Mapper transform can be defined using Custom Ranges.
Taking this approach you have to manually select the region within the worksheet i.e. Sheet1!A2:C5.
The 'Dynamically expand range' option
At this point it is important to consider what happens when data is added/removed from the worksheet.
If this option is NOT checked then the range of cells selected is fixed, adding or removing rows will still result in the data source returning the same number of rows in the DataMapper. This is useful when you are loading fixed lookup values, but not for dynamic data.
If this option IS checked then the range of cells selected dynamically changes depending on the data in the workbook. Starting on the first data row and moving down until an empty row is encountered.
Example
If the custom range was set to 'Sheet1!A2:C5' and the Excel sheet now looks like this
If 'Dynamically expand range' = false, the data for '1000 Pens', '5 Laptops' & '1 Stress ball' would be returned.
If 'Dynamically expand range' = true, the data for '1000 Pens', '5 Laptops', 1 Stress ball' & ' 25 Keyboards' would be returned. Note: '1 Label Printer' is NOT returned as the first empty row ends the table (Row 7).
Summary
If you are gathering data from multiple regions on an Excel worksheet then your best option is to set up either Excel Tables or Named Regions. These can then be imported into the DataMapper, and changes to the Excel data is reflected in the DataMapper.
If changing the Excel worksheet is not possible then you can import custom regions, which can be set to dynamically load data from the Excel worksheet so the DataMapper will import newly added Excel rows .
Download a Free Trial and Free Community Version:
https://www.liquid-technologies.com/trial-download
https://www.liquid-technologies.com/trial-download