Excel
The Excel connector in ETLBox makes it easy to read from or write to Excel files. Use ExcelSource to pull data into your workflows or ExcelDestination to create and update Excel files. With flexible options for headers, column mapping, and dynamic objects, it simplifies working with Excel in ETL processes.
Overview
The ETLBox.Excel
library enables reading data from Excel files into a data flow (ExcelSource
) or writing data to Excel files (ExcelDestination
). It is ideal for workflows that require Excel integration but is limited in performance and maximum row count due to Excel’s constraints. For large datasets, consider using the CsvDestination
to create CSV files, which Excel can also open.
Note
ResourceType
to ResourceType.Http
or ResourceType.AzureBlob
to read from or write to a web endpoint or Azure Blob Storage. See Shared Functionalites for details.The Excel connector provides the following components:
ExcelSource
ExcelSource
reads data from .xls
or .xlsx
files. It allows specifying sheet names and ranges to restrict the scope of data processing. By default, it assumes the first row contains headers, which are matched with the property names of the target object. However, you can override this behavior using attributes or runtime configurations. It also supports dynamic objects for scenarios where the structure of data is not predefined. Blank rows can be ignored, and you can even process files without headers by explicitly specifying column mappings.
ExcelDestination
ExcelDestination
writes data into .xlsx
files, providing flexibility to configure headers, specify target sheet names, and append data to existing files. You can customize how object properties map to Excel columns and use dynamic objects for flexibility. It is important to note that ExcelDestination stores all data in memory before writing it to the file, which can impact performance for large datasets.
ExcelSource
ExcelSource
reads data from an Excel file and maps it to objects. By default, it reads from the first sheet, includes a header row, and maps headers to object property names. However, you can adjust this behavior to suit your requirements, including custom column mappings, dynamic object processing, and reading files without headers.
Basic Usage
For an Excel file like this:
Col1 | Col2 |
---|---|
1 | Test1 |
2 | Test2 |
3 | Test3 |
Define a POCO and use ExcelSource
to load the data:
Using the ExcelColumn
Attribute
You can define mappings explicitly using the ExcelColumn
attribute:
Alternatively, use the ExcelColumns
property for runtime configuration:
Reading Excel Without Headers
For files without headers:
1 | Test1 |
2 | Test2 |
3 | Test3 |
Set HasNoHeader
to true
and define column mappings explicitly:
Ignoring Empty Rows
To skip blank rows when reading from an Excel file, set the IgnoreBlankRows
property to true. For example:
This ensures that rows with completely empty cells are excluded from processing.
Using Dynamic Objects
For scenarios where data structure is unknown or dynamic:
This creates ExpandoObject
instances, with properties inferred from headers or set dynamically.
Reading Field Headers
To read headers without processing rows:
ExcelDestination
ExcelDestination
writes data to .xlsx
files and supports a range of configuration options. Headers can be suppressed, specific sheet names can be defined, and data can be appended to existing files. By default, it writes headers based on the property names of the input object.
Writing Simple Data
Create an Excel file with 3 rows:
This generates:
Col1 | Col2 |
---|---|
1 | Test1 |
2 | Test2 |
3 | Test3 |
Using the ExcelColumn Attribute
The ExcelDestination
can utilize the ExcelColumn
attribute to map object properties to specific columns in the Excel file. For dynamic objects, you can use the ExcelColumns
property. For example:
This generates:
Header2 | Header1 |
---|---|
Test1 | 1 |
Test2 | 2 |
Test3 | 3 |
Note
ExcelColumn
attribute for the ExcelDestination
, all ColumnIndex
values must be set. If any ExcelColumn
lacks a ColumnIndex
, an exception will be thrown. If you want partial column index definitions or need to reorder columns dynamically, use the ColumnTransformation for more flexibility.For dynamic objects, configure the ExcelColumns
property directly:
Using ColumnTransformation for Partial Index Definitions
If you need to define only some ColumnIndex
values or reorder columns, use the ColumnTransformation
for added flexibility. This allows partial definitions without triggering exceptions, as ColumnTransformation handles the remaining logic. For example:
Please note that the ColumnTransformation
converts the incoming POCO into a dynamic object.
Configuration Options
Suppress Headers
To suppress headers:
Specify Sheet Names
Define the sheet name:
Append Data to Existing Files
Append data to a specified sheet:
Using Dynamic Objects
Dynamic object usage mirrors ExcelSource
: