Row Validation
The RowValidation transformation in ETLBox allows you to validate rows in your data flow based on customizable rules. This ensures data quality by enforcing conditions such as non-null values, numeric ranges, or custom validation logic on individual columns or entire rows. You can apply validation rules at both the row and column level, with invalid rows being routed separately from valid rows for distinct processing.
Overview
Validation in RowValidation
can be done in two primary ways:
Row-Level Validation: Define a predicate that evaluates whether a row is valid or not.
Column-Level Validation: Apply validation rules on individual columns using the
ValidateColumns
property.Built-in validation methods include:
IsNotNull
,IsNull
,IsEmpty
,IsNotEmpty
,IsNumeric
,IsPositive
,IsNegative
,IsZero
,IsNotZero
,IsEquals
,IsNumberBetween
,IsDate
,IsDateBetween
,IsInList
,IsNotInList
,IsInListIgnoreCase
,IsNotInListIgnoreCase
,IsBool
, andCustom
.Custom allows you to write any .NET code as a validator, offering complete flexibility for custom validation logic.
Row-Level Validation
The ValidateRowFunc
property allows you to define a custom predicate that evaluates whether a row is valid. This is useful when row-level validation logic is required beyond individual column rules.
Example:
In this case, any row where the Salary
is less than 0 will be considered invalid, but the TEST record is ignored.
Column Validation
For a strongly-typed object, you can specify column validation attributes directly on the properties of the class. Here’s an example:
Dynamic Object Validation Example
For dynamic objects, validation rules can be added manually using the ValidateColumns
property, as attributes cannot be directly applied to dynamic objects. Here’s an example:
IgnoreMissingProperties
When working with dynamic objects, the IgnoreMissingProperties
property can be set to true
to prevent errors when a specified property in ValidateColumns
is missing from a row.
Example:
SkipValidationOfRowAfterFirstError
The SkipValidationOfRowAfterFirstError
property allows you to stop further validation on a row as soon as the first error is encountered. This can improve performance in scenarios where full validation is not necessary after the first failure.
Example:
AddValidationErrorToRow
The AddValidationErrorToRow
property determines whether validation errors should be attached to the row object. This is useful for logging or debugging invalid data.
- For POCOs (Plain Old CLR Objects), validation errors are stored in any property that is of type
ValidationResult
(with any property name). - For dynamic objects, the errors are added as a dynamic property called
ValidationResult
.