If you need to work with xml code, make sure to add the latest ETLBox.Xml
package to your project. The xml integration is based on System.Xml.
Note
All streaming connectors share a set of common properties. For example, instead of reading or writing from/into a file you can set ResourceType to ResourceType.Http or ResourceType.AzureBlob in order to read or write into a webpoint or an Azure blob. See Shared Functionalites for a list of all shared properties between all streaming connectors.
If you want to start with example code right away, you will find it in the recipes section for the XmlSource and XmlDestination. The components could also be used in other examples.
The xml source let you read data from a xml source.
Let’s assume your xml file looks like this:
Xml reading is based on the Microsoft XmlSerializer (using System.Xml.Serialization). You can make use of the default xml attribute
annotations to influence how data is read by the XmlSerializer. For the example xml above, the following code could read the xml file:
You can take full advantage of all Microsoft XmlSerializer attributes. Here is another example that utilizes XmlElement to map xml elements with a property in an object, and also an object type to access the attributes. It also show how to set up the namespace.
XmlSource does also support the dynamic ExpandoObject. If you want to use it, you can define an ElementName that contains the data you actually
want to parse - as you normally are not interested in your root element. ETLBox then will look for this Element and parse every occurrence of
it into an ExpandoObject and send it into the connected components.
Here is an example. If your xml looks like this:
You can parse the two elements with the following code:
Attributes can be accessed via their name and the prefix at_ (and text data outside of elements via the prefix tx_) If you like to have a different prefix, you can adjust the AttributePrefixForDynamic and TextPrefixForDynamic properties. If you want to have the same behaviour as in previous ETLBox versions, you can set AttributePrefixForDynamic = "@" and TextPrefixForDynamic = "#".
When reading attributes using the dynamic approach and using an ETLBox version prior 2.7.1, the property names of attributes will have an @ sign in front of their names. This makes it difficult to access these properties,e.g. in a RowTransformation when converting the row in a dynamic object. So when we reuse the xml file from above again:
When we try to access the attributes in a RowTransformation, the following code won’t work:
Instead, we need to convert the relevant ExpandoObject into an IDictionary<string,object> first:
If you want to access parts from the xml which are not part of your processed data, you can set the property CollectUnparsedData to true. By default, only data that can be written in object which are sent into the dataflow are processed. Activating this feature will also read the rest of the Xml file. You can access the unparsed data in the property UnparsedData of the Xml source. If you use the GetNextUri/HasNextUri pattern to paginate through your source data, you can access the unparsed data of the current page in the StreamMetaData.UnparsedData property.
Here is an example for accessing UnparsedData directly in the component:
The next example show how to access unparsed data in the StreamMetaData object:
If some cases, the source file contains elements with different names which contain our data. If you are using the dynamic approach, you can use the ElementNameRetrievalFunc to adjust the element name before reading the next element. The provided StreamMetaData object will contain the name of the next element inside the AdditionalData property.