ETLBox allows you to create your own implementation of a source or destinations. This gives you high flexibility if you need to integrate systems that are currently now included in the list of default connectors.
The CustomSource and CustomDestination are part of the ETLBox core package - you don’t need to reference any additional packages to use these connectors.
A custom source can generate any type of output you need. It will need two functions to work properly: A read function that generates one data row as output, and a reading completed predicate that returns true if you reached the end of your data. Bot functions get the current progress count as input parameters. It is optional to use the progress count - it is just an information how many rows have been processed so far from the custom source.
The CustomSource has an output buffer - this means that every data row can be cached before it is send to the next component in the flow. You can restrict the maximal buffer size by setting MaxBufferSize to a value greater than 0. The default value is 100000 rows.
Let’s look at a simple example. Assuming we have a list of strings, and we want to return these string wrapped into an object data for our source. So for each string in our input array we create an object that we send into our flow. When all elements are processed, we set up an expression that returns true.
Like all other components in ETLBox, CustomSource also has a default implementation which uses ExpandoObjects. The above example could be modified like this to create code without any object types:
This example demonstrates how to use a CustomSource to read data from a CSV file and process it within an ETL pipeline. The input file InputData.csv is read line by line, with each line split into fields, and the processed data is then passed to the next component in the pipeline.
Instead of parsing your input data into an object, of course you can also simply pass your array into the flow. If we modify our example again to work with arrays, we will get the following code:
The use of a custom destination is even simpler - a custom destination just calls an action for every received record. In this action you will the receive each incoming row as well as an progress count of already received data. It is in your responsibility to do the further processing of the record. E.g. you could execute some code which writes the line into the database, or you could add it to an internal list and the convert this list into a json. Though both things could be accomplished with either the DbDestination or the JsonDestination, here is an example for the latter one.
The CustomDestination has an input buffer - this means that every data row can be cached before it is actually processed from your destination. This is the case if your processing takes longer than new data arrive. You can restrict the maximal buffer size by setting MaxBufferSize to a value greater than 0. The default value is 100000 rows.
CustomDestination also works with dynamic ExpandoObject. Simple use the default implementation when you want to work with an ExpandoObject. You can access each dynamic object in your WriteAction, together with the current progress count.
Also you can use the Custom destination with arrays. Within the WriteAction you will have access to each incoming array row (along with the current progress count),
The CustomDestination will forward your incoming data row by row to your custom method. If you need to process batches of your incoming data, you can use the CustomBatchDestination instead.