The CrossJoin allows you to combine every record from one input with every record from the other input. This allows you to simulate a cross join like behavior as in sql (also known as Cartesian product).
The CrossJoin allows you to combine every record from input with every records from the other input. E.g. if your left input has the input records 1 and 2, and your right input the records A, B and C, the CrossJoin will combine 1 with A, B and C and 2 with A, B and C.
The CrossJoin is a partial blocking transformation. The input for the first table will be loaded into memory before the actual join can start. After this, every incoming row will be joined with every row of the InMemory-Table using the cross join function. The InMemory target should always be the target with the smaller amount of data to reduce memory consumption and processing time. The passing target of the CrossJoin func does not store any rows in memory.
The CrossJoin has an input buffer for each input target.
Let’s assume you have two input sets.
Set one is a list of first names: “Elvis”, “James” and “Marilyn”. Set two is a list of last names: “Presley” and “Monroe”. Our cross join should produce a list of all possible combinations of first and last name: “Elvis Presley”, “Elvis Monroe”, “James Presley”, “James Monroe”, “Marilyn Presley”, “Marilyn Monroe”.
This is our code:
Note
The source where you expect the smaller amount of incoming data should always go into the InMemory target of the CrossJoin. This is because the CrossJoin is a partial blocking transformation where all rows from the InMemoryTarget are stored in memory before the actual join can be performed.