Rhino ETLThinking about Joins & Merges
Well, I think that I have a solid foundation with the engine and syntax right now, I still have error conditions to verify, but that is something that I can handle as I go along. Now it is time to consider handling joins and merges. My initial thinking was something like:
joinTransform UsersAndOrganizations:
on:
Left.Id.ToString().Equals(Right.UserId)
transform:
Row.Copy(Left)
Row.OrgId = Right["Organization Id"]
The problem is that while this gives me equality operation, I can't handle sets very well, I have to compare each row vs. each row, and I would like to do it better. It would also mean having to do everything in memory, and I am not really crazy about that (nor particularly worried, will solved that when I need it).
Another option is:
joinTransform UsersAndOrganizations:
left: [Row.Id, Row.UserName]
right: [Row.UserId, Row.FullName]
transform:
Row.Copy(Left)
Row.OrgId = Right["Organization Id"]
This lets me handle it in a better way, since I now have two sets of keys, and I can do comparisons a lot more easily.That is a lot harder to read, though.
Any suggestions?
Both on the syntax and implementation strategies...More posts in "Rhino ETL" series:
- (16 Oct 2007) Importing Data into MS CRM
- (13 Aug 2007) Writing to files
- (05 Aug 2007) Web Services Source
- (05 Aug 2007) Transactions
- (04 Aug 2007) Targets
- (04 Aug 2007) Aggregates
- (26 Jul 2007) Thinking about Joins & Merges
- (24 Jul 2007) First Code Drop
Comments
i'm not sure i like either of those syntax options, actually. Neither of them allows for joins across multiple tables where the join happens down the chain of tables.
for example, how would I join TableA.Field1 to TableB.Field2 to TableC.Field3? if you modify the joinTransform line like this, you can accomplish what I am saying with multiple joinTransforms, though:
joinTransform Source to UsersAndOrganizations:
I also think it would be hard to have complex joins, like Or clauses, with those two syntax specs. you would need something closer to real SQL, while still mainaining an object model
so the final syntax would look something for multiple join chain example:
joinTransform Source to UsersAndOrganizations:
joinTransform UsersAndOrganizations to Departments
and syntax for complex join example:
joinTransform Source to UsersAndOrganizations:
I left the ".Row." in the Left and Right syntax so that you can add something like
Left.Row.NonMatching
or other specialized syntax options, as needed.
Derick, I am not limited to the alternative I have brought up, I am certainly opened to new stuff
Hmm good start I think. What about a multi-key (I'm not sure about the English term) join?
on:
Comment preview