﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2021 (c) 2026</copyright><ttl>60</ttl><item><title>Tobin Harris commented on Rhino.ETL: Turning Transformations to FizzBuzz tests</title><description>@Ayende
  
I like the idea of using a composite transform for the macro, rather than introducing yet another term. 
  
  
</description><link>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment4</link><guid>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment4</guid><pubDate>Sun, 22 Jul 2007 12:46:24 GMT</pubDate></item><item><title>Jeff Brown commented on Rhino.ETL: Turning Transformations to FizzBuzz tests</title><description>Try set-wise transformation... They consume sets and produce sets.  A set might be obtained using any soft of aggregate grouping operation.  The standard single-row transform can be thought of as a degenerate set-wise operation against a group over row ids.
  
  
So you can define "Set" to represent the set of grouped rows at the input of the transformation.  Then "Row" is equivalent to "Set[0]" in the case where Set.Length == 1.  Or else you can distinguish set-wise transformations from row-wise transformations by declaration type.
  
  
Emitting new rows shouldn't be any more problematic than consuming rows.  Add an Emit or Yield operator that can appear 0 or more times in the transformation.
  
  
This mechanism is compositional if you allow groups to be merged and regrouped after each set-wise transformation.
</description><link>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment3</link><guid>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment3</guid><pubDate>Sat, 21 Jul 2007 18:31:13 GMT</pubDate></item><item><title>Ayende Rahien commented on Rhino.ETL: Turning Transformations to FizzBuzz tests</title><description>Well, parameters are supported:
  
  
transform RemoveCommasWithParemeters:
  
   for column in Parameters.ColumnsToClean:
  
       if Row[column] isa string:
  
           Row[column] = Row[column].Replace(",","")
  
  
I am currently working on how to call this cleanly.
  
Macros will probably be something like:
  
  
transform CleanSkandiaValuations:
  
  ApplyTransform("RemoveBlankRows")
  
  ApplyTransform("RemoveCommas", { Range: "Name", "City", "State" } )
  
  
</description><link>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment2</link><guid>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment2</guid><pubDate>Sat, 21 Jul 2007 15:24:38 GMT</pubDate></item><item><title>Tobin Harris commented on Rhino.ETL: Turning Transformations to FizzBuzz tests</title><description>Nice work, and yes the individual transforms are mostly simple. However, may I spice things up a bit? :) 
  
  
Fistly, my transformations act on *ranges* of data. Sometimes I want to affect all columns and sometimes I'd only want to run a transform on a few columns. Rather than define similar transforms many times for different ranges, it would be nice if there were some way of reusing the logic.
  
  
    transform RemoveCommas( range ):
  
        for column in range.Columns:
  
            if row[column] isa string:
  
                row[column] = row[column].Replace(",","")
  
  
No range could imply all columns. Does this break your current thinking about the responsibilities of a transform?
  
  
Reuse would be with different ranges. So, I might need some kind of "Command" Macro faciliy? This may  look a bit naff and not fit your concept too well, but hopefully you gives an idea...
  
  
macro CleanSkandiaValuations:
  
  apply transform RemoveBlankRows(Range(ALL))
  
  apply transform RemoveCommas(Range(Columns(1,4,5,9)))
  
  
macro CleanClericalMedicalValuations:
  
  apply transform RemoveBlankRows(Range(ALL))
  
  apply transform RemoveCommas(Range(Columns(2), Rows(ALL))
  
  
Regarding the repeated column header in data, it's where you have a source such as a CSV file with the header row repeated several times throughout the actual data itself. It's probably because whoever created the data didn't know how to freeze the header row in Excel and therefore copied and pasted it several times. Horrible!
  
  
Regarding the unpivoting of data, yeah this transformation actually creates new rows which is a challenge. 
  
</description><link>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment1</link><guid>http://ayende.com/2648/rhino-etl-turning-transformations-to-fizzbuzz-tests#comment1</guid><pubDate>Sat, 21 Jul 2007 14:29:16 GMT</pubDate></item></channel></rss>