Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,124 | Comments: 45,475

filter by tags archive

Rhino ETLImporting Data into MS CRM

time to read 14 min | 2692 words

Okay, so this is the "coding in anger" part for Rhino ETL. I need to import files into MS CRM entities. The files are standard CSV files, with the usual corruption of values that such files have. The CRM is accessed through the web services, although I am keeping aside the option of direct DB access, if I can't get the Web Services to perform any faster.

The first problem that I had was that the MS CRM Web Services are not simple services. They accept entities that are defined in the WSDL for them, not simple values. That put me in a complexity spin for a while, until I remembered that I am not working in my own little language, I am working on .NET. A quick trip to Visual Studio and an Add Web Reference + Compile later, I had integrated accessing the MS CRM into Rhino ETL.

Here is how it was done:

import CrmProxy.Crm from CrmProxy

Basically it means that I now had a dll that contains the proxy definitions for the web service, and I imported it. So it is incredibly easy to use.

Then, it was the matter of reading the file. Rhino ETL has integrated with the FileHelpers library, and I couldn't really be happier about it. There are several reasons for that, but the main one is that I run into something that the library can't handle, and I fixed that in 10 minutes, without changing the library code. Speaking of software that I like, this is one of the main criteria that I use to evaluate a piece of software. What happens when I step off the ledge? With FileHelpers, I can extend it so easily, that I really don't care about that.

Anyway, here is a part of the class definition for our file: 

[DelimitedRecord(","), IgnoreFirst]
class Customer:
      [FieldConverter(ConverterKind.Date, "dd/MM/yyyy")] 
      UpdateDate as date
      Id as int
      Name as string
      ResponsibleEmployee as Nullable of int
      [FieldConverter(Rhino.ETL.FileHelpersExtensions.DateTimeConverterWithNullValue, "dd/MM/yyyy","00/00/0000")] 
      ReceptionDate as Nullable of date

As you can see, there isn't much to it except defining the fields, types, etc.

source CustomersFile:
            file = Read(typeof(Customer)).From(Configuration.CustomerFile)
            for customer in file:
                  print "Source ${customer.Id}"
                  SendRow( Row.FromObject(customer) ) 
            if file.HasErrors:
                  AddError("Errors have been written to ${Configuration.CustomerErrorsFile}")

Here I read from the file, use the Row.FromObject() to translate an entity into a row, and then send it forward. One amazing thing here is that FileHelpers will generate an errors file for me on demand. And that one is clear and concise and actually useful. Comparing to the amount of effort that I know are required to pull reasonable errors from SSIS file input, that is a great pleasure.

Anyway, if you missed that, I am very happy about FileHelpers.

Another thing to point out is the Configuration.CustomerFile, etc. The Configuration object is dynamically populated from a config file that you can pass to Rhino ETL (command line arg), which is a simple xml file in the format:


Why XML? Because this seems like a place where I would want to touch with stuff like xmlpoke, etc. So it is easier to work with. It is also a flat configuration scheme, that doesn't have any semantics other than the simple key/value pair.

So, now that I have the data, I can send it to the destination:

destination Crm:
            Parameters.Srv = CrmService(
                  Url: Configuration.Url,
                  Credentials: NetworkCredential(
                  CallerIdValue: CallerId(CallerGuid: Guid(Configuration.CallerId)),
                  UnsafeAuthenticatedConnectionSharing: true,
                  PreAuthenticate: true

            theAccount = account(
                  accountnumber: Row.Id.ToString(),
                  name: Row.Name,
                  telephone1: Row.Phone,
                  telephone2: Row.Cellular,
                  telephone3: Row.AdditionalPhone,
                  fax: Row.Fax,
                  accountreceptiondate: CrmDateTime(Value: Row.ReceptionDate.ToString("yyyy-MM-ddT00:00:00")),
                  address1_city: Row.City,
            result = Parameters.Srv.Create(theAccount)
            print "Created account ${Row.Id} -> ${result}"


As you can see, we have the initialize method, which creates the service, then we instansiate an account instance, fill it with the required parameters, and go to town. It is also notable the easy translation of types from CLR types to CRM types, such as in the case of accountreceptiondate.

All in all, the only difficulities that I had during this were to make heads or tails from the inforamtion in the file, which is where I want the difficulity to lie when I am dealing with ETL processes.

More posts in "Rhino ETL" series:

  1. (16 Oct 2007) Importing Data into MS CRM
  2. (13 Aug 2007) Writing to files
  3. (05 Aug 2007) Web Services Source
  4. (05 Aug 2007) Transactions
  5. (04 Aug 2007) Targets
  6. (04 Aug 2007) Aggregates
  7. (26 Jul 2007) Thinking about Joins & Merges
  8. (24 Jul 2007) First Code Drop


Mark Monster

Hi Oren,

I like it, but two things.

1) Never touch the database directly.

2) Does it support settings import ? ;-) I guess not, because settings cannot be changed through the webservices. I'm still waiting for a Titan VPC on my desk, so I can see if CRM 4.0 has changed this.

Are you setting the id's on your own side? I thought it was generated serverside. Because I've heard something that CRM has done somethings to make sure the Global not-so Unique ID is Unique. Also there is some logic in the ID's generated (look in the settings for more information about this).

Ayende Rahien

1/ I am getting ~700 records per minute, which is ridiculously low. I can get tens of thousands using the DB directly.

2/ I am not even sure what settings import is, so the answer is no.

Note that I am setting the account number, not the id.

Guid is GUID, there is a guarantee that they will be unique no matter what you do to them.

I have yet to hear of a real world guid collision case.


I love the Idea of your ETL framework, and have been playing about with it as a way of introducing myslef to Boo. I was running some tests importing over 31900 sql db rows from one DB into another, but after the first 15000 or so rows the software just hangs.

Is there a debug switch that can be triggered to see what is happening?

Comment preview

Comments have been closed on this topic.


  1. The design of RavenDB 4.0: Making Lucene reliable - 6 hours from now
  2. RavenDB 3.5 whirl wind tour: I’ll find who is taking my I/O bandwidth and they SHALL pay - about one day from now
  3. The design of RavenDB 4.0: Physically segregating collections - 2 days from now
  4. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - 3 days from now
  5. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 6 days from now

And 13 more posts are pending...

There are posts all the way to May 30, 2016


  1. RavenDB 3.5 whirl wind tour (14):
    02 May 2016 - You want all the data, you can’t handle all the data
  2. The design of RavenDB 4.0 (13):
    28 Apr 2016 - The implications of the blittable format
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats