Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,260
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 485 words


I just watch a very nice diagram of SSIS doing its work, and throwing 70% of the input rows into the garbage. I was a bit concerned about that, as you may imagine. I talked before about the experiance of debugging a problem with SSIS, I present this case as a case study.
First, I should mention that I pipe all the errors to a table that contains three columns (which are provided by SSIS). The columns are ErrorColumn, ErrorCode and ErrorOutput. So when I get an error, I just check the error table and see where they happen. In this case, I got the following results (times 70 thousands):

ErrorCode         ErrorColumn        ErrorOutput
-1073442796    162                   null


"Hm, that is strange", I think, and goes to check the SSIS package for the column. I get to the point of the failure, and then I need to use the advance editor to go through each and every one of the dozens of columns that I have there, searching for a matching LineageId.
I have no idea what a LineageId is, by the way. I can't really think of why I would want to use that as my error column identifier rather than the rather nice column name.
Leaving that aside, I'm left with the entirely too opaque error code. Searching Google for this gives nothing, nor does search MSDN. So I'm basically getting told: "I know what the error was, but I am not going to tell you what it was."
After checking that the connection manager was fine, I noticed that it was converting a string to an int, so I tried trimming the string before converting, but it wasn't that, I kept getting the same error on the same column.
I outputted the offensive column as text to the table, and looked at the values (it sounds easy to do, but it takes over 5 minutes to do this, and adjust everything that it will work correctly).
All the values were perfect numbers, and SQL Server had no problem to convert them to numbers.
This is usually the point where I resort to physical violance and hit my head against the wall until I either find the answer or pass out.
After picking myself from the floor and passing the "Who I am? Where am I? What is this big bulge on my head?" stage, I had tried looking over the raw data. That was fine, but I noticed that several numbers were quite a bit bigger than the others. Then I checked the type of the column in the data base numeric(20,0) and the type I was converting to numeric(3,0), and it was obvious what the problem was.
Now, if I could get something like "truncation would occur", that would have saved me several hours of head butting and severe annoyance.

time to read 1 min | 146 words

  • You can't close a package after execution. You've to manually stop debugging and then close the package.
  • No way that I could find to look at just the errors/ warnings from the log.
  • The Flat File Source just blithedly ignored the fact that the file doesn't exists and carried on its work. I need this behaviour in several places, so it would be useful to do it, but I've no idea why or how this works.
  • You can't edit connection managers without a live connection to use.
  • You don't get any results when you search for the errors on Google.

I'm not sure how I got the situation below, but I did get it. I'm pretty sure that this shouldn't happen. It got so many stuff wrong that I don't know where to begin checking. More so, I don't know how to check this stuff.

time to read 6 min | 1064 words

SSIS is a great tool, and I really like some of the things that it allows me to do. But for crying out loud, you can't debug this thing. The errors it generate are reasonable enough if it is SQL server errors, but if it is something that is generated by SSIS itself, god forbid that you'll try to get what the error was.

Here are some examples for things that really frustrate me:

  • When you get a failure in the data loading, the error code you get refers to the LineageID of the column that caused the failure. The LineageID, as far as I could figure out, is some random integer that is attached to a column on creation. Why this is the error information that I'm getting, I'm not sure about.
  • No way to take a formatted string and turn it into a date unless it is dd-mm-yyyy. It doesn't accept any other format.
  • Speaking of which, it insists on using DB_TIMESTAMP for datetime columns in the SQL Server.
  • The UI to do complex stuff is just plain horrible. You need to do some transformation on a column? You get to write something that looks like a hybrid of the worst stuff in both C and VB, in a one line UI, that has major scrolling problem.
  • When the UI is generous enough to gives you errors, it will display them when you hover over it, for exactly 3 seconds, and then you need to move the mouse so the hover event would fire and then you've another couple of seconds to read the error.
  • Finding out what failed is a matter of trail and error. I had a problem where I had a date formatted as dd-mm-yy, instead of the yyyy that it expects. It worked like a charmed in that task, and then bombed on me when we tried to save to the database with a completely obscure error about not being able to convert timestamps to datetime.
  • You tend to get disk I/O buffer errors occationally, I'm not sure why, trying to run it again usually works.
  • It's really annoying to put it into a source control like VSS, since it set the file state to read-only, which cause SSIS to throw errors and means that if I'm testing a package, no one else can use it.
  • Very limited support for doing things across multiply packages.
  • The UI is sloooow at best.
  • The UI is also obscure as hell, just try to do a pivot table in SSIS. Give me a apache.conf any day, at least that is documented.
  • Some basic operations are hard to do. So hard that I wasn't able to do them, in fact. Getting a value from a stored procedure into a User::Variable is one of them.
  • Did I mention that I find the UI extremely awkward to work with?
  • Copy & Paste takes an exteremly long time, and often produce unpredictable results.
  • Limited ability to write actual code. This mean that in order to solve a very simple problem, I need to either add an assembly to the package (add a script task in VB.Net only, appernatly {which I then would need to share between packages, meaning copy & paste errors}) or do it in some rather nasty macro langauge that I really don't understand the benefit off.
  • Horrible debugging experiance, if something goes wrong, unless you you knows what is wrong (and we found some fairly typical errors that we made), you're screwed. You're left to try this and that until it works or gives a different error message. No way that I found to step through the actions as they occur. (And yes, if in the middle of 1,000,000 records load I get a bad record, I need to be able to see what is wrong with that record. Or all the nine other records that precede and follow it.
  • Try-finally is hard to do. I may have Pre Action, A,B,C Post Action. The Pre & Post Actions need to be executed regardless of what happens in the middle. That is not trivial to do in SSIS, even though it should. I may be able to do so if I put everything in a sequence, and then put the post action as the next action with execute on completion, but that doesn't work if I have a transaction already running (and you can't put two sequences inside one another and then chain stuff inside of them).
  • Exporting / Importing data from SQL Server to Access and back (same schema in both SQL Servers, empty MDB file to start with) is not working. For some reason Access is using memo (NTEXT for SQL) instead of text (NVARCHAR for SQL), which breaks the import process. And that isn't easy to find either, it just gives a general error about data type mistmatch in a column. Which column, it doesn't bother to tell me.

On the other hand, when you actually manage to run it, the operation itself seems to be very quick. And the UI is in shoothing pale yellow that is easy on the eyes.

Heavy Load

time to read 1 min | 88 words

I just put SQL Server through a stress test that had it copy several giga bytes over a network. It tooks over two hours, but I think most of it was network traffic. The actual inserts took relatively little, although toward the end I noticed SQL Server slurping memory like it was free candy (ended on ~1.4Gb RAM taken).
I then run a query on all of that data, which returned much quicker than I expected. The more I work with it, the more I like it.

time to read 1 min | 94 words

One of the most obnoxious thing about writing diagrams (be that DB layout, class hierarchies, SSIS packages, etc) is how to arrange the connections so you can see anything clearly. It has been notoriously hard to do it in most tools that I’ve used, but I just found the Format > Layout > Diagram command in SSIS that arrange a package very neatly in a tree fashion.

This little command is going to save many hours .

time to read 1 min | 159 words

I find myself so busy lately that I really can't dedicate any effort to my pet projects (work & life seems to get in the way). Since I still have some computer time free, and since I really can't gather the mental effort to write good code, I try to spend a short amount of time writing a lecture on various subjects.

I used to teach quite a bit during my time in the army, for a long time, that was my job description, and it is something that I really enjoy. I didn't get to talk much about technical stuff (unless you consider hand cuffs to be technical :-) ), though. I gave a couple of lessons in tech stuff since I left the army, and it had a great time. I'll see if/when I can post some of this stuff online. I would love to get some feedback on this.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...