Databases
Slaying relational hydras (or dating them)
Sometimes client confidentiality can be really annoying, because the problem sets & suggested solutions that come up are really interesting. That said, since I am interesting in having future clients, it is pretty much a must have. As such, the current post represent a real world customer problem, but probably in a totally different content. In fact, I would be surprised if the customer was able to recognize the problem as his. That said, the problem is actually quite simple. Consider a dating site, where you can input your details and what you seek, and the site will match...
Fun with a non relational databases
My post showing a different approach for handling data got a lot of traffic, and a lot of good comments. But I think that there is some misunderstanding with regards to the capabilities of NoSQL databases, so I am going to try to expand on those capabilities in this post. Instead of hand waving, and since I am thinking about this a lot lately, we will assume that we are talking about DivanDB (unreleased version), which has the following API: JsonDoc[] Get(params Id[] ids); Set(params JsonDoc[] docs); JsonDoc[] Query(string...
Rhino Divan DB reboot idea
Divan DB is my pet database. I created it to scratch an itch [Nitpickers: please note this!], to see if I can create Couch DB like system in .NET. You can read all about it in the following series of posts. It stalled for a while, mostly because I run into the hard problems (building map/reduce views). But I think that I actually have a better idea, instead of trying to build something that would just mimic Couch DB, a .NET based Document DB is actually a very achievable goal. The way it would work is actually pretty...
Slaying relational dragons
I recently had a fascinating support call, talking about how to optimize a very big model and an access pattern that basically required to have the entire model in memory for performing certain operations. A pleasant surprise was that it wasn’t horrible (when I get called, there is usually a mess), which is what made things interesting. In the space of two hours, we managed to: Reduced number of queries by 90%. Reduced size of queries by 52%. Increased responsiveness by 60%, even for data set an order of...
You see that database? OFF WITH HIS HEAD!
A while ago I was chatting with a friend that complained about a migration script timing out on his local machine. When I looked at the script, it was fairly obvious what was wrong: DECLARE @VideoID UNIQUEIDENTIFIER
DECLARE @NewID UNIQUEIDENTIFIER
DECLARE VideoCursor CURSOR READ_ONLY
FOR
SELECT ID FROM Video
OPEN VideoCursor
FETCH NEXT FROM VideoCursor
INTO @VideoID
WHILE @@FETCH_STATUS = 0
BEGIN
SET @NewID = NEWID()
INSERT INTO Content (ID, Body, FormatBody, Plain)
SELECT @NewID, ContentItem.Body, Video.FormatBody, Video.Plain
FROM ContentItem
...
JAOO: More on Evolving the Key/Value Programming Model to a Higher Level from Billy Newport
As I already mentioned, this presentation had me thinking. Billy presented a system called Redis, which is a Key/Value store which is intended for an attribute based storage. That means that storing something like User { Id = 123, Name = “billy”, Email = “billy@example.org”} is actually stored as: { "uid:123:name": "billy" }
{ "uid:123:email": "billy@example.org" }
{ "uname:billy": "123" }
Each of those lines represent a different key/value pair in the Redis store. According to Billy, this has a lot of implications. On the advantage side, you get...
Soft Deletes aren’t Append Only model
There seems to be some confusion regarding my post about soft deletes, in particular, people brought up the idea of append only models. I had the chance to work on both types of systems, and I can tell you that I would much rather work with append only model than with soft deletes. The append only model means that you can only ever insert, never delete or update. Thing about the way your bank account works. If I had the clerk transfer money from one account to another, and he had a typo and send tens times the...
Buying the pot as a way of winning the database wars?
The news are just out, Oracle is buying Sun. This is especially interesting since Oracle has previously bought InnoDB (a key component for MySQL in the enterprise) and Sun has bought MySQL. This means that, off the top of my head, Oracle is now the owner of the following database products: Oracle DB BerkleyDB MySQL & InnoDB I am pretty sure that they have others, but even so, that is quite a respectable list, I should think. Of course, with Sun, Oracle is also getting Java…
Repository is the new Singleton
I mentioned in passing that I don’t like the Repository pattern anymore much, and gotten a lot of responses to that. This is the answering post, and yes, the title was chosen to get a rise out of you. There are actually two separate issues that needs to be handled here. One of them is my issues with the actual pattern and the second is the pattern usage. There most commonly used definition for Repository is defined in Patterns of Enterprise Application Architecture: A system with a complex domain model often benefits from a layer,...
Implementing a document database: simple queries
And now this passes: public class PerformingQueries
{
const string query = @"
var pagesByTitle =
from doc in docs
where doc.type == ""page""
select new { Key = doc.title, Value = doc.content, Size = (int)doc.size };
";
[Fact]
public void Can_query_json()
{
var serializer = new JsonSerializer();
var docs = (JArray)serializer.Deserialize(
new JsonTextReader(
...
Designing a document database: What next?
So far I posted quite a few of posts about building the document database. To be frank, the reason that I did this is because the idea has been bouncing in my head a lot recently, and sitting down and actually thinking about it has been great, especially since now I have the design dancing in my head, shiny & beautiful. Here is the full list, in case you missed anything: Schema-less databases Designing a document database Designing a document database: Storage Designing...
Designing a document database: Remote API & Public API
One of the greatest challenges that we face when we try to design API is balancing power, flexibility, complexity, extensibility and version tolerance. There is a set of tradeoffs that we have to make in order to make sure that we design an API, and selecting the right tradeoffs impacts the way that users will work with the software. One of the decisions that I made about the docs db is that it would be actually have two published API. The first would be the remote API, accessible for clients, and the second would be a public API, accessible...
Designing a document database: Looking at views
I was asked how we can handle more complex views scenarios. So let us take a look at how we can deal with them. Joins In a many to one scenario, (post & comments), how can we get both on them in one call to the database? I am afraid that I am not doing anything new here, since the technique is actually described here. The answer to that is quite simple, you don’t. What you can do, however, is generate a view that will allow you to get both of them at the same time. For example:...
Designing a document database: View syntax
The choice of using Linq queries as the default syntax was not an accident. If you look at how Couch DB is doing things, you can see that the choice of Javascript as the query language can cause some really irritating imperative style coding. For example, look at this piece of code: function(doc) {
if (doc.type == "comment") {
map(doc.author, {post: doc.post, content: doc.content});
...
Designing a document database: Aggregation Recalculating
One of the more interesting problems with document databases is the views, and in particular, how are we going to implement views that contain aggregation. In my previous post, I discussed the way we will probably expose this to the users. But it turn out that there are significant challenges in actually implementing the feature itself, not just in the user visible parts. For projection views, the actual problem is very simple, when a document is updated/removed, all we have to do is to delete the old view item, and create a new item, if applicable. For aggregation...
Designing a document database: Aggregation
I said that I would speak a bit about aggregations. On the face of it, aggregation looks simple, really simple. Continuing the same thread of design from before, we can have: The problem is that while this is really nice, it doesn’t really work. The problem is that using this approach, we are going to have to recalculate the view for the entire document set that we have, a potentially very expensive operation. Now, technically I can solve the problem by rewriting the Linq statement. The problem is that it wouldn’t really work. While it...
Challenge: C# Rewriting
Here is an interesting one. Can you write code that would take the first piece of text and would turn it into the second piece of text? First (not compiling) Second (compiling): Hint, you can use NRefactory to do the C# parsing.
Designing a document database: Views
One of the more interesting problems with document databases is how you handle views. But a lot of people already had some issues with understanding what I mean with document database (hint, I am not talking about a word docs repository), so I have better explain what I mean by this. A document database stores documents. Those aren’t what most people would consider as a document, however. It is not excel or word files. Rather, we are talking about storing data in a well known format, but with no schema. Consider the case of storing an XML document or...
Designing a document database: Replication
In a previous post, I asked about designing a document DB, and brought up the issue of replication, along with a set of questions that effect the design of the system: How often should we replicate? As part of the transaction? Backend process? Every X amount of time? Manual? ...
Designing a document database: Attachments
In a previous post, I asked about designing a document DB, and brought up the issue of attachments, along with a set of questions that needs to be handled: Do we allow them at all? We pretty much have to, otherwise we will have the users sticking them into the document directly, resulting in very inefficient use of space (binaries in Json format sucks). How are they stored? In the DB? ...
Designing a document database: Authorization
This is actually a topic that I haven’t considered upfront. Now that I do, it looks like it is a bit of a hornet nest. In order to have authorization we must first support authentication. And that bring a whole bunch of questions on its own. For example, which auth mechanism to support? Windows auth? Custom auth? If we have auth, don’t we need to also support sessions? But sessions are expansive to create, so do we really want that? For that matter, would we need to support SSL? I am not sure how to implement...
Designing a document database: Concurrency
In my previous post, I asked about designing a document DB, and brought up the issue of concurrency, along with a set of questions that effect the design of the system: What concurrency alternatives do we choose? We have several options. Optimistic and pessimistic concurrency are the most obvious ones. Merge concurrency, such as the one implemented by Rhino DHT, is another. Note that we also have to handle the case where we have a conflict as a result of replication. I think that it would make a lot of sense...
Designing a document database: Scale
In my previous post, I asked about designing a document DB, and brought up the issue of scale, along with a set of questions that effect the design of the system: Do we start from the get go as a distributed DB? Yes and no. I think that we should start from the get go assuming that a database is not alone, but we shouldn’t burden it with the costs that are associated with this. I think that simply building replication should be a pretty good task, which mean that we can push more...
Designing a document database: Storage
In a previous post, I asked about designing a document DB, and brought up the issue of storage, along with a set of questions that needs to be handled: How do we physically store things? There are several options, from building our own persistent format, to using an RDMBS. I think that the most effective option would be to use Esent. It is small, highly efficient, require no installation and very simple to use. It also neatly resolve a lot of the questions that we have to ask in addition to that. ...
Designing a document database
A while ago I started experimenting with building my own document DB, based on the concepts that Couch DB have. As it turn out, there isn’t really much to it, at a conceptual level. A document DB requires the following features: Store a document Retrieve document by id Add attachment to document Replicate to a backup server Create views on top of documents The first two requirements are easily handled, and should generally take less than a day to develop. Indeed, after...
Schema-less databases
This post about how Friend Feed is using schema-less storage for most of their work is fascinating. In the ALT.Net Seattle there was a session about that, which generated a lot of interest. My next post will have more details about the actual implementation details of doing something like that in a manner easily accessible in .Net, but just reading the post is very interesting. Another item that I found that was an interesting read, although it is far harder to read is: http://highscalability.com/how-i-learned-stop-worrying-and-love-using-lot-disk-space-scale
How did I end in this position?
I am now in an argument where I am in support for stored procedures. A piece of the dialog: Team Member #1: We have to do something about this, we don’t even have any stored procedures for this. Me: I will write the stored procedure for you.
There is no database
I just noticed that for the last few months I have been consistently denying the existence of a database. I use the term persistent storage when asked, and when asked I usually say: “There is no database”. It has gotten to the point that this is how I draw the DB on most whiteboard sessions:
Hidden Windows Gems: Extensible Storage Engine
Did you know that Windows came with an embedded database? Did you know that this embedded database is the power behind Active Directory & Exchange? Did you know that this is actually part of Windows' API and is exposed to developers? Did you know that it requires no installation and has zero administration overhead? Did you know there is a .Net API? Well, the answer for all of that is that you probably didn't know that, but it is true! The embedded database is called Esent, and the managed library for this API was just released. This is...
Longest time to first test pass, but it now works
public class DivanDatabaseTest
{
[Fact]
public void Can_add_document_to_database()
{
using (var instance = new Instance("test"))
{
instance.Init();
using (var sesion = new Session(instance.JetInstance))
{
var database = new DivanDatabase(instance, sesion, "test.divan");
DocumentId[] add = database.Add(
JObject.Parse("{'name': 'oren', 'email': 'ayende@ayende.com'")
);
using (var view = database.OpenDocumentsView())
{
var doc = view.FindById(add[0].Id);
Assert.Equal("oren", (string) doc["name"]);
Assert.Equal("ayende@ayende.com", (string) doc["email"]);
}
}
}
}
}
Database Schemas
I was asked to comment on the use of DB schemas, so here it is. The first thing that we need to do is decide what a schema is. A schema is an organization unit inside the database. You can think about it as a folder structure with an allowed depth of 1. (Yes, just like MS-DOS 1.0). Like folders in the real file system, you can associate security attributes to the schema, and you can put items in the schema. There is the notion of the current schema, and that about it. Well, so this is what it...
Amazon EC2 now offers RDBMS
That is pretty amazing, since that was a big pain point for developing for EC2 powered systems. They support both Oracle and MySQL, in addition to SimpleDB, which is a non relational DB. From the looks of things, however, there is a significant difference between the Oracle and MySQL offerings. MySQL is a DB limited to a single machine. They talk about the ability to dynamically scale the machine (which sounds just awesome) from small to extra large based on requirement, but not about multi instance databases. Oracle, however, does have this ability, and it is supported on EC2. So...
Observations on Embedded databases
I spent significant parts of the last two weeks dealing with embedded databases. I have used, SQL CE, SQLite, FireBird, db4o and Berkeley DB. My requirements were really simple, or so I thought. I just wanted safe for multi threading and support for transactions. Let me go over them in order: SQL CE This is a really nice DB, syntax is comparable to SQL Server, so it makes a lot of things simpler. It has transaction support and is supposed to be multi threaded safe. It is not. It is very easy to...
Why I don't like FireBird: Part II
Take a look at this code. It is supposed to give me the earliest message, ensuring that I'll get each message once and only once. It doesn't work. I am getting some messages twice. The same code (well, simplified) just works on SQLite. public QueueMessage GetEarliestMessage()
{
byte[] data = null;
bool done = false;
while (done == false)
{
Transaction(cmd =>
{
/*
SELECT FIRST 1 Id, Data FROM IncomingMessages
ORDER BY InsertedAt ASC
*/
cmd.CommandText = Queries.GetEarliestMessageFromIncomingQueue;
string id;
using (var reader = cmd.ExecuteReader())
{
if (reader.Read() == false)
{
done = true;
return;
}
id = reader.GetString(0);
data = (byte[])reader[1];
}
/* DELETE FROM IncomingMessages WHERE Id = @Id */
cmd.CommandText = Queries.DeleteMessageFromIncomingQueue;
cmd.Parameters.Add("@Id", id);
try
{
var rowAffected = cmd.ExecuteNonQuery();
//...
Why I don't like Firebird: Part I
I mentioned before that my main objection to this DB is its utter lack of tooling. I mean, even SQLite has some tools. For Firebird, there seem to be nothing that works on my machine. I needed to debug something there, and this was the easiest solution.
SQL CE Transaction Handling
Update: Yes, I am crazy! Turn out that I forgot to do "command.Transaction = tx;" and then I went and read some outdated documentation, and got the completely wrong picture, yuck! I still think that requiring "command.Transaction = tx;" is bad API design and error prone (duh!). Someone please tell me that I am not crazy. The output out this program is: Wrote item Wrote item Wrote item 3 Wrote item ...
SQLite vs. SQL CE
Those two seems to be the most common embedded databases in the .NET world. This is important to me, since I want to do testing against embedded database. SQL CE can be used with SQL Management Studio, which is nice, but it has three major issues for me so far: It doesn't support memory only operations. SQLite does, and it means a difference of 12 seconds vs. 40 seconds in running ~100 tests that hit the DB. This is important, especially kicking up everything seems to take about 10 seconds anyway (using Test Driven.Net) It doesn't support paging (WTF!)...
Serialazble isolation level on rows that does not exists
Recently I was asked how to solve this problem. An external service makes a call to the application, to create / update an entity. This call can arrive in one of few endpoints. The catch is that sometimes the external service send the call to create a new entity to all the end points at the same time. This obviously caused issue when trying to insert the same row twice. I suggested using serializable isolation level to handle this scenario, but I wasn't sure what kind of guarantees is makes for rows that do not exists. I decided to write...
Another paging approach
I have hard time believing that, but this is something that I learned from how MS CRM works. This is the first time that I ever saw this approach, although it seems so obvious when you think about it. Updated: Fixed a bug found by Nathan Baulch, thanks! The idea here is to solve the problem of SQL 2000 not really having a good way to do paging. The basic pattern is this: select top [N] * from [TableName]
where [orderCriteria] > @orderCriteriaPreviousMaxValue or
(
[Id] > @PreviousID and
[orderCriteria] = @orderCriteriaPreviousMaxValue
)
order by [orderCriteria], [Id]
What do I mean by that?
Let...
Perfoming joins without having all the data in memory
Probably the easier way to perform a join is by a nested loop, given dataset A and dataset B, joining all the rows to dataset C is simple: for row_a in A:
for row_b in B:
if condition(row_a, row_b):
add join(row_a, row_b), C
Supporting left/right/cross joins is simple matter from here, but this has the issue of having to have both datasets in memory. I run into it while processing big files, I don't want to have to hold them in memory, especially if I need several level of joins in order to correctly process them.
I thought about bypassing the entire issue by simply writing...
The CRM Horror
It is rare that I get to the foint where I am just flat out speechless from seeing something. Today I went beyond that, I was flat out speechless and aghast. The image that you see here is a small part of the FilteredAccount view in Micorosft CRM. Yes, you got that right, a small part. I got to this from experimenting with the DB model, trying to figure out how things work. I was strongly adviced not to make any use of any sort of view that started with Filtered. "It would make you cry", they said,...
More on Data Layer Componentization
A while ago I commented on Alex's data-layer componentization idea. Alex replied, and Diego as well, but I didn't get around to answering them until now. To remind you, the basic idea is: A data-layer component simply allows for information to be shared between different applications. By supporting re-use data layer componentization allows you to break the duplication habit. You can still have silos, but the natural tendency to duplicate data disappears. Each new silo will instead contain only new types of data and where old types are required, simply pointers, or cross silo foreign keys. I am...
Schema to wince by...
I was just sent this table schema (minor modifications to protect the responsible party): CREATE TABLE [dbo].[tblEmployees]
(
[iEmployeeSirial] [int] IDENTITY (1, 1) NOT NULL ,
[vcEmployeeID] [nvarchar] (15) COLLATE Hebrew_CI_AI NULL ,
[vcEmployeeName] [nvarchar] (50) COLLATE Hebrew_CI_AI NOT NULL ,
[vcAddress] [nvarchar] (60) COLLATE Hebrew_CI_AI NULL ,
[vcCity] [nvarchar]...
Re: Saving to Blob
Mats (NPersist) is talking about Saving to Blob, basically suggesting that you can turn your persistence strategy into Table(Id, Blob): An alternative would be to serialize the non-identity fields into a blob of some kind and save that to just one column in the database. The database table would then have primary key columns matching the identity fields and then just one more blob column for the serialized non-identity fields of the object. Since this basically prevent any useful filtering on the data, he suggest that you can extract columns outside the blob, much in the way we currently...
Re: SSIS - 15 Faults Rebuttal
Seems like this is the new trend right now :-) Phil Brammer has posted a rebuttal to my original post. Here are my answers: Bad Errors:You have to understand though, that this isn’t .Net. SSIS has many components/engines at work that obtaining the correct error isn’t always at the heart of the SSIS engine. It could be a database error. It could be an ADO error. Whatever it may be, I agree, some are cryptic, but I’ve generally been able to diagnose my errors. And if there is an error I don’t know about, I contact the community and finally...
Free doesn't give it a license to suck
Phil Brammer has a posted a comment here that really annoy me, rebuting To Adolf... You wrote, "It's the windows 98 of the database world." B.S.!! It's free. Go buy Informatica or something. Jamie's comment regarding the budget is true. These guys built a product right out of the gate to fulfill an ETL space in their tool belt. They did it, and packaged it with SQL Server. You buy SQL Server you get SSIS. Pretty neat, huh! So until you have to pay for it, it's a touch absurd to say this product sucks. It's pretty dang powerful...
SSIS: The backlash
Jamie Thomson has responded to my I Hate SSIS post, he agreed that most of them are valid concerns, but he also brought up some counter arguments that I wanted to respond to. The first thing that I wanted to mention is that JT has a solution for watching variables content, and I have updated the previous post & the "I Hate SSIS" page accordingly. Now for the parts I disagree with: Ayende: I wish I had a dollar for every time that SSIS kept track of something it shouldn't. Be it the old configuration, hard coding the connection string...
Rhino ETL: Thinking about Joins & Merges
Well, I think that I have a solid foundation with the engine and syntax right now, I still have error conditions to verify, but that is something that I can handle as I go along. Now it is time to consider handling joins and merges. My initial thinking was something like: joinTransform UsersAndOrganizations:
on:
Left.Id.ToString().Equals(Right.UserId)
transform:
Row.Copy(Left)
Row.OrgId = Right["Organization Id"]
The problem is that while this gives me equality operation, I can't handle sets very well, I have to compare each row vs. each row, and I would like to do it better. It would also mean having to do everything in memory,...
The Small Comforts of Life
Today I managed to capture a screen shot of an SSIS error that had drove me crazy, and I sent it to my boss, it looked something like this one. I had the pleasure of hearing him repeating "But that is not possible" five or six times, it sounded familiar, that is what I had said when we started to run into this. As an aside, I have create the I Hate SSIS page on my wiki, there is a impressive number of issues up there.
Rhino ETL: First Code Drop
First, let me make it clear, it is not ready yet. What we have: 99% complete on the syntax Overall architecture should be stable The engine works - but I think of it as a spike, it is likely to change significantly. What remains to be done: Parallelising the work inside a pipeline Better error messages More logging More tests Transforms over sets of rows Here are a few works about how it works. The DSL is compromised of connection, source, destination and transform, which has one to one mapping with the...
Rhino.ETL: Full Package Syntax
Okay, here is the full package syntax that I have now, which is enough to express quite a bit, I am now getting started on working on the engine itself, I am going to try the message passing architecture for now, since it is much more flexible. connection(
"NorthwindConnection",
ConnectionType: SqlConnection,
ConnectionString: "Data Source=localhost;Initial Catalog=Northwind; Integrated Security=SSPI;"
)
source Northwind, Connection="NorthwindConnection":
Command: "SELECT * FROM Orders WHERE RequiredDate BETWEEN @LastUpdate AND @CurrentDate"
Parameters:
@LastUpdate = date.Today.AddDays(-1)
@CurrentTime = ExecuteScalar("NorthwindConnection", "SELECT MAX(RequiredDate) FROM Orders")
transform ToLowerCase:
for column in Parameters.Columns:
Row[column] = Row[column].ToLower() if Row[column] isa string
destination Northwind, Connection = "NorthwindConnection":
Command: """
INSERT INTO [Orders_Copy]
(
[CustomerID], [EmployeeID], [OrderDate], [RequiredDate], [ShippedDate],[ShipVia],
[Freight],[ShipName],[ShipAddress],[ShipCity],[ShipRegion],[ShipPostalCode],
[ShipCountry]
)
VALUES
(
@CustomerID,@EmployeeID,@OrderDate,@RequiredDate,@ShippedDate,@ShipVia,@Freight,
@ShipName,@ShipAddress,@ShipCity,@ShipRegion,@ShipPostalCode,@ShipCountry
)
"""
pipeline CopyOrders:
Sources.Northwind >> ToLowerCase(Columns: ['ShipCity','ShipRegion'])
ToLowerCase >>...
Rhino.ETL: Turning Transformations to FizzBuzz tests
Tobin Harris has asked some questions about how Rhino.ETL will handle transformations. As you can see, I consider this something as trivial as a FizzBuzz test, which is a Good Thing, since it really should be so simple. Tobin's questions really show the current pain points in ETL processes. Remove commas from numberstransform RemoveCommas:
for column in row.Columns:
if row[column] isa string:
row[column] = row[column].Replace(",","")
Trim and convert empty string to nulltransform TrimEmptyStringToNull:
for column in row.Columns:
val = row[column]
if val isa string:
row[column] = null if val.Trim().Length == 0
Reformat UK postcodes - No idea from what format, and to what format, but let...
Rhino.ETL: Providing Answers
It would be easier to me to answer a few of the questions that has cropped up regarding Rhino.ETL. Boo vs. Ruby: Why I choose to go with Boo rather than Ruby. Very simple reasoning, my familiarity with Boo. I can make Boo do a lot of stuff already, I would have to start from scratch on Ruby. I don't see any value in one over the other, frankly, is there a reason behind the preference? NAnt ETL Tasks: The main problem I have with such an endeavor is that it is back to XML again, if I want to...
Framework building: Rhino.ETL Status Report
I am currently working on making this syntax possible, and letting ideas buzz at the back of my head regarding the implementation of the ETL engine itself. This probably requires some explanation. My idea about this is to separate the framework into two distinct layers. The core engine, which I'll talk about in a second, and the DSL syntax. One of the basic design decisions was that the DSL would be declarative, and not imperative. How does this follow, when I have something like this working: source ComplexGenerator:
CommandGenerator:
if Environment.GetEnvironmentVariable("production"):
return "SELECT * FROM Production.Customers"
else:
return "SELECT * FROM Test.Customers"
This certainly looks like...
Idea: The Boo ETL DSL
Note: This is just syntax idea that I have in my head right now, it has no implementation. I have decided to do something about, and spent some time thinking about it, here is the initial design. It is a textual DSL for ETL. What you can see directly below is in a direct correlation of a lot of the stuff that I need to do with SSIS. The first one is basically a Data Flow task. The script goes like this: Define two data sources, and you can see that I ma using named connection strings in...
Why I hate SSIS: Part N+1
Take a look at this (highly simplified) package. The flow is very simple, truncate shadow tables, copy to shadow tables, perform cleanup and copy to the real tables. The inside of each data flow is simple Source -> Convert to Unicode -> Destination. Not really complicated or ground breaking, right? The shadow tables have the same structure as the real tables, but they have no FKs. It failed on me, in production, but only when running as a job! The error? Primary Key violation in the Products Statuses and Order Statuses tables. I have verified that: I am...
SSIS' 15 Faults
I dislike SSIS intently, and I say this as someone who has done two projects using it, and has spent much time recently struggling to work with it .Instead of harping how much I dislike, I decided that it would be better to list the things that I find it so hateful: Bad Errors: The number one reason that I am using .Net instead of C++ is not the garbage collection, it is the errors. Clear, understandable errors, with the ability to trace them back into their source. Errors that gives you the reason for what happened. SSIS' errors...
private Database myDatabase;
This is an integration story. Once upon a time there was the need to integrate between two systems. A meeting (blah!) was scheduled and it was decided that web services are an appropriate integration technology. Fast forward some time, and now you get a strange situation where the other side has found that they are unable to use web services because of technical limitations in Java. Sound suspicious to me, but whatever. How do you suppose we will integrate, then? Well, they know how to talk to a database... So, they can't figure out web services but you want to talk...
You are allowed to disagree, too
Just to clarify, the other side of this post in this one. Rob Conery says that he landed on my bad side. Allow me to clarify, I may disagree with with Rob, but I like this types of discussions. Oh, and Rob, in the last week I posted 55 posts, 5 of which were about this discussion. Considering that this is usually how much I post in a week month, and considering that this is a topic which is (a) near and dear to my heart, (b) I disagree with Rob on this matter, I wouldn't be surprised to see more coming soon....
What does it takes to calculate a sales tax?
Lately a few people have pointed out at sales tax calculation as a simple business logic that can be done in the database. Let us consider what is involved in such a task, shall we? Well, on the first iteration we have: The sales tax is 10% of the listed price. CalcSalesTax(price): return price * 1.1 On the second iteration, we learn that we now need to support more than a single location, so we need to keep an association of location to sale tax. CalcSalesTax(price, location): salesTax = locationsToSalesTax[location] return price * salesTax On the third...
A developer perspective on SQL Server 2008
I just read this about the SQL Server 2008 features, and I thought to provide my own commentary on the new features: Declarative Management Framework - not really interesting to me. Table Value Parameters - YEAH, that is more like it. There are too many cases when I need to use IN on a large data set and I run into the 2100 parameters limitations. This is going to make working with this so much easier. MERGE SQL Statement - I got tired of writing my own Update/Insert very early. This also means that the tools will support it as...
Generating Random Assoications
Several times recently I had to created what is basically random assoications between two sets of tables. The problem is usually trying to get some data for the UI. Here is what I came up with: update Policies set Status = (select top 1 Id from ...
Roy Osherove, It is Time for Violence
No, I don't intend to beat him up, I just really like the title :-) After infecting me (and pretty much everyone else) with witty songs about development, Roy has released one of his songs: Time for Violence. It is my favorite song of the three or four that he had performed at DevTeach, and I keep humming to myself. Probably because it reflect so well the way that I think about "some databases" that I won't mention by name.
Problem Solving skill - the wrong way around
A friend just sent me this: secureStr("select 1 from Users where name = '' dro;p table Users") You would think that people who are aware of sql injection would understand about parameterized queries!
Transaction Isolation Levels
Craig Freedman has a few posts that explains some of the pitfalls of the different isolation levels, very interesting stuff: Read Committed Isolation Level Query Plans and Read Committed Isolation Level Repeatable Read Isolation Level A developer really should understand the implications of this,...
Converting an object collection to a DataSet
There are many tools that tends to work only with DataSets, the most often cases are reporting tools or data driven systems. I consider this an issue with the tool, usually, but this is a fact of line. Case in point, I want to display a report of customer objects, I say objects here because I retrieve them through an NHibernate query + business logic that can't really be done in a stored procedure. At the end, I get a collection of customer objects, and I need to pass that to a reporting tool...
ODBC is not an API
Phill Haack is talking about integration from the database point of view. I had the pleasure of dealing with several projects that involved interfacing to legacy systems. Invariably, the suggested solution was to use the legacy system database directly, and build the application from that. It was assumed that this would be a less costly endevour than building interfaces at both ends and hooking them together. A few stored procedures and many (many) views later, we had the integration completed on the side of the legacy system. Then it was time to...
SSIS: I know better than you do
If you haven't guess it by now, I am not fond of SSIS. The latest installment is probably a "feature". Assume the following, I develop an SSIS package on my local machine, testing it against a local database. Now, I want to run the package against a remote database. I did the Right Thing and put the connection string in a data source, so I change that and run the package from Visual Studio. Imagine my surprise when Visual Studio does a full cycle, including reporting the number of rows that it copied....
SSIS: You really don't need all this data
Here is SSIS deciding that "No, you really don't need to move this data." I really like the "smarts" that went into that engine. I make sure that I keep busy and don't move on to othe areas of the application, but dedicate my full an undivided attention to SSIS, as it is apperantly should be. Would it be too hard to provide a deteministic engine? SSIS would run only (random) parts of the sequence container if I right click it and tell it...
That might take a while...
Yesterday I started a run of a process that should run every 5 minutes, I got annoyed when it took too long and went home. I just got back to see this number: Maybe I need to rethink the architecture. It is still working on it. The really scary thing is that the select count(*) that I also left running didn't finish in ~16 hours or so. And just to clarify, this is an OLTP DB, not a data warehousing one.d
SSIS Integration Woes
I am not sure if it is me or SSIS, but something is very wrong here. I just spend an ungodly amount of time trying to solve this simple scenario: Source database is Oracle, Destination database is SQL Server A table with large amount of rows The table has a LastModifed field ...
Remote Keys: Cross Database Foreign Keys
Here is an interesting product (via Larkware). RemoteKeys Adds cross-database constraints to SQL Server, which is something that is really nice in many scenarios. Specifically, I am a firm believer that database == application, and that separate applications should use separate databases. (Versioning, deployment, WhoBrokeMyQuery, etc). This product seems to give a nice answer to Referential integrity concerns. I was interested to know how they did it, and it looks like a good UI on top of Linked Server + INSTEAD OF Triggers. I really like the idea, but a couple...
Data Migrations Woes
For some reason, customers insist on putting their data in a database, which is all fine and dandy. But then they also insist on taking it out! Which is hardly fine at all. The fun part starts when the source is a little known database called Oracle, and the destination is a new arival from out of town, who goes by the nick name SQL Server. Fun stuff that I run into today: PK in Oracle are Number, in...
Transactions and concurrency
I just had a head-scratching, hair-pulling, what-the-hell-is-going-on bug. Basically, a classic producer / consumer issue. Each work item is composed of several sub-items, which should always be processed as a single unit (exactly one time). The problem was that I suddenly started to get duplicate processing of the same work item, but with different sub-items each time. I went over the code with a comb, and I couldn't see anything wrong, I investigated the database, and everything was fine there as well. I knew that I was in trouble when I considerred...
Unit Testing Database: Embedded vs. In Memory..
A while ago I extolled the benefits of using an in-memory database for tests. Now, the only in memory database that I know of (that has a ADO.Net provider) is SQLite. SQLite is a great database, except for one tiny issue. It is has a really weak support for dates, requiring your to jump through multiply hops to do anything even slightly interesting with dates. Since I am mostly working on business applications, strong support for dates is a crucial issue for me. With great relucance, I moved my tests to use SqlCE...
From SQLite to SQL Server Compact Edition
I am giving up on SQLite as my embedded DB of choice. The reason for this is very simple. I really want to be able to use the in memory features (and other nice stuff it has to offer), but it has the worst support for date/time manipulations that I have seen. I am trying to formulate a query that should be along the lines of: select * ...
NHibernate Web Cast Series: Thinking about the infrastructure
Here is a status report about the NHibernate Course that I talked about previously. I have talked with Justin, and at the moment is looks like we are going to do a web cast series about NHibernate. The draft syllabus can be found here, although this is merely a list of topics that we would like to cover. We are thinking of roughly 30 hours(!) or so of web casts, which should be enough to cover NHibernate from end to end, and maybe leave me out of a job :-). A you can imagine,...
Messy database: NHibernate as the maid...
If you ever had to work with someone else's database (I assume that yours are perfect), you know that not all databases are nice and clean. In fact, some of them fail to do even simple things, like foreign keys. In some cases, FK is not even possible (think a view over a main frame table that you don't control). In such cases, almost invariablely, the database slightly corrupts itself. Sometimes it is a programmer error, sometime a failed set of commands without transactions, etc. The end result is that you get an...
Complex Queries With Active Record
There are some things that you simply can't do effectively with an abstraction. Take Hierarchical Queries as a good example. They are very hard in SQL, there is no good way to solve the issue in a portable way. Therefor, each database has its own method to handle this issue. In SQL 2000, you would build a table valued function that you would query, in Oracle, you use CONNECT BY, and in SQL 2005 and DB2 (that I know of) you can use Common Table Expressions. Out of the box, NHibernate can't handle this,...
Why this is failing?
Consider the following query (heavily reduced and thus rendered meaningless)... select * from Users user0_, Forum forum1_, ...
Unit testing with NHibernate / Active Record
One of the more difficult regions to test in an application is the data access layer. It is difficult to test for several reasons: It is usually complicated - fetching data effectively is not something trivial in many cases. It can be highly dependant on the platform you are using, and moving between platforms can be a PITA. It is usually hard to...
NHibernate Query Generator 1.5 Is Done! [Time With Magic]
I have been working (and blogging about it) for a while, and I think that I finally hit the right combination of keys to make it compile, so I am shipping it :-) If you fail to recall, it started as a simple way to generate criteria queries in strongly typed manner. The first version didn't do much, and was fairly simple. As soon as I started using it, I began to want more, much more. If I was working with an open compiler (Boo), I could have written this as an extentions and...
Bringing Active Record and NHibernate Query Generator Together
This time, I am going to try to do it without commentry. The next post is where all the fun will begin. NHibernate Query Generator has been updated so it could work with Active Record as well. The way it works is simple, point it at an Active Record assembly, and it will spit out the generated query files. Usage: NHibernate.Query.Generator <cs or vb> asssembly.dll <output-dir> It supports generating C#...
Active Record Rocks!
I have been head down in NHibernate for the last several months, and as a result, I think that I started to miss just how enabling Active Record really is. This weekend I have been working on with it with a venegance, and I love it. The NHibernate 1.2 integration is really important, because it allows to infer even more stuff! Allow me to present, in all its glory, the amount of stuff needed to make a class persistent: The collapsed properties merely hide the get/set...
SQL Refactor
So I got the beta of SQL Refactor, opened SQL Management Studio and liked what I saw: The first thing that I noticed, more important than anything else, is that it has a SQL Beautifier. This is the first I have seen this for T-SQL (there seems to be a lot for Oracle, though), and the first where the beautifier is integrated directly into SQL Management studio. Why is this important? Well, check this out (from Northwind): ...
Should you use NHibernate with Stored Procedure?
In my previous post about NHibernate and Stored Procedures, I showed how it can be done, and I closed of with this:The disadvantages - You take away from NHibernate the ability to comprehend the structure of the tables, this mean that it can't do joins, eager fetching, etc. This is extremely important capability that just vanished. Likewise for queries, NHibernate's abilities to query your objects is being severly limited using this method of operation. Galen commented: It sounds like the following two...
The little query that could... drive me crazy
I have a piece of code that has to calculate some pretty hefty stuff over a large amount of data. Unfortantely, that large amount of data took large amount of time to load. By large amount I mean, I walked away and had time for a coffee, chit chat, about three phone calls and a relaxing bout of head banging, and it still continued to pry into the database, and likely would continue to do so until the end of time or there about. This calculation has two main charactaristics: ...
The really bad side of composite keys...
No, I'm not going to talk about the hard to create/understand queries, performance, or maintainability. The really bad side of composite keys is that you get assoications using part of a key. Here is an example: Now, start working with the associations here. Payments are linked to salaries that falls within their date range and belong to the same employee. Needless to say, this breaks down FK support, so you will end up with bad data that you need to clean. No to mention that they queries that you...
Using SQL Functions in NHibernate
Often, when introducing NHibernate, I need to integrate with existing database and infrastructures. Leaving aside the question of stored procedures (since I already expanded on that in length here), which are avialable on NHibernate 1.2, I want to focus on using SQL Functions here. (One again, I'm back to the Blog -> Posts -> Comments model) Now, there are four types of SQL Functions that you might want to use: Scalar functions that are a part of an entity....
PluralizingNamingStrategy for NHibenrate
One of the nicest parts of developing with NHibernate is that you can get NHibernate to generate the table from the mapping. This is extremely useful during development, when changes to the domain model and the data model are fairly common. This is even more important when you want to try something on a different database. (For instnace, you may want to use a SQLite database during testing / development, and SQL Server for production, etc). I intend to post soon about unit testing applications that uses NHibernate, and this functionality is a extremely important in...
There Be Dragons: Rhino.Commons.SqlCommandSet
After last night's post about the performance benefits of SqlCommandSet, I decided to give the ADO.Net team some headache, and release the results in a reusable form. The relevant code can be found here, as part of Rhino Commons. Beside exposing the batching functionality, it is very elegant (if I say so myself) way of exposing functionality that the original author decided to mark private / internal. I really liked the declaration of this as well: [ ...
Opening Up Query Batching
I have ranted before about the annoying trend from Microsoft, to weld the hood shut in most of the interesting places. One particulary painful piece is the command batching implementation in .Net 2.0 for SQL Server. The is extremely annoying mainly because the implementation benefits are going for those who are going to be using DataSets (ahem, not me), but are not avialable to anyone outside of Microsoft. (See topic: OR/M, NHibernate, etc). Today, I have decided to actually check what the performance difference are all about. In order to do this,...
Measuring NHibernate's Queries Per Page
One of the biggest problems with abstractions is that they may allow you to do stupid things without them being obvious. In OR/M-land, that usually means SELECT N+1 issues.The problem is that you often develop a certain functionality first, and only then realize that while you tested, all was fine and dandy on the five items that you had, but on the real system, you have 5,000, and the DBA is on its way to ER...Anyway, I am currently working with Web Applications, and I wanted to get a good indication about what pages are troublesome.Being who I am, I...
How to quickly judge the quality of a database
When I start using a new database, I usually run the following queries to
find how good or bad the project is going to be:
select count(*) as [Number Of Tables] from information_schema.tables
Anything bigger than a couple of hundreds, and I start to feel
really nervous. Ideally, it is around 20 - 50 main ones, and maybe
additional dozen tables for constants. I have worked on databases where then
numbers are in the thousands (and yes, I did touch every table)
select
table_name, count(*) as [Number Of Columns In Table] from information_schema.columns
group by table_name
order by count(*) desc
If you see a table with...
The Mighty Server and the Killer Queries
I just read a comment that really annoyed me, talking about the lack of need for caching.These days DBMS are not just a “persistent storage”; a DBMS is designed to handle many concurrent hits and transaction processes using multi CPU support and terabytes of memory pages. I don’t understand why we developers are so obsessed with the number of hits. Just to note, I have caused a 32Gb, 64bits, 4 CPU Server to weap tears of shame when I run 900 concurrent queries traversing just under Billion rows. It worked, but it wasn't pretty....
Spot the WTF
Thankfully, I haven't run into this one yet, but can you spot the WTF here?
NHibernate Generics 1.0.10
NHibernate Generics 1.0.10 is a small bugfix release that is intended to allow inheriting from EntityList<T>, EntitySet<T>, etc. Check here for the scenario that was fixed. As usual, binaries and code are here
I want one of those: Active Writer - VS Designer for Active Record
Check out the pretty pictures! Even better information at the source. Gokhan Altinoren has done some tremendous work there. Can't wait to get the bits.
Working with high levels tools: A Performance Perspective
The performance question was raised in the Castle's Forums, hammett has posted a blog entry about it, but I have my own two cents to add. The question was about a performance test made with NHibernate vs. ADO.Net, resulting in NHibernate being quite a bit slower. Just to note, at the moment I'm talking about performance in terms of milli seconds, I'll talk about performance in terms of days in a bit. A test between ADO.Net and NHibernate is going to be meaningless for the most part. Sure, you can issue a SELECT to get...
Fixing log4net 1.2.9 AdoNetAppender NULL bug
In log4net 1.2.9 there is a bug in the AdoNetAppender. It will not log null values appropriately. This bug is fixed in log4net 1.2.10, but it is not always possible to move to the next version. (In my case, both NHibernate and Castle uses log4net 1.2.9) I tried recompiling everything to use 1.2.10, but it took too long, and eventually I simply wrote this little trigger: CREATE TRIGGER [ReplaceNullLookAlikeWithNulls] ...
Riddle: Active Record as a Rule Engine
I mentioned in a previous post that I used Active Record as a Rule Engine, and a couple of people asked just how. Before I will post the answer, I want to see if you can come up with the same idea. The scenario: A rule engine that can process big amount of data. The engine will run as a Windows service, and not as a part of another application.Throughput is important, as well as flexibility in the rules and what they can do. The processing of each rule can be quite complex,...
Pessimistic Locking
I need to be able to ask the database for some rows to work on, and I need to be sure that I am the only one that works on those rows. After thinking about it, I came up with this SQL: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRANSACTION ...
ADO.Net Entity Framework: Round II
After my last post about the ADO.Net Entity Framework, I got a long comment from Pablo Castro, the ADO.NET Technical Lead. I took the time answerring it, mainly because I wanted to consider my words carefully. Again, I'm probably not a partial side in this matter, with my strong NHibernate bias, but I do have quite a bit of experiance with Object Relational Mapping and how they map to real world applications. My previous post was mainly about extensibility in the framework, and how much I can extend it to fit the needs...
NHibernate's "Stored Procedures" - Named Queries
First things first, don't get too excited, NHibernate doesn't support Stored Procedures (yet). What I am talking about is encapsulating HQL queries in a similar manner to the way stored procedures work. A little background before I start talking about the technical details. When using NHibernate, I'm usually pushing the use of NHibernate's Criteria approach, since this is by far the most easy way for people to comprehend how to query the database using NHibernate. While for simple uses HQL is very simple: ...
Better solution to the Tricky SQL problem
Moran has pointed me to the CROSS APPLY syntax in T-SQL 2005, which does allows joining against a table valued function. Using this, the query goes down to this: SELECT EmpName, ...
Solving a Tricky SQL Problem
I'm starting to get quite a bit of mail from this blog. Some of those questions are about subjects I can answer immediately, some require a fair amount of work (which can be had, if you really want), and the more interesting ones are those that require some thinking, but does not require too much time. This question from Dave is the best one so far, and I got his permission to blog about it, so I'm doubly happy. The issue is working against a legacy database to get the data for...
ADO.Net Entity Framework
I'm currently reading the documentation for ADO.Net Entities Framework. As always, I'm using this as a scratch pad for what I think as I read this. Next-Generation Data Access - June 2006: They still allow access to lower level services with the Entity Framework. Lower level is apperantely IDbConnection and IDbCommand. They mention the 80% brick wall, which is encouraging. The problem is that I don't like this solution. The framework should be flexible enough that I could plug in at all...
Command Batching in ADO.Net 2.0
I got one question about this. How am I supposed to do this? No, using a data adapter is not an option, I want to create my own commands and batch them. Searching gave no answer, and Reflecting over the relevant classes bring me to a whole mess of internal classes and methods in System.Data. Any ideas? I would really like it to be DB agnostic (or be able to query if I can batch or not), but even something for SQL Server only will be helpful....
On System.Transaction
This is an excellent post about System.Transactions, which include a very concise example of how to build your own resource manager for integrating with System.Transactions. Highly recommended.
The Wheel, slightly eliptic
A similar piece of code ( in PL/SQL and cursors ) kept me busy for most of the day. I have translated it to C# because I really don't think that I could stand seeing another cursor today. Figure out what it does (it is tricky). And see if you can suggest an improvement. Both SQL and C#/ VB/ C++ (whatever) are accepted. public void You_Commited_Grave_Sin_In_A_Past_Life_And_Are_Now_Being_Punished() ...
NHibernate Best Practice - Assert Cascade Behaviors
I've said it before, one of the trickiest parts on NHibernate is the cascade behaviors. Why is it tricky? Because a slight configuration can cause the application to get exception at best, or silently lose data at worst. How does this work? Let us assume that I have the Blog -> Posts model, and I want to create a new post. I can do this like this: using(ISession session = factory.OpenSession()) ...
Date Merging Challange Solved
I will give some wierder problem next time, Scott solved the problem (the simple one and the harder) in a really nice way. What more, he posted about how he solved the problem, and this is great. I often don't bother to explicitly think about the step to a solution, so it was a pleasant surprise to see how he got to the answer, not to mention that anyone reading it gains some valuable insight about how to solve those kind of problems. Great job Scott!
SQL Challange, Date Merging
...
25 Reasons Not To Write Your Own Object Relational Mapper
Not long ago I was ask to evaluate what features are need to build a robust data access layer. The choices were:
Build / Code Gen a DAL
Use an existing Object Relational Mapper
Build an Object Relational Mapper spesific for this application.
DataSets were not appropriate to this application, so the the first and last choice were essentially the same. The database schema...
DLinq Answers
I posted this two days ago, and I got some answers to share. I'm not an expert on DLinq, so this isn't writ in stone. DLinq supports lazy loading, but call it delay / deferred loading. Not sure why they pick a different term than the standard one in other O/RM. It also supports rich eager fetching support, which is...
Number Of Joins
When people want to say that a database was normalized too much, they generally say "It takes seventeen joins to get the username an email." Am I the only one who noticed it? Can you guess why this is the number that repeats itself?
The Prison Domain Model Challange: Answers
Chris Bilson give a nearly complete answer of how to find the valid / invalid commands for an inmate in the comments of the post. It is a very elegant solution, in my opinion. I took upon myself to improve his solution a bit. The result is this query: SELECT ...
You're a DBA If...
This list had me laughing out loud too often.It is great.
Bulk Data Operations Using NHibernate
If there is one thing that NHibernate (and most other O/RMs) are not very good at, it is bulk data operations. What does it mean? Well, let's take a simple case of wanting to delete all the records in a table. (I'm going to use Active Record Blog Sample for the sample, it is much simpler). Let take this, for instance:Post.DeleteAll(); This will delete all the rows from the Posts table. Really easy to write or do, but what is the meaning here? What goes under the cover is something like: ...
Primary Keys: Identity vs. Guids
There always seems to be a great debate with regard to what data type will be used as the primary key for tables. The debate is mostly centered on whatever we should use identity fields vs Guids. Guids has the advantage that they are globally unique (duh!), so they are a good solution for the cases where you need to merge data from several sources. Identity fields are easier to understand (Quick, what is the difference between 8833075D-8861-4324-8814-421BD5F04C7D and AD7FF558-DFCB-4354-9D68-C1DD926BC22A) and to debug since you can easily see the difference between concecutive rows. They...
One upon a time, in an IT operation far far away...
Enough bullshitting, lets start from the the following scenario, we have an application that need a database (a rare event, you can imagine).We have three tiers of users here, normal users can view data, editors can change data and the administrator can do whatever they want. It's a simple model, but it is enough for this example. Here are several security treats for the application, depending on the way it was implemented.A hacker gained access to the database and is able to send any query that it wants to the server.If we were using the Stored Procedures (SP)apporach, he doesn't...
You really shouldn't write your own O/RM
You do not have the resources to do as effectively as the existing tools in most scenarios. There is a reason that Nhibernate got over 100,000 Lines of Code. The simplest O/RM is just mapping from columns to properties. This is repetetive work that can be done by a code generation tool easily. Then you realize that you need: Transactions Concurrency Control ...
You can't escape the complexity
For every complex problem, there is a simple, elegant solution. It is also wrong. Sahil is talking about the pitfalls of DLinq, how a seemingly minor change in the code created a drastically different query which drastically different performance implications. Just to note, this is not unique to DLinq, it's something that exists in NHibernate, or any other O/RM in the market. O/RM is a leaky abstraction at best, I have said it before, and I will probably say it many times in the future. This mean that you need to...
Advantages of Stored Procedures?
Eber Irigoyen had this to comment on my post about SPs:...the point is how much scalability, flexibility, security, etctirity the SP give you, remember the rule of thumb, always use the right tool for the job, use what will give you the most benefits in the long run I completely disagree with this, if you didn't catch on so far. Here is my refutal: Scalability: ...
Avoid Mixing O/RM and SQL
It is common to people that are starting use O/RM to get into a situation that they don't know how to solve using the O/RM (or even aware that such a thing is possible). The easiest example is loading just Id and Description for a certain class without loading all the (possibility expensive) properties. There are several ways to do this using most O/RMs, but they are usually not obvious to the beginner. This is usualy a pain point since they can easily see how they would do it using plain SQL,...
Shuffling a list in SQL
Let's assume that you've a table with a position column that you use for indexing or orderring, and you want to shuffle the ordering of items there. You can do it with a single query like this one: UPDATE Attributes SET Position = CASE position WHEN @From THEN @To ELSE Position+1 END WHERE ReferalID = @Referal and Position BETWEEN @To AND @From ...
O/RM are not Magic (but they are close)
I just got a question in Email about NHibernate. The question was using composite keys (key with several columns) and an IN. NHibernate doesn't allow this, and the question was why, and how can it be solved. Well, it took me a couple of minutes to figure out what is happening. NHibernate is not allowing this because there isn't a good story to support this type of functionality. To my knowledge, there is no way to do an IN over multiply queries. The nice thing about NHibernate is that it allows you to do...
Using O/RM Properly
Steve has a post comparing DLINQ to other O/RMs, I'm not sure which O/RM solution he is talking about, but it is either not very good and the develpers used the five minutes walk through to do everything. A home grown one may be implemented so naively, but most real ones has ways to handle this properly. Here is how I would use his examples in NHibernate, notice that at no point I lose any performance. Summing:select sum(p.UnitPrice) from Product p Will result in the correct SQL...
SQL Perfromance Reference: Joins vs. Seperate Queries
I need to find the reference / reasoning for the following argument:A single query using a join will be more efficent in server resources than several queries to get the same information. For instance, consider this query: SELECT Users.Name, ...
On Toads
I don't like Oracle's tools. Today I had a chance to work with Toad a little bit, and I found that the UI is more cluttered than I am used to, but it has several features that I want for SQL Server Management Studio: Color highlighting for columns and tables Auto correct form to from (this is huge, I make this mistake all the time). ...
My WTF of The Day
The following tidbit caused quite a bit of problem, and a loss of a day for two people. I'm working on legacy system right now, and part of this work is refactoring stuff without breaking the behavior. The process is quite big, but we managed quite fine until we suddenly started to get wildly different results. We traced is a dozen times, and eventually we narrowed it down to code that looked a bit like this (PL/SQL): declare...
Clemens on O/R Mapeers: Take II
Clemens has a new post about O/R Mappers, and this time he brings up several relevant points that I agree with :-). He brings up a comment (by Scott E) from the previous post: I've hiked up the learning curve for Hibernate (the Java flavor) only to find that what time was saved in mapping basic CRUD functionality got eaten up by out-of-band custom data access (which always seems to be required) and tuning to get performance close to what it would have been with a more specialized, hand-coded DAL. This comment seems...
More On The Object Relational Siren
Udi Dahan has a post about Clemens anti O/RM post. I really liked his conclustion, and the reasoning behind it. Beyond that, I wanted to comment about knowing SQL & utilizing O/RM to the fullest. If you don't know what is going on under the covers, you'll not be able to diagnoze and fix performance problems. So using O/RM doesn't free you from understanding DB design and understanding how SQL works.
More about Object Relational Mappers vs. Custom DAL
Continuing my reply about Celemens' O/R post, I thought that I would give a more real world example of the secnarios. Here is my schema: I think that this is a pretty standard schema (if a bit simplified) for shipping orders to customers. Now, let's say that I want to show the customer all his orders, and let him see the details of each order. Here is my objects for this application: So far so good, I would...
Transactional Data Defination Language in SQL Server 2005
Did you know that you can have transactions on DDL as well? I run into this more or less by accident, and it is very cool. Check this out:
BEGIN TRANSACTION
CREATE TABLE Foo( Num int) INSERT INTO Foo VALUES (1) INSERT INTO Foo VALUES (2)
SELECT * FROM Foo
ROLLBACK
SELECT object_id(N'Foo') as FooId
The results of this query are:
Num-----12
FooId------NULL
The CREATE TABLE statement was rolled back! I usually don't approve of using DDL in the usual course of things, but what really impressed me was this query:
CREATE TABLE Foo( Num int)INSERT INTO Foo VALUES (1)INSERT INTO Foo VALUES (2)
BEGIN TRANSACTION TRUNCATE...
SQL Server 2005 and database snapshots
I was playing with SQL Server snapshots today, and they are very cool. The idea is that you can freaze your database at a point in time, and always refer back to it. The problem is that it costs you in terms of performance, and it may cost dearly.
The way that database snapshots are implemented is copy-on-write to sparse files, so when you create a snapshot, a sparse file (a file that is filled with zero, which the OS knows not to save to the disk, so you don't waste space). Any time a page has to be written, it's...
Clemens on O/R Mappers
Clemens has a post about O/R Mappers. From his post: Another argument I keep hearing is that O/R mapping yields a significant productivity boost. However, if that were the case and if using O/R mapping would shorten the average development cost in a departmental development project by – say – a quarter or more, O/R mapping would likely have taken over the world by now. It hasn't. And it's not that the idea is new. It’s been around for well more than a decade....
Refactoring SQL
Okay, I just went through a SQL refactoring tha I consider interesting. We'll start with this code:
Cursor transactions IS
SELECT L.CustomerId ,
Sum(Y.Debt) Sum_Debt ,
Sum(Y.Credit) Sum_Credit
FROM Transactions Y join
Customers L
ON L.CustomerId = Y.CustomerId
AND L.Status NOT IN (1 , 2 , 3)
GROUP BY L.CustomerId
For rec0 IN transactions Loop
UPDATE CustomerStatus
SET Debt = rec0.Sum_Debt,
Credit = rec0.Sum_Credit
WHERE CustomerId = rec0.CustomerId
End Loop;
The above is using Oracle syntax, which I find much saner for cursors than what SQL Server requires...
Price of Indirection
How goes that old saying?
Anything in computers can be solved by adding another layer of indirection, except too many levels of indirection.
I find myself contemplating that a lot lately. I'm fighting performace troubles quite a bit lately. And right now my pet devil is this creature and its friends:
CREATE VIEW LastBusinessDayTransactions
AS
SELECT * FROM Transactions
WHERE TransactionDate = (SELECT max(TradeDate) FROM TradeDates
WHERE TradeDate < (SELECT max(TradeDate) FROM TradeDates))
This is supposed to be a view that will return the transactions from the lats business days (assuming that the max trade date is...
Profiling SQL Server
I know about SQL Profiler and Execution plans. What I need is a way to measure how much a stored procedure cost in time.
What I mean is that I've this:
Start
Step 1
Step 2
Step 3
Step 3.1
Step 3.1
I want to know how much time each procedure takes, preferably in real time "The current running procedure is X, started fifteen minutes ago".
Any ideas?
Piping, Primary Keys and Three Hours I'll not get back
Okay, so I mentioned that I'd a query that violated a primary key constraint, and I couldn't figure out why. Well, I finally got a hold of what was going on. It's simple and non-obvious at the same time. Let's start with a code example that shows the problem and then I'll put in a couple of words about the reasons behind it. First, let's create a table with a unique constraint (easier than a primary key):
CREATE TABLE Foo
(
Bar INT UNIQUE
)
Now, let's insert some data into the table:
INSERT INTO Foo
SELECT 1 UNION...
Best Practice
I realized that I didn't explain a little of the background story about the last post. For some reason the (real, not the one I posted) query is violating a primary key constraint. When I tried to check it, it would think for a long while, and then spit out the error.
I know it's evil to optimize buggy code, but at least now when it fails, it fails fast.
Optimization Story
I just got out of struggling with a very long query. If you missed it, I'm working on SQL quite a bit lately, and I keep discovering new things about it. Here is a query similar to the one I was working on, the idea is to update the Summary table with new information about actions that haven't been summed yet. For instance, a transaction that occured three days ago, but only now entered the database. This query took over 20 minutes to run before I killed it. (Sorry for the rather complex query, I'm afraid it doesn't really translate...
Batches
Oren Ellenbogen has a post about IDataReader.NextResult(); I saw that method in intellisense, but I never really understood what is it. It is used to send a batch of statements to the server (seperated by ';', apperantly) and then gets sets of results. One thing that interests me is how this is done on Oracle, since it throws exceptions on ';'.
25 Things that I don't like in Oracle Tools
I'm not going to talk about the database itself just yet, the tools has quite enough things about them to rant about. First, SQL*Plus seem to be one of the main tools to work with Oracle, and it's a piece of crap in so many levels. It's a console application that someone put a Windows wrapper around. The end result is a product that combines the worst of all worlds. Let's start with the fact that SQL*Plus has no text box, it's UI is a console, meaning that: ...
Creating Tables From Queries
Did you know that you can create a table from a query? It's done using the SELECT INTO syntax, and it can be quite useful for creating a table for temporary calculations. What surprised me was that you can get this to create a permenant table as well. Obviously that isn't recommended (I hope you put a little thought into your tables before you create them), but it's useful to have for DBA scenarios.
On Lies & Linked Server
I'm not sure who is at fault here, but Oracle as a linked server really sucks. I'm not talking about general Oracle stuff here (I'll try to post something about it later). I'm talking about the database lying about data types that it doesn't like. For instance, take the following Oracle table defination: CREATE TABLE Foo ( ...
SQL Debugging
I just found out that you can debug a stored procedure the same way you can debug .Net code. To use: Open VS.Net Open Server Explorer (Ctrl+Alt+S) Go to the database where the stored procedure is locate Right click and choose Step Into Stored Procedure You can see what...
Circular Dependencies
So, I need to copy a set of tables from one database to another, the issue is that the source database keeps changing, so, in a true developer fashion, I added another layer of indirection and instead of writing scripts to copy the database I wrote a program that would write scripts that would copy the database. I'm pretty proud of it, it uses the SQL Runner tool that I mentioned earlier, and it work in stages, so the first thing that copied is tables with no foreign keys, then the tables that...
JustDate: Striping the time part from datetime variables
In oracle, there is a handy little function call trunc. That function is a real work hourse, it can truncate numbers, dates, programmers life time, etc. In the most basic form trunc(sysdate) will return the current day without the time part, it has quite a few options that are fairly nice as well. Anyone, for some reason this bulitin functionality is missing in T-SQL, so of course I had to implement it three times. The first was a scratch function that used string manipulation to do its work. The second was a SQL CLR function that...
Using Data Ranges
Okay, let's see some use for having a date range, let's assume that I've the following employees table:
...
The value of parallelism
I said that I used the little script that I posted about for short tasks.The reason for that is that it will run the files serially. This isn't very good if you want to get the maximum from your server. I actually wrote a multi threaded sql running that connects to a database and runs multiply queries in parallel. I'm using that for long queries (where long is thousands of files and the whole process takes hours).
Running it in parallel (and from a different computer than the one the server is running on) improved the performance by a factor of...
The lost art of batch files
Here is a quick & dirty way to execute a command on all the files in a directory.for %f in (*.sql) do sqlcmd -E -S . -d DataBase_Name < %f > %f.log
In this case, I execute all the sql files in a directory and write the results to a log.
This is useful if you've a common set of scripts that you need to run. I'm using it to get results of short operations.
Why short? Well, this is for the next post.
More on date ranges
Like I said, I didn't really like my solution to date ranges using Common Table Expression and recursion. It is very limiting in many ways, performance not the least of them.
So I set out to find a better way, and found it with table valued functions, like this:
ALTER FUNCTION DateRange
(
@start datetime,
@end datetime
)
RETURNS @DateRange TABLE
(
CurrentDate datetime
)
AS
BEGIN
WHILE (@start < @end)
BEGIN
INSERT INTO @DateRange(CurrentDate)
VALUES(@start)
SELECT @start = DATEADD(day,1,@start)
END
RETURN
END
Like the previous function, this take two date ranges, and return a set of all the dates between them,...
SQL Date Ranges
When you're working with SQL, it's often best to work with sets of data, but when you're working with dates, it's not always possible. Especially if you need to work with date ranges.
For instance, how do you find out how many hours each employee worked per day over the last month? Well, you can have a BusinessDays table, which you can join against, but often you will want to so some ad-hoc date ranges, or you don't have a BusinessDays table and you don't want to create one.
I was playing with ideas about generating sets from start - end values...
SQL Date Processing
Nearly a month ago I made a statement about date processing in SQL that made me use SQL CLR in order to solve this problem. Since then I think I learned a couple of things about SQL, so I think I would like to retrack this statement.
The issue that I had was finding out the first day of a month given a spesific year / month combination. I eventually solved it with something similar to:
new DateTime(year,month,1)
The problem was that it wasn't handling null values very well, and I didn't feel like loading VS.Net, adding nullables and checks to the whole...
On Default Error Messages
It's not always wise to give too much information about something in the error message. I'm usually advocating giving as much information as possible, but not when you don't know what is wrong. Case in point, the SqlConnection in .Net 2.0 will throw a message with this information:An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not...
SQL Server 2005 Pitfalls
Okay, this one is a tiny pitfall, but it really drove me mad. I was developing a web service that needed to talk to a database. But nothing that I could do would make it do it. It kept giving me this error:An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) ...
More reasons to hate Oracle...
There is the straight and narrow, and then there is everything else. For instance, I was passing this command with 'select * from customers;', using OracleClient, and I got the following error 'ORA-00911: invalid character'.
The maddening part is that it knows what the invalid character is, but it won't tell me. Trying the exact same command in SQL*Plus worked, of course.
In the end, after much hair pulling and nearly a day down the drain, I tried this command 'select * from customers', which worked. Can you spot the difference?
It the bloody ';' in the end! It cannot process that so it throws...
Favoraite Error Message of The Week
The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
When I got this error, it took me a while to understand just what was wrong, and then I walked away, knowing that I'm smarter than SQL Server. :-)
Why I hate linked servers
Check those out:
select precentage_date from openquery(ORA, 'select precentage_date from PRECENTAGE')
Worked just fine.
select precentage_date from ORA..ADMIN.PRECENTAGE
Fails with: Error converting data type DBTYPE_DBTIMESTAMP to datetime.
There isn't any values outside the range that SQL Server can handle, but still it gives me those errors. Argh!
Performance of SQL Server & SSIS
This is not a valid performance comparasion by any means, but I noticed something interesting when doing simple data transfers from SQL Server to SQL Server via SSIS & SQL Server. The situation was copying ~6.7 Millions records from one database to another (on the same machine). I first tried it using the Import/Export wizard, and then using this script: INSERT INTO Customers SELECT * FROM Another.Cutomers ...
Pitfalls of moving data to SQL Server from Oracle
I need to move a couple oftables from an Oracle database to SQL Server, how do I do this? Well, there is SSMA*, but it has a couple of problems dealing with strange decisions on the Oracle side. It also require a hefty machine to run on, as well as taking quite a bit of time to run. I used it to generate the schema, and then defined a linked server to Oracle and wrote a very simple script to port the data. It failed, miserably. Let's assume that my...
On Production "Agile" Databases
In most cases, the database is the heart and soul of the application. Every database book or class will be sure to mention that most databases outlast the application that created it. This usually cause to organization to threat a database as a fragile thing that need nurturing and three levels of approval (including from the Janitor's Sister, who once read a book about building data driven application in VB 3.0) before the smallest change can go into the database. It's worse when it is a production database. I don't think that I...
Tip: Searching Stored Procedures Text
Here is another tip for those who need to explore big SQL databases with a lot of stored procedures and quite a bit of logic. This will gives you a list of all the procedures that has a certain string in them: CREATE PROCEDURE dev_FindInSP @str nvarchar(max) ...
Performance tuning mystery
I got a stored procedure that was taking an undue amount of time to run, in was about half an hour, and that was really bad. I tried looking at the execution plan and add an index to make it faster, but it wasn't that effective.
The query was an update from a nested select (which contained a rather complex condition and a couple of joins), and I couldn't really figure out how to make it better. I tried running the pieces individually, and they run reasonably fast, so it was the interactions between them that killed the performance. The execution...
Code reuse, cursors and NIH, oh my!
I recently discovered that a certain class of programmers has a really bad case of Not Invented here. It got so bad that they decided to implement JOIN and GROUP BY by using cursors. Of course, due to the nature of the problem, they were so excited by their new discovery that they immediately rushed to propogate it thorughout the code base, using none else than the tried and true method of copy-paste-change-one-charater-and-then-move-on-never-looking-back.
I won't speak of the performance of this method, but I think that you can guess that there are orders of magnitude difference between this new way of...
SQL Frustrations
I knew that it was bound to happen.
I started running into highly mysterious errors in SQL Server. The strangest among them is the Missing Columns Mystery.
When I run a stored procedure that calls to other SPs, and so on, I get an error about invalid columns somewhere deep in the stack. But when I run the SP directly, it passes perfectly.
Of course that when I check it, the table has the supposedly missing columns. It drives me crazy, and I can't really get the error consistently, if I run it for a couple of time, it seems to fix itself,...
Hierarchical queries with Common Table Expressions
[Via Mitch Denny's Blog ] [Via Sql Down Under] It's usualy hard to efficently get hierarchical data using SQL. But apperantely SQL Server can do that easily. The key to this is to create common table expression that recursively points to itself, like this: WITH AllEmployeesOf([EmployeeId], [Name], [ManagerName]) AS ...
Sql Tips
The sp_helptext stored procedure can give you the text of any non-table object in the database. I find this incredibly useful, as it saves the need to locate the object in the management studio UI.Here is an example:sp_helptext Very_Nasty_TriggerResults:Text------------CREATE TRIGGER Very_Nasty_Trigger ON NASTY INSTEAD OF INSERTAS BEGIN SET NOCOUNT ON; Declare @stmt nvarchar(max) SET @stmt = (select [text] from INSERTED); exec sp_executesql @stmtENDNow I need to find something that works on the table level as well.
Triggers & Auditing: Take 2
Okay, my post about auditing & triggers generated some good questions, so here are a couple of answers. Before anything else, I made a mistake in the schema and didn't change the defination of the [Posts_Audit_Trail]'s post_id, it shouldn't be an identity, here is the new schema: CREATE TABLE [dbo].[Posts_Audit_Trail] ( [audit_post_id]...
Triggers and auditing
There is some discussion in the Castle Mailing List about auditing using ActiveRecord. I firmly believe that auditing is one of those things that should be done in the database layer. It's very simple to set up, and then you can forget about it until you need it. Here is an example of how simple it is. Like most of my database samples, it is going to be a Blog->Posts sample. The issue is auditing in a multiply users per blog environment. Here is how the Posts table looks: ...
Performance Tuning In SQL Server
Steve has a good article on Why is SQL Server So Slow?
How to debug your head agasint the wall
I just watch a very nice diagram of SSIS doing its work, and throwing 70% of the input rows into the garbage. I was a bit concerned about that, as you may imagine. I talked before about the experiance of debugging a problem with SSIS, I present this case as a case study.First, I should mention that I pipe all the errors to a table that contains three columns (which are provided by SSIS). The columns are ErrorColumn, ErrorCode and ErrorOutput. So when I get an error, I just check the error table and see where they happen. In this...
More annoying stuff about SSIS
You can't close a package after execution. You've to manually stop debugging and then close the package.No way that I could find to look at just the errors/ warnings from the log.The Flat File Source just blithedly ignored the fact that the file doesn't exists and carried on its work. I need this behaviour in several places, so it would be useful to do it, but I've no idea why or how this works.You can't edit connection managers without a live connection to use.You don't get any results when you search for the errors on Google.I'm not sure how I...
SQL Management Studio
Here is one of the more annoying things in SQL Management Studio. If you try to use the Summary tab to skip the [dbo] prefix in the treeview, you often get this error:I'm getting this thing three or four times a day, and then the summary tab stops showing stuff.
SSIS Debugging Frustrations
SSIS is a great tool, and I really like some of the things that it allows me to do. But for crying out loud, you can't debug this thing. The errors it generate are reasonable enough if it is SQL server errors, but if it is something that is generated by SSIS itself, god forbid that you'll try to get what the error was. Here are some examples for things that really frustrate me: When you get a failure in the data...
SQL Tidbits
Can someone explain to me why Microsoft considers "dummy” as a reserved word in sql?Any idea how I can get an object SQL text programmatically?
Heavy Load
I just put SQL Server through a stress test that had it copy several giga bytes over a network. It tooks over two hours, but I think most of it was network traffic. The actual inserts took relatively little, although toward the end I noticed SQL Server slurping memory like it was free candy (ended on ~1.4Gb RAM taken).I then run a query on all of that data, which returned much quicker than I expected. The more I work with it, the more I like it.
No more occupational therapy
One of the most obnoxious thing
about writing diagrams (be that DB layout, class hierarchies, SSIS packages,
etc) is how to arrange the connections so you can see anything clearly. It has
been notoriously hard to do it in most tools that I’ve used, but I just found
the Format > Layout > Diagram command in SSIS that arrange a package
very neatly in a tree
fashion.
This little command is going to save
many hours .
Comparing LINQ and Its Contemporaries
Ted Neward has an article at MSDN about Linq (Microsoft next generation data accress layer / ORM) and the rest of the tools that are out there in the field for data acess. I agree with most of the article, and I was excited as any when I first saw Linq (can you imagine walking to a client and telling him that this is The Microsoft way to do so, instead of having them insist on datasets?). The problem that I have is that it make some broad generalizations about the rest of the...
Moving from procedural code to sets: Part 2
In the previous post I showed how it is possible to use aggerate functions to avoid procedural code in SQL. The result of this post actually surprised me, since I didn't know that this was possible until I tried it (I intended to show that this is not possible, and then show another way). As an aside, looking into the results of both queries, the cursor based one was wrong by several thousands, because of data type conversions that were happening (losing precision from numeric to money, mostly). Well, I still want show...
SqlClr and anonymous delegates
Okay, I'm playing around with SqlClr (doing runtime DDL stuff, mainly), and I discovered that SqlClr and anonymous delegates don't play along very well. The issue is that code like this:
CallWithAction(delegate(DateTime dt) { return dt; } );
Is actually translated to something like this:if(ClassName.<> HiddenFieldName == null) ClassName.<> HiddenFieldName = new SomeDelegate(HiddenMethodName);CallWithAction(ClassName.<> HiddenFieldName);
And that causes problem with the SqlClr, since the method is trying to store into a static field, which doesn't seem to be allowed on the safe level (which I really don't want to pass.)
It doesn't happen if the anonymous method uses local variables,...
Picking a database... and staying with it.
I just read this post from an ex-googler, talking about how they tried to move from MySQL to another (presumbely Enterprise. Most likely DB2 or Oracle), and failed. I'll gloss over the fact that they implement such a critical system over a database with no trasaction support (and the comments about "it is easy to add on the application level" that made me cringe so hard that for a moment I looked like a fat bagel). I want to talk about migrating a single application from one database to another. This is not something...
NHibernate Generics & EntityList - Update
Okay, the code is in the Subversion repository, but I'm not ready to release it yet. I've a shameful confession to make first: I never used an IList with NHibernate (or ActiveRecord) before, so I'm not really sure whatever my code makes good sense or not. And, in addition to that, I haven't had the time to test it propertly. I would appriciate it if someone who had used it before could take it for a spin and tell me what they think the bad points are.
About SSIS and Code
I just saw Sahil's post SSIS, and I couldn't help but cringe at the last words of the post. While in theory you can write code that will write an SSIS package for you, you really don't want to do that. The issue is that the whole thing is completely under documented, uses COM interop in the strangest way (sometimes you've a managed wrapper, sometimes you don't, there are things that you can't do from the COM interfaces, so you need the managed wrapper, and things that you can't do from the COM interfaces). ...
Verifying you Data Access Layer when using NHibernate, Active Record, And Friends
I sometimes run into situations where people think that ORM is a magic word for no-work-required. This is especially true when they tend to realize that a few hours (on a small project, or a few hours here and there on a larger one) in front of an XML editor and some really simple classes, can more or less eliminate the need of a DAL. It's tempting, I can write a functional DAL very quickly, but I also need to make sure that I test it. The problem is that you may be tempted...
WOW! Native Queries in C# 2.0
Will you take a look at that.
This is about the coolest thing that you are going to see until C# 3 &
lambdas will arrive.
It has so many possibilities that I can't
really comprehend now, the guys at db4o are brilliant, and that is even
before you consider that Rodrigo invented Boo,
the nicest compiler on earth. I'm going to investigate this deeply, and
see if I can do something similar. If I can combine this with lightweight code
generation, it will open up a completely new approach to solving problems.
If you missed it so far, I'm super
excited about this stuff. Great job, and wonderful idea.
SQL Express 2005
I'm a developer, I'm not a dba. I can design and use a SQL database that wouldn't make a sane dba wail in despair, but I'll gladly hand off the desicions about data to someone else. It's not something that interest me. Administrating a database server is something that should be left to the expert, and I don't have time to be an expert. But recently I need to run my project against SQL Express, and I started to get access denied errors. Here are the exact steps that I needed to use...
My First Trigger
So, today I wrote my first trigger. The issue was of data replication on a hierarchial structure. When the parent is updated, all the children should be as well. If a child is update, the link to the parent is severed. The number of children is unlimited, so I can't use NHibernate to do the update, as it would be horribly inefficent to do batch update with it. The parent and child in this case is one of ~15 classes (one inheritance family) that has different meaning for the same fields in...
TDD and SQL Express
Testing against a database is usually a problem, but I run into an unusal one today. My tests would all fail, saying that they couldn't connect to the database. I checked the connection string, and then try to connect using VS 2005, and it worked. When I run the tests again, they got the database and run properly. The only thing that I think of that can cause that is if SQL Express wasn't running when I run the tests the first time, and when I accessed the database in VS 2005, it...
Refactoring the Model & Database Syncing
Most of the interesting applications today talk to a database at one point or another. Unless you're writing a document centric client side application (think Word, Excel), you need to deal with databases. And that is a problem. It's a problem because beyond this point, all your trusty tools leave you to forage forward on you on. A simple thing like a change of a field in the database may require a tremendous amount of effort to fix if you're not careful. One danger that may occur if you subscribe...
Oracle
From Sahil Malik, why oracle is evil...