Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 165 words

While driving to work today, I started wondering what pieces of code are owned by someone, and how you can detect it. Owned means that they are the only one that can touch that code. Whatever it is by policy or simply because they are the only one with the skills / ability to do so.

I wonder if you can use the source control history to figure it out. Something like:

  • Find all files changes within the last year
  • Remove all files whose changes are over a period of less than two weeks (that usually indicate a completed feature).
  • Remove all the files that are modified by more than 2 people.
  • Show the result and the associated names.

That might be a good way to indicate a risky location, some place that only very few people can touch and modify.

I started to think about how to do this in Git, but I got lost. Anyone want to try and take that up?

time to read 1 min | 100 words

Following my post yesterday, I decided that for now, we will have the following system:

  • A github hosted repository with the binaries
  • A github hook to post on notification
  • A server side listener that would get that notification

On push, we will simply call git pull in the website directory. We use git ignores for the logs & data files, but that is about it.

It is primitive in the extreme, and it likely have many failure scenarios, but for now, it works. And very nicely, too.

time to read 2 min | 275 words

Basically, I currently have a very simple procedure for deploying software, it is called: git push, and I really want to be able to do that for my web applications as well.

I know that Rob Conery has talked about this in the past:

And I know about Heroku and AppHarbor, that isn’t what I am talking about.

On my own server, I have a set of web application that I want to be able to update using git push.

Requirements:

  • It has to be an explicit operation (pushing to a specific branch is okay).
  • It can’t be something that happens periodically, I want to push, and as soon as possible, be able to see the changes. Waiting 5 minutes for the periodic check is going to be a non starter.
  • It has to take into account local information (logs, data, etc).
  • I have to be able to easily rollback.
  • I don’t really care for things like migrations, those are handled by the application, or manually.

I specifically don’t care about actually building the code, I am perfectly fine with pushing binaries to the git repository.

At first I thought about simply making the site a git repository and just push there. But you can’t push to non bare repositories by default (and rightly so). When I gave it some more thought, I realized that there are more reasons to want to avoid that.

Any thoughts? Any existing solutions?

time to read 1 min | 179 words

I just love git pull requests, but the new behavior from GitHub is beyond moronic. Take a look at a typical pull request:

image

The problem is that clicking on this button would actually merge the changes to the public repository. I don’t know about you, but there are very few cases where this is what I want to do.

In 99.9999% of the cases, I want to merge this locally to see what the bloody changes are, run some tests, maybe modify the changes before I am taking them.  In this case, this particular pull request contains a failing test. I never want to commit that to the public repo automatically.

What is worse is that I now need to manually construct the pull command in the command line, whereas GitHub previously offered the option to generate that for me, which I liked much more.

time to read 1 min | 70 words

Originally posted at 1/17/2011

Occasionally I get asked what code review tools I use.

The truth is, I tend to use TortoiseGit’s Log Viewer and just read the history, but I am not leaning toward this baby:

image

This is really nice!

time to read 5 min | 880 words

The builtin answer for sharing code between multiple projects is quite simple…

git submodule

But it introduces several problems along the way:

  • You can’t just git clone the repository, you need to clone the repository, then call git submodule init & git submodule update.
  • You can’t just download the entire source code from github.
  • You can’t branch easily with submodules, well, you can, but you have to branch in the related projects as well. And that assumes that you have access to them.
  • You can’t fork easily with submodules, well, you can, if you really feel like updating the associations all the time. Which is really nasty.

Let me present you with a simple scenario, okay? I have two projects that share a common license. Obviously I want all projects to use the same license and the whole thing to be under source control.

Here is our basic setup:

PS C:\Work\temp> git init R1
Initialized empty Git repository in C:/Work/temp/R1/.git/
PS C:\Work\temp> git init R2
Initialized empty Git repository in C:/Work/temp/R2/.git/
PS C:\Work\temp> git init Lic
Initialized empty Git repository in C:/Work/temp/Lic/.git/
PS C:\Work\temp> cd R1
PS C:\Work\temp\R1> echo "Hello Dolly" > Dolly.txt
PS C:\Work\temp\Lic> cd ..\R1
PS C:\Work\temp\R1> git add --all
PS C:\Work\temp\R1> git commit -m "initial commit"
[master (root-commit) 498ab77] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Dolly.txt
PS C:\Work\temp\R1> cd ..\R2
PS C:\Work\temp\R2> echo "Hello Jane" > Jane.txt
PS C:\Work\temp\R2> git add --all
PS C:\Work\temp\R2> git commit -m "initial commit"
[master (root-commit) deb45bc] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Jane.txt
PS C:\Work\temp\R2> cd ..\Lic
PS C:\Work\temp\Lic> echo "Copyright Ayende (C) 2011" > license.txt
PS C:\Work\temp\Lic> git add --all
PS C:\Work\temp\Lic> git commit -m "initial commit"
[master (root-commit) 8e8b1b4] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 license.txt

This just gives us the basics. Now I want to share the license.txt file between the projects. I can do that with submodules, like so:

PS C:\Work\temp\R1> git submodule init
PS C:\Work\temp\R1> git submodule add C:\Work\temp\Lic Legal
Cloning into Legal...
done.
PS C:\Work\temp\R1> cd ..\R2
PS C:\Work\temp\R2> git submodule init
PS C:\Work\temp\R2> git submodule add C:\Work\temp\Lic Legal
Cloning into Legal...
done.

Now, this looks nice, and it works beautifully. Until you start sharing this with other people. Then it starts to become somewhat messy.

For example, let us say that I want to add a disclaimer in R1:

PS C:\Work\temp\R1\Legal> echo "Not for Jihad use" > Disclaimer.txt
PS C:\Work\temp\R1\Legal> git add .\Disclaimer.txt
PS C:\Work\temp\R1\Legal> git commit -m "adding disclaimer"
[master db3987c] adding disclaimer
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Disclaimer.txt

And here is where the problems starts. Let us assume that I want to make a change that is local to just this project.

Well, guess what, you can’t. Not if you intend to share this with other people. You need to push your changes to the submodules somewhere, and that means that if you need to fork the original project, update references to the project. Of course, if there is an update to the original submodule, you need to have two stages to update that.

And we haven’t spoken yet on the fun of pushing the main repository but forgetting to push the submodule. It gives a new meaning to “it works on my machine”.

In short, git submodules looks like a good idea, but they aren’t really workable in the real world. I’ll have a new post shortly showing how to deal with the issue

time to read 2 min | 223 words

I am getting really sick of git submodules, and I am trying to find alternatives.

So far, I have discovered the following options:

PS C:\Work\RavenDB> braid add git@github.com:ravendb/raven.munin.git
F, [2011-01-09T18:41:09.788525 #224] FATAL -- : uninitialized constant Fcntl::F_SETFD (NameError)
C:/Ruby186/lib/ruby/gems/1.8/gems/open4-1.0.1/lib/open4.rb:20:in `popen4'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/operations.rb:103:in `exec'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/operations.rb:114:in `exec!'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/operations.rb:51:in `version'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/operations.rb:57:in `require_version'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/operations.rb:78:in `require_version!'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/command.rb:51:in `verify_git_version!'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/lib/braid/command.rb:10:in `run'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/bin/braid:58:in `run'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/program/class_methods.rb:155:in `run!'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/program/class_methods.rb:155:in `run'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/program/class_methods.rb:144:in `catch'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/program/class_methods.rb:144:in `run'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/factories.rb:18:in `run'
C:/Ruby186/lib/ruby/gems/1.8/gems/main-4.4.0/lib/main/factories.rb:25:in `Main'
C:/Ruby186/lib/ruby/gems/1.8/gems/evilchelu-braid-0.5/bin/braid:13
C:/Ruby186/bin/braid:19:in `load'
C:/Ruby186/bin/braid:19

Does anyone know about a good solution that will work on Windows? Most specifically, I am looking for something that is plug & play, I don’t want to write code or to understand how git works. I just wanna it to work

time to read 1 min | 101 words

One of the things that some people fear in a distributed source control is that they might run into conflicts all the time.

My experience has shown that this isn’t the case, but even when it is, there isn’t really anything really scary about that.

Here is an example of a merge conflict in Git:

image

Double click the conflicting file, and you get the standard diff dialog. You can then resolve the conflict, and then you are done.

Git is teh SUCK

time to read 2 min | 363 words

Today, I had two separate incidents in which my git repository was corrupted! To the point that nothing, git fsck or git reflog or git just-work-or-i-WILL-shoot-you didn’t work.

The first time, there was no harm done, I just cloned my repository again, and moved on. The second time that it happened, it was after I had ~10 commits locally that weren’t pushed. I had my working copy intact, but I didn’t want to lose the history. I asked around, and got a couple of suggestion to move to mercurial instead, because git has no engineering behind it.

Based on that feedback, I …

Oh, wait, it isn’t this sort of a post.

What I actually did was setup Process Monitor and watched what git.exe was actually doing. I noticed that it was searching for a .git/objects directory, and couldn’t find it anywhere in the path. Indeed, looking there myself, it appeared clear that there was no objects directory under the .git dir. And checking in other repositories showed that they had it. So now I knew why, but I still had no idea who the #*@# decided to randomly @#$%( my repository, totally derailing my productivity.

That is where having multiple personalities come in handy, he did it. The one that isn’t writing this blog post, at some point during the day, there was a need to zip the repository and send it somewhere. Since the working copy is full of crap, that idiot issued the following:

ls -R obj | rm –F

ls -R bin | rm –F

(Not the exact commands, the idiot used the UI to do a search & delete).

You can guess the following from there. At this point, having come to this astounding discovery, I heroically went to the recycle bin, found the objects directory there, and rescued it! All is well, except that there is still a thrashing for uncommon stupidity owed.

And remember, it wasn’t me, it was the other one who did that!

And yes, the spelling mistake in the title is intentional.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}