Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 79

filter by tags archive

Persistent DSL caching issues

time to read 1 min | 190 words

A while ago I talked about persistent DSL caching. I was asked why my solution was not a builtin part of Rhino DSL.

The reason for that is that this is actually a not so simple problem. Let me point out a few of the issues that are non obvious.

  • Need to handle removal of scripts
  • Need to handle updating scripts
  • Need to handle new scripts

Those are easy, sort of, but what about this one?

  • Need to handle DSL updates

When you are in development mode, you really need to know that changing the way the DSL behaves would also invalidate any cache.

I like to keep a very high bar of quality on the software I make, and there is a fine distinction between one off attempts and reusable ones. One off attempts can be hackish and stupid. Reusable implementations should be written properly.

And no, there isn't anything overly complex here. Just time to test all bases.

Anyone feels like sumbiting a patch?


Comments

josh

hmm.. I'll have to go back and read the persistent caching post, but it doesn't seem like it would be too hard. just keep a cached compiled copy, and compare script dates & maybe sizes on startup, then monitor for changes while running. right?

I'm still catching up in this area so I'm not quite ready to submit a patch. sorry.

pb@pb.com

I'm thinking the DSL updates could be handled with a simple checksum on the file. Anywhere with some guidelines on how submitting a patch works?

Ayende Rahien

pb,

Yes, you could. Now how do you store that? How do you clean this up?

The best example of how to submit a patch is here:

http://www.hanselman.com/blog/ExampleHowToContributeAPatchToAnOpenSourceProjectLikeDasBlog.aspx

pb
pb

How about caching the check sum per absolute file path? That should handle these four scenarios and worst case scenario is you compile again if not found which is what happens every time now anyway.

Ayende Rahien

pb,

where would you store this?

pb
pb

Environment.GetTempFilePath or something similar

Ayende Rahien

I think it would be best to see the code before moving forward in abstract discussion

pb
pb

Got it compiling after svn and reference finding shenanigans, trying to get unit tests going with mbunit now.

pb
pb

Hooray! Got the test passing. Missing a file reference in VS 2005 project to the _differentoperaionts.boo file and had to use mbunit gui since coudn't get testdriven.net to work with mbunit even with registry hack. Ah, the joys of open source.

pb
pb

I think I'm most of the way there but CompilerContext isn't serializeable and context.GeneratedAssembly doesn't seem to be something I can store. Suggestions?

pb
pb

Here's the basic idea

    [Test]

    public void Cache_works_when_called_twice()

    {

        string path = Path.GetFullPath(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"DslFactoryFixture\MyDsl.boo"));

        CompilerContext compilerContext = null;


        compilerContext = engine.Compile(path);

        Assert.IsFalse(engine.IsLastCompileCached);


        compilerContext = engine.Compile(path);

        Assert.IsTrue(engine.IsLastCompileCached);

    }




    /// <summary>

    /// Compile the DSL and return the resulting context

    /// </summary>

    /// <param name="urls">The files to compile</param>

    /// <returns>The resulting compiler context</returns>

    public virtual CompilerContext Compile(params string[] urls)

    {

        DslCompilerContextCache cache = new DslCompilerContextCache();

        CompilerContext compilerContext = cache.GetCached(this, urls);


        if (compilerContext == null)

        {

            IsLastCompileCached = false;

            compilerContext = ForceCompile(urls);

        }

        else

        {

            IsLastCompileCached = true;

        }


        return compilerContext;

    }


    /// <summary>

    /// If the last compile was cached or not

    /// </summary>

    public bool IsLastCompileCached

    {

        get { return _IsLastCompileCached; }

        set { _IsLastCompileCached = value; }

    }

    private bool _IsLastCompileCached;


    /// <summary>

    /// Force a compile with no caching

    /// </summary>

    /// <param name="urls"></param>

    /// <returns></returns>

    public virtual CompilerContext ForceCompile(params string[] urls)

    {

        BooCompiler compiler = new BooCompiler();

        compiler.Parameters.OutputType = CompilerOutputType;

        compiler.Parameters.GenerateInMemory = true;

        compiler.Parameters.Pipeline = new CompileToMemory();

        CustomizeCompiler(compiler, compiler.Parameters.Pipeline, urls);

        AddInputs(compiler, urls);

        CompilerContext compilerContext = compiler.Run();

        if (compilerContext.Errors.Count != 0)

            throw CreateCompilerException(compilerContext);

        HandleWarnings(compilerContext.Warnings);


        return compilerContext;

    }
pb
pb

using System;

using System.Collections.Generic;

using System.Text;

using Boo.Lang.Compiler;

using System.IO;

using System.Runtime.Serialization.Formatters.Binary;

using System.Security.Cryptography;

using System.Reflection;

namespace Rhino.DSL

{

/// <summary>

/// Cache for a CompilerContext instance

/// </summary>

public class DslCompilerContextCache

{

    /// <summary>

    /// Returns cached instance if any, or null if none

    /// </summary>

    /// <param name="engine"></param>

    /// <param name="urls"></param>

    /// <returns></returns>

    public CompilerContext GetCached(DslEngine engine, string[] urls)

    {

        if (urls == null || urls.Length == 0) throw new ArgumentNullException("urls");


        string cacheKey = GetCacheKey(urls);


        CompilerContext compilerContext = LoadCompilerContext(cacheKey);


        if (compilerContext == null)

        {

            compilerContext = engine.ForceCompile(urls);

            SaveCompilerContext(cacheKey, compilerContext);

        }


        return compilerContext;

    }


    private string GetCacheKey(string[] urls)

    {

        return String.Join("~", urls);

    }


    private string GetCacheFileName(string cacheKey)

    {

        string filename = System.Convert.ToBase64String(Encoding.UTF8.GetBytes(cacheKey));

        return Path.GetTempPath() + "\\" + filename;

    }


    private void SaveCompilerContext(string cacheKey, CompilerContext context)

    {

        string tempFile = GetCacheFileName(cacheKey);


        if (File.Exists(tempFile)) File.Delete(tempFile);


        using (Stream stream = File.Open(tempFile, FileMode.Create))

        {

            new BinaryFormatter().Serialize(stream, context.GeneratedAssembly.Location);

        }

    }


    private CompilerContext LoadCompilerContext(string cacheKey)

    {

        string tempFile = GetCacheFileName(cacheKey);

        if (!File.Exists(tempFile)) return null;


        //Open the file written above and read values from it.

        using (Stream stream = File.Open(tempFile, FileMode.Open))

        {

            string fileName = new BinaryFormatter().Deserialize(stream) as string;

            CompilerContext context = new CompilerContext();

            context.GeneratedAssembly = Assembly.LoadFile(fileName);

            return context;

        }

    }


    private string GetChecksum(string file)

    {

        using (FileStream stream = File.OpenRead(file))

        {

            return GetChecksum(stream);

        }

    }


    private string GetChecksum(byte[] buffer)

    {

        return GetChecksumFromBytes(new SHA256Managed().ComputeHash(buffer));

    }


    private string GetChecksum(Stream stream)

    {

        return GetChecksumFromBytes(new SHA256Managed().ComputeHash(stream));

    }


    private string GetChecksumFromBytes(byte[] checksum)

    {

        return BitConverter.ToString(checksum).Replace("-", String.Empty);

    }

}

}

Ayende Rahien

pb,

Let us take the discussion to the rhino tools dev mailing list,

pb
pb

Ok, posted to http://groups.google.com/group/rhino-tools-dev

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The industry at large - 9 hours from now
  2. The insidious cost of allocations - about one day from now
  3. Buffer allocation strategies: A possible solution - 4 days from now
  4. Buffer allocation strategies: Explaining the solution - 5 days from now
  5. Buffer allocation strategies: Bad usage patterns - 6 days from now

And 2 more posts are pending...

There are posts all the way to Sep 11, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    01 Sep 2015 - The case of the lying configuration file
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats