Résumé TeX template and tutorial

This is a short tutorial on creating a kick-ass looking résumé/CV with zero previous TeX knowledge.

It's pretty easy:
  1. Install MiKTeX (Windows) or MacTeX (Mac).
  2. Install TeXworks.
  3. Create a new folder and download the moderncv template.
  4. Open examples/template.tex in TeXworks.
Now you can edit the template and press the Play button to create a pdf. Make sure you have 'pdfLaTex' selected as the rendering engine.

The template contains a lot of comments to help you with formatting.

Additional notes:




Posted by Martin Konicek on 2:51 AM 36 comments

Implementation of goroutines

I have been reading up on concurrency in Go. In particular, I have been wondering about the statement from a presentation by Brad Fitzpatrick:
n, err := io.Copy(dst, src)

  • copies n bytes from dst to src
  • synchronous (blocks)
  • Go runtime deals with making blocking efficient
In virtually all languages I have worked with (Java, Scala, javascript, ...), a call that blocks the execution of the current function also blocks the current thread. The only exception is C# with its rather complex async/await feature implemented in the compiler.

Now in Go, if you make a call that blocks the current execution, you are (usually) only blocking the current goroutine, not a thread. The Go runtime multiplexes goroutines on fewer OS threads.

The question therefore is: What are the fundamental differences in the implementation of goroutines compared to OS threads that make goroutines so much more lightweight?

I have asked this question on the Go mailing list and the members of the Go team (notably Dmitry Vyukov and Ian Lance Taylor) have been incredibly helpful.

The summary is:

  • stack that grows and shrinks on demand
  • smaller context, easier to switch
  • cooperative scheduling at known points is less work (can make assumptions about CPU state)
  • scheduling in user space

Here is the discussion: enjoy!

Note: There are other languages (such as Haskell and Erlang) that also have the concept of lightweight "processes" that allow for blocking calls without blocking OS threads. The concept is sometimes referred to as green threads.

Posted by Martin Konicek on 11:21 AM 200 comments

Transactions in the real world



  • The problem of updating multiple systems consistently
  • Distributed transactions
  • Event sourcing
You might already be familiar with the common "real world" example of transferring money from one bank account to another:

START TRANSACTION;
UPDATE balance SET amount = amount - 1000 WHERE account_id = @id_from;
UPDATE balance SET amount = amount + 1000 WHERE account_id = @id_to;
COMMIT;
This example, however, is quite naive as it assumes both accounts are stored in a single database. In practice, we want to transfer money between banks, and we still want to achieve consistency (no matter what error occurs, either both updates take place, or none).

This problem appears in many situations in practice. For example:

  • In an e-shop, we want to create an order in our database and create a charge in a payment system. Again, both actions should happen, or none.
  • We want to atomically update multiple documents/values in a noSQL store.
In this article, we will look at two solutions to this problem - a rather classical one (Distributed transactions) and a more recent one (Event sourcing).

Distributed transactions

It is possible to solve the consistency problem using distributed transactions. A common algorithm to ensure correct completion of a distributed transaction is the two-phase commit:


  1. One node is designated the transaction coordinator. This can be your application or an external service.
  2. The transaction coordinator sends a Request to prepare message to all participants and waits for replies from all participants.
  3. The participants execute the transaction up to the point where they would commit. They each write an entry to their undo log and an entry to their redo log.
  4. Each participant replies Prepared if its actions succeeded, or Failed if it experienced a failure that will make it impossible to commit.

Success

If the coordinator received a Yes vote from all the participants:
  1. The coordinator sends a Commit message to all participants.
  2. Each participant completes its transaction, releases held resources, and sends an acknowledgement to the coordinator.
  3. The coordinator completes the transaction when all acknowledgements have been received.

Failure

If the coordinator received a No vote from any of the participants (or the response timeout expires):
  1. The coordinator sends a Rollback message to all participants.
  2. Each participant undoes the transaction using its undo log, releases held resources, and sends an acknowledgement to the coordinator.
  3. The coordinator marks the transaction as failed when all acknowledgemens have been received.
As you can see, this protocol is quite complicated, even though it only describes the success path (normal operation).

In practice, timeouts and retries have to be added into several steps of the algorithm, because it is impossible to guarantee reliable exactly-once delivery of messages. For example, let's look at the Step 2 of the Success case:

    2. Each participant completes its transaction, releases held resources, and sends an acknowledgement to the coordinator.

Notice that if the acknowledgement gets lots (e.g. due to a network partition), the coordinator doesn't know if the participant commited or not and therefore can't delare the transaction successful. The other participants should roll back. But what if the unresponsive participant already commited?
It is possible to solve all such issues, and there are working implementations. Getting every failure case right is, however, quite tricky.

Moreover, and this is the major drawback of distributed transactions, each of the systems involved has to support the distributed transactions protocol, which is simply not the case in practice. In particular, if your application uses a noSQL database or external web services, distributed transactions are simply not an option.

(Note, however, that distributed transactions are not a thing of the past. Google's Spanner database relies on distributed transactions and two-phase commit across datacenters to achieve consistency. Spanner relies on a precise time API using GPS and atomic clocks to achieve good scalability of writes.)

This leads us to the second part of this post: In case you can't or don't want to use distributed transactions, is there a solution that achieves consistency across heterogeneous systems? It turns out there is, although it only guarantees eventual consistency. Enter event sourcing.

Event sourcing


Event sourcing is an architectural pattern that will help us achieve weaker, eventual, consistency across services. Eventual consistency simply means that consistency is reached eventually, but the system might temporarily be seen in an inconsistent state. It turns out this is sufficient for many applications, including banking.

Requirements

For Event sourcing, the only requirement is that it is OK for any participant to receive the same message multiple times - in other words, to see each message at least once. If receiving a message multiple times always leads to the same state as receiving the message once, we say the message receiver is idempotent. Luckily idempotence is quite common in practice. For example:

  • It is OK to update a user's name to the same value multiple times. 
  • It is OK to try to create a user with the same email multiple times: the user will be created once and the subsequent requests fail because a user with given email already exists.

Algorithm

We will use the banking example to describe the algorithm. Let's say our application / service is responsible for transferring money between two distinct banks:
  1. Our service receives a request to transfer €1000 from account_A to account_B.
  2. Check with bank A that account_A has sufficient funds available. If not, return an error to the client.
  3. Create a single commit record containing the following two events. Add random globally unique ids (e.g. UUIDs) that will identify each transaction. We will see later why these ids are necessary for idempotence:
    • "Transaction_id=a1: Withdraw €1000 from account_A."
    • "Transaction_id=b1: Deposit €1000 to account_B."
  4. Mark the commit record as Undispatched and store it in a database.
  5. Pass the events one by one to our application's event listeners. Let's say we have a single listener that processes all banking events:
    1. The event listener receives the first event, sends a request to bank A's service and waits for a response.
    2. The event listener receives the second event, sends a request to bank B's service and waits for a response.
  6. If both events are processed successfully by the listener, mark the commit as Dispatched.

Failures

Now we get to the interesting part, which is the possible failure scenarios. Notice that we will sometimes retry events, and we will also retry all Undispatched events on startup of our service in case it crashed:

Step 4: Our database fails to store the commit. We return a 5xx response to the client saying the request couldn't be processed. The state of all systems stays unchanged.

Step 5.1 - first request fails: The request to the bank A's service fails or times out.We stop the processing, leave the commit marked as Undispatched and schedule a retry. This can be done e.g. by trying to re-run the listeners of all undispatched commits every minute.

Regarding the response we return to the client, this really depends on our use case. We can return a response saying that the request was accepted for processing.

After several unsuccessful retries, we mark the commit as Failed and notify the client (via in-app notifications, or email) that the request couldn't be completed.

Step 5.2 - second request fails.: The request to the bank A's service completes but the request to bank B's service fails. We leave the commit marked as Undispatched and schedule a retry.

When trying to re-process the commit, the listener will try to execute both requests again, including the already completed first request "Transaction_id=a1: Withdraw €1000 from account_A.". Here's where the idempotence requirement comes in: Bank A's service must be able to recognize that it already received the message with id=a1 and ignore it.

If a retry succeeds, we are done. If it doesn't, we'll try again in a minute.

After several unsuccessful retries, we mark the commit as Failed and notify the client. Since we have already withdrawn the amount from account A, we need to execute a compensating action: "Return the €1000 from transaction a1 to account_A."

Note that marking the event as Failed and executing the compensating action has to be done atomically, so we use a new event for that. The compensating action itself can fail, but in this case we probably don't want to give up so we'll keep retrying the action indefinitely until it succeeds.

Step 5.1 or later - our service crashes: After the first request or both requests completed successfully, our service crashes. The commit stays in the Undispatched state. When our service is restarted it goes through all the undispatched events in the database. It will re-execute both events, both web services will ignore the requests because they've received them already, and finally our service will mark the commit as Dispatched.

Posted by Martin Konicek on 11:56 PM 6 comments

Non-blocking IO demystified

This is an article about inner workings of non-blocking servers, that is servers that don't block a thread per connected client.

While some simply use the term asynchronous or non-blocking as a synonym for "fast", there seems to be little understanding of what it actually means.

The basic component of non-blocking code is an event-like interface:

onRequest: function(request) {
  response =// generate response
  respond(response)
}

The interesting part is what the underlying framework does to provide this interface. The answer is a message loop running under the covers:

while(true) {
  connection = poll_connection_from_OS()
  request = read_request_from(connection)
  onRequest(request)  // call user code
}

There two observations to make at this point:

  • the server runs entirely on a single thread
  • if your onRequest code does something fast, only CPU bound, this approach is efficient

Now comes a "surprise". Of course, what you would usually do in your onRequest code is something like this:

onRequest: function(request) {
  obj = db.read_object(request.params("id"))
  respond(json(obj))
}

The problem is that the db.read_object is a blocking call. There is absolutely no way the thread can continue, because the method must return the database object. And remember, we are still on the single thread running the message loop.

Therefore, if a thousand clients come at the same time, they will get served one-by-one, the last one waiting for the 999 database calls to complete. In other words, the throughput of our server is terribly low.

So what's the solution to this problem? Well, here it is:

onRequest: function(request) {
  db.read_object(request.params("id"), function(obj) {
    respond(json(obj))
  })
  // returns immediately!
}

The whole difference is in the db library being itself non-blocking. What db.read_object does is that it puts the passed callback function inside some data structure and returns immediately, so our (single!) main thread can happily continue accepting requests. The db object itself is then running its own message loop internally (on its own thread, so our server has two threads now). In its internal loop, the db object polls for responses from the external database and calls back our function.

Now, if thousand clients come at the same time, a thousand requests to the database will be fired almost instantly and remembered by the db object, and a response will be sent back to each individual client as soon as the database returns each one of the individual requested objects.

Now this is the awesome non-blocking IO that everyone is talking about. The server really is handling a thousand clients "in parallel" using only two threads.

Of course, there will still be one thousand open sockets but we managed to handle them all using only two threads.

Appendix - what to do when you only have a blocking client library:

If the only API your client library gives you is 'obj = db.read_object(id)', you will essentially need to do the following:

onRequest: function(request) {
  threadPool.queue(function() {
    obj = db.read_object(request.params("id"))
    respond(json(obj))
  }
}

This way you free the IO thread (the one running the message loop) for accepting more incoming requests, but your server is now blocking, since it uses one thread per client (each of those threads simply sits idle, waiting for the response from the database). If many requests come at once, the threadpool will start new threads. When a maximum number of threads is reached, the calls will be just queued, waiting for threads to complete and become available, therefore seriously limiting throughput.

The takeaway from this article is: in order for your request handling code to be non-blocking, it has to be composed entirely on non-blocking API calls. A "non-blocking" server toolkit by itself does not guarantee high concurrency/throughput.

Posted by Martin Konicek on 2:24 AM 9 comments

How to implement a rule engine in C#


Recently there was an interesting question on StackOverflow about creating a rule engine in C#.

Say we have a collection of Users (with Name, Age, ...) and rule definitions like this:

static List<Rule> rules = new List<Rule> {
     new Rule ("Age""GreaterThan""20"),
     new Rule ( "Name""Equal""John"),
   new Rule ( "Tags""Contains""C#" )
};

and we want to be able to evaluate the rules:

// Returns true if User satisfies given rule (e.g. 'user.Age > 20')
bool Matches(User user, Rule rule)
{
    // how to implement this?
}
 
One  obvious solution is to use reflection.
Here we will show a solution which uses Expression trees to
compile the rules into fast executable delegates. We can then evaluate the rules as if they were normal boolean functions.

public static Func<T, bool> CompileRule<T>(Rule r)
{
    var paramUser = Expression.Parameter(typeof(User));
    Expression expr = BuildExpr<T>(r, paramUser);
    // build a lambda function User->bool and compile it
    return Expression.Lambda<Func<T, bool>>(expr, paramUser).Compile();
}
 
static Expression BuildExpr<T>(Rule r, ParameterExpression param)
{
    var left = MemberExpression.Property(param, r.MemberName);
    var tProp = typeof(T).GetProperty(r.MemberName).PropertyType;
    ExpressionType tBinary;
    // is the operator a known .NET operator?
    if (ExpressionType.TryParse(r.Operator, out tBinary))    {
        var right = Expression.Constant(Convert.ChangeType(r.TargetValue, tProp));
        // use a binary operation, e.g. 'Equal' -> 'u.Age == 15'
        return Expression.MakeBinary(tBinary, left, right);
    } else {
        var method = tProp.GetMethod(r.Operator);
        var tParam = method.GetParameters()[0].ParameterType;
        var right = Expression.Constant(Convert.ChangeType(r.TargetValue, tParam));
        // use a method call, e.g. 'Contains' -> 'u.Tags.Contains(some_tag)'
        return Expression.Call(left, method, right);
    }
}
 
Now we can implement the rule validation by compling the rules and then simply invoking them:

var rule = new Rule ("Age""GreaterThan""20");
Func<Userbool> compiledRule = CompileRule<User>(rule);
 
// true if someUser.Age > 20
bool isMatch = compiledRule(someUser);

Of course we compile all the rules just once and then use the compiled delegates:

// Compile all the rules once.
var compiledRules = rules.Select(r => CompileRule<User>(r)).ToList();
 
// Returns true if user satisfies all rules.
public bool MatchesAllRules(User user)
{
    return compiledRules.All(rule => rule(user));
}
 
Note that the “compiler” is so simple because we are using 'GreaterThan' in the rule definition, and 'GreaterThan' is a known .NET name for the operator, so the string can be directly parsed. The same goes for 'Contains' – it is a method of List. If we need custom names we can build a very simple dictionary that just translates all operators before compiling the rules:

Dictionary<stringstring> nameMap = new Dictionary<stringstring> {
"greater_than""GreaterThan" },
"hasAtLeastOne""Contains" }
};

That’s it. As an improvement we could add error messages to the rule definitions and print the error messages for the unsatisfied rules.

If you found this useful, feel free to rate the answer on StackOverflow.

Posted by Martin Konicek on 1:10 PM 131 comments

Covariance and contravariance - simple explanation

This is a very concise tutorial on covariance and contravariance. In 10 minutes you should understand what these concepts are and how to use them. The examples are in Scala, but apply to Java or C# as well.

Covariance

Assuming Apple is a subclass of Fruit, covariance lets you treat say List[Apple] as List[Fruit].

val apples = List(new Apple(), new Apple())
processList(apples)

def processList(list:List[Fruit]) = {
  // read the list
}

This seems obvious - indeed, a list of apples is a list of fruit, right?

The surprise comes when we find out this does not work for arrays. Why is that so? Because you could do the following:

val a = Array(new Apple(), new Apple())
processArray(a)

def processArray(array:Array[Fruit]) = {
  array(1) = new Orange() // putting an Orange into array of Apples!
}

The main difference between List and Array here is that the List is immutable (you cannot change its contents) while the Array is mutable. As long as we are dealing with immutable types, everything is OK (as in the first example).

So how does the compiler know that List is immutable? Here is the declaration of List:

sealed abstract class List[+A]

The +A type parameter says "List is covariant in A". That means the compiler checks that there is no way to change contents of the List, which eliminates the problem we had with arrays.

Simply put, a covariant class is a class from which you can read stuff out, but you can't put stuff in.


Contravariance

Now when you already understand covariance, contravariance will be easier - it is exactly the opposite in every sense.

You can put stuff in a contravariant class, but you can never get it out (imagine a Logger[-A] - you put stuff in to be logged). That doesn't sound too useful, but there is one particularly useful application: functions. Say you've got a function taking Fruit:

// isGoodFruit is a func of type Fruit=>Boolean
def isGoodFruit(f:Fruit) = f.ageDays < 3

and filter a list of Apples using this function:

val list:List[Apple] = List(new Apple(), new Apple())
list.filter(isGoodFruit) // filter takes a func Apple=>Boolean

So a function on Fruits is a function on Apples - the filter will throw Apples in and isGoodFruit will know how to handle them.

The type of isGoodFruit is actually Function[Fruit, Boolean] - yes, in Scala even functions are traits, declared as:

trait Function[-A,+B]

So functions are contravariant in their parameter types and covariant in their return types.

OK, that's it; this is the minimal explanation I wanted to cover.

Posted by Martin Konicek on 11:39 PM 13 comments

Software engineering radio - best episodes

Software engineering radio is an excellent podcast full of in-depth information for developers; contains high quality content different from what you usually find online.

General

NoSQL and MongoDB with Dwight Merriman
Top 10 Architecture Mistakes with Eoin Woods
Being a consultant - honest, informal and funny
Software Craftsmanship with Bob Martin - concentrated motivation
Stefan Tilkov on REST - quite practical
Singularity research OS - microkernels, safety, static code analysis

Scala

Martin Odersky on Scala - great interview with the author of Scala
Scala Update with Martin Odersky - second half provides insights into possible future of programming

OmegaTau


Btw, Markus Völter (the guy behind se-radio) also does a podcast on technology and science. Software engineer talking to a Nuclear fusion expert - what could we want more? ;) I really enjoyed the following episodes:

Astrobiology at the NASA Astrobiology Institute
Quantum computing - kudos for mentioning the "next technical revolution"
Nuclear Fusion at MPI für Plasmaphysik



Please help me find great episodes - if you know about an episode that you really enjoyed, post it into comments. Thanks!

Posted by Martin Konicek on 2:00 PM 10 comments

TFS addin for Outlook TaskConnect released!

TaskConnect connects Outlook with issue tracking systems in a very cool way. So far there is support for TFS (2008, 2010) and Redmine.

This is how TaskConect looks inside Outlook:



It has many useful concepts making you deal with your tasks in no time. Be sure to check out the entirely new website.

Posted by Martin Konicek on 12:08 AM 2 comments

Scripting in Scala

Imagine we have a long text file, for example:

'CHINESE' : 'zh',
'DUTCH': 'nl',  
'ENGLISH' : 'en',
'SPANISH' : 'es',

And we want to turn it into this:

'zh' : 'Chinese',
'nl' : 'Dutch',  
'en' : 'English',
'es' : 'Spanish',

What we want to do is very simple: "Swap the two parts on every line!", and I believe it should be this simple also in code. Let's see how it looks in Scala:

import scala.io.Source

val lineRegex = "\\s+'(.+)'.*'(.+)'.*".r

for(line <- Source.stdin.getLines) {
   val lineRegex(name, code) = line
   println("'%s': '%s'," format (code, name.toLowerCase.capitalize))
}

That's 5 lines including the import, and the code is readable. Now we can run:

cat languages.txt | scala ourScript.scala > output.txt

That's it.

Btw, let's look at at the options we had for solving the problem:
  1. We don't want to be writing code in Java, C++ or C# and actually compiling an executable. Also you can imagine that the code in any of these languages would be overly complicated.
  2. OK, so we definitely should use a scripting language. What options do we have?
    • sed, awk, perl - Learn a (cryptic) language just for the purposes of scripting? No thanks.
    • Python, F#, Scala - All these languages have an amazing property that while you are already building your applications in them, you can use them for writing quick scripts. Scala is just elegant and more innovative than F# or Python. For a deeper look how the magic with lineRegex works, see this. If unfamiliar with Scala, here is a quick overview.

Posted by Martin Konicek on 4:02 PM 2 comments

Posted by Martin Konicek on 6:46 PM 5 comments

New blogposts about programming Outlook add-ins

I published two new blogposts on the website of our add-in for Outlook called TaskConnect:
Adding custom data to Outlook emails
Searching emails from an Outlook add-in

Knowing the information published there before developing TaskConnect would definitely help me a lot so I'm sharing it. I hope you will also find it useful.

Posted by Martin Konicek on 12:00 AM 4 comments

TFS addin for Outlook TaskConnect goes live!

You can download the first beta of TaskConnect now. I've been working on this for the last year with a team of cool guys in Prague.
TaskConnect is quite unique as it integrates TFS into Outlook in a new, clever way:

  • TaskConnect is a full-fledged TFS client with fulltext search field at your fingertips right in Outlook
  • Work items can be attached to emails: when you are discussing a work item with someone, the work item is visible right next to the email throughout your whole conversation. You can of course edit it right from Outlook.
  • Saving your time is absolutely the primary goal - speed of use is just incomparable to any existing TFS client
We think this is such a good idea that it would be silly just to support TFS. So in the future you can expect Outlook integration of many popular issue tracking systems. We also have a ton of ideas for the future and we are looking forward to your feedback and ideas.

Posted by Martin Konicek on 4:50 PM 0 comments

C# 4 delegates contravariance

A little quiz.

Given the following two delegates:

delegate void EventHandler(object sender, EventArgs e);

 

delegate void PropertyChangedEventHandler(object sender, PropertyChangedEventArgs e);


Does the following code compile?

void propertyHandler(object sender, PropertyChangedEventArgs e)

{ }

 

EventHandler handler = propertyHandler;






Did you answer "YES, because PropertyChangedEventHandler IS an EventHandler"?






Unfortunately, the answer is no. Imagine how would the following code work:

handler(this, new EventArgs());


How could just EventArgs be passed to propertyHandler, which expects PropertyChangedEventArgs?

Actually, the exact opposite is correct in C# 4:

void handler(object sender, EventArgs e)

{ }

 

PropertyChangedEventHandler propertyHandler = handler;


Then, calling

propertyHandler(this, new PropertyChangedEventArgs("Name"))


is perfectly ok, because the handler just sees passed PropertyChangedEventArgs as EventArgs.

So the conclusion is: you can handle specialized events using less specialized handlers (you can handle eg. PropertyChanged event using just an EventHandler).

For more info, see C# delegates on msdn.

Posted by Martin Konicek on 4:58 AM 0 comments

Implementation of anonymous delegates in C#

What is the output of the following program?

int x = 5;

Action action = (() => { Console.WriteLine(x); });

x = 6;

action();



The output is of course 6, because delegates capture references to outer variables.

So how is this implemented in the compiler?
The compiler rewrites the code above to something like this:

Scope s = new Scope();

scope.x = 5;

Action action = (scope);

scope.x = 6;

scope.Call();


and defines the Scope class like this:

private sealed class Scope

{

    public int x;

 

    public Scope();

    public void Call()

    {

        Console.WriteLine(this.x);

    }

}



As we would expect, the is no way to create a reference to eg. an int in IL, so this is solved naturally - the int is wrapped inside Scope class and all accesses are done through the wrapper.
The wrapper also defines the Call method, which is exactly the body of the anonymous delegate.

Actually, the compiler generated name for our Scope class is "<>__DisplayClass1", and the Call method is named "<Main>b__0".

For a little more complex example, see this post on stackoverflow.

Posted by Martin Konicek on 4:18 PM 0 comments

HTTPS and DNS poisoning attack

The best explanation of HTTPS I have seen was written by Jeff Moser, highly recommended!

After reading the article, there was only one thing left unclear, so I asked the author, Jeff Moser, and he responded:

Me:


Just one thing is really unclear to me - DNS poisoning: The attacker obtains certificate from amazon.com, I enter "amazon.com" to browser, browser goes to attacker's site, which responds by valid amazon.com certificate signed by Verisign. How does the browser tell this is an attack?


Jeff:

Great question! Note that if an attacker did this, they'd run into trouble in the "Trading Secrets" section that I described. Without knowing Amazon.com's private key, they couldn't decrypt the pre-master secret that the client sends out because the certificate from Verisign has Amazon's public key. Thus, the client would use that public key (and not one an attacker generated).


DNS poisoning is an attack when attacker fools DNS server. You type "amazon.com" in the browser, the browser asks the DNS server to resolve the URL = to translate the URL to IP address. Since the DNS server is poisoned, it returns attacker's IP address and browser connects to attacker's server, while address bar reads "amazon.com" - quite nasty.

Now everything is 100% clear, thanks Jeff!

Posted by Martin Konicek on 2:08 AM 0 comments

Custom IFormatProvider for doubles

The following example shows how to write a custom IFormatProvider which you can use in String.Format(IFormatProvider, ...).

public class DoubleFormatter : IFormatProvider, ICustomFormatter

{

    // always use dot separator for doubles

    private CultureInfo enUsCulture =

        CultureInfo.GetCultureInfo("en-US");

 

    public string Format(string format, object arg,

                            IFormatProvider formatProvider)

    {

        // format doubles to 3 decimal places

        return string.Format(enUsCulture, "{0:0.000}", arg);

    }

 

    public object GetFormat(Type formatType)

    {

        return (formatType == typeof(ICustomFormatter))

            ? this : null;

    }

}



Having this formatter, we can use it like this:

double width = 15.77555;

double height = 12.8497979;

Console.WriteLine(

    string.Format(new DoubleFormatter(), "w={0} h={1}", width, height));



Output:

w=15.776 h=12.850



So now we have a reusable format for doubles - 3 decimal places with dot separator. That is nice, but this formatter is very simple - it formats everything (eg. DateTime) as "0:000". This is a fast version if you know that you will only use it for formatting lots of doubles.

The real version should look like this:

public class DoubleFormatter : IFormatProvider, ICustomFormatter

{

    // always use dot separator for doubles

    private CultureInfo enUsCulture =

        CultureInfo.GetCultureInfo("en-US");

 

    public string Format(string format, object arg,

                        IFormatProvider formatProvider)

    {

        if (arg is double)

        {

            if (string.IsNullOrEmpty(format))

            {

                // by default, format doubles to 3 decimal places

                return string.Format(enUsCulture, "{0:0.000}", arg);

            }

            else

            {

                // if user supplied own format use it

                return ((double)arg).ToString(format, enUsCulture);

            }

        }

        // format everything else normally

        if (arg is IFormattable)

            return ((IFormattable)arg).ToString(format, formatProvider);

        else return arg.ToString();

    }

 

    public object GetFormat(Type formatType)

    {

        return (formatType == typeof(ICustomFormatter)) ? this : null;

    }

}



Example:

Console.WriteLine(string.Format(new DoubleFormatter(),

    "Numbers {0} and {1:0.0}." +

    "Now a string {2}, a number {3}, date {4} and object: {5}",

    1.234567, -0.57123456, "Hi!", 5, DateTime.Now, new object()));



Output:

Numbers 1.235 and -0.6. Now a string Hi!, a number 5, date 12.6.2009 17:11:35 and object: System.Object



This article should give you an overview of implementing custom IFormatProvider. Now you should be able to modify the code to suit your specific needs.

Other examples with custom formatters can be found in MSDN. See example with formatter for 12-digit account numbers (12345–678–9012).

Posted by Martin Konicek on 2:02 PM 2 comments

Associating data with an event

I just solved the following problem - best explained by concrete example:

WPF Animation.Completed is an event. I need to register this event and when it fires, access custom data associated with the event:

void animate(Graph graph)

{

    PointAnimation anim = new PointAnimation();

    anim.Completed += new EventHandler(anim_Completed);

 

    // in anim_Completed, I want to call graph.Fix() - how?

}

 

void anim_Completed(object sender, EventArgs e)

{

    // how do I access 'graph' here?

}



Seems like a tough problem. Remembering 'graph' in eg. static variable is not an option, especially if we have more graphs animated at the same time.
Fortunately, C#'s lambdas come to the rescue:

void animate(Graph graph)

{

    PointAnimation anim = new PointAnimation();

    anim.Completed += new EventHandler((s, e) => { graph.Fix(); });

}



This works because anonymous methods bind to outer variables.

Posted by Martin Konicek on 2:55 PM 0 comments

Starting with Scala - slides from presentation

I started learning Scala programming language.

Scala rocks! It is statically typed, has functional features, and aims for developer productivity. In Scala you can express what you want with less code than in C#. Scala runs on JVM and any Java code can be called from Scala.

Here are the slides from my presentation at Charles University in Prague:     pdf, powerpoint, examples (.zip)

Some of Scala's clever concepts suprised me:

  • Actors - a totally different thinking about concurrency than you are used to. No threads + shared state + locks. Actors live independently, share no state, and communicate by sending each other messages.
  • Adding new keywords to the language just by writing ordinary methods. Consider adding C#'s using or lock keyword to Scala, without compiler support!
  • Traits - behaviors can be mixed in freely at instatiation time, sort of built-in Strategy design pattern.
  • Absence of the static keyword, turns out it is not needed! Instead of having static state and methods of a class, you declare a singleton instance and call its methods.

If you want to see concrete examples, check out the slides.. it's coding time ;)

Links to get started:
First steps to Scala - a great tutorial.
Twitter on Scala - detailed interview - did you know that Twitter runs on Scala?

Thanks to my friends Joe and Pz for support!

Posted by Martin Konicek on 12:21 AM 4 comments

I'm in!

Great news! I've been accepted to Google summer of code. I'll be working on debugger visualizers for SharpDevelop.
This is a great opportunity to continue my work on the Object graph visualizer, and even get it integrated in an open source IDE.
I'll be also working on other visualizers and things (particularly one of them totally cool in my opinion).
Cool thing about this is that none of the visualizers is present in Visual Studio.
Hope the visualizers will make your debugging experience even better :)

Posted by Martin Konicek on 11:52 AM 2 comments

Debugger Visualizer for Visual Studio

Recently, I have been playing with implementing a Visual Studio add-in that displays your data structures as graphs live as you debug.



The graph on the right shows current state of the data structure being watched. The graph updates as you step through the code.
Here is an example of more complex structure:



The layout is done using GLEE graph layout engine, which could be possibly replaced by Graphviz or something else.

As I was suspecting prior to the implementation, there is an important issue that needs to be solved: when the user does a step in the debugger, the object graph is rebuilt from scratch and the layout engine calculates new layout. However, even if the graph changes only slightly, it can sometimes affect the layout significantly and therefore confuse the user.
I am currently thinking about matching nodes from current step to the nodes from the previous step (provided the graph doesn't change that much between steps, which is typically true - and when it chages too much, we don't mind layout changes anyway). The matching of the two graphs could be used to preserve the layout until the nodes really have to move. And if they have to move, move them in a way that they preserve relative positions to each other. There is a space for some interesting algorithms here.

Current source code of the add-in is available here.

Posted by Martin Konicek on 3:01 PM 5 comments