Portrait of Joannes Vermorel

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)


Over the Internet, your name is your personal trademark

I have been dealing with freelancers for various tasks (translations, graphists, development), and it's still unbelievable that most freelancers do not pay any attention to maintain a consistant name in their communications. Let me clarify this point: I do not care to know of the exact legal name of any freelancer I am dealing with. But how can I even recognize the person if messages never get signed twice with same name?

Over the Internet, your name is your personal trademark. If you're not careful, people will simply not remember who you are and this rule isn't restricted to freelancers. The most usual consequence over poor name branding is that people will filter out your communication attempts (email, intant messenger and the like) as spam. A clear naming policy means that your name must be explicited and obvious in all your communications ranging from Skype to regular postal mails.

In my experience, the most common inconsistant naming case are the following:

  • The e-mail must completely match the person name. If your name is John Smith then your email must be not or If you have a lengthy name so will be your e-mail address.

  • The Skype/MSN/Whatever username must completely match the person name too. I have found the instant messaging practices to be even worse. Fantasy names like batman4ever are not uncommon. A good practice is to use your e-mail as a instant messenger id.

  • Lack of personal home page: Google is the de facto yellow pages. When somebody type your name then he must get your home page. If your name is really John Smith then it's going to be tough. Well, in such case, just add some random nickname between the first and the last name to become distinguishable.

It may appear obvious to some of us, but it seems that many people do not realize that the lack of consistency in their communications does have a strong (negative) impact on the people they are communicating with, especially if the name is the only concrete reference to the person


Don't booby trap your ASP.Net Session state

The ASP.Net Session state is a pretty nice way to store (limited) amount of data on the server-side. I am not going to introduce the ASP.Net session itself in this post; you can refer to the previous link if you have no idea what I am talking about.

Although many articles can be found over the web arguing that the major issue with the session state is scalability, don't believe them! Well, as long that you're not doing crazy things like storing your whole DB in the session state which is not too hard to avoid. Additionnally, 3rd party providers (see ScaleOut for example) seem to offer pretty much plug&play solutions if you're in desperate need for more session space (disclaimer: I haven't tested myself any of those products).

In my (limited) experience, I have found that the main issue with the Session State is the lack of strong typing. Indeed, lack of strong-typing has a lot of painful consequences such as

  • Hard collisions: The same session key gets used to store 2 distinct objects of different .Net types. This situation is not too hard to debug because the web application will crash in a sufficiently brutal way to be quickly detected.

  • Smooth session collisions: The same session key gets used to store 2 distinct objects of the same type. The situation is really worse than the previous case because, if you're unlucky, your web app will not crash. Instead your users will just experience very weird app behaviors from time to time. The multi-tab browsing users will be among the first victims.

  • No documentation: static field get documented, right? There is no good reason to discard Session State documentation.

  • No refactoring: Errare Humanum Est, session keys do get poorly or inconsistently named like any other class/method/whatever. No automated refactoring means that the manual refactoring attempts will introduce bugs with a high probability.

A simple approach to solve most of those issues consists of strong typing your sessions keys. This can be easily achieved with the following pattern:

partial class MyPage : System.Web.UI.Page
// Ns = the namespace, SKey = SessionKey
const string Foo1SKey = "Ns.MyPage.Foo1";
const string Foo2SKey = "Ns.MyPage.Foo2";

Instead of explicitly typing the session key, the const string fields get used instead. If you need to share session keys between several pages then a dedicated class (gathering the session keys) can be used following the same lines. This pattern basically solves all the issues evocated here above.

  • Collisions get avoided because of the combination of namespace and class prefixes in the session key definitions (*).

  • Documentation is straightforward, just add a <summary/> documentation tag to the const string definition.

  • Refactoring is easy, just refactor the field or change its value (depending on the refactoring intent).

(*) It would have been possible to prefix directly in the code all session keys by the very same combination of namespace and class name, but it becomes a real pain if you start using the session frequently in your code.


Resx2Word, when simplistic is not enough

RESX files are great (and simple) containers of textual resources for your .Net/Asp.Net applications. It's especially useful if you're planning to translate your application into multiple languages (PeopleWords has been translated into 13 languages all textual content being put into RESX files). Yet, using Microsoft Visual Studio as a RESX file editor is quite an overkill solution for translators (whoses programming often equate zero since it's not their job anyway). In a previous post, I was discussing ResxEditor, a simplistic and stand-alone RESX file editor.

Où que tu sois je te retrouverai, car si tu ne viens pas à Lagardère, Lagardère ira à toi! (If you do not come to RESX, RESX will come to you)

Yet, I am still not entirely satisfied by ResxEditor. Indeed, during the translation of process of PeopleWords, half of the translators (smart and educated btw) were surprised by the sheer existence of other text editors beyond MS Word. I imagine that this kind of thing can happen if you have been working your entire life with MS Word.

As a result, those translators, no matter how many times you tell them not to use Word, they can't resist the urge and the RESX file gets opened and translated through Word ... and then funny things happen. For example, I have now several translations of the Microsoft RESX instructions Microsoft ResX Schema, Version 2.0, The primary goals of this format is to allow a simple XML format that is mostly human readable. ..., the large XML comment created by VS by default at the beginning of each RESX file. This XML comment is going to one of the most translated piece of MS literature (I do not think that the VS engineers were expecting this when they wrote those RESX instructions).

In order to escape the curse of the RESX instructions translation, I have just released Resx2Word, a RESX to MS Word converter (and vice-versa). Naturally, it's not possible to translate generic MS Word document to RESX, only MS Word document generated by Resx2Word can be converted back into RESX by Word2Resx. Any feedback?


Motivations behind the "PeopleWords free invitations"

I have just recently upgraded PeopleWords (online platform for the translation business). Among various small fixes and improvements, PeopleWords now provides free invitations for the translators. If you're not familiar with the "invitation feature" of PeopleWords (it happens that some people are not), then just have a look at our white paper. In this post, I will explain the (commercial) motivations underlying this feature.

I have already explained (see my previous post) that there is a strong imbalance of risks in freelance translation jobs. The risk is way much stronger on the customer side rather than on the translator side (well, at least the "perceived" risk, because, in my experience, the risk is low anyway). Indeed, the translators "feel" intuitively that there is little risk for them to multiply their job sources (it does not really matter to know where the job comes from). On the contrary, customers are seeking stable and reliable translators and customers are quite reluctant to send their offers "in the wild".

As a direct consequence of this perception, there is a huge imbalance between the number of registered translators and the number of registered customers on PeopleWords. Basically, the number of registered translators is more than one order of magnitude greater than the number of registered customers. In my opinion, it's a really bad situation because it means that, on average, rather than relying on a dedicated platform (such as PeopleWords), customers rely on e-mail based processes to get their documents translated. As a customer, my experience indicates that managing freelancers by e-mail is just hell.

Did I say that to the Russian translator? or maybe it was the e-mail for the Spanish translator? Did I not pay already the Polish guy? Or maybe it was just the previous Polish job? How many Japanese documents do I have left untranslated? What was the price agreed initially for the Chinese translation? Was it consistent with the previous translation job that has been terminated last week?

Therefore, I have the feeling that there are many benefits for the translators to use a platform (as opposed to e-mails) even for their own personal customers. Yet, in such case, PeopleWords was taking a 10% fee that would have been considered unacceptable. The (free) invitations have been designed so that a translator can invite his own customers and leverages the PeopleWords platform without having to pay the regular 10% fee.

From the viewpoint of PeopleWords, why should I provide such a free service? If translators starts using PeopleWords for free, how am I going to buy the coffee that I need every morning? The immediate (but wrong) answer would be: Once the customer is registered on PeopleWords, he will start posting offers to see if he can get lower prices (thus quitting the translator that brought him to PeopleWords). This situation is very unlikely because of the risk imbalance mentioned here. Once the customer knows a good translator, he is not going to change to spare a few buckets, especially if it's the money of the company anyway.

But a more probable situation is: Once the customer is registered on PeopleWords, one day or another, he will need translations in languages that the original translator is not able to provide. In such cases, posting an offer on PeopleWords is the most straightforward option for the customer; and thanks to the 10% fee, I am able to buy some coffee the next day.


Additional goodies from the blog spammers

An interesting thing about running a web application, it's that people never cease to surprise you. I have already discussed the behavior of the scammers within PeopleWords. Now, I am encountering a new kind of annoying people: the blog spammers.

Given that has no blog, there is no reason for blog spammers to get interested in Peoplewords, right?

Wrong, PeopleWords has no blog, but blog spammers don't care. What blog spammers are detecting is the link toward an XML feed.

<link rel="alternate" type="application/atom+xml"
title="PeopleWords, JobBot" href="AlertAtomFeed.aspx" />

This XML feed is not carrying blog posts, but PeopleWords translations jobs. As a side effect, the zombie army[*] commanded by spammers considers that any website providing an XML feed is a blog and therefore they try to fill up all the available forms with various crappy contents.

On PeopleWords, the only freely available form accessible without user registration is actually the user login form. As a consequence, I get a dozen blog spamming attempts everyday generating errors that are caught in the server logs.

[*] The PeopleWords server logs indicate that the blog spam attempts are coming from a large number of different IPs. This would indicate that the blog spammers are using a large amount of regular (but infested) machines over the internet.