Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta
Tags

Entries in localization (8)

Saturday
Jan092010

Lokad.Translate v1.0 released (and best wishes for 2010)

A few weeks ago, I have been discussing the idea of continuous localization. In summary, the whole point is to do for localization (either websites or webapps) what is done by the continuous integration server.

Obviously, the translation itself should be done by professional translators, as automated translation tools are still light years away from the required quality.

Beyond this aspect, nearly all the mundane steps involved in localization works can be automated.

This project has taken the shape of an open-source contribution codenamed Lokad.Translate. This webapp is based on ASP.NET MVC / C# / NHibernate and targets Windows Azure.

This first release comes as a single-tenant app. We are hosting our own instance at translate.lokad.com but if you want start using Lokad.Translate, you will need to setup your own.

Considering that Lokad.Translate is using both a WebRole and a WorkerRole (plus a 1Gb SQL Azure instance too), hosting Lokad.Translate on Azure should cost a bit less than $200 / month (see the Windows Azure pricing for the details).

Obviously, that's a pretty steep pricing for a small webapp. It's not surprising that Make it less expensive to run my very small service on Azure comes as the No1 community most-voted feature.

Yet, I think the situation will improve within 2010. Many cloud hosting providers such as RackSpace are already featuring small VMs that would be vastly sufficient for a single tenant version of Lokad.Translate. Considering that Microsoft will be offering similar VMs at some point, it would already drop the price around $30.

If we add also that CPU pricing isn't going to stay stuck at $0.12 forever, the hosting price of Lokad.Translate is likely to drop significantly within 2010.

Obviously, the most efficient way to cut the hosting costs would be to turn Lokad.Translate into a multi-tenant webapp. Depending on community feedback, we might consider going that road later on.

The next milestone for Lokad.Translate will be to add support for RESX files in order to support not only website localization, but webapp localization as well.

Monday
Nov302009

Continuous Localization or l10n 2.0

There is nothing is easier to sell globaly than software. Yet, it's still surprising to me to see how few efforts are made on average by small software companies toward localization.

Disclaimer: I am not saying that selling software anywhere is easy. Some places are really tough. I am just saying that selling about anything else worldwide level is just 10x harder.

Translation is (relatively) cheap

Localization (l10n in short) is easier and cheaper than you think. In a past project of mine, a few years ago, I managed to translate a web app (freelance marketplace) in 13 languages for less than $2000. Yes, that's right, it was roughly $150 per language.

The first thing to understand here is that freelance translation is cheaper than usually thought. Generic freelance considerations apply, but compared to the massive efforts needed to actually develop and maintain any small piece of software, translation is just dirty cheap.

... but management is not ...

Yet, if translation is unexpensive, managing translators is not. Each time I took over localization works, I relealized that managing half a dozen of remote translators on an ongoing basis was nearly requiring a full-time commitment from my side.

If think, most companies realize this effect intuitively. The bottom line result is that localization is typically performed in big batches.

Big batches seem to be the archetype of the non-agile process. Every two years, package all documents and hand them over to some translation agency. Wait for two months. Publish the (already outdated) documents, and wait more. Two years later, once documents are desperatly outdated, repeat.

Although, I can't blame the community doing it that way, as I was doing no better. Yet, this process felt wrong. Since localization is such a big time-consuming mess, we do it only once in a while and meantime prospects and customers suffer outdated materials on an ongoing basis, which, somehow, is even worse than poor-quality translation.

... hence Continuous Localization

Among all good practices in software development, I have found continuous integration to be one of the few breakthrough that have significantly improved agility in project management. The core idea being continuous integration is that integration becomes part of your daily process.

Instead of updating the deployment logic once every 18months, you do it on an ongoing basis, so that the software is already ready to ship. Yet, continuous integration comes with a gotcha: you can't do it by hand. It takes another layer of automation: the integration server.

Thinking about it, localization is similar to integration

The simple idea of incremental localization without some automation seemed doomed as it would require insane communication efforts between manager and translators.

Then, what about adding this automation layer to simplify the process?

Let see how the localization process could be automated:

  1. (automated) Get all source documents and incremental updates.
  2. (automated) Map updates to every target languages.
  3. (manual) Apply corresponding incremental updates to target documents.
  4. (automated) Keep track of the amount of work made by each translator.
  5. (automated) Keep track of work batch to get translators paid.

Obviously, the one step that cannot be automated this translation operation itself; but then all other steps can be vastly automated.

This idea has been the starting point of a project codenamed Lokad.Translate. This project is nothing more than a webapp playing the role of a localization server and providing all the automation that we can get to speed-up the localization process - both on the management side, but on the translator side as well.

Tech note: Lokad.Translate is ASP.NET MVC + NHibernate on top of Azure.

Since we did not want to reinvent the wheel, we decided to leverage the capabilities of the wiki powering our own company website. In particular, in order to retrieve the list of incremental changes, we using nothing else but the web feed generated by the wiki (RSS in present case, although it does not matter much) that represents recent changes. The nice thing about web feeds is that most webapps are already providing one (think blogs).

Then, concerning document management (both originals and translations), there is a gotcha: there is no need to manage documents themselves, as managing the URLs pointing to the documents is enough. Once the URL is known, if a REST API is provided by the wiki, all other commands (view/diff/edit/...) could be inferred with simple REGEX.

My objective would be to achieve a near continuous localization of the content posted on our website with say, no more than 2 weeks of delay between initial post and its localization with minimal overhead both on the management and translator. We will soon start deploying Lokad.Translate for our internal need, we will see how it goes.

Then, depending of community, we will probably provide release Lokad.Translate one way or another. Stay tuned for more (and don't hesitate to contact me if you're interested).

Saturday
Jul282007

RESX utilities open-sourced

Due to popular demand, I have finally open-sourced my RESX utilities. All the content (source code as well as binaries) is now available at resx.sourceforge.net, released under the GPL open-source license.

The release includes RESX Editor a simple yet efficient RESX file editor. It can be very handy if the translator is not too much familiar either with XML or with Visual Studio.

The release also includes Resx2word a RESX to Microsoft Word converter. The converter has been packaged as a command line utility (resx2word.exe and word2resx.exe).

Saturday
Oct142006

ResxEditor reloaded - version 1.2 released

They were a couple of long standing issues with ResxEditor. Most of them were actually reported as comments on the blog post of the initial ResxEditor release. All of those issues are now fixed. Bug fixes and new features are detailed on the ResxEditor page.

Special thanks to Nick Pasko for carrying most of the work and finding a solution to get rid of the previous cell saving behavior that was driving translators nuts.

Saturday
Jul222006

Resx2Word, when simplistic is not enough

RESX files are great (and simple) containers of textual resources for your .Net/Asp.Net applications. It's especially useful if you're planning to translate your application into multiple languages (PeopleWords has been translated into 13 languages all textual content being put into RESX files). Yet, using Microsoft Visual Studio as a RESX file editor is quite an overkill solution for translators (whoses programming often equate zero since it's not their job anyway). In a previous post, I was discussing ResxEditor, a simplistic and stand-alone RESX file editor.

Où que tu sois je te retrouverai, car si tu ne viens pas à Lagardère, Lagardère ira à toi! (If you do not come to RESX, RESX will come to you)

Yet, I am still not entirely satisfied by ResxEditor. Indeed, during the translation of process of PeopleWords, half of the translators (smart and educated btw) were surprised by the sheer existence of other text editors beyond MS Word. I imagine that this kind of thing can happen if you have been working your entire life with MS Word.

As a result, those translators, no matter how many times you tell them not to use Word, they can't resist the urge and the RESX file gets opened and translated through Word ... and then funny things happen. For example, I have now several translations of the Microsoft RESX instructions Microsoft ResX Schema, Version 2.0, The primary goals of this format is to allow a simple XML format that is mostly human readable. ..., the large XML comment created by VS by default at the beginning of each RESX file. This XML comment is going to one of the most translated piece of MS literature (I do not think that the VS engineers were expecting this when they wrote those RESX instructions).

In order to escape the curse of the RESX instructions translation, I have just released Resx2Word, a RESX to MS Word converter (and vice-versa). Naturally, it's not possible to translate generic MS Word document to RESX, only MS Word document generated by Resx2Word can be converted back into RESX by Word2Resx. Any feedback?