Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta
Tags

Entries in practices (34)

Wednesday
Dec222010

A few tips for Web API design

During the iteration in spring 2010 that has lead Lokad to release its Forecasting API v3, I have been thinking a lot about how to design proper Web APIs in this age of cloud computing.

Designing a good Web API is very surprisingly hard. Because Remote Procedure Call has been around forever, one might think that designing API is a well-known established practice, and yet, suffering the defects of Forecasting API v1 and v2, we learned the hard way it wasn't really the case.

In this post, I will try cover some of gotchas that we learned the hard-way about Web API design.

1. Full API stack ownership

When you expose an API on the web, you tend to rely on building-blocks such as XML, SOAP, HTTP, ... You must be prepared to accept and to support a full ownership of this stack: in order words, problems are likely to happen at all levels, and you must be ready to address them even if the problem arises in one of those building blocks.

For example, at Lokad, we initially opted for SOAP, and I now believe it was a mistake from Day 1. There is nothing really wrong with SOAP itself (*), the problem was that the Lokad team (myself included) was not ready to support SOAP to in full. Whenever a client was raising a subtle question some obscure XML namespace or SOAP version matters, we were very far from our comfort area. In short, we were relying on a complex SOAP toolkit which proved to be a leaky abstraction.

When it comes to public API, you have to be ready to support all abstraction leaks from the TCP, HTTP, XML, SOAP... That why, in the end, for our Forecasting API v3, we settled for POX (Plain Old Xml) over HTTP with a basic REST philosophy. Now, when an issue arises, we can trully own the problem along with its solution, instead of being helpless facing a leaky abstraction that we don't have the resource to embrace.

(*) Although there are definitely many gray areas with SOAP (just look at SOAP Exceptions for example).

2. Be minimalistic

Dealing with all the software layers that sit between the client code and the Web API itself is bad enough, if the API itself adds to complexity of its own, the task quickly becomes maddening.

While refactoring our Forecasting API v2 into the Forecasting API v3, I reduced the number of web methods from +20 to 8. Looking back, it was probably one of the best insights I had.

When developing a software library, it is usually convenient for the client developer to benefit from many syntactic sugars that is to say method variants that achieve the purpose following various coding style. For a Web API, the reverse is true: there should be one and only one way to achieve each task.

Redundancy within methods only cause confusion and extra-friction when developing against the API, and while cause an endless stream of questions whether the method X should be favored against the method X* while both achieve essentially the same thing.

3. Idempotence

Network is unreliable. With a low frequency, API calls will fail because of network glitches. The easiest way to deal with those glitches is simply to have retry policies in place: if the call fails, try again. If your API semantic is idempotent, retry policies will integrate seamlessly with your API. If not, each network glitch is likely to wreak havoc within the client app.

For example, coming from a SQL background, most developer might consider having both INSERT (only for new items) and UPDATE (only for existing items) methods. INSERT is a bad choice, because if the method is attempted twice because of a retry, the second call will fail probably triggering some unexpected exception on the client side. Instead you should rather adopt UPSERT, ie UPDATE or INSERT, semantics which play much nicer with retry policies.

4. Explicit input/output limitations

No web request or response should be allowed to be arbitrarily large as it would cause a lot of problem server side to maintain decent performance across concurrent web requests. It means that from Day 1 every message that goes in or out your API is explicitly capped: don't let your users discover your API limitations through reverse engineering. In practice, it means that no array, list or string should be left unbounded: max length should always be specified.

Then, unless you precisely want to move plain text around, I suggest to keep tight limitation on any string being passed on your API. In the Forecasting API, we have opted for the Regex pattern ^[a-zA-Z0-9]{1,32}$ for all string tokens. Obviously, this is rather inflexible, but again, unless your API is intended for plain text / binary data storage there is no point is supporting the whole UTF-8 range which can prove very hard to debug and trigger SQL-injection-like problems.

 

Monday
Jan112010

Scaling-down for Tactical Apps with Azure

Cloud computing is frequently quoted for unleashing the scalability potential of your apps, and the press goes wild quoting the story of XYZ a random Web 2.0 company that as gone from a few web servers to 1 zillion web servers in 3 days due to massive traffic surge.

Yet, the horrid truth is: most web apps won’t ever need to scale, and an even smaller fraction will even need to scale out (as opposed of scaling up).

A pair of servers already scales up to millions of monthly visitors for an app as complex as StackOverflow. Granted, people behind SO did a stunning job at improving the performance of their app, but still, it illustrates that moderatly scaling up already brings you very far.

At Lokad, although we direly need our core forecasting technology to be scalable, nearly all  other parts do not. I wish we had so many invoices to proceed that we would need to scale out our accounting database, but I don’t see that happen any time soon.

Actually, over the last few months, we have discovered that cloud computing have the potential to unleash yet another aspect in the software industry: the power of scaled-down apps.

There is an old rule of thumb in software development that says that increasing the complexity of a project by 25% increases the development efforts by 200%. Obviously, it does not look too good for the software industry, but the bright side is: if you cut the complexity by 20% then you halve the development effort as well.

Based on this insight, I have refined the strategy of Lokad with tactical apps. Basically, a tactical app is a stripped-down web app:

  • not core business, if the app crashes, it’s not a showstopper.
  • features are fanatically stripped down.
  • from idea to live app in less than 2 weeks, single developer on command.
  • no need to scale, or rather very unlikely.
  • addresses an immediate need.

Over the last couple of weeks, I have released 3 tactical apps based on Windows Azure:

Basically, each app took me less than 10 full working days to develop, and each app is addressing some long standing issues in its own narrow yet useful way:

  • Website localization had been a pain for us from the very beginning. Formalized process where tedious, and by the time the results were obtained, translations were already outdated. Lokad.Translate automates most of the mundane aspect of website localization.
  • Helping partners figuring out their own implementation bugs while they were developing against our Forecasting API was a slow painful process. We had to spend hours guessing what could be the problem in partner's code (as we typically don’t have access to the code).
  • Helping prospects to figure out how to integrate Lokad in their IT, we end-up facing about 20 new environments (ERP/CRM/MRP/eCommerce/…) every week, which is a daunting task for a small company such as Lokad. Hence, we really need to plug partners in, and Lokad.Leads is helping us to that in a more straightforward manner.

Obviously, if we were reach 10k visits per day for any one of those apps that would be already a LOT of traffic.

The power of Windows Azure for tactical apps

Tactical apps are not so much a type of apps but rather a fast-paced process to deliver short-term results. The key ingredient is simplicity. In this respect, I have found that the combination of Windows Azure + ASP.NET MVC + SQL Azure + NHibernate + OpenID is a terrific combo for tactical apps.

Basically, ASP.NET MVC offers an app template that is ready to go (following Ruby on Rails motto of convention over contention). Actually, for Lokad.Translate and Lokad.Debug, I did not even bother in re-skinning the app.

Then, Windows Azure + SQL Azure offer an ultra-standardized environment. No need to think about setting up the environment, environment is already setup, and it leaves you very little freedom to change anything which is GREAT as far productivity is concerned.

Also, ultra-rapid development is obviously error-prone (which is OK because tactical apps are really simple). Nevertheless, Azure provides a very strong isolation from one app to the next (VM level isolation). It does not matter much if one app fails and dies suffering some terminal design error, damage will be limited to app itself anyway. Obviously, it would not have been the case in a shared environment.

Finally, through OpenID (and its nice .NET implementation), you can externalize the bulk of your user management (think of registration, confirmation emails, and so on).

At this point, the only major limitation for tactical apps is the Windows Azure pricing which is rather unfriendly to this sort of apps, but I expect the situation to improve over 2010.

·         not part of your core business, if the app crashes, it’s annoying, but not a showstopper.

·         features are fanatically stripped down.

·         from idea to live app in less than 2 weeks, single developer on command.

·         no need to scale, or rather, the probability of needing scalability is really low.

·         addresses an immediate need, ROI is expected in less than a few months.

Saturday
Jan092010

Lokad.Translate v1.0 released (and best wishes for 2010)

A few weeks ago, I have been discussing the idea of continuous localization. In summary, the whole point is to do for localization (either websites or webapps) what is done by the continuous integration server.

Obviously, the translation itself should be done by professional translators, as automated translation tools are still light years away from the required quality.

Beyond this aspect, nearly all the mundane steps involved in localization works can be automated.

This project has taken the shape of an open-source contribution codenamed Lokad.Translate. This webapp is based on ASP.NET MVC / C# / NHibernate and targets Windows Azure.

This first release comes as a single-tenant app. We are hosting our own instance at translate.lokad.com but if you want start using Lokad.Translate, you will need to setup your own.

Considering that Lokad.Translate is using both a WebRole and a WorkerRole (plus a 1Gb SQL Azure instance too), hosting Lokad.Translate on Azure should cost a bit less than $200 / month (see the Windows Azure pricing for the details).

Obviously, that's a pretty steep pricing for a small webapp. It's not surprising that Make it less expensive to run my very small service on Azure comes as the No1 community most-voted feature.

Yet, I think the situation will improve within 2010. Many cloud hosting providers such as RackSpace are already featuring small VMs that would be vastly sufficient for a single tenant version of Lokad.Translate. Considering that Microsoft will be offering similar VMs at some point, it would already drop the price around $30.

If we add also that CPU pricing isn't going to stay stuck at $0.12 forever, the hosting price of Lokad.Translate is likely to drop significantly within 2010.

Obviously, the most efficient way to cut the hosting costs would be to turn Lokad.Translate into a multi-tenant webapp. Depending on community feedback, we might consider going that road later on.

The next milestone for Lokad.Translate will be to add support for RESX files in order to support not only website localization, but webapp localization as well.

Friday
Dec112009

Right level of interruption and productivity

Lokad was at LeWeb'09  along with other BizSpark One startups. It had been a great time for us. Our showcase client, Pierre-Noël Luiggi, CEO and founder of Oscaro, was there at our booth and did a tremendous job at explaining why inventory and back-office matters if you intend to build a great eCommerce business.

Among various discussions that I had with Pierre-Noël during those two days of hectic trade-show, we have been discussing a lot about productivity in presence of interruptions (LeWeb'09 was focused on real-time web, hence real-time interruptions).

At Lokad, we are using loads of communication tools, and this seemed to Pierre-Noël a rather puzzling aspect of my company, as he was expecting mathematicians / developers to need an awful lot of concentration, and thus stay as far as possible from all those Web 2.0 distracting thingies.

Well, it turned out that Pierre-Noël insights were 100% correct: working on complicated projects such as Lokad requires an insane amount of concentration to get things right. Yet, this is precisly because if this requirement that we are using so many tools in the first place: we are leveraging those tools to help us, as a team, to keep interruption level as low as possible.

Basically, we have a range of interruption levels that goes as follow (ordered by decreasing intrusiveness):

  1. Face to face meetings
  2. Phone calls
  3. Skype
  4. SMS
  5. Email
  6. Google Wave
  7. Blogs (private, public)
  8. Task Trackers
  9. Wiki

Face to face meetings come on top as the most interrupting communication means. Although, we don't have a policy that forbids knocking on somebody else door, we try to keep all direct interactions addressed either early morning, at lunch time or late evening.

Then, we also try to avoid all company meetings. As a matter of fact, we have virtually no regular company meeting of any kind. It's not that we have any religious thinking about it, it's just that after close examination, it appears there are very few situations that actually call for a meeting with more than 2 people.

Synchronous media such as phone call or Skype chat comes in 2nd as most intrusive media. The policy is to keep those media for events that really needs immediate attention. For example, an alert related to our forecasting infrastrure typically falls in this category. Yet, if the matter is not pressing to the minute, synchronous media should not be used in favor of less interrupting means.

SMS and emails come as the most disruptive non-synchronous media although it's already way much less intrusive than synchronous communications. Our policy is to treat emails as hot potatoes that need immediate attention. Although you don't need to process the item right now but email should be addressed within hours not days. Obviously, if the message does not need to (or cannot) be addressed within hours, then our policy is not to send an email in the first place.

Google Wave is a latest entrant in our communication practices at Lokad. It might ultimately supplant email altogether, but for now we keep it separated from our regular email practices. The most appealing as aspect of GW is that email is terribly ill-designed to deal with ongoing discussions. Any discussion that involves more than 5 emails is a usability nightmare, no matter of much efforts smart email clients (GMail, Outlook 2010) are doing to recover conversation threads out of the underlying mess. Google Wave has a much more natural model to deal with long ongoing discussions.

The policy at Lokad is now to move away from email any complex topic, as there is no chance that the issue could be addressed within hours. Dealing with +100 replies conversations with Google Wave is just a breeze while it was a nightmare with emails.

Note: I can't wait for Microsoft Outlook to support wave as a client, that would be an incredible combo.

Still, Google Wave does require attention from your audience, as sending a Wave is like forcing people into reading what you've just wrote. In this respect, Google Wave is not different from email.

Hence, comes the lazy asynchronous media, the less intrusive kind of media. Blogs, trackers and wiki falls into this category. I am calling those media as lazy because because you do not force any attention for your audience. It's up to the audience to decide whether it has enough time and interest to read your content.

At Lokad, everyone has a personal private blog that we call a worklog. Everyone is expected to post  once a week (and no more), giving a status update on what you've been doing during the week, and what you expect to do next week. In our limited experience, this practice is to give consistency accross all individual actions without the massive overhead typically involved with weekly team meetings (which never existed at Lokad anyway).

The true strength of those lazy media is that opportunity that you give not people not to read you. Your audience knows better than you do what it really needs to know and learn and when the need will occur. Hence, rather than pushing content toward people, it lets them pull info whenever it feels appropriate.

Monday
Nov302009

Continuous Localization or l10n 2.0

There is nothing is easier to sell globaly than software. Yet, it's still surprising to me to see how few efforts are made on average by small software companies toward localization.

Disclaimer: I am not saying that selling software anywhere is easy. Some places are really tough. I am just saying that selling about anything else worldwide level is just 10x harder.

Translation is (relatively) cheap

Localization (l10n in short) is easier and cheaper than you think. In a past project of mine, a few years ago, I managed to translate a web app (freelance marketplace) in 13 languages for less than $2000. Yes, that's right, it was roughly $150 per language.

The first thing to understand here is that freelance translation is cheaper than usually thought. Generic freelance considerations apply, but compared to the massive efforts needed to actually develop and maintain any small piece of software, translation is just dirty cheap.

... but management is not ...

Yet, if translation is unexpensive, managing translators is not. Each time I took over localization works, I relealized that managing half a dozen of remote translators on an ongoing basis was nearly requiring a full-time commitment from my side.

If think, most companies realize this effect intuitively. The bottom line result is that localization is typically performed in big batches.

Big batches seem to be the archetype of the non-agile process. Every two years, package all documents and hand them over to some translation agency. Wait for two months. Publish the (already outdated) documents, and wait more. Two years later, once documents are desperatly outdated, repeat.

Although, I can't blame the community doing it that way, as I was doing no better. Yet, this process felt wrong. Since localization is such a big time-consuming mess, we do it only once in a while and meantime prospects and customers suffer outdated materials on an ongoing basis, which, somehow, is even worse than poor-quality translation.

... hence Continuous Localization

Among all good practices in software development, I have found continuous integration to be one of the few breakthrough that have significantly improved agility in project management. The core idea being continuous integration is that integration becomes part of your daily process.

Instead of updating the deployment logic once every 18months, you do it on an ongoing basis, so that the software is already ready to ship. Yet, continuous integration comes with a gotcha: you can't do it by hand. It takes another layer of automation: the integration server.

Thinking about it, localization is similar to integration

The simple idea of incremental localization without some automation seemed doomed as it would require insane communication efforts between manager and translators.

Then, what about adding this automation layer to simplify the process?

Let see how the localization process could be automated:

  1. (automated) Get all source documents and incremental updates.
  2. (automated) Map updates to every target languages.
  3. (manual) Apply corresponding incremental updates to target documents.
  4. (automated) Keep track of the amount of work made by each translator.
  5. (automated) Keep track of work batch to get translators paid.

Obviously, the one step that cannot be automated this translation operation itself; but then all other steps can be vastly automated.

This idea has been the starting point of a project codenamed Lokad.Translate. This project is nothing more than a webapp playing the role of a localization server and providing all the automation that we can get to speed-up the localization process - both on the management side, but on the translator side as well.

Tech note: Lokad.Translate is ASP.NET MVC + NHibernate on top of Azure.

Since we did not want to reinvent the wheel, we decided to leverage the capabilities of the wiki powering our own company website. In particular, in order to retrieve the list of incremental changes, we using nothing else but the web feed generated by the wiki (RSS in present case, although it does not matter much) that represents recent changes. The nice thing about web feeds is that most webapps are already providing one (think blogs).

Then, concerning document management (both originals and translations), there is a gotcha: there is no need to manage documents themselves, as managing the URLs pointing to the documents is enough. Once the URL is known, if a REST API is provided by the wiki, all other commands (view/diff/edit/...) could be inferred with simple REGEX.

My objective would be to achieve a near continuous localization of the content posted on our website with say, no more than 2 weeks of delay between initial post and its localization with minimal overhead both on the management and translator. We will soon start deploying Lokad.Translate for our internal need, we will see how it goes.

Then, depending of community, we will probably provide release Lokad.Translate one way or another. Stay tuned for more (and don't hesitate to contact me if you're interested).