I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)


Entries in patterns (11)


Nearly all web APIs get paging wrong

Data paging, that is, the retrieval of a large amount of data through a series of smaller data retrievals, is a non-trivial problem. Through Lokad, we have implemented about a dozen of extensive API integrations, and reviewed a few dozens of other APIs as well.

The conclusion is that as soon as paging is involved, nearly all web APIs get it wrong. Obviously, rock-solid APIs like the ones offered by Azure or AWS are getting it right, but those outstanding APIs are exceptions rather than the norm.

The obvious pattern that doesn't work

I have lost count of the APIs that propose the following broken pattern to page through their data, a purchase order history for example:

Where page is the page number and pagesize the number of orders to be retrieved. This pattern is fundamentally unsafe. Any order deleted while the enumeration is in-progress will shift the indices which, in turn, is likely to cause another order to be skipped.

There are many variants of the pattern, and everytime the problem boils down to: the "obvious" paging pattern leads to a flawed implementation that fail whenever concurrent writes are involved.

The "updated_after" filter doesn't work either

Another popular approach for paging is to leverage a filter on the update timestamp of the elements to be retrieved, that is:

Then, in order to page the request, the client is supposed to take the most recent updated_at value from the response and to feed this value back to the API to further enumerate the elements.

However this approach does not (really) work either. Indeed, what if all elements have been updated at once? This can happen because of a system upgrade or because of any kind of bulk operation. Even if the timestamp can be narrowed down to the microsecond, if there are 10,000 elements to be served all having the exact same udpate timestamp, then, the API will keep sending a response where max(updated_at) is equal to the request timestamp.

The client is not enumerating anymore, the pattern has failed.

Sure, it's possible to tweak the timestamps to make sure that all the elements gracefully spread over distinct values, but it's a very non-trivial property to enforce. Indeed, a datetime column isn't really supposed to be defined with unicity constraint in your database. It's feasible, but odd and error prone.

The fallacy of the "power" APIs

Some APIs provides powerful filtering and sorting mechanisms. Thus, through those mechanims, it is possible to correctly implement paging. For example by combining two filters: one the update datetime of the items and one on the item identifier. A correct implementation is far from trivial however.

Merely offering the possibility to do the right thing is not sufficient: doing the right thing should be the only one possibility. This point is something that Lokad learned the hard way early on: web APIs should offer one and only one way to do each intended operation.

If the API offers a page mechanism but that the only way to correctly implement paging is to not use it; then, rest assured that the vast majority of the client implementations will get it wrong. From a design viewpoint, it's like baiting developers into a trap.

The "continuation token" as the only pattern that works

To my knowledge, there is about only one pattern that works for paging, it's the continuation token pattern.

Where every request to a paged resource like the purchase orders has the possibility of returning a continuation token on top of the elements returned when not all elements could be returned in one batch.

On top of being correct, that pattern has two key advantages:

  • It's very hard to get it wrong on the client side. There is only one way to do anything with the continuation token: it's to feed it again to the API.
  • The API is not commited into returning any specific number of elements (in practice a high upper bound can still be documented). Then, if some elements are particularly heavy or if the server is already under heavy workload, smaller chunks can be returned.

This enumeration should not provide any garantee that the same element won't be enumerated more than once. The only garantee that should be provided by the paging through tokens is that ultimately all elements will be enumerated at least once. Indeed, you don't want to end-up with tokens that embed some kind of state on the API side; and in order to keep it smooth and stateless, it's important to lift this constraint.

Then, continuation tokens should not expire. This property is important in order to offer the possibility to the client perform incremental update on a daily, weekly or even  on a monthly schedule depending on what makes sense from a business viewpoint.

No concurrency but data partitions

The continuation token does not support concurrent data retrieval: the next response has to be retrieved before being able to post the next request. Thus, in theory, this pattern somehow limit the amount of data that can be retrieved.

Well, it's somewhat true, and yet mostly irrevelant. First, Big (Business) Data is exceedingly rare in practice, as the transation data of the largest companies tend to fit on a USB key. For all the APIs that we have integrated, putting aside the cloud APIs (aka Azure or AWS), there was not a single integration where the amount of data was even to close to justifying concurrent data accesses. Slow data retrieval is merely a sign a non-incremental data retrieval.

Second, if the data is so large that concurrency is required, then, partitionning the data is typically a much better approach. For example, if you with to retrieve all the data from a large retail network, then the data can be partitionned per store. Partitionning will be making things easier both on the API side and on the client side.


You don't know how much you'd miss an O/C mapper till you get one

When we started moving our enterprise app toward Windows Azure, we quickly realized that scalable enterprise cloud apps were tough to develop, real tough.

Windows Azure wasn't at fault here, quite the opposite actually, but the cloud computing paradigm itself is tough to develop enterprise apps. Indeed, scalability in enterprise apps can't be solved by just pilling up tons of memcached servers.

Enterprise apps aren't about scaling out some super-simplistic webapp to a billion users who will be performing reads 99.9% of the time, but rather scaling out complex business logic and accordingly complex business data along.

This lead us to implement Lokad.Cloud, an open source .NET O/C mapper (object-to-cloud) much similar in the spirit to O/R mapper such as NHibernate but tailored for NoSQL storage.

I am proud to announce that Lokad.Cloud has reached its v1.0 milestone.

As a matter of fact, you've probably never heard of O/C mappers, so I will explain why relying a decent O/C mapper should be a primary concern for any ambitious cloud app developer.

To illustrate the point, I am going to list a few subtleties that arise as soon you start using the Queue Storage. As far cloud apps are concerned, Queue Storage is one of the most powerful and most handy abstraction to achieve true scale out behaviors.

Microsoft provides the StorageClient which is basically a .NET wrapper around the REST API offered by the Queue Storage. Let see how an O/C mapper implemented on top of the StorageClient can make queues even better:

  • Strong typed messages: Queue Storage deals with binary messages, not with objects. Obviously, you don't want to entangle your business logic with serialization/deserialization logic. Business logic only cares about the semantic of the processing, not about the underlying data format used for persistence while transiting data over the cloud. The O/C mapper is here to provide a strong typed view of the Queue Storage.
  • Overflowing messages: Queue Storage upper bounds messages to 8kB. This limitation is fine as the Blob Storage is available to deal with large (even gigantic) blobs. Yet again, you don't want to mix storage contingencies (8kB message limit) with your business logic. The O/C mapper lets large message overflow into the Blob Storage.
  • Garbage collection: you might think that manually handling overflowing messages is just fine. Not quite so. What will happen to your overflowing messages, conveniently stored in the Blob Storage, if the queue (for good or ill reasons) happens to be cleared? Simple, you end up with a cloud storage leak: dead piece of data start to pill-up into your storage, and you get charge for it. In order to avoid such situation, you need a cloud garbage collector that makes sure that expired data are automatically collected. The O/C mapper embeds a storage garbage collector.
  • Auto-deletion of messages: Messages should not only be retrieved from the Queue, but also deleted once processed. Following the GC idea, developers should not be expected to delete queue messages when the message processing goes OK, much like you don't have to care about destroying objects getting out of reach. The O/C mapper auto-deletes queue messages upon process completion.
  • Delayed messages: Queue Storage does not offer any simple way to schedule a message to reappear in the queue at a specified time. You can come up with your own custom logic, but again, why should the business logic even bother about such details. The O/C mapper supports delayed messages so that you don't have to think about it.
  • Poisoned queues: that one is deadly subtle one. A poisoned queue message refers to a message that leads to a faulty processing, typically an uncaught exception being thrown by the business logic while trying to process the message. The problem is intricately coupled to the good behavior of the Queue, indeed, if a retrieved message fails to be deleted within a certain amount of time, the message will reappear in the Queue. This behavior is excellent for building robust cloud apps. but deadly if not properly handled. Indeed, faulty messages are going to fail and to reappear over and over, consuming ever increasing cloud resources for no good reason. In a way, poisoned messages represents processing leaks. The O/C mapper detects poisoned messages and isolate them for further investigation and eventual reprocessing once the code is fixed.
  • Abandoning messages: In the clouds, you should not expect VM instances to stay up forever.  In addition to hardware faults, the fabric might decide anytime to shutdown one of your instance.  If a worker instance gets shut down while processing a message, then the processing will be lost until the message reappears in the Queue. Nevertheless, such extra delay might negatively impact your business service level, as an operation that was supposed to take only half a minute might suddenly take 1h (the expiration delay of your message). If the VM gets the chance to be notified of the upcoming shutdown, the O/C mapper abandons in-process messages, making them available for processing again without waiting for expiration.

I have only illustrated here a few point about Queue Storage, but Blob Storage, Table Storage, Management API, Performance Monitoring, ...  also need to rely on higher level abstractions as offered by an O/C mapper such as Lokad.Cloud to become fluently usable.

Don't waste any more time crippling your business logic with cloud contingencies, and start using some O/C mapper. I suggest Lokad.Cloud, but I admit this is biased viewpoint.


Paging indices vs Continuation tokens

Developers coming from the world of relational databases are well familiar with indexed paging. Paging is rather straightforward:

  • Each row gets assigned a unique integer, starting from 0, and going with +1 increment for each additional row.
  • The query is said to be paged, because constraints specifies that only the row assigned an index greater or equal to N and lower than N+PageSize are retrieved.

I call such as pattern a chunked enumeration: instead of trying to retrieve all the data at once, client app is retrieving chunks of data, potentially splitting a very large enumeration is into a very large number of much small chunks.

Indexed paging is a client-driven process. Indeed, it is the client code (aka the code retrieving the enumerated content) that decides how big each chunk is supposed to be. Indeed, it's the client code that is responsible for incrementally updating indices from one request to the next. In particular, the client code might decide to make a fast forward read on the data.

Although indexed paging is well established pattern, I have found that it's not such a good fit for cloud computing. Indeed, client-driven enumeration is causing several issues:

  • Chunks may be highly heterogeneous in size.
  • Retrieval latency on the server-side might be erratic too.
  • If a chunk retrieval fails (chunk too big), client code has no option but to initiate a tedious trial-and-error process to gradually go for smaller chunks.
  • Chunking optimization is done on the client side, injecting replicated logic into every single client implementation.
  • Fast forward may be completely impractical to implement on the server side.

For those reasons, on the continuation tokens are usually favored in a cloud computing situation. This pattern is simple too:

  1. Request a chunk, passing a continuation token if you have one (not for the first call)
  2. Server returns an arbitrarily sized chunk plus an eventual continuation token.
  3. If no continuation token is retrieved, then the enumeration is finished.
  4. If a token is returned, then go back to 1, passing the token in the call.

Although, this pattern looks similar to the indexed paging, constraints are very different. Continuation tokens are a server-driven process. It's up to the server to decide how much data should be send at each request, which yield many benefits:

  • Chunk size, and retrieval latency can be made much more homogeneous.
  • Server has (potentially) much more local info to wisely choose appropriate chunk sizes.
  • Clients hold no more complex optimization logic for data retrieval.
  • Fast-forward is disabled which leads to a simpler server-side implementation.

Then, there are even more subtle benefits:

  • Better resilience against denial of service. If the server suffer an overload, then, it can optionally delay the retrieval by returning nothing but the current continuation token (a proper server-driven way of saying busy, try again later to the client).
  • Better client resilience to evolution. Indeed, the logic that optimize the chunking process might evolve on the server-side over time, but client code is not impacted and implicitly benefits from those improvements.

Bottom line: unless you specifically want to offer support for fast-forward, you are nearly always better off relying continuation tokens in your distributed computing patterns.


Azure Management API concerns

Disclaimer: this post is based on my (limited) understanding of the Azure Management API, I did start reading the docs only a few hours ago.

Microsoft has just released the first preview of their Management API for Windows Azure.

As far I understand the content of the newly released API (check the MSDN reference), this API just let you automates what was done manually through the Windows Azure Console so far.

At this point, I have two concerns:

  1. No way to adjust your instance count for a given role.

  2. Auto-management (*) involves loads of quirks.

(*) Auto-Management: the ability for a cloud app to scale itself up and down depending on the workload.

I am not really satisfied by this Management API as it does not seem to address basic requirements to easily scale up or down my (future) cloud app.

Being able to deploy a new azure package programmatically is nice, but we were already doing that in Lokad.Cloud. Thanks to the AppDomain restart trick, I suspect we will keep deploying that way, as the deployment through Lokad.Cloud is likely to be still 100x faster.

That being said, the Management API is powerful, but it does not seem to address auto-management, at least not in a simple fashion.

The single feature I was looking forward was being able to adjust the number of instances on-demand through a very very simple API that would have let me do three things:

  1. Create new instance for the current role.

  2. Shut down current instance.

  3. Get the status of instances attached to the current role.

That's it!

Notice that I am not asking here to deploy a new package, or to change the production/staging status. I just need to be able tweak the instance count.

In particular, I would expect a Non-SSL REST API to do those limited operations, much like the other REST API available for the cloud storage.

Indeed, security concerns related to the instance count management are nearly identical to the ones related to the cloud storage. Well, not really, as in practice securing your storage is way much more sensitive.


Thinking the Table Storage of Windows Azure

Disclaimer: I am not exactly a Table Storage expert. In this post, I am just trying to sort out my own thoughts about this service offered with Windows Azure. Check my follow-up post.

Soon after the release announcement of the release of our new O/C mapper (object to cloud) named Lokad.Cloud, folks on the Azure Forums raised the question of the Table Storage.

Although it might be surprising, Lokad.Cloud does not provide - yet - any support for Table Storage.

At this point, I feel very uncertain about Table Storage, not in the sense that I do not trust Microsoft to end-up with finely tuned product, but rather at the patterns and practices level.

Basically, the Table Storage is an entity storage that features three special system properties:

  • PartitionKey: a grouping criterion - data having the same PartitionKey being kept close.

  • RowKey: the unique identifier for the entity.

  • Timestamp: the equivalent of Blob Storage ETag.

So far, I got the feeling that many developers feel attracted toward the Table Storage for the wrong reasons. In particular, Table Storage is not a substitute of your old plain SQL tables:

  • No support for transactions.

  • No support for keys (let alone foreign keys).

  • No possible refactoring (properties are frozen at setup).

If you are looking for those features, you're most likely betting on the wrong horse. You should be considering SQL Azure instead.

Then, some might argue that SQL Azure won't scale above 10GB (at least considering the current pricing plans offered by Microsoft). Well, the trick is Table Storage won't scale either, at least not unless you're not very cautious with your queries.

AFAIK, the only indexed column of the Table Storage is the RowKey. Thus, any filtering criterion based on custom entity properties is likely to get abyssal performance as soon your Table Storage get large.

Well, sort of, the most probable scenario is like to to be worse as your queries are just going to timeout after exceeding 60s.

Again, my goal here is not to bash the Table Storage, but it must be understood that the Table Storage is clearly not a magically scalable equivalent of the plain old SQL tables.

Back to Lokad.Cloud, we did not consider adding Table Storage because we did not feel the need either although our forecasting back-end is probably very high in the currently complexity spectrum of the cloud apps.

Indeed, the Blob Storage is surprisingly powerful with very predicable performance too:

  • Storing complex objects is a non-issue with a serializer at hand.

  • A blob name prefix is a very efficient substitute to the PartitionKey.

Basically, it seems to me that any Table Storage operation can be executed with the same performance with the Blob Storage for now. Later on, when the Table Storage will start supporting secondary indexes, this situation is likely to evolve, but meantime I still cannot think a single situation that would definitively support Table Storage over Blob Storage.