I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)


Entries in storage (9)


A few design tips for your NoSQL app

Since the migration of Lokad toward Windows Azure about 18 months ago, we have been near exclusively relying on NoSQL - namely Blob Storage, Table Storage and Queue Storage. Similar cloud storage abstractions exist for all major cloud providers, you can think of them as NoSQL as a Service.

It took us a significant effort to redesign our apps around NoSQL. Indeed, cloud storage isn't a new flavor of SQL, it's a radically different paradigm and it required in-depth adjustment of the core architecture of our apps.

In this post, I will try to summarize  gotchas we grabbed while (re)designing

You need an O/C (object to cloud) mapper

NoSQL services are orders of magnitude simpler than your typical SQL service. As a consequence, the impedance mismatch between your object oriented code and the storage dialect is also much lower compared to SQL; this is a direct consequence of the relative lack of expressiveness of NoSQL.

Nevertheless, introducing an O/C mapper was a major complexity buster. At present time, we no more access cloud storage directly, and the O/C mapper layer is a major bonus to abstract away may subtleties such as retry policies, MD5, queue message overflow, ...

Performance is obtained mostly by design

NoSQL is not only simpler but more predictable as well when it comes to performance. However, it does not mean that a solution build on top of NoSQL automatically benefit from scalability and performance - quite the opposite actually.  NoSQL comes with strict built-in limitations. For example, you can't expect more than 20 updates / second on a single blob, which is near ridiculously low compared to its SQL counterpart.

Your design needs to embrace the strengths of NoSQL and be really cautious about not hitting bottlenecks. Good news, those are much easier to spot. Indeed, no later optimization will save your app from abysmal performance if the storage architecture doesn't match dominant I/O patterns of your app (see the Table Storage or the 100x cost factor).

Go for contract-based serializer

A serializer, aka a component that let you turn an arbitrary object graph into a serialized byte stream, is extremely convenient for NoSQL. In particular, it provides a near-seamless way to let your object-oriented code interact with the storage. In many ways, the impedance mismatch objects vs NoSQL is much lower than it was for objects vs SQL.

Although, sometimes, serializers are nearly too powerful. In particular, it's easy to serialize objects part of the runtime which can prove brittle over time. Indeed, upgrading the runtime might end-up breaking your serialization patterns. That's why I advise to go for simple yet explicit contract-based serialization schemes.

Although we did use a lot of XML in our early days on the cloud, we are now migrating away from XML in favor of JSON, Protocol Buffers or adhoc high-density binary encoding that provides better readability vs flexibility vs performance tradeoff in our experience.

Entity isolation is easiest path to versioning

One early mistake of Lokad in our early NoSQL day was apply too much of DRY principle (Don't Repeat Yourself).  Indeed, sharing the same class between entities is a sure way to end-up with painful versioning issues later on.  Indeed, touching entities once data has been serialized with them is always somewhat risky, because you can end-up with data that you can't deserialize any more.

Since the schema evolution required for one entity doesn't necessarily match the evolution of the other entities, you ought to keep them apart upfront. Hence, I suggest to give up on DRY early - when it comes to entities - to ease later evolutions.

With proper design, aka CQRS, needs for SQL drop to near-zero

Over the last two decades, SQL has been king. As a consequence, nearly all apps embed SQL structural assumptions very deep into their architecture, making a relational database an irreplaceable component - by design.

Yet,  we have find out that when the app deeply embraces concepts such as CQRS, event sourcing, domain driven design and task-based UI, then there is no more need for SQL databases.

This aspect was a surprise to us, as we initiated our cloud migration extensively leveraging SQL databases. Now, as we are gaining maturity at developing cloudy apps, we are gradually phasing those databases out: not because of performance or capabilities, simply because they aren't needed anymore.


Why perfectly reliable storage is not enough

Cloud computing now offers near perfectly reliable storage. Amazon D3 is announcing a 99.999999999% durability and the Windows Azure storage is in the same league.

Yet, perfectly reliable data storage does not prevent data loss - by a long range. It only prevents data loss caused by hardware failure, which nowadays are no more the most frequent cause for losing data.

The primary danger threatening your data is just plain accidental deletion. Yes, it's possible to setup administrative rights and so on to minimize the surface area of potential trouble. But at the end of the road, someone yields sysadmin powers over the data, and this person is just a few clicks away from causing a lot of trouble.

 A long established pattern to avoid those kind of trouble is  automated data snapshots taken on a daily or weekly basis that can be restored when something will go utterly wrong. In the SQL world, snapshots are given as any serious RDBMS do provide snapshotting as a basic feature at present day.

Yet, in the NoSQL world, things aren't that bright, and at Lokad, we realized that such obvious feature was still missing from the Windows Azure Storage.

Thus, today, we are releasing Lokad.Snapshot an open source C#/.NET app targeting the Windows Azure Storage and running on Windows Azure itself. Kudos to Christoph Rüegg, the primary architect of this app. Lokad.Snapshot offers automated snapshots for tables and blobs. In addition, Lokad.Snapshot exposes a Really Simple Monitoring endpoint to be consumed by Lokad.Monitoring.

The Lokad.Snapshot codebase should still be considered as beta, although the app is already in production for our own internal needs at Lokad. If your Azure data isn't snapshotted yet, make sure to have a look at Lokad.Snapshot, it might be a life-saver sooner than expected.


Fat entities for Table Storage in Lokad.Cloud

After realizing the value of the Table Storage, giving a lot of thoughts about higher level abstractions, and stumbling upon a lot of gotcha, I have finally end-up with what I believe to be a decent abstraction for the Table Storage.

The purpose of this post is to outline the strategy adopted for this abstraction which is now part of Lokad.Cloud.

Table Storage (TS) comes with an ADO.NET provider part of the StorageClient library. Although I think that TS itself is a great addition to Windows Azure, frankly, I am disappointed by the quality of the table client library. It looks like an half backed prototype, far from what I typically expect from a v1.0 library produced by Microsoft.

In many ways, the TS provider now featured by Lokad.Cloud is a pile of workarounds for glitches that are found in the underlying ADO.NET implementation; but it's also much more than that.

The primary benefit brought by TS is a much cheaper way of accessing fine grained data on the cloud, thanks to the Entity Group Transactions.

Although, secondary indexes may bring extra bonus points in the future, cheap access to fine grained data is basically the only advantage of Table Storage compared to the Blob Storage at present day.

I believe there are couple of frequent misunderstandings about Table Storage. In particular, TS is nowhere an alternative to SQL Azure. TS features nothing you would typically expect from a relational database.

TS does feature a query language (pseudo equivalent of SQL), that supposedly support querying entities against any properties. Unfortunately, for scalability purposes, TS should never be queried without specifying row keys and/or partition keys. Indeed, specifying arbitrary properties may give a false impression that it just works ; yet to perform such queries, TS has no alternative but to scan the entire storage, which means that your queries will become intractable as soon your storage grows.

Note: If your storage is not expect to grow, then don't even bother about Table Storage, and go for SQL Azure instead. There is no point in dealing with the quirks of a NoSQL store, if you don't need to scale in the first place.

Back to original point, TS features cheaper data access costs, and obviously this aspect had to be central in Lokad.Cloud - otherwise, it would not been worthwhile to even bother with TS in the first place.

Fat entities

To some extend, Lokad.Cloud puts aside most of the property-oriented features of TS. Indeed, query aspects of properties don't scale anyway (except for the system ones).

Thus, the first idea of was to go for fat entities. Here is the entity class shipped in Lokad.Cloud:

    public class CloudEntity < T >
        public string RowRey { get; set; }
        public string PartitionKey { get; set; }
        public DateTime Timestamp { get; set; }
        public T Value { get; set; }

Lokad.Cloud exposes the 3 system properties of TS entities. Then, the CloudEntity class is generic and exposes a single custom property of type T.

When an entity is pushed toward TS, the entity is serialized using the usual serialization pattern applied within Lokad.Cloud.

This entity is said to be fat because the maximal size for CloudEntity in 1MB (actually it's 960KB) which corresponds to the maximal size for an entity in TS in the first place.

Instead of going for 64KB limitation per property, Lokad.Cloud offers an implementation that come with a single 1MB limitation for the whole entity.

Note: Lokad.Cloud relies, under the hood, on a hacky implementation which involves spreading the serialized representation of the CloudEntity over 15 binary properties.

At first glance, this design appears questionable as it introduces some serialization overhead instead of relying on the native TS property mechanism. Well, a raw naked entity costs about 1KB due to its Atom representation. In fact, the serialization overhead is negligible, even for small entities; and for complex entities, our serialized representation is usually more compact anyway due to GZIP compression.

The whole point of fat entities is to remove as much friction as possible from the end-developer. Instead of worrying about tight 64KB limits for each entity, the developer has only to worry about a single and much higher limitation.

Furthermore, instead of trying to cram your logic into a dozen of supported property types, Lokad.Cloud offers full strong-typing support through serialization.

Batching everywhere

Lokad.Cloud features a table provider that abstracts the Table Storage. A couple of key methods are illustrated below.

public interface ITableStorageProvider
  void Insert(string tableName, IEnumerable> entities);
  void Delete(string tableName, string partitionKeys, IEnumerable rowKeys);
  IEnumerable> Get(string tn, string pk, IEnumerable rowKeys);

Those methods have no limitations concerning the number of entities. Lokad.Cloud takes care of building batches of 100 entities - or less, since the group transaction should also ensures that the total request weight less than 4MB.

Note that the 4MB restriction of the TS for transactions is a very reasonable limitation (I am not criticizing this aspect) , but the client code of the cloud app is really not the right place to enforce this constraint as it significantly complicates the logic.

Then, the table provider also abstracts away all subtle retry policies that are needed while interacting with TS. For example, when posting a 4MB transaction request, there is a non-zero probability of hitting a OperationTimedOut error. In such a situation, you don't want to just retry your transaction, because its very likely to fail again. Indeed, time-out happens when your upload speed does not match the 30s time-out of the TS. Hence, the transaction needs to be split into small batches, instead of being retried as such.

Lokad.Cloud goes through those details so that you don't have to.


Table Storage gotcha in Azure

Table Storage is a powerful component of the Windows Azure Storage. Yet, I feel that there is quite a significant friction working directly against the Table Storage, and it really calls for more high level patterns.

Recently, I have been toying more with the v1.0 of the Azure tools released in November'09, and I would like to share a couple of gotchas with the community hoping it will save you a couple of hours.

Gotcha 1: no REST level .NET library is provided

Contrary to other storage services, there is no .NET library provided as a wrapper around the raw REST specification of the Table Storage. Hence, you have no choice but to go for ADO.NET.

This situation rather frustrating because ADO.NET does not really reflect the real power of the Table Storage. Intrinsically, there nothing fundamentaly wrong with ADO.NET, it just suffers the law of leaky abstractions, and yes, the table client is leaking.

Gotcha 2: constraints on Table names are specific

I would have expected all the storage units (that is to say queues, containers and tables) in Windows Azure to come with similar naming constraints. Unfortunately it's not the case, as table names do not support hyphens for example.

Gotcha 3: table client performs no client-side validation

If your entity has properties that do not match the set of supported property types then properties get silently ignored. I got burned through a int[] property that I was naively expecting to be supported. Note that I am perfectly fine with the limitations of the Table Storage, yet, I would have expected the table client to throw an exception instead of silently ignoring the problem.

Similarly, since the table client performs no validation, DataServiceContext.SaveChangesWithRetries behaves very poorly with the default retry policy as a failing call due to, say, and entity that already exists in the storage, is going to attempted again and again, as if it was a network failure. In this sort of situation, you really want to fail fast, not to spend 180s re-attempting the operation.

Gotcha 4: no batching by default

By default DataServiceContext.SaveChanges does not save entities in batch, but performs 1 storage call for each entity. Obviously, this is a very inefficient approach if you have many entities. Hence, you should really make sure that SaveChanges is called with the option SaveChangeOptions.Batch.

Gotcha 4: paging takes a lot of plumbing

Contrary to Blob Storage library that abstracts away most nitty-gritty details such as the management of continuation tokens, the table client does not. You are forced into a lot of plumbing to perform something as simple as paging through entities.

Then, back to the method SaveChanges, if you need to save more than 100 entities at once, you will have to deal yourself with the capacity limitations of the Table Storage. Simply put, you will have to split your calls into smaller ones: the table client doesn't do that for you.

Gotcha 5: random access to many entities are once takes even more plumbing

As outlined before, the primary benefit of the Table Storage is to provide a cloud storage much more suited than the Blob Storage for fine-grained data access (up to 100x cheaper actually). Hence, you really want to grab entities by batch of 100 whenever possible.

Turns out that retrieving 100 entities following a random access pattern (within the same partition obviously) is really far from being straightforward. You can check my solution posted on the Azure forums.

Gotcha 6: table client support limited tweaks through events

Although there is no REST level API available in the StorageClient, the ADO.NET table does support limited customization through events: ReadingEntity and WritingEntity.

It took me a while to realize that such customization was possible in the first place as those events feel like outliers in the whole StorageClient design. It's about the only part where events are used, and leveraging side-effects on events is usually considered as really brittle .NET design.

Stay tuned for an O/C mapper to be included in Lokad.Cloud for Table Storage. I am still figuring out how to deal with overflowing entities.


Serialization in the cloud: SharedContract vs. SharedType

Every time developers decide not to go for relational databases in cloud apps, they end-up with custom storage formats. In my (limited) experience, that one of the inescapable law of cloud computing.

Hence, serialization plays a very important role in cloud apps either for persistence or for transient computations where input data need to be distributed among several computing nodes.

In the case of Lokad.Cloud, our O/C mapper (object to cloud), our blob storage abstraction relies on seamless serialization. Looking for a serialization solution, we did initially go the quick & dirty way through the BinaryFormatter that has been available since .NET 1.1, that is to say forever in the .NET world.

Binary formatter is easy to setup, but pain lies ahead:

  1. No support for versioning, i.e. what will happen to your data if your code happen to change?
  2. Since it embeds all .NET type info, it's not really compact, even for small datastructure (if you just want to serialize a 1M double array, it's OK though, but that's not the typical situation).
  3. It offers little hope for interoperability of any kind. Even interactions with other distinct .NET Framework versions can be subject to problems.

Robust serialization approach is needed

With the advent of WCF (Windows Communication Foundation), Microsoft teams came up with a much improved vision for serialization. In particular, they introduced two distinct serialization behaviors:

Both serializers produce XML streams but there is a major design gap between the two.

Shared contract assumes that the contract (the schema in the XML terminology) will be available at deserialization time. In essence, it's a static spec while implementation is subject to evolution. Benefits are that versioning, and even performance to some extend, can be expected to be great as the schema is both static and closed.

Shared type, in the other hand, assumes that the concrete .NET implementation will be available at deserialization time. The main benefit of the shared type approach is its expressivity, as basically any .NET object graph can be serialized (object just need to be marked as [Serializable]). Yet, as price to pay for this expressiveness, versioning does suffer.

Serialization and O/C mapper

Our O/C mapper is designed not only to enable persistence (and performance), but also to ease the setup of transient computations to be run over the cloud.

As far persistence is concerned, you really want to go for a SharedContract approach, otherwise data migration from old .NET types to new .NET types is going to heavily mess-up your design through the massive violation of the DRY principle (you would typically need to have old and new types side by side).

Then, for transient computations, SharedType is a much friendlier approach. Indeed, why should you care about data schema and versioning, if you can just discard old data, and re-generate them as part of your migration? That's going to be a lot easier, but outdated data are considered as expendable here.

As a final concern for O/C mapper, it should be noted that CPU is really cheap compared to storage. Hence, you don't want to store raw XML in the cloud, but rather GZipped XML (which comes as a tradeoff CPU vs Storage in the cloud pricing).

The case of Lokad.Cloud

For Lokad.Cloud, we will provide a GZipped XML serializer based on a combination of both the DataContractSerializer and the NetDataContractSerializer to get the best of both worlds. DataContractSerializer will be used by default, but it will be possible switch to NetDataContractSerializer through a simple attribute (idea has been borrowed to Aaron Skonnard).