Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta
Tags

Entries in review (9)

Tuesday
Jul272010

Wish list for Relenta CRM

At Lokad, we have using the Relenta CRM for nearly two years. It's an excellent lean CRM that comes with a core focus on emails which happen to represent about 90% of our interactions with clients and prospects.If you happen to be an ISV, Relenta is worth having a closer look.

Although, I have been missing a few key features in Relenta for a long time. Hence, I taking the time here to post my wish list for Relenta.

1. Accounts

Relenta only deals with Contacts yet, when prospecting larger companies, many contacts are typically involved. It would be much nicer if it was possible to create 1-to-many relationships between Accounts and Contacts. In particular, this would let the Relenta user browse at a glance all the latest interactions related to a particular account, instead of jumping from one contact to the next.

2. Recent updates

One of the feature that I like the most in wikis are their abilities to display the recent changes. Through recent change, you can gain immediate insights in what other people are doing, without having to bother to actually ask them.

Presently, there is no way to easily figure out who has been doing what in Relenta.  It would much nicer if a stream of recent updates was available for browsing. In particular, display could be made more or less compact by aggregating updates per contact (or per account). Eventually, the stream could even be made available as RSS.

3. Activity capture API

The Lead Capture API of Relenta is a killer feature due to its simplicity. For an ISV, it's a super simple way to collect all trial registrations that keeps flowing through our online apps with extremely limited integration grunt-work.

Yet, although it's very simple to automate the Contact creation in Relenta, it's not possible to automate the insertion of Activities later on the very same contact. This feature would be extremely handy to automatically report payments, or any kind of noticable activities (in the case of Lokad that would be large forecasts retrieval for example).

4. Refined tagging

Tagging is one of the best idea of the Web 2.0 wave. It's a great way to organize complex yet little structured content.

Relenta already provide a minimal tagging system, yet there is no tag auto-completion (killer feature) and it's not possible to search against multiple tags. Pushing a bit more work on tags would be a great move forward to make the most of them.

5. iCalendar support

iCalendar is a very nice and popular format to send meeting request. Presently, Relenta does not support .ics attachments and meeting requests appears as completely garbled. It would be really nice if Relenta was support iCalendar with the possibility to acknowledge meeting requests.

Monday
Feb082010

Big Wish List for Windows Azure

At Lokad, we have been working with Windows Azure for more than 1 year now. Although Microsoft is a late entrant in the cloud computing arena, So far, I am extremely satisfied with this choice as Microsoft is definitively moving forward in the right direction.

Here is my Big Wish List for Windows Azure. It's the features that would turn Azure into a killer product, deserving a lion-sized market share in the cloud computing marketplace.

My wishes are ordered by component:

  • Windows Azure
  • SQL Azure
  • Table Storage
  • Queue Storage
  • Blob Storage
  • Windows Azure Console
  • New services

Windows Azure

Top priority:

  • Faster CPU burst: The total time between initial VM request (through the Azure Console or the Management API), and the start of the client code execution is long, typically 20min, and in my (limited) experience above 1h for any larger number of VMs (say +10 VMs). Obviously, we are nowhere near real-time elastic scalability. In comparison, SQL Azure needs no more than a few seconds to instantiate a new DB. I would really like to see such an excellent behavior on the Windows Azure side.
  • Smaller VMs: for now, the smallest VMs are 2GB large and costs $90/month, which brings the cost of a modest web app to 200 USD/month (considering a web role and a worker role). Competitors (such as Rackspace) are already offering much smaller VMs, down to 256MB per instance priced about 10x cheaper. I would really like to see that on Azure as well. Otherwise, scaled down apps are just not possible.
  • Per minute charge: for now Azure is charging by the hour, which means that any hour that your start consuming will be charged fully. Obviously, it would be a great incentive to improve performance to start charging by the minute, so that developers could really fine tune their cloud usage to meat the demand without wasting resources. Obviously, such a feature makes little sense if your VMs take 1h to get started.
  • Per-VM termination control: Currently, it is not possible to tell the Azure Fabric which VM should be terminated; which is rather annoying. For example, long running computations can be interrupted at any moment (they will have to be performed again) while idle VMs might be kept alive.
  • Bandwidth and storage quota: most apps are never supposed to be require truckloads of bandwidth or storage. If they do, it just means that something is going really wrong. Think of a loop endlessly pooling some data from a remote data source. With pay-as-go, a single VM can easily generates 10x its own monthly costs through a faulty behavior. To prevent such situations, it would be much nicer to assign quota for roles.

 Nice to have:

  • Instance count management through RoleEnvironment: The .NET class RoleEnvironment provides a basic access to the properties of the current Azure instance. It would be really nice to provide here a native .NET access to instance termination (as outlined here above), and instance allocation requests - considering that each role should be handling its own scalability.
  • Geo-relocation of services: Currently, the geolocation of a service is set at setup-time, and cannot be changed afterward. Yet, the default location is "Asia" (that's the first item of the list), which makes the process quite error-prone (any manual process should be considered as error-prone anyway). It would nicer if it was possible to relocate a service - eventually with a limited downtime, as it's only a corrective measure, not a production imperative.

SQL Azure

Top priority:

  • DB snapshot & restore toward the Blob Storage: even if the cloud is perfectly reliable, cloud app developers are not. The data of a cloud app (like any other app btw) can be corrupted by a faulty app behavior. Hence, frequent snapshots should be taken to make sure that data could be restored after a critical failure. The ideal solution for SQL Azure would be to dump DB instances directly into the Blob Storage. Since DB instances are kept small (10GB max), SQL Azure would be really nicely suited for this sort of behavior.
  • Smaller VM (starting at 100MB for $1 / month): 100MB is already a lot of data. SQL Azure is a very powerful tool to support scaled-down approaches, eventually isolating the data of every single customer (in case of a multi-tenant app) into an isolated DB. At $10/month, the overhead is typically too large to go for a such a strong isolation; but at $1/month, it would become the de-facto pattern; leading to smaller and more maintainable DB instances (as opposed to desperately trying to scale up monolithic SQL instances).
  • Size auto-migration: Currently, a 1GB DB cannot be upgraded as 10GB instances. The data has to be manually copied first, and the original DB deleted later on (and vice-versa, the other way around). It would be much nicer if SQL Azure was taking care of auto-scaling up or down the size of the DB instances (within the 10GB limit obviously).

Nice to have:

  • Geo-relocation of service: Same like above. Downtime is OK too, just a corrective measure.

Table Storage

Top priority:

  • REST level .NET client library: at present time, Table Storage can only be accessed though an ADO.NET implementation that proves to be rather troublesome. ADO.NET feels in the way if you really want to get the most of Table Storage. Instead, it would be much nicer if a .NET wrapper around the REST API was provided as low-level access.

Nice to have:

  • Secondary indexes: this one has already been announced; but I am re-posting it here as it would be a really nice feature nonetheless. In particular, this would be very handy to reduce the number of I/O operations in many situations.

Queue Storage

Nice to have:

  • Push multiple messages at once: the Queue offers the possibility to dequeue multiple messages at once; but messages can only be queued one by one. Symmetrizing the queue behavior by offering batch writes too would be really nice.

Blob Storage

Nice to have:

  • Reverse Blob enumeration: prefixed Blob enumeration is probably one of the most powerful of the Blob Storage. Yet, items can only be enumerated in increasing order against their respective blob names. Yet, in many situation the "canonical" order is exactly the opposite of what you want (ex: retrieve blob names prefixed by dates, starting by the most recent ones). It would be really nice if it was to possible to enumerate the other way around too.

Windows Azure Console

The Windows Azure Console is probably the weakest component of Windows Azure. In many ways, it's a real shame to see such a good piece of technology so much dragged down by the abysmal usability of its administrative web client.

Top priority:

  • 100x speed-up: when I say 100x, I really mean it; and even with 100x factor, it will still be rather slow by most web standards, as refresh latency of 20min is not uncommon after updating the configuration of a role.
  • Basic multi-user admin features: for now, the console is a mono-user app which is quite a pain in any enterprise environment (what happens when Joe, the sysadmin, goes in vacations?). It would much nicer if several Live ID could be granted access rights to an Azure project.
  • Billing is a mess, really: beside the fact that about 10 counter-intuitive clicks are required to navigate from the console toward your consumption records; the consumption reporting is still of substandard quality. Billing cries for massive look & feel improvements.

Nice to have:

  • Project rename: once named, projects cannot be renamed. This situation is rather annoying as there are many situations that would call for a naming correction. At present time, if you are not satisfied with your project name, you've got no choice but to reopen an Azure account; and starts all over again.
  • Better handling of large projects: the design of the console is OK if you happen to have a few services to manage, but beyond 10 services, the design starts being messy. Clearly, the console has not been designed to handle dozens of services. It would be way nicer to have a compact tabular display to deal with the service list.
  • Aggregated dashboard: Azure services are spread among many panels. With the introduction of new services (Dallas, ...), getting a big picture of your cloud resources is getting more and more complex. Hence, it would be really nice to have a dashboard aggregating all resources being used by your services.
  • OpenID access: Live ID is nice, but OpenID is nice too. OpenID is taking momentum, I would be really nice to see Microsoft supporting OpenID here. Note that there is no issue to support LiveID and OpenID side by side.

New services

Finally, there are a couple of new services that I would be thrilled to see featured by Windows Azure:

  • .NET Role Profiler: in a cloud environment, optimizing has a very tangible ROI, as each performance gain will be reflected through a lower consumption bill. Hence, a .NET profiler would be a killing service for cloud apps based on .NET. Even better, low overhead sampling profilers could be used to collect data even for systems in production.
  • Map Reduce: already featured by Amazon WS, it's such a massively useful for the rest of us (like Lokad) who perform intensive computations on the cloud. Microsoft has already been moving forward with DryadLinq in this direction, but I am eager to see how Azure will be impacted.

 

This is a rather long list already. Did I forget anything? Just let me know.

Tuesday
Jul282009

Thoughts about the Windows Azure pricing

Microsoft has recently unveiled its pricing for Windows Azure. In short, Microsoft did exactly align with the pricing offered by Amazon. CPU costs $0.12 / h, meaning that a single instance running 24/24 for a month costs $86.4 which is fairly expensive compared to classical hosting provider where you can get more for basically half the price.

But well, this situation was expected as Microsoft probably does not want to start a price war with his business partners still selling dedicated Windows Server hosting. Current Azure pricing is sufficiently high to deter most companies except the ones who happen to have peaky needs.

To me, the Azure pricing is fine except in 3 areas:

  • Each Azure WebRole costs at least $86.4 / month no matter how few web traffic you have (reminder: with Azure you need a distinct webrole for every distinct webapp). This situation is caused by the architecture of Windows Azure where a VM gets dedicated for every WebRole. If we compare to Google App Engine (GAE), the situation does not looks to good for Azure, indeed, with GAE, hosting a low traffic webapp is virtually free. Free vs. $1000 / year is likely to make a difference for most small / medium businesses, especially if you end-up with a dozen of webapps to cover all your needs.

  • Cloud Storage operations are expensive: the storage itself is rather cheap $0.15 / GB / month, but the cost of $0.01 per 10K operations might be a killer for cloud apps intensively relying on small storage operations. Yes, one can argue that this price ain't cheaper with AWS, but this is not entirely true as AWS provides other services such as the block storage that comes with 10x lower price per operation (EBS could be used to lower the pressure on blob storage whenever possible).

  • Raw CPU at $0.12 / h is expensive and Azure offers no solution to lower this price whereas AWS offers CPU at $0.015 / h through their MapReduce service.

Obviously, those pricing weaknesses closely reflect missing cloud technologies for Azure (at the moment). The MapReduce issue will be fixed when Microsoft ports DryadLinq to Azure. Block storage and shared low cost web hosting might be also on their way too (although I have little info on that matter). As a side note, the Azure Cache Provider might be a killing tool to reduce the pressure on the cloud storage (but pricing is unknown yet).

As a final note, it's interesting to see that the cloud computing pricing is really dependent on the quality of the software used to run the cloud. Better software typically leads to computing hardware being delivered at much lower costs, almost 10x lower costs in many situations.

Tuesday
Nov112008

Cloud computing: a personal review about Azure, Amazon, Google Engine, VMWare and the others

My own personal definition of cloud computing is a hosting provider that delivers automated and near real time arbitrary large allocation of computing resources such as CPU, memory, storage and bandwidth.

For companies such as Lokad, I believe that cloud computing will shape many aspects of the software business in the next decade.

Obviously, all cloud computing providers have limits on the amount of resources that one can get allocated, but I want to emphasize that, for the end-user, the cloud is expected to be so large that the limitation is rather the cost of resource allocation, as opposed to hitting technical obstacles such as the need to perform a two-weeks upgrade from one hosting solution to another.

Big players arena


Considering that the ticket for state-of-the art data centers is now reaching $500M, cloud computing is an arena for big players. I don't expect small players to stay competitive for long in this game.

The current players are

  • Amazon Web Services, probably the first production-ready cloud offer on the market.

  • Google App Engine, a Python cloudy framework by Google.

  • Windows Azure just unveiled by Microsoft a few weeks ago.

  • VMWare specialist of virtualization who unveiled their Cloud vService last September.

  • Salesforce and their Platform as a Service offering. Definitively cloud computing, but mostly restricted to B2B apps oriented toward CRM.

Then, I expect a couple of companies to enter the cloud computing market within the next three years (just wild guesses, I have no insider's info on those companies).

  • Sun might go for a Java-oriented cloud computing framework, much like Windows Azure, leveraging their VirtualBox product.

  • Yahoo will probably release something based on Hadoop because they have publicly expressed a lot of interest in this area.

There will most probably be a myriad of small players providing tools and utilities built on top of those clouds, but I rather not expect small or medium companies to succeed at gaining momentum with their own grid.

In particular, it's unclear for me if open-source is going to play any significant role - at the infrastructure level - in the future of cloud computing. Although open-source will present at the application level.

Indeed, open-source is virtually nonexistent in areas such as web search engines (yes, I am aware of Lucene, but it's very far from being significant on this market). I am expecting a similar situation for the cloud market.

Benefits


Some people are about privacy, security and reliability issues when opting for a cloud provider. My personal opinion on that is that those points are probably among strongest benefits of the cloud.

Indeed, only those who have never managed loads of applications may believe that homemade IT infrastructure management efficiently address privacy, security and reliability concerns. In my experience, achieving a good level of security and reliability is hard for IT-oriented medium-sized companies and much harder for large non-IT-oriented companies.

Also, I am pretty sure that those concerns are among top priorities for big cloud players. A no-name small cloud hosting company can afford a data leak, but for a Google-sized company, the damage caused by such an accident is immense. As a result, the most rational option consists in investing massive amount of efforts to prevent those accidents.

Basically, I think that clouds can significantly reduce the need for system administrators and infrastructure managers by providing a secure and reliable environment where getting security patches and fighting botnets is part of the service.

Drawback: re-design for the cloud


The largest drawback that I can see is the amount of work needed to migrate applications toward clouds. Indeed, cloud hosting is a very different beast compared to regular hosting.

  • Scalability only applies with proper application design - which varies from one cloud to another.

  • Data access latency is large: you need data caching everywhere.

  • ACID properties of your storage are loose at best.

Thus, I expect that the strongest hindering factor for cloud adoption will be the technical challenges caused by the cloud itself.

If you don't need scalability, hosting on expensive-but-reliable dedicated servers is still the fastest way to bring a software product to the market. Then, if you have happen to have massive computing needs, then you probably have massive sales as well, and well, sales fixes everything.

Computing resources being commoditized? Not so sure.


With all those emerging clouds, will we see a commoditization of the computing resources? I don't expect it.

Actually, cloud frameworks are very diverse, and switching from one cloud to another is going to involve massive changes at best and complete rewrite at worst. Let's see

  • Amazon provides on-demand instantiation of near physical servers running either Linux or Windows. The code can be natively executed on top of custom OS. Scalability is achieved through programmatic computing node instantiation.

  • Google App Engine provides a Python-only (*) web app framework. Each web request gets treated independently, and scalability is a given. The code is executed in a sandboxed virtual environment. The OS is mostly irrelevant.

  • Windows Azure offers a .NET execution environment associated with IIS. The code is executed in a sandboxed virtual environment on top of a virtualized OS. Scalability is achieved by having working instances "sleeping" and waiting for the surge of incoming work.

  • VMWare takes any OS image and bring it to the cloud. Scalability is limited but other benefits apply.

  • SalesForce provides a specific framework oriented toward enterprise applications.

(*) I guess that Google will probably release a reduced Java framework at some point, much like Android.

Thus, for the next couple of years, choosing a cloud hosting provide would most probably mean a significant vendor lock-in. One more reason not to go for small players.

Since cloud computing will be an emerging market for at least 5 years. YAWG - Yet Another Wild Guess: 18 months to get the cloud offers out of their beta statuses, 18 months to train hordes of developers against those new frameworks, 18 months to write or migrate apps. During this time, I expect aggressive pricing from all actors, and little or no abuse of the "lock-in" power.

Then, when the market matures, I guess that 3rd party providers will provide tools to ease, if not to automate, the migration from one cloud to another much like the Java-.NET conversion tools.

Sunday
Oct212007

Velib's from a software engineer viewpoint

The Velib's are becoming insanely popular in Paris because of the strikes (strikes in public transportations is a national sport in France, a bit like baseball is in the US). Thus, I have been taking my first Velib ride yesterday, a few months after their initial launch.

Velib picture

The Velib both the name of a public bike renting system in Paris but also the name of the bike itself. There are now 10.000 Velib's in Paris (the figure will increase up to 20.000 at the beginning of 2008). The key idea is that take a Velib from any Velib station and put it back into any other Velib station (it does not have to be the same station).

Velib's are a bit bulky (17kg), but in overall they are quite nicely designed.

In my opinion, there are two main weaknesses in the current Velib's system

  • the Velib traffic regulation

  • the software interface of the Velib renting system

The idea of taking/letting the Velib wherever you want is quite nice. Yet, in practice, there are very important daily migrations of Velib's within Paris. Basically, in the morning you observe that all the Velib are taken (by the people) toward the inner center of Paris. Then, at the end of the day, there is the opposite flux, and the Velib's get massively migrated to the outer part of Paris again.

For the average user, strong migrations means that that you are having hard-time to actually find a free Velib in your starting area; but also that you are having hard-time again to find a free slot to park your Velib in your arrival. In order to overcome such a situation, the deal with JCDecaux (the company in charge of the Velib system) include some Velib traffic regulation to organize counter-migration of the Velib's (through special trucks). Yet, I suspect that the initial deal was massively under-estimating the strength in the migrations in Paris.

At this point, I can hope for two things: Paris re-negotiates with JCDecaux another agreement to increase the Velib traffic regulation; and/or JCDecaux upgrades its traffic regular software to anticipate the migration and respond more pro-actively to them.

Also, the software interface to rent your Velib is a pain. The first mistake comes from the fact that there is not one but two display devices: a big color digital screen that displays the main interface and below, a small alphanumeric display that displays some informations related to the credit card processing. Together those two display devices are a real pain, because you are never sure were to look at while waiting for the next instruction.

Then, the total number of keys that have to pressed iteratively on the numeric pad to do a rent-for-the-day operation is completely insane. I have quickly lost the track, but it must be around 50 key operations or so; which takes 10 mins no matter how much familiarity you have with the system (I was yesterday assisted by somehow who did dictate to me the instructions in order to speed-up the process).

Among the things that are plain nuts with the current UI, I think the password management is a design truly born in Hell. You have to choose a password, then confirm your password, then re-enter again your password. Now, you are asked to enter your credit card password; don't mix the two of them or your going to block your credit card (and get sent back to the starting point). Actually, the whole password thing is completely useless. The credit card should be the default way to perform authentication for those who do not have an RFID pass (the RFID pass comes with the 1-year subscription). That would save half of the operations.

Not sure that the Velib UI would have succeeded against any hallway testing; yet, during the strikes you got the perfect excuse to be late anyway. You can perfectly afford some 20min struggle to rent your Velib.