Author

Portrait of Joannes Vermorel

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades,

Meta

Entries in cloudcomputing (12)

Saturday
Jul102010

Top 10 cloud computing predictions

The Microsoft World Partner Conference 2010 is due to begin next Monday, and it's clear that Windows Azure is going to be one of the product that will get the most attention this year.

Over the last 2 years, I have attended and even took part to many cloud computing talks, and I am hearing tons of very confused opinions on cloud computing, and even more concerning the future of cloud computing. Hence, here are my top 10 cloud computing predictions for the next 5 years.

1) Cloud will become mainstream in enterprise adoptions

Cloud computing is already mainstream in consumer markets. Amazon, Google, Yahoo, Microsoft, ... all of them are running on top of their own clouds. If you're using a web search engine, then you're using the cloud already. In the next 5 years, I expect the cloud to become the mainstream adoption pattern. I am NOT saying that the cloud will dominate the enterprise in just 5 years from now, I am saying that it will dominate setups and upgrades. It might take one or two decades to progressively move away from strict on-premise solutions.

2) ISVs will vastly dominate the overall cloud consumption

Yet, the migration toward the cloud will be implicit. Indeed, enterprises care little about cloud computing itself, they will buy SaaS solutions not raw processing power. The vast majority of those SaaS solutions will be powered by public clouds, but for non-IT companies this fact will be irrelevant. The economical forces will drive ISVs toward the cloud, which will vastly dominate the overall cloud consumption. Single-tenant apps have very hard time competing with the low management costs of multi-tenant apps. Nothing will actually prevent companies to buy raw cloud processing power, but I expect this behavior to be marginalized as the SaaS ecosystem grows.

3) Private clouds are nonexistent and will remain marginal

We keep hearing about private clouds, yet, if we exclude the few private clouds designed by internet consumer leaders (eBay, Yahoo, Facebook, Yandex, ...) that have not been turned into public clouds, there is NOTHING even close to a private cloud at present day on the market. The only product that would start looking like a private cloud is Eucaliptus, but it's still lightyears away from global solutions build on top of containerized data-centers that public clouds represent. The skills and the costs required to operate a cloud are steep, I can't figure out why would companies go for private clouds. Some will argue that control is of utmost importance, but shareholders might not agree when they will realize that a small cloud costs millions upfront, and millions for ongoing management. Although, companies with ad-hoc data centers will keep improving them, probably importing best practices established by major cloud hosters, but that's it. Yet, those improved data centers will still be extremely far feature-wise, reliability-wise, security-wise from public clouds.

4) Hybrid clouds are fantasy and will remain fantasy

Another myth I keep hearing about is the idea of hybrid clouds: you have your own private cloud, and when you lack capacity, you rent some extra from a public cloud. Although the idea is fascinating, IMHO, it's vastly impractical. Designing a true auto-scalable app on top of a cloud - any cloud - is already quite hard. Clouds are easing the scale-out process by offering some very normalized environments, but scaling-out remains a challenge, especially for enterprise apps. Offloading processing power into some heterogeneous computing environment is bad idea, software complexity would skyrocket, and it will fail like grid computing failed before. What was a nice idea in theory was just way to difficult to be routinely implemented. Although, please note I am not stating that hybrid clouds are impossible; I am just stating that it's very unwise, and that complexity will comes back as punishment.

5) Cloud mashups will be the dominant pattern

I expect SaaS mashups to become the dominant pattern in enterprise environments - for consumer environments, it's already the case. Companies and people alike will combine the apps they want most, irrespective of the underlying clouds. As a results, scenarios where a single company adopts Salesforce for the CRM, Microsoft BPOS for the collaborative suite and Netsuite for the ERP are likely. Obviously, those mashups will requires very capable integration tools, which will also be offered on the cloud. RunMyProcess would be a good example of such tools.

6) Self-hosted servers will be considered as liability

Some people consider self-hosted servers as more secure than remote or cloud-hosted solutions. As far I can tell, 99.99% of the time, this appears to be complete fallacy. Securing a computing environment takes skills that even my bank (a very large international bank) is obviously lacking. The situation is worse in nearly all non-IT companies I have investigated while running Lokad. Some companies happen to be very confident in their IT security, but most of time, it's just over-confidence, with no tangible processes to support this confidence. As cloud computing grows more mature, I expect the community consensus to gradually converge toward the opinion that unless proven otherwise, any self-hosted server should be considered as an IT liability.

7) No1 cloud issue will stay the lack of qualified manpower

Media, influencers, integrators, and cloud providers keep discussing the relative strengths and weaknesses of the cloud, but there is one issue that dwarf all others, and yet, this issue is barely mentioned: the extreme lack of talented workforces to develop in the cloud. Not believing me? Just try to hire any experienced cloud computing software architect. Hiring good developers is already extremely hard, hiring good developers who happen to have skills and experience in large scale distributed systems is only harder.

8) Fine-grained geolocation will be the No1 entry barrier

Two years ago I was stating that cloud computing was an arena for big players. I still believe this isn't going to change. In particular, geolocation capabilities - aka the possibility to bring the computing resources close to the end-user - are already exponentially increasing the entry costs in the cloud market. Closer data-centers will mean lower latencies, and smoother UI behaviors for cloud-hosted apps. Ultra-responsive UI are so much more enjoyable that it's little wonder than Google recently started to add website speed as an extra criterion for their website ranking. In 5 years, clouds will no more be expected to have half a dozen of worldwide locations (Windows Azure has 6 locations at the moment), but dozens, with a data-center close to every major megalopolis. Considering that each data center costs more than a few hundred millions USD, entering the cloud market will be just impossible for anyone but the largest IT companies of the planet.

9) Cloud computing is not going to kill desktop apps

Some believe that the cloud is going to kill desktop apps. I don't. I believe that all software areas will be growing (cloud / desktop / embedded / games, ...). There will be more desktop apps in 5 years from now, and WAY much more cloud apps. Although cloud computing will shift the purpose and the value of desktop apps. The AppStore is a good example of the strong interactions that are likely to exist between non-web apps and the cloud: apps are available on the cloud at any time, typically interact with the cloud, and bring a top user experience that would be very hard to deliver otherwise. And no I don't think that World of Warcraft is going to run on HTML 5 any time soon.

10) Dev stacks are going to develop their cloud affinity

The software world is basically divided between a hand-few development stacks: Microsoft/.NET, Linux/LAMP, Oracle/Java, ... I expect each stack to develop some growing affinity with one public cloud in particular. The .NET world as already a very natural orientation toward Windows Azure. Linux-based solutions will keep moving forward with Amazon, eventually Rackspace. As Google is expending the coverage of its App Engine, I expect more development Java/Python tools to be released - basically the ones internally developed and used at Google. Some are dreaming about cloud interoperability, but considering the pace of change in the cloud computing world, I don't see that happening in the next 5 years.

What are your predictions for the cloud in the next 5 years?

Saturday
May152010

Really Simple Monitoring

Moving toward cloud computing relieves from (most) hardware downtime worries, yet, cloud computing is no magic pill that garanties that every single of our apps is ready to serve users as expected.

You need a monitoring system to achieve this. In particular, OS uptime and simple HTTP responsiveness is only scratching the surface as far monitoring is concerned.

In order to go beyond plain uptime monitoring, Lokad has started a new Windows Azure open source project named Lokad.Monitoring. The project comes with several tenants:

  • A monitoring philosophy,
  • A XML format, the Really Simply Monitoring (shamelessly inspired by RSS),
  • A web client for Windows Azure

Beta is version is already in production. Check project introduction page.

Wednesday
Apr282010

Sqwarea, open source game on Windows Azure

Beyond running a small software company, I am also responsible for the Sofware Engineering and Distributed Computing course at the ENS Paris. For the fourth year in a row, Microsoft offered gracious support for this course (include some Windows Azure resources).

Every year, a small dozen of 1st year Computer Science students take over a sofware project. Last year, my students produced Clouster, a scalable clustering algorithm on top of Windows Azure. It was already significant achievement considering the beta status of Windows Azure at the time (student upgraded twice from a SKD version to another during the time of the course).

This year, my students went (*) for an online massively multiplayer strategy game named Sqwarea (heavy contraction of square+war+area).

You are a King battling over a gigantic map to conquer the world. Train soldiers, conquer new territories, and resist the assault of other kingdoms. The world is flat, see for yourself.

Despite my teaching methods, students managed to do really great (especially considering that we are only at 2/3 of the project at this point of time), so let's review a few salient facts about this project:

  • Open source, see sqwarea.codeplex.com
  • ASP.NET MVC, C#, jQuery, OpenId for the front-end.
  • Lokad.Cloud for the persistence, and back-end execution framework.
  • Windows Azure used as the hoster.
  • Table Storage for the persistence (1 entity per map square).
  • Queue Storage to spread the workload among VMs.

 Then, in order to make sure, project wasn't going to be easy, I included a game rule real hard to implement:

People and soldiers have to be constantly reminded who is the King; otherwise, they just do it their own way. If, after a conquest, a part of your kingdom is no more connected to your King through a path of controlled land squares, then the disconnected area is reverted as neutral.

Apparently, students managed to implement a good (and expectedly complicated) scheme to get it this connectivity rule working in a very scalable way.

(*) Actually, every year, I choose the project to be carried on by my students. Hence, if you think the project idea is lame, blame me.

Friday
Apr092010

.NET profiler for Windows Azure

Under modern managed runtimes, performance profiling comes in two flavors:

  • CPU profiling
  • memory profiling

In last decade, the No1 breakthrough in the profiling arena was the introduction of sampling. Instead of intercepting every single method call, every single object allocation - introducing a 10x slowdown in the process - the profiler takes only sample at regular intervals.

Sampling decreases the accuracy in favor of gain in performance. In practice, sampling is not just a tradeoff, it's a game changer.

Indeed, even a modest sampling rate - say 2 or 3% of your processing capacity - give you already incredibly precise execution profile. Hint: with a 2Ghz CPU, 1% already accounts for 10M cycles per second.

With sampling, it becomes possible to aggregate fine grained execution statistics in production conditions, or even in actual production leaving the profiler ON all the time.

In the .NET ecosystem, Microsoft has been offering for years a free (yet rudimentary) memory profiler - while 3rd party vendors were providing more advance tools, such as the excellent dotTrace by JetBrains.

Lately, I discovered that Microsoft had released a new free CPU Profiler for .NET along with Visual Studio 2008. Caution: while running this tool for the 1st time, I did get a BSOD caused by an unsupported proc, problem was fixed through this hotfix .

The MS profiler is rather crude, especially on UI part. Yet, its strong orientation toward command-line and CSV/XML exports makes it rather handy for continuous integration scenarios where the profiler is run behind unit tests (or batch execs) putting the system under performance stress.

Back to the cloudy part announced in the post title, I believe that profilers will soon be considered a must-have components for cloud computing. Indeed, with the cloud you end-up precisely charged for the resources you consume.  Thus, the performance gains obtained with a profiler have a very real and very measurable ROI.

Cloud computing is not cheap per se: if you really want cheap stuff, you can roll your own hardware and get a 90% discount. Cloud computing is low cost only if performance is kept under control: no need to be a performance hero, but poor performance - that could be tolerated in good ol'days where the customer was paying for the hardware too - now impacts the SaaS vendor instead.

Forecasters expect the cloud computing market to top over dozens of billions: cost-killer technologies are bound to emerge in such a large market, and I expect profilers to be one of them.

Comparing the very marginal overhead of a sampling profiler to the significant savings that could be obtained by fine-tuning the precise hotspots of cloud apps, I expect cloud profilers to be used in the background for all apps in both testing AND production environments alike.

The strong orientation of Windows Azure toward .NET makes it one of the best cloud to introduce early on such a profiling layer on top of cloud apps.

I am actually toying with the idea of trying to run the MS profiler on Azure directly (you can run arbitrary exec files), however it may prove a bit difficult for the time being.

Monday
Mar222010

Thinking an academic package for Azure

This year, this is the 4th time that I am a teaching Software Engineering and Distributed Computing at the ENS. The classroom project of 2010 is based on Windows Azure, like the one of 2009.

Today, I have been kindly asked by Microsoft folks to suggest an academic package for Windows Azure, to help those like me who wants to get their students started with cloud computing in general, and Windows Azure in particular.

Thus, I decided to post here my early thoughts on that one. Keep in mind I have strictly no power whatsoever on Microsoft strategy. I am merely publishing here some thoughts in the hope of getting extra community feedback.

I believe that Windows Azure, with its strong emphasis on tooling experience, is a very well suited platform for introducing cloud computing to Computer Science students.

Unfortunately, getting an extra budget allocated for the cloud computing course in the Computer Science Department of your local university is typically complicated. Indeed, very few CS courses require a specific budget for experiments. Furthermore, the pay-as-you-go pricing of cloud computing goes against nearly every single budgeting rule that universities have in place - at least for France - but I would guess similar policies in other countries. Administrations tend to be wary of “elastic” budgeting, and IMHO, rightfully so. This is not a criticism, merely an observation.

In my short 2-years cloud teaching experience, a nice academic package sponsored by Microsoft has tremendously simplified my situation to deliver the “experimental” part of my cloud computing course.

Obviously, unlike software licenses, offering cloud resources costs very real money. In order to make the most of resources allocated to academics, I would suggest narrowing the offer down to the students who are the most likely to have an impact on the software industry in the next decade.

The following conditions could be applied to the offer:

  1. Course is setup by a Computer Science Department.
  2. It focuses on cloud computing and/or software engineering.
  3. Hands-on, project-oriented teaching.
  4. Small classroom, 30 students or less.

Obviously there are plenty of situations where cloud computing would make sense, and not fit into these constraints such as bioinformatics class with data crunching project or large audience courses with +100 students … but the package that I am proposing below is unlikely to match the needs of those situations anyway.

For the academic package itself, I would suggest:

  1. 1 book per student on Visual Studio 20XY (replace XY by the latest edition).
  2. 4 months of Azure hosting with:
    • 4 small VMs
    • 30 storage accounts, cumulative storage limited to 50GB.
    • 4 small SQL instances of 1GB.
    • Cumulative bandwidth limited to 5GB per day.

Although, it might be surprising in this day and age of cloud computing, I have found that artifacts made out of dead trees tend to be the most valuable ingredient for a good cloud computing course, especially those +1000 pages ones about the latest versions of Visual Studio / .NET (future editions might even include a chapter or two about Windows Azure which would be really nice).

Indeed, in order to tackle the cloud, students must first overcome difficulties posed by their programming environments. One can argue that everything can be found on the web. That’s true, but there is so much information online about .NET and Visual Studio, and that students get lost and lose their motivation if they have to browse through an endless flow of information.

Furthermore, I have found that teaching basics of C# or .NET in a Computer Science is a bad idea. First, it's like an attempt to kill students out of sheer boredom. Just imagine yourself listening for 3h straight at someone enumerating keywords of a programming language. Second, you have little or no control on the background of your students. Some might be Java or C++ gurus already; while some might have never heard of OO programming.

With the book on hand, I suggest to simply ask students to read a couple of chapters from one week to the next, and to interrogate them on their reading at the beginning of each session.

Then, concerning the Windows Azure package itself, I suggest 4 months worth of CPU as it should fit for most courses. If the course spread longer than 4 months then I would suggest students to start optimizing their app not to use all the 4 VMs all the time.

4 VMs seems just enough to feel both the power and the pain of scaling out. It brings a handy 4x speed-up if the app is well designed, but represents a world of pain if the app does not correctly handle concurrency.

Same idea applies to SQL instances. Offering a single 10GB instance would make things easier, but  course should be focused on scaling out, not scaling up. Thus, there is no reason to make things easier here.

In practice, I have found that offering individual storage accounts simplifies experiments, although there is little support for offering either lot of storage or lot of bandwidth.

In total, the package would represent a value of roughly $2500 (assuming $30 per book), and, from a different angle, about $100 per student. Not cheap, but attracting talented students seems worth a worthy (albeit long-term) investment.

1.     Focus on cloud computing and/or software engineering.