As a consultant specializing in enterprise Drupal, I often find myself answering a relatively small (and consistent) set of questions about the Drupal platform early on in any engagement. Many of the larger Drupal sites are media and publishing oriented, which makes these questions even more common across projects. As a result, I've started to compile a list of most common Drupal questions and the answers I've been able to provide. First up is the most common concern enterprises have about Drupal - scalability.
To be fair, scalability and performance mean different things to different organizations, so this question can be a bit misleading but it's still Drupal's biggest perceived issue, so it's generally the thing that gets asked first.
To begin with, let's take a look at some numbers. On projects in which I've been involved, I've seen the following stats achieved:
While those numbers might not represent the top tier of high-traffic sites, clearly many businesses' needs can effectively be met by a well-built Dupal site. Of course, "well-built" is the key. Fortunately, when it comes to scalability, many of the challenges that need to be overcome for a Drupal site are the same that crop up on any typical LAMP stack site.
Current Drupal best practice is to generally build your site for its features and then allocate time to tune modules, queries, database setup, hardware, caching, etc. Overall this practice works well, but it does rely on the development team to be experienced and disciplined throughout the project so that any refactoring needed at the end of the project doesn't become too onerous. A few things to watch out for along the way:
The database layer is generally where most scaling bottlenecks occur (again, as is typical in LAMP stack web apps) and there are a few quick and easy steps to take as a first pass at tuning up MySQL.
Because Drupal is, in the end, a database intenstive application the best way to improve performance is through aggressive caching. APC, memcache, CDNs, MySQL Proxy and static page caching all play a role in speeding up an enterprise Drupal deployment. Typical solutions include a master/slave database setup, optionally with MySQL proxy to split database reads and writes, using Drupal's cache router module to implement caching with memcache and offloading static content to a CDN (something that can be done without hacking Drupal, despite what you might read).
So the final word on Drupal scaling is that, while it may not be ready to run Amazon and Yahoo, there's not too much left that Drupal can't do performance-wise. Of course that doesn't mean it's completely ready to go out of the box, but as long as adequate attention is paid to performance-related best practices Drupal can be a solid option for high-volume sites.