There is lots of information available on how to build scalable web applications. I thought it would be nice to share a little bit of how we at
Ibuildings build scalable web applications.
Often we're dealing with large applications that need to scale very well. Sometimes this is due to high load/traffic but, it can also be that the web application is used in many ways. For example an application can have a heavily used
CRM like back-end system. The front-end website uses the same system to publish information from it. You can imagine that if you didn't set it up correctly and the load increases on the back-end, it affects the website users and vice versa.
In order to have good scalability your application has to be designed with scalability in mind. At Ibuildings we always start with a functional and technical design specification. The design will cover these topics and therefor we address it in an early stage. Of course every application requires its own level of scalability and performance.
When designing there are a number of things to consider. Performance is one of the things that immediately comes to my mind. Better performance usually means more requests per second and less intensive operations. A good caching strategy is important. It helps you to increase application performance. A general rule of thumb is to not use PHP at all. Don't get me wrong. I am not saying you should use some other programming language. I am saying cache your data close to the HTML output. Or even better, generate HTML. Static HTML is the fastest way to serve your website. Luckily there are great tools out there that'll help you cache your data. We often use tools like
ZendPlatform,
APC,
memcached and
Smarty.
Build software that is loosely coupled. It will be much easier to - if required in the future - serve different components on different servers. Often you'll see application bottlenecks in just a small part of the application. If you build your software loosely coupled it is much easier to take that part and put it on dedicated hardware.
The infrastructure architecture must support the application architecture. For example, take PHP sessions. If you have a web server cluster it requires a
solution for sharing your sessions between servers. What about storage? Do you allow file uploads like images or documents? Can all servers access the same data? We use things from expensive high-end
SAN's to simple
rsync solutions. Do servers have their own local cache or do you use a shared caching mechanism? Is it possible to offload tasks to designated servers? Building on a
SOA architecture can help spread load across multiple servers.
When it comes to a database we prefer to use
MySQL. Designing your database includes thinking about scalability and performance. Replication is a good way to spread read queries across multiple servers. Unfortunately it doesn't help when you have lots of writes. Data partitioning is also a good thing to consider. For example each group of database servers can serve a part of your database. This will allow you to spread the writes to different groups of servers. Of course this also has some cons. The application has to be aware of this partitioning and fetching data can be more complex.
While developing make sure that you develop with a database that'll reflect the real thing. That is to say, fill your database with content before you start developing. A database with just a couple of records behaves different then one with millions of records. Think about how a user manages millions of records. This usually requires a different approach then in the case you only have three.
In future blog posts I'll discuss the use of some tools and best practices in detail. So if you found this post interesting - stay tuned.
In previous posts by my colleagues Martin and Lineke you could already read that caching is becoming important when you are going to build scalable web applications. For many projects we use ATK as our framework, but it didn't have a caching API. So in th
Tracked: Apr 07, 16:41