Web & Cache System

Wow…It seems I took a short vacation from writing hobby 🙂 . Apologies to my friends who have been waiting to read more from my thoughts.

This is a topic not widely known to many people. I have been discussing about this with my mentor Santhosh Tuppad & thought of sharing this with my fellow friends. I learned a lot from Perze Ababa & Vaishali Antala for cache system.

World Wide Web has grown terrifically over the past decade & so has the data & information provided by websites. If the website is real friendly & interesting then it’s just a matter of time that the end-users visiting the site grow exponentially. The problem here was how to manage millions of user request & traffic coming to the site. This is how people got an idea to start eating each other’s head 🙂 . Opps! What I meant is, people started thinking of ways to provide better performance & speed for their websites.

We are all aware of “Cache” memory available in our CPU’s. Cache is a type of memory which stores temporary data, which is likely to be used repeatedly. This helps the system to save time & reduce load on the physical disk, as when the same data (let us suppose a file) is being accessed often then instead of fetching that data (ie, file) from physical disk again, it is being fetched from the cache 2nd time onwards.

Scenario over here is of websites, having millions of users & their requests to the site server. A single cache memory of a system isn’t sufficient here. Thus, people started making cache dedicated for web application acceleration.

Magical Applications making this possible Magic Wand

Combination of applications when used together could make your web app super fast then “Iron-man flying in the movie Avengers 🙂 ”

  1. Level 1 – Varnish
  2. Level 2 – Akamai

You could ask a question here do we need to use both of them. To understand this let us consider the scenario of client-server architecture which could be roughly described by following pic:

Client-Server Architecture

Client-Server Architecture

Now, on the first level of caching you could use Varnish which resides on server side. Then, for the second level, to get better performance & reduce load, you could use Akamai on the client side. Let us understand the functionality & working of both.

Overview of Varnish & Akamai

Cache System
Cache System
  1. Akamai

Akamai aids in caching contents viewed by user on the client side. In order to accomplish this Akamai makes use of peer-to-peer networking concept. For example, consider a group of 5 users. Now, for the first time a user_1 is trying to view a page http://in.bookmyshow.com/movies/The-Avengers-3D-/ET00007152. A request is being made by the user_1 to view that page, which goes to the server of BookMyShow website, which then returns that page as the response. Akamai will download the page/content on the client system & utilizes the client system as server for other near-by users trying to access same page/content. So, now when user_3 tries to access same page/content, if user_3 is near-by to user-1 server then the response will be sent by user_1 server (which is actually a client system) instead of centralized server (ie, BookMyShow server).

Tada…. This reduces the load on the original server & also the page gets loaded faster as the user_3 server is near-by in location, resulting in better performance of web app.

    1. Working
Akamai process
Akamai process

Working of Akamai could be explained in following 4 simple steps

      1. Send a request to original server.
      2. Original server looks in to index.html file
      3. Index.html file points to akamize page in Akamai server.
      4. Akamai server delivers the content to user as response.
    1. Akamai’s Approach
      1. Eliminating long routes whenever possible:

The concept of peer-to-peer system aids Akamai to utilize client system as temporary server in order to cache or replicate content on client system. Thus, when a request is created it will be delivered from server close to end users instead of centralized server.

      1. Optimize routes:

By mapping paths across the Internet to avoid trouble spots, compressing content, and replicating packets to ensure fast, complete delivery.

      1. Perform computing closer to the user:

This helps avoid long internet latencies ie, the time interval to transfer data from one end to another.

    1. Akamize the page

As mentioned in step 3 of “Working” of Akamai above, index.html file redirects the request to Akamai server by referring the akamaize url of the same request. (Don’t be scared. I will explain the meaning of “Akamize page/url” 🙂 

We have a centralized server which will have address/url for every file, page in the website structure. Eg, www.cnn.com/images/hub2000/ad.info.gif

Now, All these url’s of a website are converted into “akamize url” which are being understood by Akamai server. This task is being done by “Free Flow Launcher” tool. Converting in the sense, this tool adds some data in front of the normal url structure that helps to point to appropriate Akamai server.

Eg, Akamized url: http://a388.g.akamaitech.net/7/388/21/fc35ed7f236388/cnn.com/images/hub2000/ad.info.gif

    1. Peer-to-peer networking

Here is how a peer-to-peer network looks. All the nodes (System) are interconnected to each other.

Akamai cache’s data on end users system using Akamai’s “Net Session Interface” tool.

Peer-to-peer Architecture
Peer-to-peer Architecture
    1. References

Wish to learn more on Akamai??? Refer this:

      1. Wikipedia
      2. Google baba – aka Google search 🙂 
  1. Varnish
    1. Need

Now, we have a rough idea of how Akamai works by caching the content on client side. There are two problems with this:

      1. We can’t use end users system to store cache data forever. So, we need to set some expiry time for storing cache data on end user system. This expiry time could be of shorter interval like 2-5 mins.
      2. What happens if there is a miss in Akamai cache ie, cache content is not available in any end user server? In this case the request will be directed to centralized server.

Now, what if there is a miss everytime & request gets redirected to centralized server. This results in increased load on the centralized server, long internet latencies.

Thus, there was a need to cache content on the server side as well 🙂

This is where Varnish comes into picture.

    1. Working

Hmmm, I guess after reading my long post you must be thinking when will I stop blabbering 🙂

So, let me show you the lively video of how cache works:

https://www.youtube.com/watch?v=x7t2Sp174eI&feature=player_embedded

or

https://www.varnish-cache.org/about

    1. Credits

As I am a very good person, I would like to maintain the credits of

http://deglos.com/blog/2010/09/22/drupal-and-varnish-quick-intro

which is the best reference guide explaining advantages, disadvantages, Optimizing Varnish & Drupal.

If you have any questions on Akaimai & Varnish then please do leave comments. I will try to answer them best as per my knowledge & that will also help me in making my view more clear 🙂

8 thoughts on “Web & Cache System

  1. Wonderfully described, Yagnesh…. Superb, worth and worth reading. Thanks for sharing it to all of us.

    Keep up writing these, you have enormous talent.

    Kudos!!

    Thanks,
    Sudhamshu

  2. This is just great!! Elaborated & yet explained in simple terms.
    May be you should also come up with test ideas one could use while testing a cache system in any web-application, that will be of great help to testers who are wanting to test this area of web-apps.

    Thanks for sharing the knowledge!

    Regards,
    Sunil

Leave a reply to Madhu Sudhan Cancel reply