VMware HA : ‘V’irtually ‘M’ore - High Availability

Hi Guys,


Today i want to discuss something about High Availability feature of VMware. First of all….what is high availability? why we need to concentrate on this thing? how can we accomplish this thing with the help of vmware? Come, Let’s support…. by providing High Availability in the environment where you are working on….


What is High Availability ?


High Availability is nothing but, the continuity of the service. That means the service should be available to the users 24x7 without any interruption.


Why we need to concentrate on this thing?


Now a days every business / company / organization is integrated with internet marketing. By depending on internet marketing these guys are making more money. Which means internet became a medium to the user / customer and the company / business. So a company can make money from all over the world with out having their offices / stores located in each locality. Instead of these they are putting their servers in each location, because these are the guys who are providing services to the users / customers.


If the user faces any interruption in the service, that means the company is losing its business. To avoid such things each and every company concentrating on providing high availability for their environments in the means of Clustering, Redundancy at all levels (Software and Hardware) and Load Balancing etc. For that reason as a systems engineer we need to concentrate on these things too….


How can we accomplish this High Availability in our environments?


There are lot of technologies available today to provide HA to the environment. Major thing is we need to know about those technologies. For example Clustering, Grid Computing, Load Balancing, Hardware Redundancy, Software redundancy, DR Sites and Finally VMware HA…. :)


How can we accomplish this High Availability with VMware HA?


After eating lot of your time, i am coming to the point now. i have discussed these things because there are lot of people who wants to come into this industry with lot of ambition, but without ignition. I want to support them, by igniting their knowledge…  Finally lets talk about vmware HA.


VMware HA


As usual it is also a very good feature from VMware. With this we can provide High Availability to the virtual machines which are running in/on ESX cluster. We have lot of features like VMotion, DRS, SMP etc, but why we need this one. We need this because we need our services running without interruption. Assume like, for some reason if any one of the ESX server in the cluster goes down suddenly, what happens to the virtual machines which are running on that particular server? Are they continue to run or go down. Yes, they also goes down. But with the help of VMware HA, these vm’s can be restarted immediately on the other ESX servers in the same cluster. But here you will get a down time of 5 –10 mins. Because server crash is an unexpected thing.


How it works?


image When you add a ESX server to the HA enabled cluster in the VC, what happens is VC installs an agent called HA agent on each ESX server. Using this HA agent, all the servers are knowing about each other by sending heartbeat packets saying that “I am ALIVE”. For every 5 seconds, each and every server in the cluster will send this packet to all other servers. So that every server knows about the other. If for some reason like, server crash or server hang or if the server is in maintenance mode this packet will never be sent by the server. At that moment, each and every live server in the cluster will be aware of the dead server. Immediately HA agents calculate the resources, and try to restart the vm’s on all the running servers.


Note : HA feature is not dependent on VC or VMotion.


If the host gets disconnected from the VC, at that moment also the HA agent will be running on the host. But the only thing is the resources data of that host will not be updated in the database. Because VC is using SQL/Oracle database to store the vm, resource, host, cluster, datacenter & what not ever information.


And remember, in restarting the VM’s HA is never be using VMotion.


……………. Thats all for now, i will discuss more in the next article.


Thanks for showing extra-ordinary interest and support in my blog. Keep reading interesting articles like these.


Please subscribe now to get the latest posts delivered to your INBOX.

Comments

  1. Hi friends,
    Suppose in one cluster 3 ESX servers available, within 1 esx down. 5 vm's are running on that server, they require minimum of 3GB RAM and 2GHz processor speed. But in remaining 2 esx servers only 2GB RAM and 2GHz processor speed available, in this case all vm's will start or how will those starts and on what basis.

    In other case all required resources not available what will happend. How will know end user this situation.

    Meanwhile in this time end user will know this breakdown, or vm will restart immediately otherwise how much time it takes to restart.

    please clarify the above doubts.

    Thanks in advance
    Ranjith

    ReplyDelete
  2. Hi Ranjith,

    This is a very good and common question. Here we go....

    1. First of all in production environments or large environments, the people who are handling the infrastructure will perform a capacity planning before implementing or upgrading. And they will purchase servers according to their requirements. So in such environments, they will keep some resources reserved for the situations like these. If anybody planning their environment like the way you stated above, that means they are not following the standards to maintain 100% uptime.

    2. Answer for your question : HA options are available for ESX servers and Virtual Machines too. If you add an ESX server to a HA enabled cluster, that means in case of physical failure all the virtual machines running on that physical server will be restarted on other servers with the available resources.

    3. SO using the VM HA options, you can set a priority so that HA will restart the VM's with high priority first and Medium Later and Lower at last.

    4. In case after restarting all the High and Medium Priority VM's on the new ESX server, HA will try to calculate the remaining resources in the Cluster using DRS and initiate some vmotion's of some virtual machines (it shuffles the VM's between the hosts), so that you may get some resources available on some server. After that, it will try to restart the low priority VM on the server with resources available.

    5. If for example, there are no resources available, and if there is high priority virtual machine then it will try to shutdown one of the low priority vm on the server, which is already running and try to restart the high priority vm. These things will happen when you select the automation level as Fully Automated (in the wizard when you configure the HA for the cluster).

    6. For the servers which are running in production, there will be a monitoring tool installed on each and every server. So the admin knows about these kind of things by getting alerts, warnings and errors on his screen or through email etc.

    7. Last thing is you must plan your environment in such a way, which is having sufficient resources in case of failures. This is what we call it as contingecy plan in other words backup plan. That is the best solution. Because you are spending money to purchase only one server, but not for 100.

    Good Luck.

    Hope this is informative for you. Mail me at charan@isupportyou.net if you have any questions like these.

    Regards
    Charan
    @iSUPPORTu

    ReplyDelete
  3. Hi Charan,

    Very explanatory. I am a newbie. I have a question for you.

    What does "restarting" a vm mean by? does this has any impact on the user.

    ReplyDelete
  4. Hi charan,

    Thanks for your knowledge sharing its helpful. I have one doubt

    I have enabled HA in cluster. Not enable DRS. in that cluster 3 hosts running each host having 5 instance.

    Suddenly one host went down not accessible. In this time HA will restart Vm's from another host

    The Problem host still not UP. In this time VM's will accessible or NOT

    FYI: DRS not enabled

    ReplyDelete
  5. If they were able to restart on the other host, they were absolutely accessible. If for some reason, there are no resources available on the other hosts HA is unable to restart the VM's. If HA feels the resources are available on the other hosts they will restarted on the preset priority conditions. thats all. mail me at charan@isupportyou.net if you have any questions further.

    thanks
    charan

    ReplyDelete
  6. thanks...Charan.

    Detailed explanation....very useful for me.

    ReplyDelete
  7. Very Good explanations .....really understanding and simple ....

    ReplyDelete
  8. Hi Charan,

    Greetings of the day !!

    I have one doubt can you please clarify.

    Suppose I have one VCenter Server (having say 3 esx hosts) which is configured for HA function, one of the ESX Server has lost its connectivity due to power failure now in this case my ESX host is no more on the network and has no any power connectivity so that other ESX Server can start the VM's existing on this failed host. Can you please explain what would happen in this case will still my VM can get on other two ESX hosts? if yes then how.


    Hope you got my question.

    waiting for your answer

    Thaks in Advance :-)

    ReplyDelete
  9. Hi Charan,

    thanks for information, I am fresher for vmware, so l would like request which is best training center for pune location,

    Thanks,
    shiva

    ReplyDelete
  10. Excellent Charan you made you made me very interesting to learn HA.

    ReplyDelete
  11. Charan, you are excellent Guy.
    Very simple and clear information.
    You are a awesome blogger...

    ReplyDelete
  12. Simple !! and easily understandable !!!

    Thanks,

    ReplyDelete

Post a Comment

Popular posts from this blog

VMWare Interview Questions & Tips

Windows: 2012 Server: Restoration of server from ransomware brute force attack – Real time experience

Windows and VMware : System Admin Responsibilities