As much as I like VMware I’m really struggling with why the View team and the vSphere team can’t seem to get on the same page. They just released both 5.1 products but immediately came out with a warning that View and vSphere weren’t compatible but they haven’t said why. I get it when you have a company you have to work with and it takes a while to get updates done, but when you work in the same company you’d think they would be on a closer page than an outside partner. #Frustrating!
VMware vSphere Replication… nice add but can be a challenge
So over the past month or so me and our team have been working on setting up VR (vSphere Replication) that is part of SRM (Site Recovery Manager). We found this is the only way we can have a replicated solution for our non 3PAR storage connected VMs. We use NexSan for things that don’t need fast storage, but NexSan doesn’t have a replication technology built into the current products, although I hear that is changing, but for our environment that doesn’t really help… <!--more But wait, there's more! --> So I thought I’d help out by putting together some of my thoughts from our installation.
So the product is deployed from the SRM interface as a built in feature, assuming you installed it when you deployed SRM in the first place. There is a point during the install where it asks you if you want to install VR replication or not. If you didn’t you’ll have to go back and install that part to utilize it. Once installed it’s actually pretty easy to deploy. From within SRM there is vSphere Replication group on the left where the Sites, Array Managers, etc groups are. The Getting Started pages are actually pretty helpful for setting it up (and all of SRM for that matter) that you can just follow it and have it deployed fairly quickly. If that was only all you had to do how easy life would be
The steps are below:
- Step one is you have to deploy a VRM (VRMS) server wich is a vSphere Replication Management Server. They reference is as VRM in some places and VRMS in others.. go figure. This is a appliance deployment from an OVF file and the wizard is straight forward asking you for the VM name (defafult is a really long word, so if you’re like me you’ll want it to be named like a server). Give it the name you want, if other than the default, then click next.
- Now select the cluster this will be deployed to. I’d suggest placing it and the other servers that are part of the solution on the destination cluster you’ll be migrating VMs over to.
- Next select the lun you place the VM on.
- Select your disk format (thin, thick).
- Next the nework needs to be specified
- Here is the customization for the appliance. Give it a password for root (it runs SUSE 11 Linux). Give it a default gateway, DNS IPs, comma separated for more than one. Nesxt the IP, and finally the subnet mask. Notice there is no place to put the suffix search order for the domain in it so it can connect to things like SQL by short name. This where one of my annoyances comes from. I’ll eloborate more on that later. One other note, the wizard doesn’t sanity check your inputs on the wizard so if you mess up the IPs it’ll take whatever you give it. It does make sure you put the password in correctly twice.
- Next you tell it to configure this server as an extension. Oh but wait! You have to cancel the wizard and go into the Runtime settings of vCenter and specify a management IP that this can talk to. Seems a little redundant, and why on earth are the not using a DNS NAME for gods sake. That was the next thing that got under my skin about this setup. VMware seems to have developed a bad habit of having developers that don’t talk to each other. One team puts a part of the product out that uses DNS happily, the next one uses DNS in one part, but IP in another, and I’ve seen yet others that only use IP addresses. Get some consistency going on guys!! I love VMware and advocate it loudly. I’m even working on a book about vSphere 5, but I also will point out things I think they are screwing up (you’ve probably figured that part out LOL). Anywho, unless you already had the mgmt IP set, go set it then start the wizard over to get back to here. It will give you a green check if it’s happy and let you click next.
- Click finish to deploy. Once the OVF finishes deploying you’ll see it turn on the VM, configure it, and register it with vCenter. On the summary tab you’ll know it’s done with it says “Connected” next to status. The next set of commands, and setup can’t be done until it’s up and talking with SRM anyway.
- Once it’s talking you’ll be able to do the next part which is configuring the VRM server. Wait, Didn’t I just do that? Alas no, not everything was part of the wizard. When you click the Configure VRM server link you’ll have a browser pop open and ask you to login. The login is root and whatever password you gave it. Assuming that all worked you should get a VMware Studio interface like all the recent applianes they’ve put out. This web page has a getting started screen too, so you can use it, but all it does it shoot you over to the various tabs so I’m not sure how much good it really servers other than giving directions. Go to configuration. Now if you didn’t already read about how this is setup you can use MS SQL with this product, Oracle or DB2. I choose SQL. Select Manual Configuration. Make sure you have an account that has dbo rights to the database as it will require it to add the tables to the database. You specify your DB host, the port, user,pw, and database name. Now here is where one of those pesky annoyances came into play. Since you can’t give this a suffix you can’t just give it a short name, so make sure you use a FQDN to your SQL server. Don’t bother trying to modify the resolv.conf file manually, anytime the application is restarted it overwrites the resolv.conf with just nameserver entries. I tried to get real cute and set the resolv.conf to read only and it reset that too :$ As long as you use either IP or FQDN you’ll be fine and won’t run into that problem. Moving on…. Why on earth you have to put the IP address of the box you are CONFIGURING NOW I’ll never understand, but you do. Put the IP of this box into the VRM Host box. ou can leave the name alone or change it. This shows up in the SRM interface. Now specify the vCenter address (FQDN), port, username, password, and server team email address. Once done click Save and Restart Service and cross your fingers. A little unknown tid bit I didn’t find anywhere in documentation is that you have to specify the vcenter server address that matches how vcenter is configured. So hopefully you setup vCenter with a FQDN (which I didn’t). You’ll end up having to go change that so they match. Another stupid annoyance on why it cares as long as it can talk to it…. Assuming it worked you can now go to the Security button. The only thing you can do here is change the password, but if you set it in the wizard why would you want to? Now in the network tab you can set a normal hostname for the box so it doesn’t show up as “localhost.localhost”. Otherwise the rest can be left alone. Assuming all worked well you can go back into SRM.
- Rinse and repeat the above steps at DR. If you’ve already setup SRM you know you have to have it setup at both sites on separate servers. So once you deploy the VMs at DR you can come back to this screen and continue.
- Waiting for that to be done …
- Ok, now you can configure VRMS connection (see, here is a point where they change the terminology). When you configure the connection you’ll be telling it which VRM boxes will be used, authentication, etc. It’s pretty straight forward (Mine is screwing up so I can’t walk through it at the moment. I’ll update this once it’s fixed).
- Now you can deploy a VR server. These appliances are treated like agents, similar to how vShield works, so I was instructed to put one on each host. You can do whatever you want to do, that is just what I was told. Deploying a VR server uses a similar method to the VRM/VRMS server. It will follow an OVF wizard and deploy a VM (or several if you do one per host). I would be a few of them because the replication jobs will sit on these boxes, so you may want to spread the load out. I’ve noticed when replication is active with a single VM being replicated it pegs out the resources of the appliance. I ended up with 8 (one on each host) because I have that many VMs I want to replicate. Now, you only need VR servers at the destination location, you don’t need them on your prod site unless you want to be prepared for a reverse replication need in the event of a disaster and you are ready to move it back to prod. That’s up to you how you handle that one. Once the VR server is deployed you have to wait for it to come online. When you click Register VR server (the next step) and it’s not up yet you won’t see anything to select. Once it’s up and has registered you’ll have that server show up as selectable in the register screen. When you register it you’ll be prompted to accept the certificate since it’s a self-signed cert. Once you do that it’ll process the request and the VR server will show up under the name of the replication VRMS box you specified in it’s setup. You can configure the VM similar to how you configure the VRM server, but you have to go to it manually. The URL is https://ip_of_VR:5480. The password for this box is always vmware, and I haven’t been able to really change it. I got so frustrated with it just didn’t want to screw with it honestly so it may work ok. If I figure that out I’ll amend this with that information.
- Now you are ready to configure a VM to replication. You go into VMs and Templates, find the VM you want to replciation, right click on it, then select vSphere Replication. This will launch a wizard that will step you through where to put the VM placeholder, where you want the disks replicated, etc. It’s also pretty straight forward. You’ll want to have destination volumes already on the DR site to replicate the VMDK’s to. Once you have that all setup you can go into your protection groups and add these VMs to one, and then to a recovery plan. It ties together pretty nicely.
Some points to be aware of. If you have a VM with more than 2 or 3 VMDKs it appearntly doesn’t like that and will crash the wizard. If it does that when you try to do it a second time it tells you it’s already done. The actual message is “THe object has already been deleted or has not been completely created”. And the only way to fix it is to shut the VM down, detach the VMDKs from the VMX config, delete the original server entry (but the VMDK’s are safe since they aren’t attached), then create a fresh VM (there by creating a new vmx file) and attaching the VMDK’s to it. Then you can try it again, but in my experience it has blown up at the same point each time, and so far there is no rhyme or reason to it. I’m planning on calling into support about but just haven’t had time yet. But I can say I have 2 VMs with 8 VMDK’s and they both do the same thing at the same place. That can’t be a conincidence….. Another annoyance
I hope this is helpful and I’ll update it as I find more out.
Nexus 1000 v
So in my research for my book, vSphere Performance, I’m finding out that the Nexus 1000v has some major advantages to it over the other switch types, however, in 5 there appears to be a possible bug that’s been introduced that makes VMs disconnect when they vMotion from host to host. Still working with support on why.
Writing a book
Is far different than I thought out would be. You can’t be yourself as much as you’d like, and you have to edit edit edit…. Not what I expected..
Lost it all….
So sometimes when you think you have a good backup of your blog you find out the hard way that you don’t.. my hosting provider disappeared sometime the last few weeks and I didn’t notice, and I can’t find a backup, so now I have to try to remember what all I’ve blogged about :$