Wednesday, 14 May 2008

Get rid of that pesky VMware Tools update notification...

I have enough issues with the "VMware Tools out of date" notification appearing in the VI client every time an ESX patch is applied... it's almost useless information, as generally there are no support issues with running the RTM version of Tools regardless of the patch level of the host.

But even more annoying is the default behaviour of the VI 3.5 version of Tools, which enables a visual notification in the systray (on Windows guests) if an update is available, which is controlled by the checkbox below:


Too bad if you happen to certify particular driver (ie Tools) versions with coporate SOE versions, like every large enterprise on this planet does. And say goodbye to your standards when a bleary eyed admin sees this little yellow exclamation in the systray at 2am and gets the idea that upgrading VMware Tools may solve whatever problem they got woken up for. Hopefully they'll remember to raise a retrospective change request after some sleep. I won't even begin to imagine what the curious VDI user might do.

Fire up your registry monitoring tool of choice and clear the checkbox, and you will invariably be directed towards a modification in HKCU, meaning if you want to effect a machine wide change you would need to load the default user hive and mod the value in there, as well as the all users hive.

Good news is that you can more easily control the display on a machine wide basis by modifying the (default) value of 'HKLM\SOFTWARE\VMware, Inc.\VMware Tools'. Setting it to a DWORD value of 0 is the equivalent of clearing the checkbox (yes i know by default it's a REG_SZ - just turn it into a DWORD).

For anyone out there even half as lazy as me, copy this into your install script after the tools installer has been run:

REG ADD "HKLM\SOFTWARE\VMware, Inc.\VMware Tools" /V "" /T REG_DWORD /D "0x0" /F

Tuesday, 13 May 2008

What VMware Site Recovery Manager isn't...

Straight up front - this is not a cynical post. My main point is NOT that SRM has some kind of product or design flaw. The reason for such a post is that there will be many people who will write about what SRM does offer, so I thought I'd balance it a little... to help people keep sight of the fact that it is not a panacea (not that it's purporting to be, but the marketing hyperbole is hardly going to point out why you need additional BCP / DR products). Personally I consider SRM a necessity, for the mere fact that keeping those BCP / DR VMs offline will save a fortune in system administration overheads associated with having them online, which are easily the biggest chunk of TCO. Enough of the disclaimers, onto the meat of the post!

When you think about why you would invoke a DR plan in the virtual world, it pretty much boils down to 2 things:

1) Catastrophes, like an entire datacenter or array outage.

2) Configuration errors that can't be recovered from within the application's RTO

Point 1 is obviously what SRM is designed to address, it is called Site Recovery Manager after all.

Point 2 however, is not what SRM can / should be used for, and one would certainly hope that configuration errors, like an OS or application patch that breaks something or a change request gone wrong, are much more probable than catastrophes.

Of course there are a number of ways you can address point 2. Snapshots can go some way towards it, but that can be very difficult in large enterprises where the VMware admins may not know about application changes in order to take the snap beforehand. You could schedule regular snaps and merges, effectively keeping VM's continuously in a snapshotted state, but I seem to recall something about SCSI reservations being used by VMFS to do metadata updates... stuff like extending a snapshot file when it gets written to - if you've got 20 VM's on a LUN that simultaneously kick off a virus scan which writes to a log as well as reads the entire filesystem, that might have some implications. Regular image level VCB backups could be used to similar effect, but you probably don't want to use SRM and take images of the entire virtual infrastructure. And as there's not really an elegant interface to track and manage specific VM image backups via VCB (at least not that I have seen), there's definitely room for the 3rd party tools that offer scheduled / asynchronous replication of an online production VM to an offline DR partner. If anything, it makes the need for their products more obvious.

So it's probably worth keeping the above in mind if you're coming up with a business case for putting SRM into your environment... include the need for a single VM rcovery solution as well if you don't have one already, to save getting caught out and then having to explain why that DR application you spent all that money on actually isn't the be all and end all.

Sunday, 27 April 2008

SAAAP!? VDI - Why is Storage Such an Issue?

A rather contentious title for a post, but there are a few things on the storage front that have been bugging me lately. The body of this post will probably have very little to do with storage, just bear with me on this little derailed train of thought.

The most common thing we see around the interwebs lately on the storage front all seem to be targeted at this issue of the storage cost with VDI. Single instancing, de-duplication, linked cloning, disk streaming... you all know what I'm talking about. But is this storage issue best solved by the storage vendors?

Maintaining state on the endpoints is the reason why we need so much storage for VDI. Even with some jedi mind trickery on the storage side to reduce duplicate OS files, there's still the applications. Which could be addressed to a similar degree with the same tricks and some kind of de-duplication, but I honestly don't think the technology is there yet. Not when Microsoft is releasing security patches on a monthly basis. The de-dup functionality from the likes of NetApp is way cool no doubt, but it's a post-processing operation. And while it may be pure philosophy on my part, I just can't see that approach scaling too well - prevention is better than cure, after all. And while de-dup via post processing is certainly a more viable option than what is currently available in a pre-processing engine, all I can say is keep an eye on that space.

Another kind of de-dup I'm sure we'll see bandied about more is Citrix Provisioning Server, the rebranded Ardence product. But again, while very, very cool, it's not very practical currently due to the non-persistence of the cache. One reboot and you're back to square 1 - which is great for a grid compute farm, but not so great for VDI.

Application virtualisation obviously goes a long way towards solving this problem. VMware knows this, otherwise the Thinstall acquisition wouldn't make much sense. With app virtualisation and presentation virtualisation and we're finally starting to address the storage problem. But a major piece is still missing - environment virtualisation.

Environment virtualisation is something that people in the presentation virtualisation (read:Citrix folk) space are very familiar with. I'm not talking about simple profile redirection or mandatory profiles - I'm talking about real environment virtualisation. The kind of thing that allows you to logon to a desktop, fire up Excel, then logon to a VDI machine on the other side of the world and open Excel on that, then fire up another Excel session from a presentation server in New York, and then another one from a presentation server in Sydney, and have all your application preferences individually delivered and saved accordingly. The kind of technology that streams your profile to wherever it needs to go. Mark it for replication around the globe or don't - the choice is yours.

Which leads me inexorably back to the title of this post. Why is storage such an issue for VDI? If the endpoint was stateless, would this issue still exist? In such a world, could we use, dare I say it, local storage on a large scale? Does this have greater implications - with a fully virtualised (machine, app, and environment) stack, could Microsoft finally have a leg to stand on with the "Quick Migration vs VMotion" debate? Comments are open - what do you think?

NOTE: I changed the title of this post after finishing it and realising the original title was just plain wrong. So apologies if you came here via an RSS feed!

SAAAP!? //Sunday Afternoon Architecture And Philosophy

I really wanted to keep this blog strictly technical, but alas I'm going to have to indulge in some ranting philosophy on at most a weekly basis. Hence a new post category, which is a rather poor wordplay on the abbreviated form of "what's up". I couldn't think of anything classier unfortunately.

But with a GTA IV PS3 pack on the way to my place next week and GT5 and MGS4 only a few months away, I'm sure this little excursion won't be too painful for you all - I'll probably be lucky to get something out once a month!

Sunday, 20 April 2008

First Official VI Client Plug-in Document Released - round of applause for VC architects / devs!

OK, so I've been giving the VC devs a _bit_ of a hard time lately. So just to show I'm as quick to dish out praise as I am to dish out flames, I wanted to say damn good job on the VI plugin architecture!

I have been eagerly awaiting some official documentation surrounding VI plug-in development (both client and server), ever since Andrew Kutz's amazing job of reverse engineering the framework. I guess the document release probably got lost in the publicity surrounding the Virtual Disk Development Kit released a few days prior to the release of this.

Peronsally I think the whole VI plugin idea is nothing short of a stroke of genius. What better way to combat the potential for the likes of Microsoft to relegate VirtualCenter to the background by building full VMware infrastructure management into SCVMM, than to open up the VI management framework in such a way. And the architecture... as much as I might accuse VMware's tools of lacking enterprise scalability from time to time, they've absolutely nailed it this time. I mean, look at this (from the document)



Outstanding! On the coding side, it looks so straight forward that even a hacker like me should be able to put something together fairly easily... running off the example in the document, you could pretty easily add a context menu item to ESX hosts that fires up the out of band management web interface (ie the iLO page for HP kit) of that particular host.

Hopefully it won't be too long before the framework moves out of experimental status, and we start seeing 3rd party tools leverage the functionality. I've already seen the Xsigo plugin and it looks great.

Download it here

Friday, 18 April 2008

VirtualCenter 2.5 Update 1 Upgrade Process - Not So Sucky After All!

EEEK! In my original post on the VC 2.5 Update 1 upgrade process, I incorrectly stated that the VC database user needs full sysadmin on the database server for the upgrade, and pondered why that was the case. Well it turns out that is entirely NOT the case at all!

Had I RTFM, I would have seen that the VC database user just needs the db_owner role on the MSDB system database, not sysadmin on the entire server. This is apparently a requirement for the SQL Agent job creation. Curiously, although the documentation states this is a requirement for a clean install as well, you only get the permissions error if you run the interactive installer. If you permission the VC database user with only the db_datareader, public and SQLAgentUser roles on the MSDB system database (I work in the finance industry - least privilege is important!) then run the installer silently, you don't get the permissions error and the SQL Agent jobs are still created!

Thanks to John over at the VMTN blog and the VMware support guys for sorting me out on that one!

Thursday, 17 April 2008

Veeam Backup Install with Remote Database

Even with VMware Consolidated Backup and the release of VMware Site Recovery Manager around the corner, there is still a (gaping) hole left by these 2 products - the ability to failover a single guest to a regularly sync'ed and otherwise offline DR partner.

In the physical world there were 2 main reasons to invoke DR for a single node - physical hardware failure, or a configuration error that couldnt be recovered from fast enough. Virtual hardware obviously doesn't fail however (I know, I know, the underlying host could fail but you've all got HA enabled clusters to allow quick recovery from that don't you!), which leaves us pretty much with the configuration error problem. Whether it be due to OS or application patching, a change request gone pear shaped, rogue developers or administrators, whatever... the probability of a configuration error causing an outage is much greater than that of a datacenter outage (one would hope!). There is definitely a need to offer such a capability, as companies like Platespin (or Novell now I guess), Veeam and Vizioncore know all too well.

Veeam Backup is something I've been messing around with lately, and it certainly impresses in a number or areas, like being able to manage the inventories of multiple VirtualCenters and backup / replicate VM's or ESX host files between them from a single interface. I'm still waiting for someone in this space to eliminate the need for direct ESX host access though, but I digress...

NOTE that the following is pure hackery, and should not be used in production environments!

Anyway, onto the topic of the post. By default, Veeam Backup installs a SQL 2005 Express instance named 'VEEAM' on the box. Thankfully, they have built their product with enterprise scalability in mind (rare for a version 1 product from a small relatively new software company), you just need to know how to configure it. Which would be as follows:

1. Extract the files from the Veeam Backup single install binary.

2. Run 'setuplight.exe'. Make sure you have a generic domain account ready to use for the Veeam Backup service.

3. Go to the install directory, and modify the 'dbconfig.xml' file to point to your remote database host and instance.

4. While you're in the install directory, copy the 'DBcreate.sql' file to the remote database host.

5. On the remote database host, create a login for the generic domain account used in step 2.

6. Create a database named 'VeeamBackup' and set the owner to be the generic domain account login created in step 5. Dont just grant the login the db_owner role - the login should be directly mapped to the dbo account for the VeeamBackup database.

7. Open up the 'DBcreate.sql' file you copied over in step 4, add the following at the start of the file and execute it in Query Analyzer:

USE VeeamBackup
GO

8. Now jump back on the box where Veeam Backup was installed, and fire up the GUI - you're done!

Video coming soon... in the meantime, go grab the bits and check it out for yourselves!

UPDATE The good folk at Veeam got in touch regarding this post... turns out I inadvertently stumbled upon something they're planning for a future Enterprise focused release. So don't deploy this configuration in your production environments just yet, and keep an eye out for the upcoming Enterprise product release. I'll be sure to run that through its paces too :-)