Playing IT Fast and Loose

It’s been a long time since I’ve been at work from dusk ’til dawn. I not saying that I’m the reason we have such fabulous uptime, there are a lot of factors that play into it. We’ve got a well rounded NetOps team, we try to buy decent hardware, we work to keep everything backed up and we don’t screw with things when they are working. And we’ve been lucky for a long time.

It also helps that our business model doesn’t require selling things to the public or answering to many external “customers”.  Which puts us in the interesting position where its almost okay if we are down for a day or two, as long as we can get things back to pretty close to where they were before they went down. That also sets up to make some very interesting decisions come budget time. They aren’t necessarily “wrong”, but they can end up being awkward at times.

For example, we’ve been working over the last two years to virtualize our infrastructure. This makes lots of sense for us – our office space requirements are shrinking and our servers aren’t heavily utilized individually, yet we tend to need lots of individual servers due to our line of business. When our virtualization project finally got rolling, we opted to us a small array of SAN devices from Lefthand (now HP).  We’ve always used Compaq/HP equipment, we’ve been very happy with the dependability of the physical hardware.  Hard drives are considered consumables and we do expect failures of those from time to time, but whole systems really biting the dust?  Not so much.

Because of all the factors I’ve mentioned, we made the decision to NOT mirror our SAN array. Or do any network RAID.  (That’s right, you can pause for a moment while the IT gods strike me down.)  We opted for using all the space we could for data and weighed that against the odds of a failure that would destroyed the data on a SAN, rendering entire RAID 0 array useless.

Early this week, we came really close. We had a motherboard fail on one of the SANs, taking down our entire VM infrastructure. This included everything except the VoIP phone system and two major applications that have not yet been virtualized. We were down for about 18 hours total, which included one business day.

Granted, we spent the majority of our downtime waiting for parts from HP and planning for the ultimate worst – restoring everything from backup. While we may think highly of HP hardware overall, we don’t think very highly of their 4-hour response windows on Sunday nights.  Ultimately, over 99% of the data on the SAN survived the hardware failure and the VMs popped back into action as soon as the SAN came back online. We only had to restore one non-production server from backup after the motherboard replacement.

Today, our upper management complemented us on how we handled the issue and was pleased with how quickly we got everything working again.

Do I recommend not having redundancy on your critical systems? Nope.

But if your company management fully understands and agrees to the risks related to certain budgeting decisions, then as a IT Pro your job is to simply do the best you can with what you have and clearly define the potential results of certain failure scenarios.  

Still, I’m thinking it might be a good time to hit Vegas, because Lady Luck was certainly on our side.

Advertisement

Upcoming Tech Events in 2011

Looking to fill your calendar with some free or low cost tech events in early 2011?  Consider some of these:
  • TechNet Events Presents: Virtualization 101 – Microsoft Evangelists will talk about the creation of the hypervisor and demonstrate usage scenaros ranging from the home user up to multinational corporations. Discussions will also include how virtualization has given rise to “the Cloud”.  The event is free and will be in San Francisco on 2/2/11, but check the list for dates in Los Angeles, Irvine, Denver, Portland and others locations on the west coast.
  • Data Connectors Tech-Security Conferences – Just like the one-day event I attended a few weeks ago, Data Connectors will be all over the west coast in early 2011.   In particular, find it in San Jose, CA on 2/10/11.
  • She’s Geeky unConference – For all those women who embrace their geekiness, save the date for “She’s Geeky Bay Area #4” running January 28-30th. 
  • Register by 1/21 and snag a free Expo Only pass to the SPTechCon (The SharePoint Technology Conference) in San Francisco February 7-9th.  The full event doesn’t fall into the “low cost” category, but if SharePoint is your thing, you might want consider more than just the expo.
  • RSA 2011 – Another one of my favorites, the “Expo Plus” pass at RSA gets you into the expo hall, the keynotes and one conference session of your choice. RSA will be at the Moscone Center in San Francisco, February 14-18th. 
Plan your time well and you won’t have to be in the office for much of the first quarter! 

Virtualized Domain Controllers? I’ll pass, thanks.

There are a lot of good arguments for virtualizing DCs. You should have several of them for redundancy, but depending on the number of employees and general work load, DCs tend to be underutilized and it can be hard to warrant having a whole physical server for each one. But after loosing a second domain controller after doing essentially some basic VM maintenance, I’m not sold.
You may remember a previous post of mine from the summer of 2009 about NTDS Error 2103, when the DCs in a small child domain were virtualized.  I had agreed to virtualize both DCs from that domain as the domain was not supporting any user accounts and had less than a half dozen servers as members.  One did not convert well and we decided to just leave the remaining DC as the sole one standing for that domain after vetting out the risks.  There are several “rules” to follow when virtualizing DCs, particularly not restoring snapshots of them and not putting yourself in the situation where your VM host machine need to authenticate to DCs that can’t start up until your host authenticates.
Fast forward about 16 months, to now.  Our system administrator who handles the majority of our ESX management was migrating many of our VMs to our newly installed SAN.  He reported that he shut down the DC normally, moved the VM and then started it back up a few hours later after all the server files had been copied over. The few servers that use that DC were working properly and everything looked good. 
But alas, a few weeks later, the server reported a USN rollback condition. Replication and netlogon services stopped.  I checked the logs to see if I could figure out the cause, but only saw things that added to the confusion.  The DC was mysteriously missing logs from between the time of the VM relocation and the time of the NTDS error.  And the forest domain controllers had logs indicating it had been silent for nearly 2 weeks. At this point, I can only speculate what went bad.
We slapped a bandage on the server by restarting netlogon so those few servers could authenticate, but without replication happening properly, the server will simply choke up again. And after the tombstone lifetime passes, the forest domain will consider it a lost cause.  It’s essentially a zombie.
So begins our finally steps to decommission that child domain. I have no interest in restoring that domain from backup, since removing that domain has been an operations project that has been bumped for a long time. Now our hand has been forced and the plan is simple.  Change a couple service accounts, move 2 servers to join the forest root domain and then NTDSUTIL that DC into nothingness.
As for our two forest root domain controllers?  I’ll throw my body in front of their metal cases for a long time to come.

My TechEd Session Wish List

Had a great time at TechEd this year, do not get me wrong. But like all the other conferences of the past, there is often too much good stuff to get it all in.
This year, just about all the breakout sessions are available online. While some may think this reduces the value of actually attending the conference, I disagree. The more intimate sessions, like Birds-of-a-Feather and the “Interactive” style sessions were not recorded. So when I could, I attended those sessions over the traditional breakouts, chatted with Microsoft experts in the TLC areas, or spent time networking with others in the Expo and Community Lounge.
If I could have tailored TechEd to fit my schedule and I had more than 4 days, here are the sessions I would have attended. I did get to a few of them during the conference, they are marked with a (*). Since it will probably take me a while to view all the ones I missed, if you caught one of these and it’s especially good or bad, comment and let me know!
Management Track
MGT314* – Technical Introduction to Microsoft System Center Essentials 2010
Office & SharePoint
OSP314* – Microsoft Outlook and Exchange 2010: Better Together Overview
OSP208 – Microsoft Office 2010 for IT Professionals
OSP203 – (SharePoint) Designing Governance: How Information Management and Security Must Drive Your Design
Security, Identity & Access
SIA333 – Useful Hacker Techniques: Which Part of Hackers’ Knowledge Will Help You in Efficient IT Administration?
SIA230 – Why Security Fixes Won’t Fix Your Security
SIA306 – Night of the Living Directory: Understanding Windows Server 2008 R2 Active Directory Recycle Bin, Undeletion and Reanimation
Unified Communications
UNC303* – Upgrading from Microsoft Exchange Server 2003/2007 to Exchange Server 2010: Tips, Tricks and Lessons Learned
UNC307* – What’s New in Archiving, Retention, and Discovery in Microsoft Exchange Server 2010 SP1
UNC201 – Microsoft Exchange Server 2010 SP1: An Overview of What’s Coming
UNC306 – Going Big! Deploying Large Mailboxes with Microsoft Exchange Server 2010 without Breaking the Bank
UNC203 – What’s New in OWA, Mobility, and Calendaring in Microsoft Exchange Server 2010 SP1
UNC301 – Microsoft Exchange Server 2010: Sizing and Performance – Get It Right the First Time

Virtualization

VIR310 – Networking and Windows Server 2008 R2 Hyper-V: Deployment Considerations
VIR403 – Virtualization FAQ, Tips and Tricks
VIR316 – Remote Desktop Session Host vs. Virtual Desktop Infrastructure Smackdown
Windows Client
WCL304 – Best Practices Guide to Managing Applications
WCL205 – Windows 7 Deployment Tips from Early Adopters
Windows Server
WSV208* – Best Practices in Architecting and Implementing Windows Server Update Services (WSUS)
WSV333 – DNSSEC and Windows: Get Ready, ‘Cause Here It Comes!
WSV201 – 10 Hot Topics Every IT Admin Needs to Know about Windows Server 2008 R2
WSV303 – Death of a Network: Identify the Hidden Causes of Lousy Network Performance
WSV301 – Administrators’ Idol: Windows and Active Directory Best Practices
WSV307 – Windows Server 2008 R2 SP1

Developer Tools, Languages & Frameworks DEV211 – Microsoft Professional, Master and Architect Level Certifications: Notes from Those Who Have Conquered and Lived to Tell the Tale

Upcoming Events for Techies

The Citrix and Microsoft Roadshow – a free, half-day event being held in multiple locations across the US covering desktop virtualization. If you are in CA, catch it in Sacramento on May 25th, inSan Francisco on June 10th or in Los Angeles on June 17th.
Enterprise Content Management in SharePoint – another free, half-day seminar hosted by Microsoft, QuickStart Intelligence, and KnowledgeLake. Learn how to lower costs and increase productivity by transforming your existing Microsoft SharePoint into an Advanced Enterprise Content Management system using SharePoint 2010. This is being held June 18th in Microsoft’s San Francisco office.
Also don’t forget about the Microsoft Bus Tour if you’ll be on the east coast, which starts today! I’m hearing some cities are already fully booked, so don’t miss out if you can still grab a slot.
The Bus Tour ends at TechEd in New Orleans and I’m looking forward to a fun-filled week of learning. Visit me at the Springboard booth in the TLC area if you are going to be there.

Gearing Up for Vitualization Certification

There are several virtualizations exams available from Microsoft, some shiny and new and one that’s been around for a bit of time now. There is indication that there will be an new MCITP certification that’s not yet on the Microsoft certification list – MCITP: Windows Server 2008 R2, Virtualization Adminstrator.

At the moment, there are 3 exams that count toward this certification, though without final say from Microsoft website, I’m currently thinking that the full certification is not fully baked yet. However, no reason you can’t get started. In the past, I’ve taken 70-652 (TS: Windows Server Virtualization, Configurating) which is a stand-alone Technology Specialist exam for virtualization with Hyper-V on Windows 2008. It does not cover Server 2008 R2 technologies.

The other 3 exams, are new and are specifically geared toward Windows Server 2008 R2.

  • 70-669 TS: Windows Server 2008 R2, Desktop Virtualization
  • 70-659 TS: Windows Server 2008 R2, Server Virtualization
  • 70-693 Pro: Windows Server 2008 R2, Virtualization Administrator

There is very little study/prep materials available for these exams at the moment, however expect you’ll need to know about configuring and managing Hyper-V and RDS, as well as VDI, MED-V and App-V technologies.

Don’t forget, the Microsoft Second Shot offer is still available for exams taken through June 30th. Drop me an email if you need a voucher number for the second shot offer.

On the Tech Radar: Upcoming Events

Looking for some technology events for your calendar in the upcoming months? Here are a few that you might want to check out.
Start out April with the regular Pacific IT Professionals meeting on April 6th. Hear from Neustar about their Webmetrics and UltraDNS solutions. Also, PacITPros will also be having a special TechDays event on Computer Forensics on April 12th. Sign up soon to secure your spot!
A one day Windows Intelligence event is being held in Burlingame on April 26th, hosted by QuickStart Intelligence and Microsoft. Technical tracks include Windows 7, Server 2008 R2 and Virtualization, Exchange and Office.
On June 10th, the Microsoft and Citrix will team up and come to San Francisco to talk about desktop virtualization. Other cities and dates are on the schedule from now through June as part of the 2010 Virtualization Summit.
Finally, don’t forget some of the multi-day events, which are always a lot of fun – the Microsoft Management Summit in Las Vegas (April 19-23) and TechEd in New Orleans in June.

Two days at Microsoft: What makes an Optimized Desktop?

This week I’ve had the honor of spending two days at the Microsoft campus in Redmond, learning about the components of MDOP (Microsoft Desktop Optimization Pack) and concept of the “Optimized Desktop”.

The discussions topics for the training revolved around the primary problem with desktop management: The components of a PC are bound together, making hardware and software difficult and expensive to replace and manage. Software and OS upgrades can slow drastically when the life-cycle of aging hardware components dictate what’s possible in the organization. Also, applications need consistent management to allow for ease of maintenance and the eventual retirement of dated and insecure tools.

Also, with new opportunities and challenges with cloud services, highly mobile workers and cutting edge consumer products, IT Professionals have a lot of needs to juggle to keep everyone working effectively. Users want easy access to their data from different devices, regardless of where it’s located – local to their office PC or laptop, on the corporate network or in the cloud.

The next generation optimized Windows desktop uses several applications found in MDOP to separate user data & settings, applications and the operating system from the hardware so they can be managed independently. This can make the adoption of newer, more secure operating systems easier to attain.

Ultimately, the Optimized Desktop helps bring some essential features to the finger tips of both the IT Pros and the users they support: end-to-end management, better application experiences, improved security and data protection, anywhere access for users, and reliable business continuity.

The components of MDOP include:

  • Enterprise Desktop Virtualization (MED-V)
  • Application Virtualization (App-V)
  • Diagnostics and Recovery Toolset (DaRT)
  • System Center Desktop Error Monitoring (DEM)
  • Asset Inventory Service (AIS)
  • Advanced Group Policy Management (AGPM)

I won’t drill down into each of those components in this particular post, but trust you’ll see more about these tools in the near future. Brad McCabe, Senior Product Manager for Windows Client, put together an full agenda for those of us in attendance and I was excited to be able to participate.

Finally, if you aren’t sure where you can go and what you can do with Desktop Virtualization (VDI), don’t miss out on the Desktop Virtualization Hour, Thursday 3/18 at 9am.

Tech Triple Play in San Francisco – March 2nd

Is your schedule empty on March 2nd? If so, you can fill your day with several technology events being held in downtown San Francisco.
Start your morning with a Microsoft TechNet Event (8am-Noon) for Windows Azure, Hyper-V and Windows 7 Deployment. Get an overview of Windows Azure, look at the tools and techniques available for building virtual environments in Hyper-V version 2.0, then learn how to simplify your Windows 7 deployments.
Then for the price of an Expo pass at RSA, spend the afternoon checking out the vendors in the Exposition hall. The Expo pass also gets you the afternoon keynotes on Wednesday, Thursday and Friday.
Finally, spend the evening hanging out with the Pacific IT Professionals at their monthly meeting, held at Microsoft’s downtown office at 6pm. Be sure to check out the site for meeting information and RSVP so there is enough snacks to go around.
See you there!

IT Roadmap at Moscone Center

Yesterday was the Network World IT Roadmap in San Francisco. I had the experience of being the user case study presenter for the virtualization session. If you happened to catch it, I apologize for talking too fast. I’m working on that!

Other sessions covered application delivery, green IT, IP communications, data center, cloud, network management, security and compliance and WAN, LAN and mobility. Phew. Network World offered a lot in one day, plus several additional keynotes and the expo hall. My co-worker caught the WAN, LAN and mobility session, so I’m curious to see what trouble he’ll be looking to cause in the office next week.

There was some twittering happening related to the conference, but I was disappointed to see that the @itroadmap Twitter handle didn’t tweet at all during the event. They had advertised Twitter on the conference site as a way to stay connected during the conference yet didn’t reach out to that audience once. Twitter is becoming a popular way to interact as things happen – several attendees were tweeting during sessions – so it seems like Network World missed out on an opportunity there.