Six Easy Pieces: How to Do Cost of Ownership Analysis Better

If cost of ownership analysis is a painful exercise for IT organizations, why has almost every company done it (and continued to do it) multiple times? In this article, we will identify six key elements to effective cost of ownership analysis, which you can use these to improve the accuracy and eliminate the frustration associated with this necessary step in your IT evolution. 1) Analyze Platforms, Not Servers First evaluate the current "platforms" within your environment, including all servers of all types in order to simplify the process. Simply because management requires an accurate understanding of current IT costs and strengths so they can better assess new ideas and technologies.

One of the most difficult things to "get right" in an analysis of this type is an exact match between a given technology and the associated costs. Limiting the scope makes cost of acquisition simple to determine but it makes every other cost almost impossible to quantify without controversy. The easiest way to do this is not to limit the technology scope to a few machines or a single new application but to expand it to match all the technology in the IT budget. A platform approach will result in development of a new "view" of the IT budget that is platform based. This gives the study team tremendous leverage if discussions should wander to "I think this amount is too high for platform A." So if the amount is reduced for platform A, it must be raised for platform B. What does B think of that?

The advantage to this approach is that the total in this view should match the total in the budget. This places the entire cost discussion on solid footing - the IT budget - and allows the process to be managed dispassionately, a key to later acceptance of the results. 2) Focus on a Representative Application and Include All the Pieces Next, let's consider a new business critical application or workload that requires platform selection. By definition, a critical application will require careful design, careful sizing, careful maintenance, operation, support, and disaster recoverability. Once again, the key to success is to not limit the view to a subset of components. It may also require a new or dedicated infrastructure, but at a minimum, it will tax existing infrastructure. The "view" developed in the previous step should facilitate this type of analysis.

Each of these components and their associated costs, should be included in any cost of ownership comparison. Over the past ten years, our group within IBM Lab Services has been doing IT Systems and Storage Optimization ("Scorpion") Studies that focus on this type of view and component based analysis. This means that any analysis that omits those other components for support, maintenance, and disaster recovery, etc. may miss half of the real costs. Our findings show that a typical ratio of production Web, application, and database servers to "everything else" is about one to one. This discrepancy grows for very large critical applications and is largely why our industry hasn't done so well sizing many new enterprise application suites. Vendors feed this controversy to gain competitive advantage.

We've all heard the stories. 3) Consider Practical Capacity, Not Vendor Ratings System capacity and performance can quickly become a very tedious and esoteric discussion, and in many cost of ownership efforts, it does. This can be avoided. Often, distributed server utilizations are very low and there is a good reason for it. Our experience is that the most important aspect of performance analysis within cost of ownership is not which vendor claim or benchmark is used as a base, but rather (a) what system utilizations are "normal" in your current environment?; and (b) what is a reasonable expectation for the future? An underutilized server requires no capacity planning. If average server utilizations in your environment are low, model a future state of 2 or 3 times the current for each component in the possible solution.

Most cost analyses are considered part of a technology acquisition process so higher future state utilizations are assumed. No higher. This is particularly true with the rise of virtualization, which is almost always assumed in cost of ownership comparisons. Use any reasonable performance metric - expected utilizations are far more important. Transitioning from a non-virtualized to virtualized server environment has some significant advantages including higher potential utilization.

Don't assume world class utilization numbers unless you know what kind of effort it will take to attain them. It has a cost, however, capacity must be managed. IBM's System z mainframe environments typically demonstrate this fact quite well. They can do this because the level of internal standardization and automation is much higher than other platforms. Mainframes usually run at very high utilizations around the clock. Other platforms will eventually attain these levels, but that is still years of vendor development away. 4) Don't Ignore Labor Costs to Protect the Innocent The most difficult topic within cost of ownership is undoubtedly the cost of labor.

High Full Time Equivalent (FTE) ratios have been an industry target for years and most IT professionals can quote the current best practice and describe how they are exceeding it. In a down economic cycle, most staff see nothing positive in quantifying the cost of labor for "their" platform. Therein lies a problem. The extent to which strategy (b) is used differs by platform for a variety of reasons, but the result is the same. IT infrastructure support organizations have been managing to these ratios for years using two basic strategies: (a) improve efficiency; or (b) push work onto other parts of the IT organization. Any cost of ownership analysis that limits labor calculations to IT infrastructure support headcount will likely miss major portions of the real support costs and skew the results.

Consider the entire IT organization and apportion every group that is not truly platform neutral (and even individuals within an otherwise neutral group, like network support) to the appropriate platform labor category. A good solution to this problem is an approach similar to item one. The same kind of "view" is now developed for assignment of labor with the underlying organization chart as the foundation. They will be higher - up to 2 times or more on x86 platforms - and will reflect true insight to cost. The results will look quite different from industry published norms. Because the resulting labor cost numbers cross organizational lines, no one group will feel responsible for them, or for lowering them.

A side benefit to the process stems from the two strategies often used to manage FTE ratios - productivity improvement and narrowing of responsibility. Resistance to the process will be lessened and again, buy-in should be improved. If the FTE ratios are changed significantly by the new "view" of the organization, the need for productivity tools will be evident. This is in stark contrast to the mainframe where software costs tend to be high while systems are maintained at strict currency levels, with the result that staffing has been flat or dropping for years with steadily improving Quality of Service (QoS). 5) Quantify QoS in a Way that Makes Sense QoS is an elusive topic since it has so many aspects that differ in importance between companies, but some general trends can give guidance. In the years that we have been working with customers doing these studies, we've seen an alarming trend toward high complexity within distributed systems - old hardware, old software, multiple releases of everything to be maintained - and a lack of investment in systems management software.

In this age of real-time systems, disaster recovery has become a universal need. The majority of customers we've worked with have done the former and not the latter because of the cost. Two key metrics in disaster recovery are Recovery Time Objective (RTO - the time to bring alternative systems online for use) and Recovery Point Objective (RPO - the age of the data on those recovered systems). If we consider the dominant RTO and RPO for a given platform, we gain insight into both cost and QoS. Though any system can be made disaster recoverable, there is a huge cost differential between making a single mainframe recoverable and making 1,000 distributed systems recoverable. This can be quantified very easily with a call to a recovery services provider and should be included in the platform cost comparison. As IT will have to compete with public clouds the above cost analysis can be used to set internal cost guidelines for a corporate cloud infrastructure very early in the cloud development process. Cloud computing and other metered services will certainly offer a recoverable option and here lies an opportunity.

Or it can be used to steer workload onto the platform that is already recoverable, thus eliminating some of the need to develop a recovery capability where it currently does not exist. 6) Look at Costs Incrementally - Plot Your Own Course The last topic to consider is primarily financial. Just as the first chip to roll off a fabrication line is worth billions and the second worth pennies, the first workload for a platform is far more expensive to provision than subsequent ones. There is a "sunk cost" and an "incremental cost" associated with IT infrastructure that must be considered. This is especially true for the mainframe since the technology may be physically refreshed, but financially the transaction is handled as an upgrade. IBM has taken this concept a step further with the arrival of mainframe "specialty" engines that have much lower price points and drastically reduced impact on software costs. This is unlike the distributed world where technology and book value are tied together.

However, they cannot run alone, they must be added to an existing system. These kinds of dramatic differences must be considered in cost of ownership and are often large enough to justify a change in course for IT. Exploiting these areas of low incremental cost to support growth can significantly improve the overall cost of IT. Virtualization is expected to have a significant effect on other platforms, so the need is universal and growing. It is not unusual for mainframe systems in production to cost $4,000/MIP while specialty engine upgrades may run only $200/MIP. The incremental costs on the mainframe in this case are 1/20th of current cost. With 33 years at IBM, John Schlosser is currently a Senior Managing Consultant for the Scorpion practice within IBM Systems and Technology Group - Lab Services & Training. He is a founding member of the group which was started in 1999. He has developed and modified many of the methodologies the team uses for IT infrastructure cost analysis.

0 comments:

Post a Comment