Feb 4

Sivakumar, Head – Technology & Development (siva.k@appnomic.com)

 

The ITIL standard based guidelines recommend the following set of process for Operational Management –

  • Incident Management – Process to handle any disruption of service with measurable Service Level Agreements
  • Problem Management – Process to find the root cause and closure for repeating incidents or new incidents without known solution with the intent to build the knowledge base so that the recurrence of the same problem can be solved quickly.
  • Change Management – Process to handle any technology, system or application addition or change in a stable production setup with minimal impact on services.
  • Release Management – Process to test & verify any change before rollout into the production setup.
  • Configuration Management – Process to keep the Configuration Database updated with the latest information as and when any change is implemented in the production setup.

Ref: ITIL Standards

Autonomic Computing defines systems that conform to the following aspects of self-management –

  • Self-configuration - Automated configuration of components and systems follows high-level policies. Rest of system adjusts automatically and seamlessly.
  • Self-optimization - Components and systems continually seek opportunities to improve their own performance and efficiency.
  • Self-healing - System automatically detects, diagnoses, and repairs localized software and hardware problems.
  • Self-protection - System automatically defends against malicious attacks or cascading failures. It uses early warning to anticipate and prevent system-wide failures.

Ref: www.research.ibm.com/autonomic/research/papers/AC_Vision_Computer_Jan_2003.pdf

Imagine ITSM after Autonomic Computing systems are in abundance. Incident Management could become non-existent as the systems would be able to correct themselves when it detects a failure. Problem Management may still exist as there would always be complex issues that cannot be self-healed by the system and hence would need the intelligence of a human being. But then the human-being needs to be far more advanced than the complex system to solve such problems.

Change & Release Management would continue to exist in one form or the other as there would always be a need to change & improve the systems for automation and efficiencies. Configuration Management that is at the heart of ITIL may be fully automated with complete versioning & backup available as of any point in time. Configuration changes may be done automatically by the system based on increased load or localized failures to meet the SLA. Policy definitions (including SLA) will become the only configuration item that needs to be maintained by the user.

Are we anywhere close to reaching the ideal state described above? What are the technologies that helping us progress in this direction? Is autonomic computing a viable solution in the near future? How much progress has been made in this direction? Is there a different approach to solve the ITSM problem? I’m trying to find answers to these questions and would like to discuss my thoughts with groups working in these technology areas. I will keep posting my thoughts and findings in the coming weeks.

Feb 4

 

Complex Enterprise Class Applications support tens of thousands of users and process millions of transactions in a day. A down time of even a few minutes could translate into lost business and have a direct financial impact on the Organization. Such applications have complex interactions with different elements – Host Operating Systems, Network Components (LAN & WAN), Application Servers, Database Servers, Web Servers & Client Operating Systems.

Managing such application is not the same as managing the individual elements in the setup. Ensuring that the devices are performing well does not always translate to better performance for the end users. It is increasingly becoming difficult to find experts in the application who also have the in-depth knowledge of different infrastructure components to identify and troubleshoot incidents & problems in the quickest possible time. The global trend pattern predicted is that the Infrastructure Capital Costs are going down while the Infrastructure Management costs are increasing. In many Organizations, the Management costs are more than 60% of their IT budget. If this trend continues, what are the options available to the CTO of an Organization to reduce the Management Cost:

  1. Outsource and if required Offshore operations to a cheaper vendor and location. This option has proved successful with large Organizations setting up Captives & Outsourcing the routine operations of IT Management. The ITSM guideline based on ITIL standards has become the reference for Managing IT operations and parts of these processes can be outsourced to a cost effective vendor.
  2. Increased level of usage of tools & automation to eliminate manual activities. There have been a slew of specialized tools built by different vendors to manage & automate specific set of activities. IBM, HP, CA & Remedy have a whole suite of tools to monitor, manage & automate most of the IT operational activities. But I find that that there is a huge cost involved in setting up & integrating the set of tools to cover the complete scope of IT operations. The SMB segment is finding it difficult to implement such high-end tools and this in turn has created a set of low cost solution providers who have taken the task of targeting the SMB segment with specific tools. The current state of IT operations in most Organizations is typically a combination of specific tools for critical operations and manual process driven operations for the non-critical operations to balance the IT cost.

Looking into the future I see new technology solutions that are maturing to tackle the problems of IT management. One such approach is Autonomic Computing, many companies are working on this initiative and a key proponent of this is IBM.

Definition - Autonomic computing is a self-managing computing model named after, and patterned on, the human body’s autonomic nervous system. An autonomic computing system would control the functioning of computer applications and systems without input from the user, in the same way that the autonomic nervous system regulates body systems without conscious input from the individual. The goal of autonomic computing is to create systems that run themselves, capable of high-level functioning while keeping the system’s complexity invisible to the user.

If this lives up to its promise, then IT infrastructure systems & applications would manage themselves and the whole IT Management would focus on defining the governance policies & SLA instead of worrying about the operational procedures. I will be exploring this technology and other technology solutions that would impact ITSM and keep the post updated in the coming weeks.